Agent

Overview

The Agent class is the central component of Tyler, providing a flexible interface for creating AI agents with tool use, delegation capabilities, and conversation management.

Creating an Agent

from tyler import Agent

agent = Agent(
    name="MyAssistant",
    model_name="gpt-4o",
    purpose="To help users with their tasks",
    temperature=0.7,
    tools=[...],  # Optional tools
    agents=[...]  # Optional sub-agents for delegation
)

All Parameters

name

string

default:"Tyler"

The name of your agent. This is used in the system prompt to give the agent an identity.

model_name

string

default:"gpt-4.1"

The LLM model to use. Supports any LiteLLM compatible model including OpenAI, Anthropic, Gemini, and more.

purpose

string | Prompt

default:"To be a helpful assistant."

The agent’s purpose or system prompt. Can be a string or a Tyler Prompt object for more complex prompts.

temperature

float

default:"0.7"

Controls randomness in responses. Range is 0.0 to 2.0, where lower values make output more focused and deterministic.

drop_params

bool

default:"True"

Whether to automatically drop unsupported parameters for specific models. When True, parameters like temperature are automatically removed for models that don’t support them (e.g., O-series models). This ensures seamless compatibility across different model providers without requiring model-specific configuration.

tools

List[Union[str, Dict, Callable, ModuleType]]

default:"[]"

List of tools available to the agent. Can include:

Direct tool function references (callables)
Tool module namespaces (modules like web, files)
Built-in tool module names (strings like "web", "files")
Custom tool definitions (dicts with ‘definition’, ‘implementation’, and optional ‘attributes’ keys)

For module names, you can specify specific tools using 'module:tool1,tool2' format.

agents

List[Agent]

default:"[]"

List of sub-agents that this agent can delegate tasks to. Enables multi-agent systems and task delegation.

max_tool_iterations

int

default:"10"

Maximum number of tool calls allowed per conversation turn. Prevents infinite loops in tool usage.

api_base

string | None

default:"None"

Custom API base URL for the model provider (e.g., for using alternative inference services). You can also use base_url as an alias for this parameter.

base_url

string | None

default:"None"

Alias for api_base. Either parameter can be used to specify a custom API endpoint.

api_key

string | None

default:"None"

API key for the model provider. If not provided, LiteLLM will use environment variables (e.g., OPENAI_API_KEY, WANDB_API_KEY). Use this when you need to explicitly pass an API key, such as with W&B Inference or custom providers.

extra_headers

Dict[str, str] | None

default:"None"

Additional headers to include in API requests. Useful for authentication tokens, API keys, or tracking headers.

reasoning

string | Dict | None

default:"None"

Enable reasoning/thinking tokens for supported models (OpenAI o1/o3, DeepSeek-R1, Claude with extended thinking).

String: 'low', 'medium', 'high' (recommended for most use cases)
Dict: Provider-specific config (e.g., {'type': 'enabled', 'budget_tokens': 1024} for Anthropic)

When enabled, the model will show its internal reasoning process before generating the final answer.

step_errors_raise

bool

default:"False"

If True, the step() method will raise exceptions instead of returning error messages. Used for backward compatibility and custom error handling.

notes

string | Prompt

default:""

Supporting notes to help the agent accomplish its purpose. These are included in the system prompt and can provide additional context or instructions.

version

string

default:"1.0.0"

Version identifier for the agent. Useful for tracking agent iterations and changes.

thread_store

ThreadStore | None

default:"None"

Thread store instance for managing conversation threads. If not provided, uses the default thread store. This parameter is excluded from serialization.

file_store

FileStore | None

default:"None"

File store instance for managing file attachments. If not provided, uses the default file store. This parameter is excluded from serialization.

message_factory

MessageFactory | None

default:"None"

Custom message factory for creating standardized messages. Advanced users can provide a custom implementation to control message formatting and structure. If not provided, uses the default factory. This parameter is excluded from serialization (recreated on deserialization).

completion_handler

CompletionHandler | None

default:"None"

Custom completion handler for LLM communication. Advanced users can provide a custom implementation to modify how the agent communicates with LLMs. If not provided, uses the default handler. This parameter is excluded from serialization (recreated on deserialization).

response_type

Type[BaseModel] | None

default:"None"

Optional Pydantic model to enforce structured output from the LLM. Uses the output-tool pattern: your Pydantic schema is registered as a special tool, and tool_choice="required" forces the model to call it. The validated model instance is available in AgentResult.structured_data.This approach allows regular tools to work alongside structured output. LiteLLM automatically translates tool_choice for different providers (OpenAI, Anthropic, Gemini, Bedrock, etc.).Can be set at the agent level (default for all runs) or overridden per run() call. See the Structured Output Guide for details.

retry_config

RetryConfig | None

default:"None"

Configuration for automatic retry behavior. When enabled with structured output, the agent will automatically retry LLM calls if validation fails, providing error feedback to help the LLM correct its output.

from tyler import RetryConfig

agent = Agent(
    retry_config=RetryConfig(
        max_retries=3,
        retry_on_validation_error=True,
        backoff_base_seconds=1.0
    )
)

See RetryConfig for all options.

tool_context

Dict[str, Any] | None

default:"None"

Default request identity passed to tools via the ctx parameter. Primarily used for user/org/session identity that answers “who is making this request?”Can be set at the agent level (for system agents or default identity) or per run() call (typical for user requests). When both are provided, run-level merges with and overrides agent-level for conflicting keys.

# System agent with fixed identity
system_agent = Agent(
    model_name="gpt-4o",
    tool_context={"user_id": "system", "role": "admin"}
)

# User-facing agent - identity passed per request
agent = Agent(model_name="gpt-4o", tools=my_tools)
await agent.run(thread, tool_context={
    "user_id": request.user.id,
    "org_id": request.user.org_id,
    "permissions": request.user.permissions
})

Infrastructure (database clients, API clients) should be closed over at tool definition time, not passed in context. See ToolContext for the recommended pattern.

response_format

Literal['json'] | None

default:"None"

Simple JSON mode for when you want any valid JSON without schema validation. Pass response_format="json" to run() to force JSON output.

# Get any valid JSON (no schema validation)
result = await agent.run(thread, response_format="json")
data = json.loads(result.content)  # Parse yourself

Cannot be used with response_type. Use response_type for schema validation, or response_format="json" for simple JSON without validation.

Creating from Config Files

Agent.from_config

classmethod

Create an Agent from a YAML configuration file. Enables reusing the same configuration between CLI and Python code.

from tyler import Agent

# Auto-discover config (searches standard locations)
agent = Agent.from_config()

# Load from specific path
agent = Agent.from_config("my-config.yaml")

# With parameter overrides
agent = Agent.from_config(
    "config.yaml",
    temperature=0.9,
    model_name="gpt-4o"
)

Parameters

config_path

string | None

default:"None"

Path to YAML config file (.yaml or .yml). If None, searches standard locations:

./tyler-chat-config.yaml (current directory)
~/.tyler/chat-config.yaml (user home)
/etc/tyler/chat-config.yaml (system-wide)

**overrides

Any

Override any config values. These replace (not merge) config file values using shallow dict update semantics.Examples:

tools=["web"] replaces entire tools list
temperature=0.9 replaces temperature value
mcp={...} replaces entire mcp dict (not merged)

Config File Format

# Agent Identity
name: "MyAgent"
purpose: "To help with tasks"
notes: "Additional instructions"

# Model Configuration
model_name: "gpt-4.1"
temperature: 0.7
max_tool_iterations: 10
reasoning: "low"  # For models supporting thinking tokens

# Tools
tools:
  - "web"           # Built-in tool module
  - "slack"
  - "./my_tools.py" # Custom tool file (relative to config)

# MCP Servers (optional)
mcp:
  servers:
    - name: "docs"
      transport: "streamablehttp"
      url: "https://example.com/mcp"

# Environment variables (keeps secrets safe)
api_key: "${OPENAI_API_KEY}"  # Reads from environment

Config files use the same format as tyler-chat CLI configs, so you can share configurations between interactive CLI sessions and your Python code.

Advanced Config Loading

For more control over config loading, use load_config() directly:

from tyler import load_config, Agent

# Load config into dict
config = load_config("config.yaml")

# Inspect and modify before creating agent
print(f"Model: {config['model_name']}")
config["temperature"] = 0.9
config["notes"] += "\nModified programmatically"

# Create agent from modified config
agent = Agent(**config)

Processing Conversations

The go() method is the primary interface for processing conversations. It supports three output modes controlled by the stream parameter:

Non-Streaming Mode (stream=False)

from tyler import Thread, Message

# Create a conversation thread
thread = Thread()
thread.add_message(Message(role="user", content="Hello!"))

# Process the thread
result = await agent.run(thread)

# Access the response
print(result.content)  # The agent's final response
print(result.thread)  # Updated thread with all messages
print(result.new_messages)  # New messages added in this turn

With Structured Output

Get type-safe, validated responses using Pydantic models:

from pydantic import BaseModel
from tyler import Agent, Thread, Message, RetryConfig

class SupportTicket(BaseModel):
    priority: str
    category: str
    summary: str

agent = Agent(
    name="classifier",
    model_name="gpt-4o",
    retry_config=RetryConfig(max_retries=2)  # Retry on validation failure
)

thread = Thread()
thread.add_message(Message(role="user", content="My payment failed!"))

# Get structured output
result = await agent.run(thread, response_type=SupportTicket)

# Access the validated Pydantic model
ticket: SupportTicket = result.structured_data
print(f"Priority: {ticket.priority}")
print(f"Category: {ticket.category}")

With Tool Context (Request Identity)

Pass request-scoped identity to your tools:

# Define tools with infrastructure closed over
def create_order_tools(db):
    async def get_user_orders(ctx: ToolContext, limit: int = 10) -> str:
        user_id = ctx["user_id"]  # Identity from context
        org_id = ctx["org_id"]    # Tenant isolation
        orders = await db.get_orders(user_id, org_id, limit)
        return f"Found {len(orders)} orders"
    return [get_user_orders]

# Create agent with tools (db already closed over)
agent = Agent(model_name="gpt-4o", tools=create_order_tools(database))

# Run with identity context
result = await agent.run(
    thread,
    tool_context={
        "user_id": current_user.id,
        "org_id": current_user.org_id,
        "permissions": current_user.permissions
    }
)

See ToolContext for the recommended pattern.

Event Streaming Mode (stream=True or stream=“events”)

Stream responses as high-level ExecutionEvent objects with full observability:

from tyler import EventType

# Stream responses in real-time (both syntaxes work identically)
async for event in agent.stream(thread):
    # or: agent.stream(thread)
    
    if event.type == EventType.LLM_STREAM_CHUNK:
        # Content being generated
        print(event.data["content_chunk"], end="", flush=True)
    
    elif event.type == EventType.TOOL_SELECTED:
        # Tool about to be called
        print(f"Using tool: {event.data['tool_name']}")
    
    elif event.type == EventType.MESSAGE_CREATED:
        # New message added to thread
        msg = event.data["message"]
        print(f"New {msg.role} message")

Raw Streaming Mode (stream=“raw”)

Raw mode is for advanced use cases requiring OpenAI compatibility. Tools ARE executed, but you only receive raw LiteLLM chunks (no ExecutionEvents for observability).

Stream raw LiteLLM chunks in OpenAI-compatible format for direct integration:

# Get raw chunks for OpenAI compatibility
async for chunk in agent.stream(thread, mode="raw"):
    # chunk is a raw LiteLLM object with OpenAI structure
    if hasattr(chunk, 'choices') and chunk.choices:
        delta = chunk.choices[0].delta
        
        # Access content
        if hasattr(delta, 'content') and delta.content:
            print(delta.content, end="", flush=True)
        
        # Access tool calls (not executed, just passed through)
        if hasattr(delta, 'tool_calls') and delta.tool_calls:
            print(f"\nTool call delta: {delta.tool_calls}")
    
    # Usage info in final chunk
    if hasattr(chunk, 'usage') and chunk.usage:
        print(f"\nTokens: {chunk.usage.total_tokens}")

When to use raw mode:

Building OpenAI API proxies or gateways
Direct integration with OpenAI-compatible clients
Debugging provider-specific behavior
Minimal latency requirements (no transformation overhead)

How raw mode works:

✅ Tools ARE executed (fully agentic behavior)
✅ Multi-turn iteration supported (continues until task complete)
✅ Raw chunks show tool calls via finish_reason: "tool_calls"
⚠️ No ExecutionEvent telemetry (only raw chunks)
⚠️ Silent during tool execution (brief pauses between chunk streams)
⚠️ Consumer handles chunk formatting (SSE serialization, etc.)

The pattern matches OpenAI’s Agents SDK: chunks → finish_reason=“tool_calls” → [tools execute silently] → more chunks → repeat until done See the streaming guide for more details and examples.

Return Values

AgentResult (Non-Streaming)

@dataclass
class AgentResult:
    thread: Thread  # Updated thread with all messages
    new_messages: List[Message]  # New messages from this execution
    content: Optional[str]  # The final assistant response
    structured_data: Optional[BaseModel]  # Validated Pydantic model (if response_type used)

When using response_type, the structured_data field contains the validated Pydantic model instance. See AgentResult for full documentation.

ExecutionEvent (Streaming)

@dataclass
class ExecutionEvent:
    type: EventType  # Type of event
    timestamp: datetime  # When the event occurred
    data: Dict[str, Any]  # Event-specific data
    metadata: Optional[Dict[str, Any]]  # Additional metadata

Event Types

ITERATION_START - New iteration beginning
LLM_REQUEST - Request sent to LLM
LLM_RESPONSE - Complete response received
LLM_STREAM_CHUNK - Streaming content chunk
TOOL_SELECTED - Tool about to be called
TOOL_RESULT - Tool execution completed
TOOL_ERROR - Tool execution failed
MESSAGE_CREATED - New message added
EXECUTION_COMPLETE - All processing done
EXECUTION_ERROR - Processing failed
ITERATION_LIMIT - Max iterations reached

Execution Details

You can access execution information through the thread and messages:

# Calculate timing from messages
if result.new_messages:
    start_time = min(msg.timestamp for msg in result.new_messages)
    end_time = max(msg.timestamp for msg in result.new_messages)
    duration_ms = (end_time - start_time).total_seconds() * 1000
    print(f"Duration: {duration_ms:.0f}ms")
    print(f"Started: {start_time}")
    print(f"Ended: {end_time}")

# Token usage from thread
token_stats = result.thread.get_total_tokens()
print(f"Total tokens: {token_stats['overall']['total_tokens']}")

# Tool usage from thread
tool_usage = result.thread.get_tool_usage()
if tool_usage['total_calls'] > 0:
    print(f"\nTools used:")
    for tool_name, count in tool_usage['tools'].items():
        print(f"  {tool_name}: {count} calls")

Working with Tools

from lye import WEB_TOOLS, FILES_TOOLS

agent = Agent(
    name="ResearchAssistant",
    model_name="gpt-4o",
    purpose="To research topics and create reports",
    tools=[*WEB_TOOLS, *FILES_TOOLS]
)

# The agent can now browse the web and work with files
result = await agent.run(thread)

# Check which tools were used
tool_usage = result.thread.get_tool_usage()
for tool_name, count in tool_usage['tools'].items():
    print(f"Used {tool_name}: {count} times")

Agent Delegation

researcher = Agent(
    name="Researcher",
    purpose="To find information",
    tools=[*WEB_TOOLS]
)

writer = Agent(
    name="Writer",
    purpose="To create content",
    tools=[*FILES_TOOLS]
)

coordinator = Agent(
    name="Coordinator",
    purpose="To manage research and writing tasks",
    agents=[researcher, writer]  # Can delegate to these agents
)

# The coordinator can now delegate tasks
result = await coordinator.run(thread)

Custom Configuration

Using custom API endpoints

# Use a custom API endpoint
agent = Agent(
    model_name="gpt-4",
    api_base="https://your-api.com/v1",
    extra_headers={"Authorization": "Bearer token"}
)

W&B Inference configuration

import os

# Use W&B Inference with DeepSeek-R1 (thinking tokens)
agent = Agent(
    model_name="openai/deepseek-ai/DeepSeek-R1-0528",
    base_url="https://api.inference.wandb.ai/v1",
    api_key=os.getenv("WANDB_API_KEY"),  # Your W&B API key
    extra_headers={
        "HTTP-Referer": "https://wandb.ai/my-team/my-project",
        "X-Project-Name": "my-team/my-project"
    },
    reasoning="low",  # Enable thinking tokens
    temperature=0.7
)

For W&B Inference, you can also use YAML config with environment variable substitution:

model_name: "openai/deepseek-ai/DeepSeek-R1-0528"
base_url: "https://api.inference.wandb.ai/v1"
api_key: "${WANDB_API_KEY}"  # Reads from environment

Custom storage configuration

from narrator import ThreadStore, FileStore

agent = Agent(
    thread_store=ThreadStore(backend="postgresql"),
    file_store=FileStore(path="/custom/path")
)

Weave Tracing and Serialization

Tyler Agent uses @weave.op() decorators for comprehensive tracing. When you initialize Weave, all agent method calls are automatically traced and logged to your Weave dashboard.

import weave
from tyler import Agent

# Initialize Weave - enables automatic tracing
weave.init("my-project")

# Create an agent
agent = Agent(
    name="MyAgent",
    model_name="gpt-4o",
    temperature=0.7
)

# All calls are automatically traced to Weave!
result = await agent.run(thread)
# View traces at: https://wandb.ai/<entity>/my-project/weave

Don’t use weave.publish(agent) - Published Agent objects cannot be retrieved and used (Weave returns an unusable ObjectRecord). For reproducibility, publish your configuration instead:

# ✅ DO: Publish config dict for reproducibility
config = {
    "name": "MyAgent",
    "model_name": "gpt-4o",
    "temperature": 0.7,
    "tools": ["web", "files"]
}
weave.publish(config, name="my-agent-config")

# Later, retrieve and recreate
config_ref = weave.ref("my-agent-config:v1").get()
agent = Agent(**config_ref)

# ❌ DON'T: Publish agent object directly
# weave.publish(agent)  # The result cannot be used!

Pydantic Serialization

Agents inherit from pydantic.BaseModel and support standard Pydantic serialization:

# Serialize to dict
agent_dict = agent.model_dump()

# Serialize to JSON
agent_json = agent.model_dump_json()

# Deserialize (helper objects are automatically recreated)
restored_agent = Agent(**agent_dict)

# restored_agent works exactly like the original
result = await restored_agent.run(thread)

The following attributes are excluded from serialization and automatically recreated:

thread_store - Database connections
file_store - File system state
message_factory - Message creation helper
completion_handler - LLM communication helper

If you provide custom helpers, they will be preserved during initialization but not serialized.

Best practices

Clear Purpose: Define a specific, focused purpose for each agent
Tool Selection: Only include tools the agent actually needs
Temperature: Use lower values (0.0-0.3) for consistency, higher (0.7-1.0) for creativity
Error Handling: Always handle potential errors in production
Token Limits: Monitor token usage to avoid hitting limits
Streaming: Use streaming for better user experience in interactive applications

Example: Complete Application

import asyncio
from tyler import Agent, Thread, Message, EventType
from lye import WEB_TOOLS

async def main():
    # Create agent
    agent = Agent(
        name="WebAssistant",
        model_name="gpt-4o",
        purpose="To help users find information online",
        tools=WEB_TOOLS,
        temperature=0.3
    )
    
    # Create thread
    thread = Thread()
    
    # Add user message
    thread.add_message(Message(
        role="user",
        content="What's the latest news about AI?"
    ))
    
    # Process with streaming
    print("Assistant: ", end="", flush=True)
    
    async for event in agent.stream(thread):
        if event.type == EventType.LLM_STREAM_CHUNK:
            print(event.data["content_chunk"], end="", flush=True)
        elif event.type == EventType.TOOL_SELECTED:
            print(f"\n[Searching: {event.data['tool_name']}...]\n", end="", flush=True)
    
    print("\n")

if __name__ == "__main__":
    asyncio.run(main())

Getting started

Building agents

Apps

Concepts

API reference

Standalone packages

Overview

Creating an Agent

All Parameters

Creating from Config Files

Parameters

Config File Format

Advanced Config Loading

Processing Conversations

Non-Streaming Mode (stream=False)

With Structured Output

With Tool Context (Request Identity)

Event Streaming Mode (stream=True or stream=“events”)

Raw Streaming Mode (stream=“raw”)

Return Values

AgentResult (Non-Streaming)

ExecutionEvent (Streaming)

Event Types

Execution Details

Working with Tools

Agent Delegation

Custom Configuration

Using custom API endpoints

W&B Inference configuration

Custom storage configuration

Weave Tracing and Serialization

Pydantic Serialization

Best practices

Example: Complete Application

Getting started

Building agents

Apps

Concepts

API reference

Standalone packages

​Overview

​Creating an Agent

​All Parameters

​Creating from Config Files

​Parameters

​Config File Format

​Advanced Config Loading

​Processing Conversations

​Non-Streaming Mode (stream=False)

​With Structured Output

​With Tool Context (Request Identity)

​Event Streaming Mode (stream=True or stream=“events”)

​Raw Streaming Mode (stream=“raw”)

​Return Values

​AgentResult (Non-Streaming)

​ExecutionEvent (Streaming)

​Event Types

​Execution Details

​Working with Tools

​Agent Delegation

​Custom Configuration

​Using custom API endpoints

​W&B Inference configuration

​Custom storage configuration

​Weave Tracing and Serialization

​Pydantic Serialization

​Best practices

​Example: Complete Application

Overview

Creating an Agent

All Parameters

Creating from Config Files

Parameters

Config File Format

Advanced Config Loading

Processing Conversations

Non-Streaming Mode (stream=False)

With Structured Output

With Tool Context (Request Identity)

Event Streaming Mode (stream=True or stream=“events”)

Raw Streaming Mode (stream=“raw”)

Return Values

AgentResult (Non-Streaming)

ExecutionEvent (Streaming)

Event Types

Execution Details

Working with Tools

Agent Delegation

Custom Configuration

Using custom API endpoints

W&B Inference configuration

Custom storage configuration

Weave Tracing and Serialization

Pydantic Serialization

Best practices

Example: Complete Application