Skip to main content

Overview

The Agent class is the central component of Tyler, providing a flexible interface for creating AI agents with tool use, delegation capabilities, and conversation management.

Creating an Agent

from tyler import Agent

agent = Agent(
    name="MyAssistant",
    model_name="gpt-4o",
    purpose="To help users with their tasks",
    temperature=0.7,
    tools=[...],  # Optional tools
    agents=[...]  # Optional sub-agents for delegation
)

All Parameters

name
string
default:"Tyler"
The name of your agent. This is used in the system prompt to give the agent an identity.
model_name
string
default:"gpt-4.1"
The LLM model to use. Supports any LiteLLM compatible model including OpenAI, Anthropic, Gemini, and more.
purpose
string | Prompt
default:"To be a helpful assistant."
The agent’s purpose or system prompt. Can be a string or a Tyler Prompt object for more complex prompts.
temperature
float
default:"0.7"
Controls randomness in responses. Range is 0.0 to 2.0, where lower values make output more focused and deterministic.
drop_params
bool
default:"True"
Whether to automatically drop unsupported parameters for specific models. When True, parameters like temperature are automatically removed for models that don’t support them (e.g., O-series models). This ensures seamless compatibility across different model providers without requiring model-specific configuration.
tools
List[Union[str, Dict, Callable, ModuleType]]
default:"[]"
List of tools available to the agent. Can include:
  • Direct tool function references (callables)
  • Tool module namespaces (modules like web, files)
  • Built-in tool module names (strings like "web", "files")
  • Custom tool definitions (dicts with ‘definition’, ‘implementation’, and optional ‘attributes’ keys)
For module names, you can specify specific tools using 'module:tool1,tool2' format.
agents
List[Agent]
default:"[]"
List of sub-agents that this agent can delegate tasks to. Enables multi-agent systems and task delegation.
max_tool_iterations
int
default:"10"
Maximum number of tool calls allowed per conversation turn. Prevents infinite loops in tool usage.
api_base
string | None
default:"None"
Custom API base URL for the model provider (e.g., for using alternative inference services). You can also use base_url as an alias for this parameter.
base_url
string | None
default:"None"
Alias for api_base. Either parameter can be used to specify a custom API endpoint.
api_key
string | None
default:"None"
API key for the model provider. If not provided, LiteLLM will use environment variables (e.g., OPENAI_API_KEY, WANDB_API_KEY). Use this when you need to explicitly pass an API key, such as with W&B Inference or custom providers.
extra_headers
Dict[str, str] | None
default:"None"
Additional headers to include in API requests. Useful for authentication tokens, API keys, or tracking headers.
reasoning
string | Dict | None
default:"None"
Enable reasoning/thinking tokens for supported models (OpenAI o1/o3, DeepSeek-R1, Claude with extended thinking).
  • String: 'low', 'medium', 'high' (recommended for most use cases)
  • Dict: Provider-specific config (e.g., {'type': 'enabled', 'budget_tokens': 1024} for Anthropic)
When enabled, the model will show its internal reasoning process before generating the final answer.
step_errors_raise
bool
default:"False"
If True, the step() method will raise exceptions instead of returning error messages. Used for backward compatibility and custom error handling.
notes
string | Prompt
default:""
Supporting notes to help the agent accomplish its purpose. These are included in the system prompt and can provide additional context or instructions.
version
string
default:"1.0.0"
Version identifier for the agent. Useful for tracking agent iterations and changes.
thread_store
ThreadStore | None
default:"None"
Thread store instance for managing conversation threads. If not provided, uses the default thread store. This parameter is excluded from serialization.
file_store
FileStore | None
default:"None"
File store instance for managing file attachments. If not provided, uses the default file store. This parameter is excluded from serialization.
message_factory
MessageFactory | None
default:"None"
Custom message factory for creating standardized messages. Advanced users can provide a custom implementation to control message formatting and structure. If not provided, uses the default factory. This parameter is excluded from serialization (recreated on deserialization).
completion_handler
CompletionHandler | None
default:"None"
Custom completion handler for LLM communication. Advanced users can provide a custom implementation to modify how the agent communicates with LLMs. If not provided, uses the default handler. This parameter is excluded from serialization (recreated on deserialization).
response_type
Type[BaseModel] | None
default:"None"
Optional Pydantic model to enforce structured output from the LLM. Uses the output-tool pattern: your Pydantic schema is registered as a special tool, and tool_choice="required" forces the model to call it. The validated model instance is available in AgentResult.structured_data.This approach allows regular tools to work alongside structured output. LiteLLM automatically translates tool_choice for different providers (OpenAI, Anthropic, Gemini, Bedrock, etc.).Can be set at the agent level (default for all runs) or overridden per run() call. See the Structured Output Guide for details.
retry_config
RetryConfig | None
default:"None"
Configuration for automatic retry behavior. When enabled with structured output, the agent will automatically retry LLM calls if validation fails, providing error feedback to help the LLM correct its output.
from tyler import RetryConfig

agent = Agent(
    retry_config=RetryConfig(
        max_retries=3,
        retry_on_validation_error=True,
        backoff_base_seconds=1.0
    )
)
See RetryConfig for all options.
tool_context
Dict[str, Any] | None
default:"None"
Default request identity passed to tools via the ctx parameter. Primarily used for user/org/session identity that answers “who is making this request?”Can be set at the agent level (for system agents or default identity) or per run() call (typical for user requests). When both are provided, run-level merges with and overrides agent-level for conflicting keys.
# System agent with fixed identity
system_agent = Agent(
    model_name="gpt-4o",
    tool_context={"user_id": "system", "role": "admin"}
)

# User-facing agent - identity passed per request
agent = Agent(model_name="gpt-4o", tools=my_tools)
await agent.run(thread, tool_context={
    "user_id": request.user.id,
    "org_id": request.user.org_id,
    "permissions": request.user.permissions
})
Infrastructure (database clients, API clients) should be closed over at tool definition time, not passed in context. See ToolContext for the recommended pattern.
response_format
Literal['json'] | None
default:"None"
Simple JSON mode for when you want any valid JSON without schema validation. Pass response_format="json" to run() to force JSON output.
# Get any valid JSON (no schema validation)
result = await agent.run(thread, response_format="json")
data = json.loads(result.content)  # Parse yourself
Cannot be used with response_type. Use response_type for schema validation, or response_format="json" for simple JSON without validation.

Creating from Config Files

Agent.from_config
classmethod
Create an Agent from a YAML configuration file. Enables reusing the same configuration between CLI and Python code.
from tyler import Agent

# Auto-discover config (searches standard locations)
agent = Agent.from_config()

# Load from specific path
agent = Agent.from_config("my-config.yaml")

# With parameter overrides
agent = Agent.from_config(
    "config.yaml",
    temperature=0.9,
    model_name="gpt-4o"
)

Parameters

config_path
string | None
default:"None"
Path to YAML config file (.yaml or .yml). If None, searches standard locations:
  1. ./tyler-chat-config.yaml (current directory)
  2. ~/.tyler/chat-config.yaml (user home)
  3. /etc/tyler/chat-config.yaml (system-wide)
**overrides
Any
Override any config values. These replace (not merge) config file values using shallow dict update semantics.Examples:
  • tools=["web"] replaces entire tools list
  • temperature=0.9 replaces temperature value
  • mcp={...} replaces entire mcp dict (not merged)

Config File Format

# Agent Identity
name: "MyAgent"
purpose: "To help with tasks"
notes: "Additional instructions"

# Model Configuration
model_name: "gpt-4.1"
temperature: 0.7
max_tool_iterations: 10
reasoning: "low"  # For models supporting thinking tokens

# Tools
tools:
  - "web"           # Built-in tool module
  - "slack"
  - "./my_tools.py" # Custom tool file (relative to config)

# MCP Servers (optional)
mcp:
  servers:
    - name: "docs"
      transport: "streamablehttp"
      url: "https://example.com/mcp"

# Environment variables (keeps secrets safe)
api_key: "${OPENAI_API_KEY}"  # Reads from environment
Config files use the same format as tyler-chat CLI configs, so you can share configurations between interactive CLI sessions and your Python code.

Advanced Config Loading

For more control over config loading, use load_config() directly:
from tyler import load_config, Agent

# Load config into dict
config = load_config("config.yaml")

# Inspect and modify before creating agent
print(f"Model: {config['model_name']}")
config["temperature"] = 0.9
config["notes"] += "\nModified programmatically"

# Create agent from modified config
agent = Agent(**config)

Processing Conversations

The go() method is the primary interface for processing conversations. It supports three output modes controlled by the stream parameter:

Non-Streaming Mode (stream=False)

from tyler import Thread, Message

# Create a conversation thread
thread = Thread()
thread.add_message(Message(role="user", content="Hello!"))

# Process the thread
result = await agent.run(thread)

# Access the response
print(result.content)  # The agent's final response
print(result.thread)  # Updated thread with all messages
print(result.new_messages)  # New messages added in this turn

With Structured Output

Get type-safe, validated responses using Pydantic models:
from pydantic import BaseModel
from tyler import Agent, Thread, Message, RetryConfig

class SupportTicket(BaseModel):
    priority: str
    category: str
    summary: str

agent = Agent(
    name="classifier",
    model_name="gpt-4o",
    retry_config=RetryConfig(max_retries=2)  # Retry on validation failure
)

thread = Thread()
thread.add_message(Message(role="user", content="My payment failed!"))

# Get structured output
result = await agent.run(thread, response_type=SupportTicket)

# Access the validated Pydantic model
ticket: SupportTicket = result.structured_data
print(f"Priority: {ticket.priority}")
print(f"Category: {ticket.category}")

With Tool Context (Request Identity)

Pass request-scoped identity to your tools:
# Define tools with infrastructure closed over
def create_order_tools(db):
    async def get_user_orders(ctx: ToolContext, limit: int = 10) -> str:
        user_id = ctx["user_id"]  # Identity from context
        org_id = ctx["org_id"]    # Tenant isolation
        orders = await db.get_orders(user_id, org_id, limit)
        return f"Found {len(orders)} orders"
    return [get_user_orders]

# Create agent with tools (db already closed over)
agent = Agent(model_name="gpt-4o", tools=create_order_tools(database))

# Run with identity context
result = await agent.run(
    thread,
    tool_context={
        "user_id": current_user.id,
        "org_id": current_user.org_id,
        "permissions": current_user.permissions
    }
)
See ToolContext for the recommended pattern.

Event Streaming Mode (stream=True or stream=“events”)

Stream responses as high-level ExecutionEvent objects with full observability:
from tyler import EventType

# Stream responses in real-time (both syntaxes work identically)
async for event in agent.stream(thread):
    # or: agent.stream(thread)
    
    if event.type == EventType.LLM_STREAM_CHUNK:
        # Content being generated
        print(event.data["content_chunk"], end="", flush=True)
    
    elif event.type == EventType.TOOL_SELECTED:
        # Tool about to be called
        print(f"Using tool: {event.data['tool_name']}")
    
    elif event.type == EventType.MESSAGE_CREATED:
        # New message added to thread
        msg = event.data["message"]
        print(f"New {msg.role} message")

Raw Streaming Mode (stream=“raw”)

Raw mode is for advanced use cases requiring OpenAI compatibility. Tools ARE executed, but you only receive raw LiteLLM chunks (no ExecutionEvents for observability).
Stream raw LiteLLM chunks in OpenAI-compatible format for direct integration:
# Get raw chunks for OpenAI compatibility
async for chunk in agent.stream(thread, mode="raw"):
    # chunk is a raw LiteLLM object with OpenAI structure
    if hasattr(chunk, 'choices') and chunk.choices:
        delta = chunk.choices[0].delta
        
        # Access content
        if hasattr(delta, 'content') and delta.content:
            print(delta.content, end="", flush=True)
        
        # Access tool calls (not executed, just passed through)
        if hasattr(delta, 'tool_calls') and delta.tool_calls:
            print(f"\nTool call delta: {delta.tool_calls}")
    
    # Usage info in final chunk
    if hasattr(chunk, 'usage') and chunk.usage:
        print(f"\nTokens: {chunk.usage.total_tokens}")
When to use raw mode:
  • Building OpenAI API proxies or gateways
  • Direct integration with OpenAI-compatible clients
  • Debugging provider-specific behavior
  • Minimal latency requirements (no transformation overhead)
How raw mode works:
  • ✅ Tools ARE executed (fully agentic behavior)
  • ✅ Multi-turn iteration supported (continues until task complete)
  • ✅ Raw chunks show tool calls via finish_reason: "tool_calls"
  • ⚠️ No ExecutionEvent telemetry (only raw chunks)
  • ⚠️ Silent during tool execution (brief pauses between chunk streams)
  • ⚠️ Consumer handles chunk formatting (SSE serialization, etc.)
The pattern matches OpenAI’s Agents SDK: chunks → finish_reason=“tool_calls” → [tools execute silently] → more chunks → repeat until done See the streaming guide for more details and examples.

Return Values

AgentResult (Non-Streaming)

@dataclass
class AgentResult:
    thread: Thread  # Updated thread with all messages
    new_messages: List[Message]  # New messages from this execution
    content: Optional[str]  # The final assistant response
    structured_data: Optional[BaseModel]  # Validated Pydantic model (if response_type used)
When using response_type, the structured_data field contains the validated Pydantic model instance. See AgentResult for full documentation.

ExecutionEvent (Streaming)

@dataclass
class ExecutionEvent:
    type: EventType  # Type of event
    timestamp: datetime  # When the event occurred
    data: Dict[str, Any]  # Event-specific data
    metadata: Optional[Dict[str, Any]]  # Additional metadata

Event Types

  • ITERATION_START - New iteration beginning
  • LLM_REQUEST - Request sent to LLM
  • LLM_RESPONSE - Complete response received
  • LLM_STREAM_CHUNK - Streaming content chunk
  • TOOL_SELECTED - Tool about to be called
  • TOOL_RESULT - Tool execution completed
  • TOOL_ERROR - Tool execution failed
  • MESSAGE_CREATED - New message added
  • EXECUTION_COMPLETE - All processing done
  • EXECUTION_ERROR - Processing failed
  • ITERATION_LIMIT - Max iterations reached

Execution Details

You can access execution information through the thread and messages:
# Calculate timing from messages
if result.new_messages:
    start_time = min(msg.timestamp for msg in result.new_messages)
    end_time = max(msg.timestamp for msg in result.new_messages)
    duration_ms = (end_time - start_time).total_seconds() * 1000
    print(f"Duration: {duration_ms:.0f}ms")
    print(f"Started: {start_time}")
    print(f"Ended: {end_time}")

# Token usage from thread
token_stats = result.thread.get_total_tokens()
print(f"Total tokens: {token_stats['overall']['total_tokens']}")

# Tool usage from thread
tool_usage = result.thread.get_tool_usage()
if tool_usage['total_calls'] > 0:
    print(f"\nTools used:")
    for tool_name, count in tool_usage['tools'].items():
        print(f"  {tool_name}: {count} calls")

Working with Tools

from lye import WEB_TOOLS, FILES_TOOLS

agent = Agent(
    name="ResearchAssistant",
    model_name="gpt-4o",
    purpose="To research topics and create reports",
    tools=[*WEB_TOOLS, *FILES_TOOLS]
)

# The agent can now browse the web and work with files
result = await agent.run(thread)

# Check which tools were used
tool_usage = result.thread.get_tool_usage()
for tool_name, count in tool_usage['tools'].items():
    print(f"Used {tool_name}: {count} times")

Agent Delegation

researcher = Agent(
    name="Researcher",
    purpose="To find information",
    tools=[*WEB_TOOLS]
)

writer = Agent(
    name="Writer",
    purpose="To create content",
    tools=[*FILES_TOOLS]
)

coordinator = Agent(
    name="Coordinator",
    purpose="To manage research and writing tasks",
    agents=[researcher, writer]  # Can delegate to these agents
)

# The coordinator can now delegate tasks
result = await coordinator.run(thread)

Custom Configuration

Using custom API endpoints

# Use a custom API endpoint
agent = Agent(
    model_name="gpt-4",
    api_base="https://your-api.com/v1",
    extra_headers={"Authorization": "Bearer token"}
)

W&B Inference configuration

import os

# Use W&B Inference with DeepSeek-R1 (thinking tokens)
agent = Agent(
    model_name="openai/deepseek-ai/DeepSeek-R1-0528",
    base_url="https://api.inference.wandb.ai/v1",
    api_key=os.getenv("WANDB_API_KEY"),  # Your W&B API key
    extra_headers={
        "HTTP-Referer": "https://wandb.ai/my-team/my-project",
        "X-Project-Name": "my-team/my-project"
    },
    reasoning="low",  # Enable thinking tokens
    temperature=0.7
)
For W&B Inference, you can also use YAML config with environment variable substitution:
model_name: "openai/deepseek-ai/DeepSeek-R1-0528"
base_url: "https://api.inference.wandb.ai/v1"
api_key: "${WANDB_API_KEY}"  # Reads from environment

Custom storage configuration

from narrator import ThreadStore, FileStore

agent = Agent(
    thread_store=ThreadStore(backend="postgresql"),
    file_store=FileStore(path="/custom/path")
)

Weave Tracing and Serialization

Tyler Agent uses @weave.op() decorators for comprehensive tracing. When you initialize Weave, all agent method calls are automatically traced and logged to your Weave dashboard.
import weave
from tyler import Agent

# Initialize Weave - enables automatic tracing
weave.init("my-project")

# Create an agent
agent = Agent(
    name="MyAgent",
    model_name="gpt-4o",
    temperature=0.7
)

# All calls are automatically traced to Weave!
result = await agent.run(thread)
# View traces at: https://wandb.ai/<entity>/my-project/weave
Don’t use weave.publish(agent) - Published Agent objects cannot be retrieved and used (Weave returns an unusable ObjectRecord). For reproducibility, publish your configuration instead:
# ✅ DO: Publish config dict for reproducibility
config = {
    "name": "MyAgent",
    "model_name": "gpt-4o",
    "temperature": 0.7,
    "tools": ["web", "files"]
}
weave.publish(config, name="my-agent-config")

# Later, retrieve and recreate
config_ref = weave.ref("my-agent-config:v1").get()
agent = Agent(**config_ref)

# ❌ DON'T: Publish agent object directly
# weave.publish(agent)  # The result cannot be used!

Pydantic Serialization

Agents inherit from pydantic.BaseModel and support standard Pydantic serialization:
# Serialize to dict
agent_dict = agent.model_dump()

# Serialize to JSON
agent_json = agent.model_dump_json()

# Deserialize (helper objects are automatically recreated)
restored_agent = Agent(**agent_dict)

# restored_agent works exactly like the original
result = await restored_agent.run(thread)
The following attributes are excluded from serialization and automatically recreated:
  • thread_store - Database connections
  • file_store - File system state
  • message_factory - Message creation helper
  • completion_handler - LLM communication helper
If you provide custom helpers, they will be preserved during initialization but not serialized.

Best practices

  1. Clear Purpose: Define a specific, focused purpose for each agent
  2. Tool Selection: Only include tools the agent actually needs
  3. Temperature: Use lower values (0.0-0.3) for consistency, higher (0.7-1.0) for creativity
  4. Error Handling: Always handle potential errors in production
  5. Token Limits: Monitor token usage to avoid hitting limits
  6. Streaming: Use streaming for better user experience in interactive applications

Example: Complete Application

import asyncio
from tyler import Agent, Thread, Message, EventType
from lye import WEB_TOOLS

async def main():
    # Create agent
    agent = Agent(
        name="WebAssistant",
        model_name="gpt-4o",
        purpose="To help users find information online",
        tools=WEB_TOOLS,
        temperature=0.3
    )
    
    # Create thread
    thread = Thread()
    
    # Add user message
    thread.add_message(Message(
        role="user",
        content="What's the latest news about AI?"
    ))
    
    # Process with streaming
    print("Assistant: ", end="", flush=True)
    
    async for event in agent.stream(thread):
        if event.type == EventType.LLM_STREAM_CHUNK:
            print(event.data["content_chunk"], end="", flush=True)
        elif event.type == EventType.TOOL_SELECTED:
            print(f"\n[Searching: {event.data['tool_name']}...]\n", end="", flush=True)
    
    print("\n")

if __name__ == "__main__":
    asyncio.run(main())