Skip to main content
Fabro can be used as a Rust SDK with two primary entry points:
  • fabro-agent — a full AI coding agent with tool use, sandboxed execution, event streaming, and context management. Use this when you want to build an agent that can read files, run commands, and interact with a codebase.
  • fabro-llm — a standalone LLM client for multi-provider completions, streaming, and tool execution loops. Use this when you want direct control over LLM calls without the agent layer.
Both crates can be used independently of Fabro’s workflow engine.

Agent (fabro-agent)

The fabro-agent crate provides a session-based AI agent that runs an LLM with tool use in a sandboxed environment. The agent loop streams LLM responses, executes tool calls (shell, read_file, write_file, edit_file, glob, grep, web_fetch, web_search), feeds results back, and repeats until the model responds with text or hits a safety limit.
Cargo.toml
[dependencies]
fabro-agent = { git = "https://github.com/fabro-sh/fabro" }
fabro-llm = { git = "https://github.com/fabro-sh/fabro" }
tokio = { version = "1", features = ["full"] }

Quick start

use fabro_agent::{
    AnthropicProfile, LocalSandbox, Session, SessionConfig,
};
use fabro_llm::client::Client;
use std::path::PathBuf;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::from_env().await?;
    let sandbox = Arc::new(LocalSandbox::new(PathBuf::from(".")));
    let profile = Arc::new(AnthropicProfile::new("claude-sonnet-4-5"));
    let config = SessionConfig::default();

    let mut session = Session::new(client, profile, sandbox, config);
    session.initialize().await;

    // Subscribe to events before sending input
    let mut events = session.subscribe();
    tokio::spawn(async move {
        while let Ok(event) = events.recv().await {
            if let fabro_agent::AgentEvent::TextDelta { delta } = &event.event {
                print!("{delta}");
            }
        }
    });

    session.process_input("List the files in this directory").await?;
    session.close();
    Ok(())
}

Session

Session is the core type. It holds the LLM client, a provider profile, a sandbox, and configuration. The main loop lives inside process_input(). Constructor:
pub fn new(
    llm_client: Client,
    provider_profile: Arc<dyn ProviderProfile>,
    sandbox: Arc<dyn Sandbox>,
    config: SessionConfig,
) -> Self
Lifecycle methods:
MethodDescription
initialize().awaitDiscovers project docs, skills, and MCP servers. Call before process_input.
process_input(input).awaitSends user input and runs the agent loop until the model stops or a limit is hit.
close()Ends the session and emits SessionEnded.
abort()Cancels the current process_input call.
cancel_token()Returns a CancellationToken for external cancellation.
Inspection:
MethodDescription
state()Returns SessionState: Idle, Processing, AwaitingInput, or Closed.
history()Returns the conversation as &History (a sequence of Turn values).
subscribe()Returns a broadcast receiver for SessionEvent values.
Steering:
MethodDescription
steer(message)Injects a system-level guidance message into the next LLM call.
follow_up(message)Queues a follow-up user message after the current turn completes.

SessionConfig

All fields are public. Key settings with their defaults:
FieldDefaultDescription
max_turns0 (unlimited)Maximum conversation turns before stopping.
max_tool_rounds_per_input200Maximum tool execution rounds per process_input call.
default_command_timeout_ms10,000Default timeout for Bash tool commands.
max_command_timeout_ms600,000Maximum allowed timeout for Bash tool commands.
enable_loop_detectiontrueDetect and break out of repetitive tool call patterns.
enable_context_compactiontrueAutomatically summarize old turns when approaching the context window limit.
compaction_threshold_percent80Context window usage percentage that triggers compaction.
max_subagent_depth1Maximum nesting depth for sub-agents.
wall_clock_timeoutNoneHard timeout for process_input. Triggers AbortReason::WallClockTimeout.
tool_hooksNonePre/post hooks around tool execution (see Tool hooks).
mcp_servers[]MCP server configurations to connect on startup.
skill_dirsNoneDirectories to discover SKILL.md files. None uses convention defaults.

Sandbox

The Sandbox trait abstracts where tools execute — local filesystem, Docker container, SSH remote, or a cloud sandbox. All tool operations go through this interface.
#[async_trait]
pub trait Sandbox: Send + Sync {
    async fn read_file(&self, path: &str, offset: Option<usize>, limit: Option<usize>) -> Result<String, String>;
    async fn write_file(&self, path: &str, content: &str) -> Result<(), String>;
    async fn delete_file(&self, path: &str) -> Result<(), String>;
    async fn file_exists(&self, path: &str) -> Result<bool, String>;
    async fn list_directory(&self, path: &str, depth: Option<usize>) -> Result<Vec<DirEntry>, String>;
    async fn exec_command(
        &self,
        command: &str,
        timeout_ms: u64,
        working_dir: Option<&str>,
        env_vars: Option<&HashMap<String, String>>,
        cancel_token: Option<CancellationToken>,
    ) -> Result<ExecResult, String>;
    async fn grep(&self, pattern: &str, path: &str, options: &GrepOptions) -> Result<Vec<String>, String>;
    async fn glob(&self, pattern: &str, path: Option<&str>) -> Result<Vec<String>, String>;
    async fn initialize(&self) -> Result<(), String>;
    async fn cleanup(&self) -> Result<(), String>;
    fn working_directory(&self) -> &str;
    fn platform(&self) -> &str;
    fn os_version(&self) -> String;
    // ... optional methods with defaults: is_remote(), refresh_push_credentials(), etc.
}
Built-in implementations:
TypeDescription
LocalSandboxExecutes directly on the local filesystem.
DockerSandboxRuns inside a Docker container (feature-gated: docker).
ReadBeforeWriteSandboxDecorator that blocks writes to files the agent hasn’t read. Wraps any Arc<dyn Sandbox>.
External crates provide additional implementations: ExeSandbox (SSH-based) in fabro-exe, and SpritesSandbox in fabro-sprites.

Provider profiles

The ProviderProfile trait encapsulates LLM-specific system prompts, tool definitions, and capability metadata. It controls how the agent presents itself to the model.
pub trait ProviderProfile: Send + Sync {
    fn provider(&self) -> Provider;
    fn model(&self) -> &str;
    fn tool_registry(&self) -> &ToolRegistry;
    fn tool_registry_mut(&mut self) -> &mut ToolRegistry;
    fn build_system_prompt(&self, env: &dyn Sandbox, ...) -> String;
    fn capabilities(&self) -> ProfileCapabilities;
    fn tools(&self) -> Vec<ToolDefinition>;
    // ...
}
Built-in profiles: AnthropicProfile, OpenAiProfile, GeminiProfile.

Events

All operations emit AgentEvent values through a tokio broadcast channel. Subscribe before calling process_input().
let mut rx = session.subscribe();
tokio::spawn(async move {
    while let Ok(event) = rx.recv().await {
        match event.event {
            AgentEvent::TextDelta { delta } => print!("{delta}"),
            AgentEvent::ToolCallStarted { tool_name, .. } => {
                println!("[calling {tool_name}]");
            }
            AgentEvent::ToolCallCompleted { tool_name, is_error, .. } => {
                println!("[{tool_name} done, error={is_error}]");
            }
            AgentEvent::LoopDetected => println!("[loop detected]"),
            AgentEvent::CompactionCompleted { .. } => println!("[context compacted]"),
            _ => {}
        }
    }
});
Key AgentEvent variants:
VariantDescription
SessionStarted / SessionEndedSession lifecycle.
TextDelta { delta }Incremental text from the model.
ReasoningDelta { delta }Incremental reasoning/thinking text.
AssistantMessage { text, model, usage, tool_call_count }Complete assistant turn with token usage.
ToolCallStarted { tool_name, tool_call_id, arguments }A tool call is about to execute.
ToolCallCompleted { tool_name, tool_call_id, output, is_error }A tool call finished.
Error { error }An AgentError occurred.
LoopDetectedThe agent is repeating itself.
TurnLimitReached { max_turns }Turn limit hit.
CompactionStarted / CompactionCompletedContext window compaction.
SubAgentSpawned / SubAgentCompletedSub-agent lifecycle.
McpServerReady / McpServerFailedMCP server connection status.

Tool hooks

Implement ToolHookCallback to intercept tool calls for approval, logging, or transformation:
use fabro_agent::{ToolHookCallback, ToolHookDecision};
use async_trait::async_trait;

struct MyHooks;

#[async_trait]
impl ToolHookCallback for MyHooks {
    async fn pre_tool_use(
        &self,
        tool_name: &str,
        tool_input: &serde_json::Value,
    ) -> ToolHookDecision {
        if tool_name == "shell" {
            println!("Agent wants to run: {}", tool_input["command"]);
        }
        ToolHookDecision::Proceed // or Block { reason }
    }

    async fn post_tool_use(&self, tool_name: &str, _call_id: &str, _output: &str) {
        println!("{tool_name} completed");
    }

    async fn post_tool_use_failure(&self, tool_name: &str, _call_id: &str, error: &str) {
        eprintln!("{tool_name} failed: {error}");
    }
}
Pass hooks via SessionConfig:
let config = SessionConfig {
    tool_hooks: Some(Arc::new(MyHooks)),
    ..Default::default()
};
For simple sync approval, use ToolApprovalAdapter to wrap a closure:
use fabro_agent::ToolApprovalAdapter;
use std::sync::Arc;

let config = SessionConfig {
    tool_hooks: Some(Arc::new(ToolApprovalAdapter(Arc::new(|tool_name, _args| {
        if tool_name == "shell" {
            Err("shell is not allowed".into())
        } else {
            Ok(())
        }
    })))),
    ..Default::default()
};

Error handling

All fallible Session methods return Result<T, AgentError>:
VariantDescription
Llm(SdkError)An error from the LLM provider (wraps fabro_llm::error::SdkError).
SessionClosedprocess_input was called on a closed session.
InvalidState(String)The session is in an unexpected state.
ToolExecution(String)A tool execution failed.
Aborted(AbortReason)The session was cancelled (Cancelled) or timed out (WallClockTimeout).

LLM client (fabro-llm)

The fabro-llm crate is a standalone Rust library for calling LLM providers. It provides a unified client that routes requests to Anthropic, OpenAI, Gemini, and other providers, with built-in streaming, tool execution, retries, and middleware. You can use it independently of Fabro’s workflow engine — add it as a dependency in any Rust project.
Cargo.toml
[dependencies]
fabro-llm = { git = "https://github.com/fabro-sh/fabro" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Quick start

The simplest path is Client::from_env(), which auto-registers providers based on environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, etc.):
use fabro_llm::client::Client;
use fabro_llm::generate::{generate, GenerateParams};
use fabro_llm::set_default_client;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::from_env().await?;
    set_default_client(client);

    let result = generate(
        GenerateParams::new("claude-sonnet-4-5")
            .prompt("Explain ownership in Rust in two sentences.")
    ).await?;

    println!("{}", result.text());
    println!("Tokens used: {}", result.total_usage.total_tokens);
    Ok(())
}

Client

Client is the core type that holds provider adapters and middleware. It routes each request to the appropriate provider.

Creating from environment

let client = Client::from_env().await?;
This checks for API key environment variables and registers adapters for each provider found:
Environment variableProvider
ANTHROPIC_API_KEYAnthropic
OPENAI_API_KEYOpenAI
GEMINI_API_KEY or GOOGLE_API_KEYGemini
KIMI_API_KEYKimi
ZAI_API_KEYZAI
MINIMAX_API_KEYMinimax
INCEPTION_API_KEYInception
The first provider registered becomes the default. Optional base URL overrides (e.g. ANTHROPIC_BASE_URL) are also read.

Creating manually

use fabro_llm::client::Client;
use fabro_llm::providers::AnthropicAdapter;
use std::collections::HashMap;
use std::sync::Arc;

let adapter = AnthropicAdapter::new("sk-ant-...")
    .with_base_url("https://custom-proxy.example.com");

let mut providers = HashMap::new();
providers.insert("anthropic".to_string(), Arc::new(adapter) as _);

let client = Client::new(providers, Some("anthropic".to_string()), vec![]);

Low-level calls

For direct control without the tool loop, use complete() and stream() on the client:
use fabro_llm::types::{Request, Message};

let request = Request {
    model: "claude-sonnet-4-5".into(),
    messages: vec![Message::user("Hello")],
    ..Default::default()
};

let response = client.complete(&request).await?;
println!("{}", response.text());

High-level generation

The generate() function wraps the client with automatic tool execution loops, retries, and timeouts. It is the recommended entry point for most use cases.

Basic completion

use fabro_llm::generate::{generate, GenerateParams};

let result = generate(
    GenerateParams::new("claude-sonnet-4-5")
        .system("You are a helpful assistant.")
        .prompt("What is the capital of France?")
        .temperature(0.0)
).await?;

println!("{}", result.text());

Multi-turn conversations

Use .messages() instead of .prompt() to pass a full conversation history:
use fabro_llm::types::Message;

let result = generate(
    GenerateParams::new("claude-sonnet-4-5")
        .messages(vec![
            Message::user("My name is Alice."),
            Message::assistant("Hello Alice! How can I help you?"),
            Message::user("What's my name?"),
        ])
).await?;
You cannot use both .prompt() and .messages() on the same request — this returns SdkError::Configuration.

GenerateParams reference

MethodTypeDescription
new(model)impl Into<String>Required. Model ID or alias (e.g. "opus", "claude-sonnet-4-5")
.prompt(text)impl Into<String>Convenience: sends a single user message
.messages(msgs)Vec<Message>Full conversation history
.system(text)impl Into<String>System prompt
.tools(tools)Vec<Tool>Tools available to the model
.tool_choice(choice)ToolChoiceHow the model selects tools
.max_tool_rounds(n)u32Max tool execution rounds (default: 1)
.temperature(t)f64Sampling temperature
.top_p(p)f64Nucleus sampling
.max_tokens(n)i64Maximum output tokens
.stop_sequences(seqs)Vec<String>Stop sequences
.reasoning_effort(level)impl Into<String>e.g. "low", "medium", "high"
.provider(name)impl Into<String>Force a specific provider
.max_retries(n)u32Retry count for transient errors (default: 2)
.timeout(config)TimeoutConfigTotal and per-step timeouts
.client(client)Arc<Client>Override the default client
.abort_signal(token)CancellationTokenCancel generation
.stop_when(f)Fn(&[StepResult]) -> boolCustom stop condition after each tool round

GenerateResult

GenerateResult dereferences to Response, so you can call response methods directly:
let result = generate(params).await?;

// Response methods (via Deref)
result.text();            // concatenated text output
result.tool_calls();      // Vec<ToolCall> from the final response
result.reasoning();       // Option<String> — extended thinking content

// GenerateResult fields
result.response;          // Response — the final LLM response
result.tool_results;      // Vec<ToolResult> — from the final step
result.total_usage;       // Usage — aggregated across all steps
result.steps;             // Vec<StepResult> — one per tool round
result.output;            // Option<Value> — for structured output

Tools

Tools let the model call functions during generation. There are two kinds:
  • Active tools have an execute handler — Fabro runs them automatically and feeds results back to the model.
  • Passive tools have no handler — Fabro returns the tool calls to you in the response.

Defining an active tool

use fabro_llm::tools::Tool;
use serde_json::json;

let weather = Tool::active(
    "get_weather",
    "Get the current weather for a city",
    json!({
        "type": "object",
        "properties": {
            "city": { "type": "string", "description": "City name" }
        },
        "required": ["city"]
    }),
    |args, _ctx| async move {
        let city = args["city"].as_str().unwrap_or("unknown");
        Ok(json!({ "temperature": "72°F", "city": city }))
    },
);

Using tools with generate

let result = generate(
    GenerateParams::new("claude-sonnet-4-5")
        .prompt("What's the weather in San Francisco?")
        .tools(vec![weather])
        .max_tool_rounds(5)
).await?;

// Inspect the tool execution history
for (i, step) in result.steps.iter().enumerate() {
    let calls = step.response.tool_calls();
    println!("Step {i}: {} tool calls, {} results", calls.len(), step.tool_results.len());
}
The generate() function loops automatically: the model calls tools, Fabro executes them, feeds results back, and repeats until the model stops or max_tool_rounds is reached.

Tool choice

Control how the model selects tools:
use fabro_llm::types::ToolChoice;

// Let the model decide (default)
GenerateParams::new("opus").tool_choice(ToolChoice::Auto);

// Force a specific tool
GenerateParams::new("opus").tool_choice(ToolChoice::Named {
    tool_name: "get_weather".into()
});

// Force the model to use some tool
GenerateParams::new("opus").tool_choice(ToolChoice::Required);

// Prevent tool use
GenerateParams::new("opus").tool_choice(ToolChoice::None);

Passive tools

Passive tools let you handle execution yourself:
let search = Tool::passive(
    "search",
    "Search the codebase",
    json!({
        "type": "object",
        "properties": {
            "query": { "type": "string" }
        },
        "required": ["query"]
    }),
);

let result = generate(
    GenerateParams::new("claude-sonnet-4-5")
        .prompt("Find all uses of the Config struct")
        .tools(vec![search])
).await?;

// Handle tool calls yourself
for call in result.tool_calls() {
    println!("Model wants to call {} with {}", call.name, call.arguments);
}

Streaming

Text stream

For simple cases where you only need the text deltas:
use fabro_llm::generate::{stream, GenerateParams};
use futures::StreamExt;

let stream_result = stream(
    GenerateParams::new("claude-sonnet-4-5")
        .prompt("Write a haiku about Rust")
).await?;

let mut text_stream = stream_result.text_stream();
while let Some(chunk) = text_stream.next().await {
    print!("{}", chunk?);
}

Full event stream

For fine-grained control, consume StreamEvent variants directly:
use fabro_llm::generate::{stream, GenerateParams};
use fabro_llm::types::StreamEvent;
use futures::StreamExt;

let mut stream_result = stream(
    GenerateParams::new("claude-sonnet-4-5")
        .prompt("Explain monads")
).await?;

while let Some(event) = stream_result.next().await {
    match event? {
        StreamEvent::TextDelta { delta, .. } => print!("{delta}"),
        StreamEvent::ReasoningDelta { delta } => eprint!("[thinking] {delta}"),
        StreamEvent::ToolCallStart { tool_call } => {
            println!("\n> Calling tool: {}", tool_call.name);
        }
        StreamEvent::StepFinish { usage, .. } => {
            println!("\n[step done, {} tokens]", usage.total_tokens);
        }
        StreamEvent::Finish { response, .. } => {
            println!("\n[done: {:?}]", response.finish_reason);
        }
        _ => {}
    }
}

StreamEvent variants

VariantDescription
StreamStartStream opened
TextStart { text_id }Text block started
TextDelta { delta, text_id }Incremental text chunk
TextEnd { text_id }Text block ended
ReasoningStartExtended thinking started
ReasoningDelta { delta }Incremental reasoning chunk
ReasoningEndExtended thinking ended
ToolCallStart { tool_call }Tool call started
ToolCallDelta { tool_call }Incremental tool call arguments
ToolCallEnd { tool_call }Tool call complete
StepFinish { finish_reason, usage, response, tool_calls, tool_results }A tool round completed (more rounds may follow)
Finish { finish_reason, usage, response }Generation complete
Error { error, raw }Provider error

Structured output

Generate typed JSON objects that conform to a JSON Schema:
use fabro_llm::generate::{generate_object, GenerateParams};
use serde_json::json;

let schema = json!({
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "age": { "type": "integer" },
        "hobbies": {
            "type": "array",
            "items": { "type": "string" }
        }
    },
    "required": ["name", "age", "hobbies"]
});

let result = generate_object(
    GenerateParams::new("claude-sonnet-4-5")
        .prompt("Generate a profile for a fictional character"),
    schema,
).await?;

let profile = result.output.expect("structured output");
println!("Name: {}", profile["name"]);

Middleware

Middleware intercepts requests and responses for logging, caching, or transformation:
use fabro_llm::middleware::{Middleware, NextFn, NextStreamFn};
use fabro_llm::provider::StreamEventStream;
use fabro_llm::types::{Request, Response};
use fabro_llm::error::SdkError;
use async_trait::async_trait;

struct LoggingMiddleware;

#[async_trait]
impl Middleware for LoggingMiddleware {
    async fn handle_complete(
        &self,
        request: Request,
        next: NextFn,
    ) -> Result<Response, SdkError> {
        println!("Request to model: {}", request.model);
        let response = next(request).await?;
        println!("Response: {} tokens", response.usage.total_tokens);
        Ok(response)
    }

    async fn handle_stream(
        &self,
        request: Request,
        next: NextStreamFn,
    ) -> Result<StreamEventStream, SdkError> {
        println!("Streaming request to model: {}", request.model);
        next(request).await
    }
}
Add middleware to the client:
let mut client = Client::from_env().await?;
client.add_middleware(Arc::new(LoggingMiddleware));

Model catalog

The crate embeds a catalog of known models with metadata:
use fabro_llm::catalog;

// Look up a model by ID or alias
let info = catalog::get_model_info("opus").unwrap();
println!("{} ({})", info.display_name, info.provider);
println!("Context: {} tokens", info.limits.context_window);
println!("Tools: {}, Vision: {}", info.features.tools, info.features.vision);

// List all models for a provider
let models = catalog::list_models(Some("anthropic"));

// Get the default model for a provider
let default = catalog::default_model_for_provider("openai").unwrap();

// Find a capability-matched model on a different provider
let equivalent = catalog::closest_model("gemini", &info);
See Models for the full catalog table.

Error handling

All fallible operations return Result<T, SdkError>. The error type classifies failures to enable retry and failover decisions:
use fabro_llm::error::SdkError;

match result {
    Err(SdkError::Provider { kind, detail }) => {
        println!("Provider error ({}): {}", detail.provider, detail.message);
        if let Some(code) = detail.status_code {
            println!("HTTP {code}");
        }
    }
    Err(SdkError::RequestTimeout { message }) => println!("Timeout: {message}"),
    Err(SdkError::Network { message }) => println!("Network: {message}"),
    Err(SdkError::Abort { message }) => println!("Cancelled: {message}"),
    Err(e) => println!("Other: {e}"),
    Ok(_) => {}
}

Error classification

Every SdkError exposes classification methods:
MethodReturnsDescription
retryable()boolSafe to retry with the same provider (e.g. rate limit, server error)
failover_eligible()boolSafe to try a different provider
retry_after()Option<f64>Seconds to wait before retrying (from provider Retry-After header)
status_code()Option<u16>HTTP status code, if applicable
provider_name()&strWhich provider returned the error

Provider error kinds

KindHTTP statusRetryableFailover
Authentication401NoNo
AccessDenied403NoNo
NotFound404NoNo
InvalidRequest400NoNo
RateLimit429YesYes
Server500, 502, 503YesYes
ContentFiltervariesNoNo
ContextLengthvariesNoNo
QuotaExceededvariesNoYes

Retries

The generate() function retries automatically based on max_retries (default: 2). For low-level use, the retry function wraps any async operation:
use fabro_llm::retry::retry;
use fabro_llm::types::RetryPolicy;

let policy = RetryPolicy {
    max_retries: 3,
    base_delay: 1.0,
    max_delay: 60.0,
    backoff_multiplier: 2.0,
    jitter: true,
    on_retry: None,
};

let response = retry(&policy, || {
    let c = client.clone();
    let r = request.clone();
    async move { c.complete(&r).await }
}).await?;
Retry only fires when error.retryable() returns true and respects Retry-After headers.

Cancellation

Pass a CancellationToken to abort long-running generation:
use tokio_util::sync::CancellationToken;

let token = CancellationToken::new();
let token_clone = token.clone();

// Cancel after 30 seconds
tokio::spawn(async move {
    tokio::time::sleep(std::time::Duration::from_secs(30)).await;
    token_clone.cancel();
});

let result = generate(
    GenerateParams::new("opus")
        .prompt("Write a novel")
        .abort_signal(token)
).await;
// Returns SdkError::Abort if cancelled

Provider adapters

Each provider has a dedicated adapter. All adapters implement the ProviderAdapter trait and are interchangeable.
AdapterProviderConstructor
AnthropicAdapterAnthropic Messages API::new(api_key)
OpenAiAdapterOpenAI Responses API::new(api_key)
GeminiAdapterGoogle Gemini API::new(api_key)
OpenAiCompatibleAdapterAny OpenAI-compatible endpoint::new(api_key, base_url)
All adapters support .with_base_url() for proxies or custom endpoints. OpenAiAdapter also supports .with_org_id() and .with_project_id().

Custom provider

Implement the ProviderAdapter trait to add a new provider:
use fabro_llm::provider::{ProviderAdapter, StreamEventStream};
use fabro_llm::types::{Request, Response};
use fabro_llm::error::SdkError;
use async_trait::async_trait;

struct MyProvider;

#[async_trait]
impl ProviderAdapter for MyProvider {
    fn name(&self) -> &str { "my-provider" }

    async fn complete(&self, request: &Request) -> Result<Response, SdkError> {
        // Call your provider's API
        todo!()
    }

    async fn stream(&self, request: &Request) -> Result<StreamEventStream, SdkError> {
        // Return a stream of events
        todo!()
    }
}
Register it on the client:
client.register_provider(Arc::new(MyProvider)).await?;