feat(provider): add LiteLLM as embedded AI gateway provider #782
Open
RheagalFire wants to merge 2 commits intoevalstate:mainfrom
Open
feat(provider): add LiteLLM as embedded AI gateway provider #782RheagalFire wants to merge 2 commits intoevalstate:mainfrom
RheagalFire wants to merge 2 commits intoevalstate:mainfrom
Conversation
Author
|
cc @evalstate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #127.
Summary
Adds LiteLLM as a first-class provider alongside the existing Anthropic / OpenAI / Google / Azure / Bedrock / TensorZero options. Embedded SDK mode (no proxy required) — every call goes through
litellm.acompletion(model=...), which routes to 100+ underlying providers (Anthropic, OpenAI, AWS Bedrock, Vertex AI, Cohere, Mistral, Groq, Perplexity, Together, Fireworks, Cerebras, Databricks, IBM Watsonx, AI21, Replicate, DeepInfra, NVIDIA NIM, xAI, Sambanova, …) using each backing's standard auth conventions.The interactive model picker gets a new
LiteLLM [available] (20 curated)row at the top of the Providers column. Selecting it shows 20 curated entries spanning the major backings; pressingcswaps to the "all" scope which discovers ~2.3k models fromlitellm.models_by_providerat runtime. The ✓/✗ marker on each row reflects whether that backing's credentials are present in the user's environment, so users see at a glance which models will actually chat vs. which will fail at runtime with a missing-key error.Three credential modes work today (precedence: env → config → proxy):
Quickstart (60 seconds)
Why
A LiteLLM provider lets users:
litellm.openai/gpt-4o→litellm.anthropic/claude-sonnet-4-6→litellm.bedrock/...) without touching any other config.The picker UX (curated + dynamic discovery + per-row availability markers) keeps fast-agent's "model selection is painless" story intact — users don't have to memorize 2k+ model names.
Prior art
request_paramsdeep-merge for LiteLLM+LangFuse downstream users; this PR builds on the same primary use case.Installation & usage
Install
Path 1 — env vars (simplest, zero config)
The LiteLLM SDK reads each backing's standard env var (
ANTHROPIC_API_KEY,OPENAI_API_KEY,COHERE_API_KEY,MISTRAL_API_KEY, …). Export whichever backings you want to use, then:export ANTHROPIC_API_KEY=sk-ant-... fast-agent goIn the picker, navigate to LiteLLM, focus the Models column, pick
Anthropic Claude Sonnet → litellm.anthropic/claude-sonnet-4-6(✓ marker), Enter, chat.Path 2 — config file (no shell exports)
fastagent.config.yaml:Reads from the same
<provider>:sections fast-agent already uses for native providers.LiteLLMLLM.__init__bridges them into env vars at startup so the LiteLLM SDK's per-backing auth resolution picks them up. Env vars always win if both are set.Bridged backings:
anthropic→ANTHROPIC_API_KEY/ANTHROPIC_BASE_URL,openai→OPENAI_API_KEY/OPENAI_BASE_URL,google→GEMINI_API_KEY,xai→XAI_API_KEY,groq→GROQ_API_KEY,deepseek→DEEPSEEK_API_KEY,openrouter→OPENROUTER_API_KEY. Other backings (Cohere, Mistral, Perplexity, Bedrock, Vertex, …) still use env vars or the proxy.Path 3 — LiteLLM proxy server (centralized auth)
Run a LiteLLM proxy with model deployments:
Point fast-agent at it:
In proxy mode the model spec is
litellm/<deployment-name>(the name from the proxy'smodel_list), not the upstreamlitellm.<backing>/<model>shape.Optional
litellm:config knobsArchitecture
LiteLLMLLMextends the existingOpenAILLMand only swaps the underlying client. LiteLLM normalizes every backing's response into OpenAI'sChatCompletionshape, so the existing OpenAI streaming, tool-call accumulation, structured-output, reasoning-effort, and cache-token handling all work unchanged.The shim is intentionally narrow — it implements only
__aenter__/__aexit__andchat.completions.create. Adding the full AsyncOpenAI surface (files, embeddings, audio, etc.) is left out of scope; calls to those raiseNotImplementedErrorso the failure mode is loud rather than silent.For wizard rendering, per-row credential markers use a static map (
_LITELLM_BACKING_ENV_KEYS) instead oflitellm.validate_environment(...)because the latter triggers OAuth device-code flows for some backings (e.g.github_copilot) and would block the picker render for 60+ seconds.Files
src/fast_agent/llm/provider/litellm/__init__.py(1) — package marker.src/fast_agent/llm/provider/litellm/llm_litellm.py(193) —LiteLLMLLM(OpenAILLM),_LiteLLMClientShim,_bridge_fastagent_config_to_litellm_env.src/fast_agent/llm/provider_types.py(+1) —LITELLM = ("litellm", "LiteLLM").src/fast_agent/llm/model_factory.py(+5) — provider-class dispatch.src/fast_agent/llm/provider_key_manager.py(+13) — keyless registration;LITELLM_API_KEYis read for proxy mode but isn't required.src/fast_agent/llm/provider_model_catalog.py(+47) —LiteLLMModelCatalogAdapterreturns the full LiteLLM catalog (~2.3k specs).src/fast_agent/llm/model_selection.py(+118) — 20 curated entries spanning 16 backings.src/fast_agent/ui/model_picker.py(+10) — per-row ✓/✗ marker honorsmodel.backing_availableoverride.src/fast_agent/ui/model_picker_common.py(+135) —Provider.LITELLMfirst inPICKER_PROVIDER_ORDER,_provider_is_active,litellm_backing_creds_presentstatic map,ModelOption.backing_availableplumbing into curated + dynamic discovery paths.src/fast_agent/llm/provider/openai/llm_openai.py(+1) — addsProvider.LITELLMto providers using_process_stream_manual(LiteLLM chunks omit fields likedelta.refusal).src/fast_agent/config.py(+48) —LiteLLMSettingsschema.pyproject.toml(+5) — newlitellm = ["litellm>=1.60,<1.85"]optional extra; also added toall-providers.README.md(+1 phrase) — providers paragraph mentions the new optional extra.tests/unit/fast_agent/llm/test_litellm_provider.py(new, 280) — 24 tests.tests/unit/fast_agent/llm/test_model_factory.py(+5) — skip LiteLLM intest_curated_catalog_aliases_are_parseable(same pattern asanthropic-vertex; LiteLLM aliases intentionally mirror native short names).Tests
Unit tests (24 / 24 pass)
Coverage:
Provider.LITELLMinPICKER_PROVIDER_ORDER_provider_is_activetrue when litellm importable, false when notapi_base/api_key/timeout/drop_params/extra_headersforwarded tolitellm.acompletionRegression (1499 / 1499 adjacent unit tests pass)
Type checking (ty) clean on touched files
One narrow
# ty: ignore[invalid-method-override]onLiteLLMLLM._openai_client(intentional — the shim is a duck-typed substitute forAsyncOpenAI; OpenAILLM only callschat.completions.createon it).Lint clean
Live E2E
Path 1 — env vars + wizard pick:
Path 2 — config file, no env vars exported:
Path 3 — LiteLLM proxy: verified locally with
litellm --config litellm_proxy.yaml --port 4000andlitellm.api_base: http://localhost:4000in fast-agent config. Round-trips identically.Troubleshooting
litellm.AuthenticationError: Missing <Provider> API KeyThe chosen model's backing provider doesn't have credentials in your environment. Look up the env var that backing needs (e.g.
OPENAI_API_KEYforlitellm.openai/...,COHERE_API_KEYforlitellm.cohere/...), then export it (Path 1) or add it under the matching<provider>:section infastagent.config.yaml(Path 2, for bridged backings only).litellm.NotFoundError: ... DeploymentNotFound/model not foundThe backing provider authenticated successfully, but the specific model name isn't deployed at that endpoint. Common causes:
us-central1).Workaround: press
cin the picker to switch to the all-scope and find the deployed name, or pass--model=litellm.<backing>/<exact-name>directly.Wizard shows ✗ on a model whose key is set
OPENAI_KEYinstead ofOPENAI_API_KEY).fast-agent goafter exporting new keys.Out of scope / future work