litellm
Описание
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
Языки
- Python69,5%
- JavaScript22,4%
- TypeScript7,6%
- HTML0,4%
- Остальные0,1%
🚅 LiteLLM
Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]
LiteLLM Proxy Server (LLM Gateway) | Hosted Proxy (Preview) | Enterprise Tier
LiteLLM manages:
- Translate inputs to provider's
,completion, andembeddingendpointsimage_generation - Consistent output, text responses will always be available at ['choices'][0]['message']['content']
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router
- Set Budgets & Rate limits per project, api key, model LiteLLM Proxy Server (LLM Gateway)
Jump to LiteLLM Proxy (LLM Gateway) Docs
Jump to Supported LLM Providers
🚨 Stable Release: Use docker images with the tag. These have undergone 12 hour load tests, before being published. More information about the release cycle here
Support for more providers. Missing a provider or LLM Platform, raise a feature request.
Usage (Docs)
Important
LiteLLM v1.0.0 now requires
. Migration guide hereopenai>=1.0.0
LiteLLM v1.40.14+ now requires. No changes required.pydantic>=2.0.0
Response (OpenAI Format)
Call any model supported by a provider, with . There might be provider-specific details here, so refer to provider docs for more information
Async (Docs)
Streaming (Docs)
liteLLM supports streaming the model response back, pass to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)
Response chunk (OpenAI Format)
Logging Observability (Docs)
LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack
LiteLLM Proxy Server (LLM Gateway) - (Docs)
Track spend + Load Balance across multiple projects
The proxy provides:
📖 Proxy Endpoints - Swagger Docs
Quick Start Proxy - CLI
Step 1: Start litellm proxy
Step 2: Make ChatCompletions Request to Proxy
Important
Proxy Key Management (Docs)
Connect the proxy with a Postgres DB to create proxy keys
UI on on your proxy server
Set budgets and rate limits across multiple projects
Request
Expected Response
Supported Providers (Docs)
| Provider | Completion | Streaming | Async Completion | Async Streaming | Async Embedding | Async Image Generation |
|---|---|---|---|---|---|---|
| openai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Meta - Llama API | ✅ | ✅ | ✅ | ✅ | ||
| azure | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| AI/ML API | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| aws - sagemaker | ✅ | ✅ | ✅ | ✅ | ✅ | |
| aws - bedrock | ✅ | ✅ | ✅ | ✅ | ✅ | |
| google - vertex_ai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| google - palm | ✅ | ✅ | ✅ | ✅ | ||
| google AI Studio - gemini | ✅ | ✅ | ✅ | ✅ | ||
| mistral ai api | ✅ | ✅ | ✅ | ✅ | ✅ | |
| cloudflare AI Workers | ✅ | ✅ | ✅ | ✅ | ||
| cohere | ✅ | ✅ | ✅ | ✅ | ✅ | |
| anthropic | ✅ | ✅ | ✅ | ✅ | ||
| empower | ✅ | ✅ | ✅ | ✅ | ||
| huggingface | ✅ | ✅ | ✅ | ✅ | ✅ | |
| replicate | ✅ | ✅ | ✅ | ✅ | ||
| together_ai | ✅ | ✅ | ✅ | ✅ | ||
| openrouter | ✅ | ✅ | ✅ | ✅ | ||
| ai21 | ✅ | ✅ | ✅ | ✅ | ||
| baseten | ✅ | ✅ | ✅ | ✅ | ||
| vllm | ✅ | ✅ | ✅ | ✅ | ||
| nlp_cloud | ✅ | ✅ | ✅ | ✅ | ||
| aleph alpha | ✅ | ✅ | ✅ | ✅ | ||
| petals | ✅ | ✅ | ✅ | ✅ | ||
| ollama | ✅ | ✅ | ✅ | ✅ | ✅ | |
| deepinfra | ✅ | ✅ | ✅ | ✅ | ||
| perplexity-ai | ✅ | ✅ | ✅ | ✅ | ||
| Groq AI | ✅ | ✅ | ✅ | ✅ | ||
| Deepseek | ✅ | ✅ | ✅ | ✅ | ||
| anyscale | ✅ | ✅ | ✅ | ✅ | ||
| IBM - watsonx.ai | ✅ | ✅ | ✅ | ✅ | ✅ | |
| voyage ai | ✅ | |||||
| xinference [Xorbits Inference] | ✅ | |||||
| FriendliAI | ✅ | ✅ | ✅ | ✅ | ||
| Galadriel | ✅ | ✅ | ✅ | ✅ | ||
| Novita AI | ✅ | ✅ | ✅ | ✅ | ||
| Featherless AI | ✅ | ✅ | ✅ | ✅ | ||
| Nebius AI Studio | ✅ | ✅ | ✅ | ✅ | ✅ |
Contributing
Interested in contributing? Contributions to LiteLLM Python SDK, Proxy Server, and LLM integrations are both accepted and highly encouraged!
Quick start: → → → →
See our comprehensive Contributing Guide (CONTRIBUTING.md) for detailed instructions.
Enterprise
For companies that need better security, user management and professional support
This covers:
- ✅ Features under the LiteLLM Commercial License:
- ✅ Feature Prioritization
- ✅ Custom Integrations
- ✅ Professional Support - Dedicated discord + slack
- ✅ Custom SLAs
- ✅ Secure access with Single Sign-On
Contributing
We welcome contributions to LiteLLM! Whether you're fixing bugs, adding features, or improving documentation, we appreciate your help.
Quick Start for Contributors
For detailed contributing guidelines, see CONTRIBUTING.md.
Code Quality / Linting
LiteLLM follows the Google Python Style Guide.
Our automated checks include:
- Black for code formatting
- Ruff for linting and code quality
- MyPy for type checking
- Circular import detection
- Import safety checks
Run all checks locally:
All these checks must pass before your PR can be merged.
Support / talk with founders
- Schedule Demo 👋
- Community Discord 💭
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
Why did we build this
- Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere.
Contributors
Run in Developer mode
Services
- Setup .env file in root
- Run dependant services docker-compose up db prometheus
Backend
- (In root) create virtual environment python -m venv .venv
- Activate virtual environment source .venv/bin/activate
- Install dependencies pip install -e ".[all]"
- Start proxy backend uvicorn litellm.proxy.proxy_server:app --host localhost --port 4000 --reload
Frontend
- Navigate to ui/litellm-dashboard
- Install dependencies npm install
- Run
to start the dashboardnpm run dev