litellm

0

Описание

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Языки

  • Python69,5%
  • JavaScript22,4%
  • TypeScript7,6%
  • HTML0,4%
  • Остальные0,1%
10 месяцев назад
10 месяцев назад
9 месяцев назад
9 месяцев назад
9 месяцев назад
9 месяцев назад
месяц назад
месяц назад
месяц назад
10 месяцев назад
10 месяцев назад
10 месяцев назад
месяц назад
10 месяцев назад
год назад
2 года назад
месяц назад
README.md

🚅 LiteLLM

Deploy to Render Deploy on Railway

Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]

LiteLLM Proxy Server (LLM Gateway) | Hosted Proxy (Preview) | Enterprise Tier

PyPI Version Y Combinator W23 Whatsapp Discord

LiteLLM manages:

  • Translate inputs to provider's
    completion
    ,
    embedding
    , and
    image_generation
    endpoints
  • Consistent output, text responses will always be available at
    ['choices'][0]['message']['content']
  • Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router
  • Set Budgets & Rate limits per project, api key, model LiteLLM Proxy Server (LLM Gateway)

Jump to LiteLLM Proxy (LLM Gateway) Docs
Jump to Supported LLM Providers

🚨 Stable Release: Use docker images with the

-stable
tag. These have undergone 12 hour load tests, before being published. More information about the release cycle here

Support for more providers. Missing a provider or LLM Platform, raise a feature request.

Usage (Docs)

Important

LiteLLM v1.0.0 now requires

openai>=1.0.0
. Migration guide here
LiteLLM v1.40.14+ now requires
pydantic>=2.0.0
. No changes required.

Open In Colab

Response (OpenAI Format)

Call any model supported by a provider, with

model=<provider_name>/<model_name>
. There might be provider-specific details here, so refer to provider docs for more information

Async (Docs)

Streaming (Docs)

liteLLM supports streaming the model response back, pass

stream=True
to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

Response chunk (OpenAI Format)

Logging Observability (Docs)

LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack

LiteLLM Proxy Server (LLM Gateway) - (Docs)

Track spend + Load Balance across multiple projects

Hosted Proxy (Preview)

The proxy provides:

  1. Hooks for auth
  2. Hooks for logging
  3. Cost tracking
  4. Rate Limiting

📖 Proxy Endpoints - Swagger Docs

Quick Start Proxy - CLI

Step 1: Start litellm proxy

Step 2: Make ChatCompletions Request to Proxy

Important

💡 Use LiteLLM Proxy with Langchain (Python, JS), OpenAI SDK (Python, JS) Anthropic SDK, Mistral SDK, LlamaIndex, Instructor, Curl

Proxy Key Management (Docs)

Connect the proxy with a Postgres DB to create proxy keys

UI on

/ui
on your proxy server ui_3

Set budgets and rate limits across multiple projects

POST /key/generate

Request

Expected Response

Supported Providers (Docs)

ProviderCompletionStreamingAsync CompletionAsync StreamingAsync EmbeddingAsync Image Generation
openai
Meta - Llama API
azure
AI/ML API
aws - sagemaker
aws - bedrock
google - vertex_ai
google - palm
google AI Studio - gemini
mistral ai api
cloudflare AI Workers
cohere
anthropic
empower
huggingface
replicate
together_ai
openrouter
ai21
baseten
vllm
nlp_cloud
aleph alpha
petals
ollama
deepinfra
perplexity-ai
Groq AI
Deepseek
anyscale
IBM - watsonx.ai
voyage ai
xinference [Xorbits Inference]
FriendliAI
Galadriel
Novita AI
Featherless AI
Nebius AI Studio

Read the Docs

Contributing

Interested in contributing? Contributions to LiteLLM Python SDK, Proxy Server, and LLM integrations are both accepted and highly encouraged!

Quick start:

git clone
make install-dev
make format
make lint
make test-unit

See our comprehensive Contributing Guide (CONTRIBUTING.md) for detailed instructions.

Enterprise

For companies that need better security, user management and professional support

Talk to founders

This covers:

  • Features under the LiteLLM Commercial License:
  • Feature Prioritization
  • Custom Integrations
  • Professional Support - Dedicated discord + slack
  • Custom SLAs
  • Secure access with Single Sign-On

Contributing

We welcome contributions to LiteLLM! Whether you're fixing bugs, adding features, or improving documentation, we appreciate your help.

Quick Start for Contributors

For detailed contributing guidelines, see CONTRIBUTING.md.

Code Quality / Linting

LiteLLM follows the Google Python Style Guide.

Our automated checks include:

  • Black for code formatting
  • Ruff for linting and code quality
  • MyPy for type checking
  • Circular import detection
  • Import safety checks

Run all checks locally:

All these checks must pass before your PR can be merged.

Support / talk with founders

Why did we build this

  • Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere.

Contributors

Run in Developer mode

Services

  1. Setup .env file in root
  2. Run dependant services
    docker-compose up db prometheus

Backend

  1. (In root) create virtual environment
    python -m venv .venv
  2. Activate virtual environment
    source .venv/bin/activate
  3. Install dependencies
    pip install -e ".[all]"
  4. Start proxy backend
    uvicorn litellm.proxy.proxy_server:app --host localhost --port 4000 --reload

Frontend

  1. Navigate to
    ui/litellm-dashboard
  2. Install dependencies
    npm install
  3. Run
    npm run dev
    to start the dashboard