You can find the Dockerfile to build litellm proxy here
Quick Start
- Basic
- With CLI Args
- use litellm as a base image
- Kubernetes
- Helm Chart
Step 1. Create a file called litellm_config.yaml
Example litellm_config.yaml
(the os.environ/
prefix means litellm will read AZURE_API_BASE
from the env)
model_list:
- model_name: azure-gpt-3.5
litellm_params:
model: azure/<your-azure-model-deployment>
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
Step 2. Run litellm docker image
See the latest available ghcr docker image here:https://github.com/berriai/litellm/pkgs/container/litellm
Your litellm config.yaml should be called litellm_config.yaml
in the directory you run this command.The -v
command will mount that file
Pass AZURE_API_KEY
and AZURE_API_BASE
since we set them in step 1
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
Step 3. Send a Test Request
Pass model=azure-gpt-3.5
this was set on step 1
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "azure-gpt-3.5",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
That's it ! That's the quick start to deploy litellm
Options to deploy LiteLLM
Docs | When to Use |
---|---|
Quick Start | call 100+ LLMs + Load Balancing |
Deploy with Database | + use Virtual Keys + Track Spend (Note: When deploying with a database providing a DATABASE_URL and LITELLM_MASTER_KEY are required in your env ) |
LiteLLM container + Redis | + load balance across multiple litellm containers |
LiteLLM Database container + PostgresDB + Redis | + use Virtual Keys + Track Spend + load balance across multiple litellm containers |
Deploy with Database
Docker, Kubernetes, Helm Chart
Requirements:
- Need a postgres database (e.g. Supabase, Neon, etc) Set
DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>
in your env - Set a
LITELLM_MASTER_KEY
, this is your Proxy Admin key - you can use this to create other keys (🚨 must start withsk-
)
- Dockerfile
- Kubernetes
- Helm
- Helm OCI Registry (GHCR)
We maintain a seperate Dockerfile for reducing build time when running LiteLLM proxy with a connected Postgres Database
docker pull ghcr.io/berriai/litellm-database:main-latest
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e LITELLM_MASTER_KEY=sk-1234 \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm-database:main-latest \
--config /app/config.yaml --detailed_debug
Your OpenAI proxy server is now running on http://0.0.0.0:4000
.
LiteLLM container + Redis
Use Redis when you need litellm to load balance across multiple litellm containers
The only change required is setting Redis on your config.yaml
LiteLLM Proxy supports sharing rpm/tpm shared across multiple litellm instances, pass redis_host
, redis_password
and redis_port
to enable this. (LiteLLM will use Redis to track rpm/tpm usage )
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/<your-deployment-name>
api_base: <your-azure-endpoint>
api_key: <your-azure-api-key>
rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/gpt-turbo-small-ca
api_base: https://my-endpoint-canada-berri992.openai.azure.com/
api_key: <your-azure-api-key>
rpm: 6
router_settings:
redis_host: <your redis host>
redis_password: <your redis password>
redis_port: 1992
Start docker container with config
docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml
LiteLLM Database container + PostgresDB + Redis
The only change required is setting Redis on your config.yaml
LiteLLM Proxy supports sharing rpm/tpm shared across multiple litellm instances, pass redis_host
, redis_password
and redis_port
to enable this. (LiteLLM will use Redis to track rpm/tpm usage )
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/<your-deployment-name>
api_base: <your-azure-endpoint>
api_key: <your-azure-api-key>
rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/gpt-turbo-small-ca
api_base: https://my-endpoint-canada-berri992.openai.azure.com/
api_key: <your-azure-api-key>
rpm: 6
router_settings:
redis_host: <your redis host>
redis_password: <your redis password>
redis_port: 1992
Start litellm-database
docker container with config
docker run --name litellm-proxy \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-p 4000:4000 \
ghcr.io/berriai/litellm-database:main-latest --config your_config.yaml
Advanced Deployment Settings
Customization of the server root path
info
In a Kubernetes deployment, it's possible to utilize a shared DNS to host multiple applications by modifying the virtual service
Customize the root path to eliminate the need for employing multiple DNS configurations during deployment.
👉 Set SERVER_ROOT_PATH
in your .env and this will be set as your server root path
Setting SSL Certification
Use this, If you need to set ssl certificates for your on prem litellm proxy
Pass ssl_keyfile_path
(Path to the SSL keyfile) and ssl_certfile_path
(Path to the SSL certfile) when starting litellm proxy
docker run ghcr.io/berriai/litellm:main-latest \
--ssl_keyfile_path ssl_test/keyfile.key \
--ssl_certfile_path ssl_test/certfile.crt
Provide an ssl certificate when starting litellm proxy server
Platform-specific Guide
- AWS EKS - Kubernetes
- AWS Cloud Formation Stack
- Google Cloud Run
- Render deploy
- Railway
Kubernetes - Deploy on EKS
Step1. Create an EKS Cluster with the following spec
eksctl create cluster --name=litellm-cluster --region=us-west-2 --node-type=t2.small
Step 2. Mount litellm proxy config on kub cluster
This will mount your local file called proxy_config.yaml
on kubernetes cluster
kubectl create configmap litellm-config --from-file=proxy_config.yaml
Step 3. Apply kub.yaml
and service.yaml
Clone the following kub.yaml
and service.yaml
files and apply locally
Use this
kub.yaml
file - litellm kub.yamlUse this
service.yaml
file - litellm service.yaml
Apply kub.yaml
kubectl apply -f kub.yaml
Apply service.yaml
- creates an AWS load balancer to expose the proxy
kubectl apply -f service.yaml
# service/litellm-service created
Step 4. Get Proxy Base URL
kubectl get services
# litellm-service LoadBalancer 10.100.6.31 a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com 4000:30374/TCP 63m
Proxy Base URL = a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com:4000
That's it, now you can start using LiteLLM Proxy
Extras
Run with docker compose
Step 1
- (Recommended) Use the example file
docker-compose.yml
given in the project root. e.g. https://github.com/BerriAI/litellm/blob/main/docker-compose.yml
Here's an example docker-compose.yml
file
version: "3.9"
services:
litellm:
build:
context: .
args:
target: runtime
image: ghcr.io/berriai/litellm:main-latest
ports:
- "4000:4000" # Map the container port to the host, change the host port if necessary
volumes:
- ./litellm-config.yaml:/app/config.yaml # Mount the local configuration file
# You can change the port or number of workers as per your requirements or pass any new supported CLI augument. Make sure the port passed here matches with the container port defined above in `ports` value
command: [ "--config", "/app/config.yaml", "--port", "4000", "--num_workers", "8" ]
# ...rest of your docker-compose config if any
Step 2
Create a litellm-config.yaml
file with your LiteLLM config relative to your docker-compose.yml
file.
Check the config doc here
Step 3
Run the command docker-compose up
or docker compose up
as per your docker installation.
Use
-d
flag to run the container in detached mode (background) e.g.docker compose up -d
Your LiteLLM container should be running now on the defined port e.g. 4000
.