🐳 Docker, Deploying LiteLLM Proxy

You can find the Dockerfile to build litellm proxy here

Quick Start

Basic
With CLI Args
use litellm as a base image
Kubernetes
Helm Chart

Step 1. Create a file called litellm_config.yaml

Example litellm_config.yaml (the os.environ/ prefix means litellm will read AZURE_API_BASE from the env)

model_list:
 - model_name: azure-gpt-3.5
 litellm_params:
 model: azure/<your-azure-model-deployment>
 api_base: os.environ/AZURE_API_BASE
 api_key: os.environ/AZURE_API_KEY
 api_version: "2023-07-01-preview"

Step 2. Run litellm docker image

See the latest available ghcr docker image here:https://github.com/berriai/litellm/pkgs/container/litellm

Your litellm config.yaml should be called litellm_config.yaml in the directory you run this command.The -v command will mount that file

Pass AZURE_API_KEY and AZURE_API_BASE since we set them in step 1

docker run \
 -v $(pwd)/litellm_config.yaml:/app/config.yaml \
 -e AZURE_API_KEY=d6*********** \
 -e AZURE_API_BASE=https://openai-***********/ \
 -p 4000:4000 \
 ghcr.io/berriai/litellm:main-latest \
 --config /app/config.yaml --detailed_debug

Step 3. Send a Test Request

Pass model=azure-gpt-3.5 this was set on step 1

curl --location 'http://0.0.0.0:4000/chat/completions' \
 --header 'Content-Type: application/json' \
 --data '{
 "model": "azure-gpt-3.5",
 "messages": [
 {
 "role": "user",
 "content": "what llm are you"
 }
 ]
}'

That's it ! That's the quick start to deploy litellm

Options to deploy LiteLLM

Docs	When to Use
Quick Start	call 100+ LLMs + Load Balancing
Deploy with Database	+ use Virtual Keys + Track Spend (Note: When deploying with a database providing a `DATABASE_URL` and `LITELLM_MASTER_KEY` are required in your env )
LiteLLM container + Redis	+ load balance across multiple litellm containers
LiteLLM Database container + PostgresDB + Redis	+ use Virtual Keys + Track Spend + load balance across multiple litellm containers

Deploy with Database

Docker, Kubernetes, Helm Chart

Requirements:

Need a postgres database (e.g. Supabase, Neon, etc) Set DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> in your env
Set a LITELLM_MASTER_KEY, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with sk-)

Dockerfile
Kubernetes
Helm
Helm OCI Registry (GHCR)

We maintain a seperate Dockerfile for reducing build time when running LiteLLM proxy with a connected Postgres Database

docker pull ghcr.io/berriai/litellm-database:main-latest

docker run \
 -v $(pwd)/litellm_config.yaml:/app/config.yaml \
 -e LITELLM_MASTER_KEY=sk-1234 \
 -e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
 -e AZURE_API_KEY=d6*********** \
 -e AZURE_API_BASE=https://openai-***********/ \
 -p 4000:4000 \
 ghcr.io/berriai/litellm-database:main-latest \
 --config /app/config.yaml --detailed_debug

Your OpenAI proxy server is now running on http://0.0.0.0:4000.

LiteLLM container + Redis

Use Redis when you need litellm to load balance across multiple litellm containers

The only change required is setting Redis on your config.yamlLiteLLM Proxy supports sharing rpm/tpm shared across multiple litellm instances, pass redis_host, redis_password and redis_port to enable this. (LiteLLM will use Redis to track rpm/tpm usage )

model_list:
 - model_name: gpt-3.5-turbo
 litellm_params:
 model: azure/<your-deployment-name>
 api_base: <your-azure-endpoint>
 api_key: <your-azure-api-key>
 rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
 - model_name: gpt-3.5-turbo
 litellm_params:
 model: azure/gpt-turbo-small-ca
 api_base: https://my-endpoint-canada-berri992.openai.azure.com/
 api_key: <your-azure-api-key>
 rpm: 6
router_settings:
 redis_host: <your redis host>
 redis_password: <your redis password>
 redis_port: 1992

Start docker container with config

docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml

LiteLLM Database container + PostgresDB + Redis

model_list:
 - model_name: gpt-3.5-turbo
 litellm_params:
 model: azure/<your-deployment-name>
 api_base: <your-azure-endpoint>
 api_key: <your-azure-api-key>
 rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
 - model_name: gpt-3.5-turbo
 litellm_params:
 model: azure/gpt-turbo-small-ca
 api_base: https://my-endpoint-canada-berri992.openai.azure.com/
 api_key: <your-azure-api-key>
 rpm: 6
router_settings:
 redis_host: <your redis host>
 redis_password: <your redis password>
 redis_port: 1992

Start litellm-databasedocker container with config

docker run --name litellm-proxy \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-p 4000:4000 \
ghcr.io/berriai/litellm-database:main-latest --config your_config.yaml

Advanced Deployment Settings

Customization of the server root path

info

Setting SSL Certification

Use this, If you need to set ssl certificates for your on prem litellm proxy

Pass ssl_keyfile_path (Path to the SSL keyfile) and ssl_certfile_path (Path to the SSL certfile) when starting litellm proxy

docker run ghcr.io/berriai/litellm:main-latest \
 --ssl_keyfile_path ssl_test/keyfile.key \
 --ssl_certfile_path ssl_test/certfile.crt

Provide an ssl certificate when starting litellm proxy server

Platform-specific Guide

AWS EKS - Kubernetes
AWS Cloud Formation Stack
Google Cloud Run
Render deploy
Railway

Kubernetes - Deploy on EKS

Step1. Create an EKS Cluster with the following spec

eksctl create cluster --name=litellm-cluster --region=us-west-2 --node-type=t2.small

Step 2. Mount litellm proxy config on kub cluster

This will mount your local file called proxy_config.yaml on kubernetes cluster

kubectl create configmap litellm-config --from-file=proxy_config.yaml

Step 3. Apply kub.yaml and service.yamlClone the following kub.yaml and service.yaml files and apply locally

Use this kub.yaml file - litellm kub.yaml
Use this service.yaml file - litellm service.yaml

Apply kub.yaml

kubectl apply -f kub.yaml

Apply service.yaml - creates an AWS load balancer to expose the proxy

kubectl apply -f service.yaml

# service/litellm-service created

Step 4. Get Proxy Base URL

kubectl get services

# litellm-service LoadBalancer 10.100.6.31 a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com 4000:30374/TCP 63m

Proxy Base URL = a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com:4000

That's it, now you can start using LiteLLM Proxy

Extras

Run with docker compose

Step 1

(Recommended) Use the example file docker-compose.yml given in the project root. e.g. https://github.com/BerriAI/litellm/blob/main/docker-compose.yml

Here's an example docker-compose.yml file

version: "3.9"
services:
 litellm:
 build:
 context: .
 args:
 target: runtime
 image: ghcr.io/berriai/litellm:main-latest
 ports:
 - "4000:4000" # Map the container port to the host, change the host port if necessary
 volumes:
 - ./litellm-config.yaml:/app/config.yaml # Mount the local configuration file
 # You can change the port or number of workers as per your requirements or pass any new supported CLI augument. Make sure the port passed here matches with the container port defined above in `ports` value
 command: [ "--config", "/app/config.yaml", "--port", "4000", "--num_workers", "8" ]

# ...rest of your docker-compose config if any

Step 2

Create a litellm-config.yaml file with your LiteLLM config relative to your docker-compose.yml file.

Check the config doc here

Step 3

Run the command docker-compose up or docker compose up as per your docker installation.

Use -d flag to run the container in detached mode (background) e.g. docker compose up -d

Your LiteLLM container should be running now on the defined port e.g. 4000.

🐳 Docker, Deploying LiteLLM Proxy | liteLLM (2024)

Quick Start​

Options to deploy LiteLLM​

Deploy with Database​

Docker, Kubernetes, Helm Chart​

LiteLLM container + Redis​

LiteLLM Database container + PostgresDB + Redis​

Advanced Deployment Settings​

Customization of the server root path​

Setting SSL Certification​

Platform-specific Guide​

Kubernetes - Deploy on EKS​

Extras​

Run with docker compose​