🐳 Docker, Deploying LiteLLM Proxy | liteLLM (2024)

You can find the Dockerfile to build litellm proxy here

Quick Start

  • Basic
  • With CLI Args
  • use litellm as a base image
  • Kubernetes
  • Helm Chart

Step 1. Create a file called litellm_config.yaml

Example litellm_config.yaml (the os.environ/ prefix means litellm will read AZURE_API_BASE from the env)

model_list:
- model_name: azure-gpt-3.5
litellm_params:
model: azure/<your-azure-model-deployment>
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"

Step 2. Run litellm docker image

See the latest available ghcr docker image here:https://github.com/berriai/litellm/pkgs/container/litellm

Your litellm config.yaml should be called litellm_config.yaml in the directory you run this command.The -v command will mount that file

Pass AZURE_API_KEY and AZURE_API_BASE since we set them in step 1

docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug

Step 3. Send a Test Request

Pass model=azure-gpt-3.5 this was set on step 1

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "azure-gpt-3.5",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

That's it ! That's the quick start to deploy litellm

Options to deploy LiteLLM

DocsWhen to Use
Quick Startcall 100+ LLMs + Load Balancing
Deploy with Database+ use Virtual Keys + Track Spend (Note: When deploying with a database providing a DATABASE_URL and LITELLM_MASTER_KEY are required in your env )
LiteLLM container + Redis+ load balance across multiple litellm containers
LiteLLM Database container + PostgresDB + Redis+ use Virtual Keys + Track Spend + load balance across multiple litellm containers

Deploy with Database

Docker, Kubernetes, Helm Chart

Requirements:

  • Need a postgres database (e.g. Supabase, Neon, etc) Set DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> in your env
  • Set a LITELLM_MASTER_KEY, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with sk-)
  • Dockerfile
  • Kubernetes
  • Helm
  • Helm OCI Registry (GHCR)

We maintain a seperate Dockerfile for reducing build time when running LiteLLM proxy with a connected Postgres Database

docker pull ghcr.io/berriai/litellm-database:main-latest
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e LITELLM_MASTER_KEY=sk-1234 \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm-database:main-latest \
--config /app/config.yaml --detailed_debug

Your OpenAI proxy server is now running on http://0.0.0.0:4000.

LiteLLM container + Redis

Use Redis when you need litellm to load balance across multiple litellm containers

The only change required is setting Redis on your config.yamlLiteLLM Proxy supports sharing rpm/tpm shared across multiple litellm instances, pass redis_host, redis_password and redis_port to enable this. (LiteLLM will use Redis to track rpm/tpm usage )

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/<your-deployment-name>
api_base: <your-azure-endpoint>
api_key: <your-azure-api-key>
rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/gpt-turbo-small-ca
api_base: https://my-endpoint-canada-berri992.openai.azure.com/
api_key: <your-azure-api-key>
rpm: 6
router_settings:
redis_host: <your redis host>
redis_password: <your redis password>
redis_port: 1992

Start docker container with config

docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml

LiteLLM Database container + PostgresDB + Redis

The only change required is setting Redis on your config.yamlLiteLLM Proxy supports sharing rpm/tpm shared across multiple litellm instances, pass redis_host, redis_password and redis_port to enable this. (LiteLLM will use Redis to track rpm/tpm usage )

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/<your-deployment-name>
api_base: <your-azure-endpoint>
api_key: <your-azure-api-key>
rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/gpt-turbo-small-ca
api_base: https://my-endpoint-canada-berri992.openai.azure.com/
api_key: <your-azure-api-key>
rpm: 6
router_settings:
redis_host: <your redis host>
redis_password: <your redis password>
redis_port: 1992

Start litellm-databasedocker container with config

docker run --name litellm-proxy \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-p 4000:4000 \
ghcr.io/berriai/litellm-database:main-latest --config your_config.yaml

Advanced Deployment Settings

Customization of the server root path

info

In a Kubernetes deployment, it's possible to utilize a shared DNS to host multiple applications by modifying the virtual service

Customize the root path to eliminate the need for employing multiple DNS configurations during deployment.

👉 Set SERVER_ROOT_PATH in your .env and this will be set as your server root path

Setting SSL Certification

Use this, If you need to set ssl certificates for your on prem litellm proxy

Pass ssl_keyfile_path (Path to the SSL keyfile) and ssl_certfile_path (Path to the SSL certfile) when starting litellm proxy

docker run ghcr.io/berriai/litellm:main-latest \
--ssl_keyfile_path ssl_test/keyfile.key \
--ssl_certfile_path ssl_test/certfile.crt

Provide an ssl certificate when starting litellm proxy server

Platform-specific Guide

  • AWS EKS - Kubernetes
  • AWS Cloud Formation Stack
  • Google Cloud Run
  • Render deploy
  • Railway

Kubernetes - Deploy on EKS

Step1. Create an EKS Cluster with the following spec

eksctl create cluster --name=litellm-cluster --region=us-west-2 --node-type=t2.small

Step 2. Mount litellm proxy config on kub cluster

This will mount your local file called proxy_config.yaml on kubernetes cluster

kubectl create configmap litellm-config --from-file=proxy_config.yaml

Step 3. Apply kub.yaml and service.yamlClone the following kub.yaml and service.yaml files and apply locally

Apply kub.yaml

kubectl apply -f kub.yaml

Apply service.yaml - creates an AWS load balancer to expose the proxy

kubectl apply -f service.yaml

# service/litellm-service created

Step 4. Get Proxy Base URL

kubectl get services

# litellm-service LoadBalancer 10.100.6.31 a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com 4000:30374/TCP 63m

Proxy Base URL = a472dc7c273fd47fd******.us-west-2.elb.amazonaws.com:4000

That's it, now you can start using LiteLLM Proxy

Extras

Run with docker compose

Step 1

Here's an example docker-compose.yml file

version: "3.9"
services:
litellm:
build:
context: .
args:
target: runtime
image: ghcr.io/berriai/litellm:main-latest
ports:
- "4000:4000" # Map the container port to the host, change the host port if necessary
volumes:
- ./litellm-config.yaml:/app/config.yaml # Mount the local configuration file
# You can change the port or number of workers as per your requirements or pass any new supported CLI augument. Make sure the port passed here matches with the container port defined above in `ports` value
command: [ "--config", "/app/config.yaml", "--port", "4000", "--num_workers", "8" ]

# ...rest of your docker-compose config if any

Step 2

Create a litellm-config.yaml file with your LiteLLM config relative to your docker-compose.yml file.

Check the config doc here

Step 3

Run the command docker-compose up or docker compose up as per your docker installation.

Use -d flag to run the container in detached mode (background) e.g. docker compose up -d

Your LiteLLM container should be running now on the defined port e.g. 4000.

🐳 Docker, Deploying LiteLLM Proxy | liteLLM (2024)

FAQs

What is LiteLLM proxy? ›

LiteLLM is an open-source locally run proxy server that provides an OpenAI-compatible API. It interfaces with a large number of providers that do the inference.

How to run docker behind proxy? ›

Instead of configuring the Docker client, you can specify proxy configurations on the command-line when you invoke the docker build and docker run commands. Proxy configuration on the command-line uses the --build-arg flag for builds, and the --env flag for when you want to run containers with a proxy.

What is docker proxy used for? ›

2 Answers. Save this answer. A docker proxy is a layer of defense, which restricts the interaction between a potentially harmful system (running inside docker) with a critical environment (the docker host).

How to deploy through docker? ›

How to Perform Docker Deployment of an Application With Docker Containers?
  1. Install Docker on the machines you want to use it;
  2. Set up a registry at Docker Hub;
  3. Initiate Docker build to create your Docker Image;
  4. Set up your 'Dockerized' machines;
  5. Deploy your built docker image or application.

What is LiteLLM used for? ›

LiteLLM provides a unified interface to call 100+ LLMs using the same Input/Output format, including OpenAI, Huggingface, Anthropic, vLLM, Cohere, and even custom LLM API server.

Should I disable proxy? ›

Disabling proxy settings can protect your privacy and improve network performance in some cases.

How to set proxy for Docker desktop? ›

Procedure
  1. In your Windows system tray, right-click the Docker whale icon.
  2. Click Settings.
  3. On the Settings page, click Resources > Proxies.
  4. Enable the Manual proxy configuration option. The proxy fields get automatically enabled.
  5. Type the required information in the following proxy fields: ...
  6. Click Apply & Restart.
Sep 22, 2020

Why use reverse proxy with Docker? ›

A reverse proxy improves website performance and security by intercepting incoming traffic requests and directing them to the appropriate backend servers. The simplest way to set up and manage a reverse proxy is to deploy both the proxy and the backend services inside Docker containers.

What is proxying? ›

It works by accessing the internet on behalf of the user while hiding their identity and computer information. A anonymous proxy is best suited for users who want to have full anonymity while accessing the internet.

Why would someone use Docker? ›

Docker streamlines the development lifecycle by allowing developers to work in standardized environments using local containers which provide your applications and services. Containers are great for continuous integration and continuous delivery (CI/CD) workflows.

Why Docker networking is needed? ›

Docker networking enables a user to link a Docker container to as many networks as he/she requires. Docker Networks are used to provide complete isolation for Docker containers. Note: A user can add containers to more than one network.

What is every proxy used for? ›

The Every Proxy Network Bridge is used to bridge connections between the system wide external network and the VPN network. This bridge is necessary only when you have a per-app type of VPN client application.

What does it mean to deploy on Docker? ›

Container deployment is a method for quickly building and releasing complex applications. Docker container deployment is a popular technology that gives developers the ability to construct application environments with speed at scale.

How to deploy a web server in Docker? ›

How to Deploy a Web App on Docker
  1. Step 1: Creating Dockerfile.
  2. Step 2: Containerize your application.
  3. Step 3: Push the docker image to a docker repository.
  4. Step 4: Pull the docker image and run in Linux.
  5. Step 5: Aceess the docker conatainer.
  6. How to deploy web API in Docker?
  7. How to deploy Java web application using Docker?
Apr 23, 2024

Is it easier to deploy with Docker? ›

Using Docker and Docker Compose can help to ensure that your application is consistent across different environments, and make it easier to deploy and scale. However, there are some additional considerations to keep in mind when using Docker in production.

What is Lite LLM? ›

GitHub - ModelTC/lightllm: LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

What is proxy app used for? ›

Proxy applications (proxy apps) are small, simplified codes that allow application developers to share important features of larger production applications without forcing collaborators to assimilate large and complex code bases.

What does proxy mode do? ›

Proxies provide a valuable layer of security for your computer. They can be set up as web filters or firewalls, protecting your computer from internet threats like malware.

What does proxy mean on Android? ›

A mobile proxy is a gateway that assigns a device mobile IP address rather than a residential IP or a VPN. In other words, mobile proxies make it look like their user is connected to the internet via a mobile data network, masking their real address.

Top Articles
Latest Posts
Article information

Author: Horacio Brakus JD

Last Updated:

Views: 5675

Rating: 4 / 5 (71 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Horacio Brakus JD

Birthday: 1999-08-21

Address: Apt. 524 43384 Minnie Prairie, South Edda, MA 62804

Phone: +5931039998219

Job: Sales Strategist

Hobby: Sculling, Kitesurfing, Orienteering, Painting, Computer programming, Creative writing, Scuba diving

Introduction: My name is Horacio Brakus JD, I am a lively, splendid, jolly, vivacious, vast, cheerful, agreeable person who loves writing and wants to share my knowledge and understanding with you.