Skip to main content
The Tandemn System control plane receives job requests from users, runs placement planning, launches replicas through SkyPilot, coordinates chunks through Redis, and writes output to S3. Deploy it in your AWS environment on a host that CLI users and EC2 replicas can reach.

Clone the repository

git clone --recurse-submodules https://github.com/Tandemn-Labs/Tandemn-server.git
cd Tandemn-server

Run setup

setup.sh installs Python dependencies, verifies AWS and Redis connectivity, and creates your .env file.
bash setup.sh
Review .env before starting the control plane. At minimum, confirm:
  • S3_UPLOAD_BUCKET
  • HF_TOKEN, if you use gated models
  • TD_API_KEY, if the control plane is reachable outside a private network
  • KOI_SERVICE_URL, only if you are enabling optional Koi integration
Do not commit .env files with secrets or deployment-specific credentials.

Start Redis

docker run -d -p 6379:6379 redis
Redis is required for multi-replica chunked jobs.

Download the performance database

The LLM placement advisor uses a performance database that is not included in the repository.
curl -L https://github.com/Tandemn-Labs/LLM_placement_solver/releases/download/aiconfigurator-v1/data.csv \
  -o LLM_placement_solver/llm_advisor/data/aiconfigurator/data.csv
Without this dataset, tandemn plan falls back to the roofline solver.

Start the control plane

Set the public URL that EC2 replicas will use to call back to the control plane.
python server.py --url https://your-public-url.example.com
For local development, you can use a tunnel or VPN endpoint. For production, run the control plane on a small EC2 instance in the same VPC as your inference replicas.

Next steps

Install the CLI

Set up the user-facing command line tool.

Configure the server

Understand what belongs in .env.