Install the server

The Tandemn System control plane receives job requests from users, runs placement planning, launches replicas through SkyPilot, coordinates chunks through Redis, and writes output to S3. Deploy it in your AWS environment on a host that CLI users and EC2 replicas can reach.

Clone the repository

git clone --recurse-submodules https://github.com/Tandemn-Labs/Tandemn-server.git
cd Tandemn-server

Run setup

setup.sh installs Python dependencies, verifies AWS and Redis connectivity, and creates your .env file.

bash setup.sh

Review .env before starting the control plane. At minimum, confirm:

S3_UPLOAD_BUCKET
HF_TOKEN, if you use gated models
TD_API_KEY, if the control plane is reachable outside a private network
KOI_SERVICE_URL, only if you are enabling optional Koi integration

Do not commit .env files with secrets or deployment-specific credentials.

Start Redis

docker run -d -p 6379:6379 redis

Redis is required for multi-replica chunked jobs.

Download the performance database

The LLM placement advisor uses a performance database that is not included in the repository.

curl -L https://github.com/Tandemn-Labs/LLM_placement_solver/releases/download/aiconfigurator-v1/data.csv \
  -o LLM_placement_solver/llm_advisor/data/aiconfigurator/data.csv

Without this dataset, tandemn plan falls back to the roofline solver.

Start the control plane

Set the public URL that EC2 replicas will use to call back to the control plane.

python server.py --url https://your-public-url.example.com

For local development, you can use a tunnel or VPN endpoint. For production, run the control plane on a small EC2 instance in the same VPC as your inference replicas.

Next steps

Install the CLI

Set up the user-facing command line tool.

Configure the server

Understand what belongs in .env.

Start here

​Clone the repository

​Run setup

​Start Redis

​Download the performance database

​Start the control plane

​Next steps