Documentation Index
Fetch the complete documentation index at: https://docs.tandemn.com/llms.txt
Use this file to discover all available pages before exploring further.
The Tandemn server should be treated as shared infrastructure. It receives job requests from users and coordinates execution across accelerated resources.
Deployment checklist
Choose a server host
Pick a machine in your AWS environment that CLI users and EC2 inference replicas can reach.
Prepare AWS access
Configure AWS credentials, IAM permissions, S3 bucket access, and quota visibility.
Clone the server repository
Fetch the Tandemn server repository and submodules.
Run setup
Install dependencies, verify AWS and Redis connectivity, and create .env.
Start Redis
Run Redis for chunked multi-replica jobs.
Start the control plane
Run python server.py --url <public-or-private-url>.
Verify from a client
Run tandemn check from a separate machine using the public server URL.
Commands
git clone --recurse-submodules https://github.com/Tandemn-Labs/Tandemn-server.git
cd Tandemn-server
bash setup.sh
docker run -d -p 6379:6379 redis
python server.py --url https://your-public-url.example.com
Manual installation
Use this path when you do not want to run setup.sh.
uv venv .venv --python 3.12 --seed
source .venv/bin/activate
uv pip install -r requirements.txt
sky check
Create a .env file in the project root:
S3_UPLOAD_BUCKET=your-s3-bucket
HF_TOKEN=hf_your_token_here
TD_API_KEY=your-secret-key
ANTHROPIC_API_KEY=sk-ant-...
KOI_SERVICE_URL=http://localhost:8090
KOI_SERVICE_URL is optional. Leave it unset for standalone Orca behavior.
Operating model
The administrator owns the server deployment, AWS credentials, S3 bucket, Redis, environment configuration, network exposure, and cluster access. Users own their prompt files, model choices, SLOs, and CLI environment.
Document the server URL, supported models, and expected SLO values for your users before sharing access.
Network model
For local development, the control plane can be exposed through a tunnel or VPN. For production, deploy the control plane on a small EC2 instance in the same VPC as the inference replicas to avoid public exposure and tunnel latency.
Updates
When a new Tandemn server version is available, review the release notes, back up deployment-specific configuration, and update in a controlled maintenance window.