Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tandemn.com/llms.txt

Use this file to discover all available pages before exploring further.

What is Tandemn?

Tandemn is an infrastructure layer for running inference workloads across accelerated clusters. It gives administrators a server to operate and users a CLI to submit jobs.

Who should use Tandemn?

Tandemn is built for teams that run inference workloads and want to reduce manual infrastructure decisions. It is especially useful when an organization has mixed GPU supply or idle accelerated capacity.

What problem does Tandemn solve?

Inference infrastructure often requires users to choose hardware, size jobs, and manage placement manually. Tandemn moves those decisions into an orchestration layer so users can submit work through a simpler interface.

Is Tandemn only for batch inference?

The current docs focus on batch inference jobs submitted through the CLI. Batch workflows are a natural fit when jobs can be queued, planned, and executed across available resources.

What does a user need to get started?

A user needs the Tandemn CLI, the server URL from an administrator, and a valid OpenAI-style JSONL input file.
pip install tandemn
export TD_SERVER_URL=<your-server-url>
tandemn check

What does an administrator need to get started?

An administrator needs Python 3.12+, AWS credentials, IAM access for EC2/S3/service quotas, an S3 bucket, Redis, and a network-reachable host for the self-hosted control plane.

How do I know which models are available?

Model availability is deployment-specific. Ask your administrator which model identifiers are supported in your Tandemn environment.

Where should I start?

Use the Quickstart for an end-to-end setup, or use Requirements to understand what you need before deploying Tandemn.