Skip to main content
Use these pages to understand how Tandemn works before operating it in a shared environment.

Architecture

See how the CLI, server, and GPU resources fit together.

Batch inference

Learn why Tandemn focuses on queued inference workloads.

Job lifecycle

Follow a job from input file to execution.

Models and routing

Understand how model choice and hardware placement relate.

CLI-first design

Tandemn is documented as a CLI-first product. Users submit work with tandemn deploy; administrators operate the server and cluster environment behind that interface.

Architecture

How the Tandemn server, CLI, users, and GPU resources work together.

Batch inference

How Tandemn thinks about queued inference workloads.

Job lifecycle

What happens after a user submits a Tandemn job.

Models and routing

How model selection and hardware routing fit into Tandemn.