- Aggregates and routes jobs to the top open source models across heterogeneous GPU pools.
- Deploys across heterogeneous GPU pools, drawing on idle capacity across mixed GPU instances
- Comparable inference latency at a significantly lower cost
Welcome to the Tandemn docs!