1. Create an account

Go to Tandemn’s gateway and create an account. Right now, we support signing in with GitHub, Gmail and Apple accounts.

2. Generating keys

After signing in, go to the keys page and follow the instructions to generate an API key.

3. Create a request!

Here, we show an example of running an inference on Llama 70B. The Python code shows an example without streaming and the cURL command shows an example with streaming.
import requests

url = "https://api.tandemn.com/api/v1/chat/completions"
headers = {
    "Authorization": f"Bearer <your-api-key>",
    "Content-Type": "application/json"
}
data = {
    "model": "casperhansen/llama-3.3-70b-instruct-awq",
    "messages": [
        {"role": "user", "content": "Hello! Can you explain quantum computing?"}
    ]
}

response = requests.post(url, headers=headers, json=data)
print(response.json())