POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.redpill.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>'

Create Chat Completion

Creates a model response for the given chat conversation. All requests are TEE-protected.
POST https://api.redpill.ai/v1/chat/completions

Request Body

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., gpt-4, phala/qwen-2.5-7b-instruct)
messagesarrayYesArray of message objects
temperaturenumberNoSampling temperature (0-2), default 1
max_tokensintegerNoMaximum tokens to generate
streambooleanNoStream responses, default false
top_pnumberNoNucleus sampling (0-1)
nintegerNoNumber of completions, default 1
stopstring/arrayNoStop sequences
presence_penaltynumberNoPresence penalty (-2 to 2)
frequency_penaltynumberNoFrequency penalty (-2 to 2)
toolsarrayNoFunction calling tools
tool_choicestring/objectNoControl tool usage
response_formatobjectNoOutput format ({"type": "json_object"})

Message Object

{
  "role": "user" | "assistant" | "system",
  "content": "string" | array
}

Example Requests

curl https://api.redpill.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": "What is RedPill AI?"}
    ]
  }"

Response

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "RedPill AI is a privacy-first AI platform..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 25,
    "total_tokens": 38
  }
}

Streaming

Enable stream: true for real-time responses:
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Vision (Multimodal)

Use vision models with images:
response = client.chat.completions.create(
    model="phala/qwen2.5-vl-72b-instruct",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://..."}}
        ]
    }]
)

Function Calling

Define tools/functions for the model to call:
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in SF?"}],
    tools=tools
)

Function Calling Guide

Learn more about function calling →

Error Handling

try:
    response = client.chat.completions.create(...)
except openai.AuthenticationError:
    print("Invalid API key")
except openai.RateLimitError:
    print("Rate limit exceeded")
except openai.BadRequestError as e:
    print(f"Bad request: {e}")

Supported Models

  • OpenAI: gpt-4-turbo, gpt-4, gpt-3.5-turbo
  • Anthropic: anthropic/claude-3.5-sonnet
  • Google: google/gemini-1.5-pro
  • Meta: meta-llama/llama-3.3-70b-instruct
  • Phala TEE: phala/deepseek-chat-v3-0324
  • +200 more models

All Models

View all 218+ supported models →