Chat Completions
API Reference
Chat Completions
POST /v1/chat/completions - Create chat completion responses
POST
Chat Completions
Documentation Index
Fetch the complete documentation index at: https://docs.redpill.ai/llms.txt
Use this file to discover all available pages before exploring further.
Create Chat Completion
Creates a model response for the given chat conversation. All requests are TEE-protected.Try it now! Click the “Try it” button above to test the API in the playground. You’ll need:
- Your API key (add it when prompted)
- Fill in
messageslike:[{"role":"user","content":"Hello"}]
Request Body
Model ID to use for completionExamples:
z-ai/glm-5.1, z-ai/glm-5, phala/qwen3.5-27b, openai/gpt-5, anthropic/claude-sonnet-4.5Array of message objects. Each message needs With system message:
role and content.Example:Sampling temperature (0-2), default 1
Maximum tokens to generateNote: Newer models (GPT-5, O3, O4) use
max_completion_tokens instead. See note below.Maximum completion tokens (for GPT-5, O3, O4 models)Use this parameter instead of
max_tokens for newer OpenAI models:openai/gpt-5,openai/gpt-5-mini,openai/gpt-5-nanoopenai/o3,openai/o4-mini
Stream responses, default false
Nucleus sampling (0-1)
Number of completions, default 1
Presence penalty (-2 to 2)
Frequency penalty (-2 to 2)
Message Object
Example Requests
Response
Streaming
Enablestream: true for real-time responses:
Vision (Multimodal)
Use vision models with images:Function Calling
Define tools/functions for the model to call:Function Calling Guide
Learn more about function calling →
Error Handling
Supported Models
- GPU TEE:
z-ai/glm-5.1,z-ai/glm-5,phala/qwen3.5-27b - OpenAI:
openai/gpt-5,openai/gpt-5-mini,openai/o4-mini - Anthropic:
anthropic/claude-sonnet-4.5,anthropic/claude-opus-4.1 - Google:
google/gemini-1.5-pro - Meta:
meta-llama/llama-3.3-70b-instruct - 50+ total models
All Models
View all supported models →