Overview
RedPill supports 48+ active AI models accessible through a single TEE-protected API across 9 providers. Every request is hardware-protected regardless of which model you choose.Currently Active: OpenAI, Anthropic, Google, xAI, DeepSeek, Qwen, ZhipuAI, Meta, MoonshotAI, NousResearch, and more
List Models via API
Get the latest models programmatically ->
Model Categories
Chat Models
Conversational AI for chatbots and assistants
Instruction Models
Task completion and code generation
Vision Models
Image understanding and analysis
Embedding Models
Text embeddings for search and similarity
Featured Models
OpenAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
openai/gpt-5.2 | 400K | $1.75/M | $14/M |
openai/gpt-5.1 | 400K | $1.25/M | $10/M |
openai/gpt-5 | 400K | $1.25/M | $10/M |
openai/gpt-5-mini | 400K | $0.25/M | $2/M |
openai/gpt-5-nano | 400K | $0.05/M | $0.4/M |
openai/o3 | 200K | $2/M | $8/M |
openai/o4-mini | 200K | $1.1/M | $4.4/M |
openai/gpt-4.1 | 1M | $2/M | $8/M |
openai/gpt-4.1-mini | 1M | $0.4/M | $1.6/M |
openai/gpt-4.1-nano | 1M | $0.1/M | $0.4/M |
openai/gpt-4o | 128K | $2.5/M | $10/M |
openai/gpt-4o-mini | 128K | $0.15/M | $0.6/M |
openai/gpt-4 | 8K | $30/M | $60/M |
openai/gpt-3.5-turbo | 16K | $0.5/M | $1.5/M |
New: GPT-5.2 is OpenAI’s latest flagship model with enhanced reasoning and improved performance across all benchmarks.
Anthropic Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
anthropic/claude-opus-4.6 | 1M | $10/M | $37.5/M |
anthropic/claude-sonnet-4.6 | 1M | $3/M | $15/M |
anthropic/claude-opus-4.5 | 200K | $5/M | $25/M |
anthropic/claude-sonnet-4.5 | 1M | $3/M | $15/M |
anthropic/claude-opus-4.1 | 200K | $15/M | $75/M |
anthropic/claude-opus-4 | 200K | $15/M | $75/M |
anthropic/claude-sonnet-4 | 1M | $3/M | $15/M |
anthropic/claude-haiku-4.5 | 200K | $1/M | $5/M |
anthropic/claude-3.7-sonnet | 200K | $3/M | $15/M |
anthropic/claude-3.5-haiku | 200K | $0.8/M | $4/M |
New: Claude Opus 4.6 and Sonnet 4.6 are the latest Anthropic models. Claude Opus 4.5 and Haiku 4.5 are also now available.
Google Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
google/gemini-3-pro-preview | 1M | $4/M | $18/M |
google/gemini-2.5-pro | 1M | $2.5/M | $15/M |
google/gemini-2.5-flash | 1M | $0.3/M | $2.5/M |
google/gemini-2.5-flash-lite | 1M | $0.1/M | $0.4/M |
New: Gemini 3 Pro Preview is Google’s next-generation model with advanced reasoning capabilities. Gemini 2.5 Pro pricing updated to $2.5/M prompt.
xAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
x-ai/grok-4 | 256K | $3/M | $15/M |
x-ai/grok-4.1-fast | 2M | $0.2/M | $0.5/M |
x-ai/grok-code-fast-1 | 256K | $0.2/M | $1.5/M |
New: Grok 4.1 Fast now supports a massive 2M token context window at just $0.2/M prompt.
DeepSeek Models
| Model ID | Context | Prompt | Completion | Provider |
|---|---|---|---|---|
deepseek/deepseek-v3.2 | 163K | $0.27/M | $0.4/M | Chutes |
deepseek/deepseek-chat-v3.1 | 163K | $1/M | $2.5/M | Near AI |
deepseek/deepseek-r1-0528 | 163K | $2/M | $2/M | Tinfoil |
New: DeepSeek V3.2 is the latest version available through Chutes. DeepSeek R1 is a reasoning model available with Tinfoil TEE protection.
MoonshotAI Models
| Model ID | Context | Prompt | Completion | Provider |
|---|---|---|---|---|
moonshotai/kimi-k2.5 | 262K | $0.6/M | $3/M | Chutes |
moonshotai/kimi-k2-thinking | 262K | $2/M | $2/M | Tinfoil |
New: Kimi K2.5 is a native multimodal model with state-of-the-art visual coding. Kimi K2 Thinking is an advanced reasoning model optimized for long-horizon agentic tasks.
Qwen Models
| Model ID | Context | Prompt | Completion | Provider |
|---|---|---|---|---|
qwen/qwen3-coder-480b-a35b-instruct | 262K | $2/M | $2/M | Tinfoil |
qwen/qwen3-vl-30b-a3b-instruct | 128K | $0.2/M | $0.7/M | Phala |
qwen/qwen3-30b-a3b-instruct-2507 | 262K | $0.15/M | $0.45/M | Near AI |
qwen/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M | Phala |
ZhipuAI Models
| Model ID | Context | Prompt | Completion | Provider |
|---|---|---|---|---|
z-ai/glm-5 | 202K | $1.2/M | $3.5/M | Phala |
z-ai/glm-4.7 | 131K | $0.85/M | $3.3/M | Near AI |
z-ai/glm-4.7-flash | 202K | $0.1/M | $0.43/M | Phala |
New: GLM-5 is ZhipuAI’s latest flagship model for systems engineering. GLM 4.7 Flash offers excellent speed-to-quality ratio with 202K context.
Other Models
| Model ID | Context | Prompt | Completion | Provider |
|---|---|---|---|---|
openai/gpt-oss-120b | 131K | $0.1/M | $0.49/M | Phala |
openai/gpt-oss-20b | 131K | $0.04/M | $0.15/M | Phala |
google/gemma-3-27b-it | 53K | $0.11/M | $0.4/M | Phala |
meta-llama/llama-3.3-70b-instruct | 131K | $2/M | $2/M | Tinfoil |
nousresearch/hermes-3-llama-3.1-405b | 131K | $1/M | $1/M | OpenRouter |
phala/uncensored-24b | 32K | $0.2/M | $0.9/M | Phala |
GPU TEE Confidential Models
RedPill offers 17 confidential AI models running entirely in GPU TEE across 4 providers:Phala Network (8 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
phala/glm-5 | 202K | $1.2/M | $3.5/M |
phala/gpt-oss-120b | 131K | $0.1/M | $0.49/M |
phala/qwen3-vl-30b-a3b-instruct | 128K | $0.2/M | $0.7/M |
phala/gemma-3-27b-it | 53K | $0.11/M | $0.4/M |
phala/uncensored-24b | 32K | $0.2/M | $0.9/M |
phala/gpt-oss-20b | 131K | $0.04/M | $0.15/M |
phala/glm-4.7-flash | 202K | $0.1/M | $0.43/M |
phala/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M |
Tinfoil (4 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-r1-0528 | 163K | $2/M | $2/M |
qwen/qwen3-coder-480b-a35b-instruct | 262K | $2/M | $2/M |
moonshotai/kimi-k2-thinking | 262K | $2/M | $2/M |
meta-llama/llama-3.3-70b-instruct | 131K | $2/M | $2/M |
Near AI (3 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-chat-v3.1 | 163K | $1/M | $2.5/M |
qwen/qwen3-30b-a3b-instruct-2507 | 262K | $0.15/M | $0.45/M |
z-ai/glm-4.7 | 131K | $0.85/M | $3.3/M |
Chutes (2 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-v3.2 | 163K | $0.27/M | $0.4/M |
moonshotai/kimi-k2.5 | 262K | $0.6/M | $3/M |
Learn About Confidential AI
Explore all 17 GPU TEE models in detail ->
Vision Models
Models that understand images:| Model ID | Context | Features | Provider |
|---|---|---|---|
phala/qwen3-vl-30b-a3b-instruct | 128K | Vision + Text | Phala |
phala/gemma-3-27b-it | 53K | Vision + Text | Phala |
moonshotai/kimi-k2.5 | 262K | Vision + Text | Chutes |
Embedding Models
Generate vector embeddings for semantic search:| Model ID | Dimensions | Max Tokens |
|---|---|---|
openai/text-embedding-3-large | 3072 | 8191 |
openai/text-embedding-3-small | 1536 | 8191 |
openai/text-embedding-ada-002 | 1536 | 8191 |
Provider Coverage
Supported Providers
- OpenAI - GPT-5.2, GPT-5.1, GPT-5, GPT-4.1, O3, O4-mini, GPT-OSS, embeddings
- Anthropic - Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, Claude Opus 4.1, Claude Haiku 4.5, Claude 3.7 Sonnet, Claude 3.5 Haiku
- Google - Gemini 3 Pro Preview, Gemini 2.5 Pro/Flash/Flash-Lite
- xAI - Grok 4, Grok 4.1 Fast, Grok Code Fast
- DeepSeek - DeepSeek V3.2, DeepSeek V3.1, DeepSeek R1
- MoonshotAI - Kimi K2.5, Kimi K2 Thinking
- Qwen - Qwen3 Coder 480B, Qwen3 VL, Qwen3 30B, Qwen 2.5
- ZhipuAI - GLM-5, GLM-4.7, GLM-4.7 Flash
- Meta - Llama 3.3 70B
- NousResearch - Hermes 3 Llama 405B
Pricing Tiers
Budget Models
Balance of cost and quality:phala/qwen-2.5-7b-instruct- $0.04/M prompt (TEE)phala/gpt-oss-20b- $0.04/M prompt (TEE)openai/gpt-5-nano- $0.05/M promptphala/glm-4.7-flash- $0.1/M prompt (TEE)google/gemini-2.5-flash-lite- $0.1/M promptopenai/gpt-4o-mini- $0.15/M prompt
Mid-Range Models
Production-ready with strong quality:openai/gpt-5-mini- $0.25/M promptgoogle/gemini-2.5-flash- $0.3/M promptmoonshotai/kimi-k2.5- $0.6/M promptanthropic/claude-3.5-haiku- $0.8/M prompt
Premium Models
Best quality for production:openai/gpt-5.2- $1.75/M promptanthropic/claude-sonnet-4.5- $3/M promptgoogle/gemini-2.5-pro- $2.5/M promptanthropic/claude-opus-4.6- $10/M prompt
Model Selection Guide
By Use Case
Chatbots & Assistants
Chatbots & Assistants
Best:
anthropic/claude-sonnet-4.5, openai/gpt-5Budget: openai/gpt-5-mini, phala/qwen-2.5-7b-instructCode Generation
Code Generation
Best:
openai/gpt-5, anthropic/claude-opus-4.6Budget: x-ai/grok-code-fast-1, phala/glm-4.7-flashText Analysis
Text Analysis
Best:
anthropic/claude-sonnet-4.5, google/gemini-2.5-proBudget: google/gemini-2.5-flash, phala/qwen-2.5-7b-instructImage Understanding
Image Understanding
Best:
phala/qwen3-vl-30b-a3b-instruct, moonshotai/kimi-k2.5Budget: phala/gemma-3-27b-itHigh-Privacy Workloads
High-Privacy Workloads
TEE-Protected: All Phala models
phala/glm-5- Best qualityphala/gpt-oss-120b- OpenAI architecturephala/qwen-2.5-7b-instruct- Budget option
Long Context
Long Context
Best:
x-ai/grok-4.1-fast (2M tokens), google/gemini-2.5-pro (1M tokens), anthropic/claude-opus-4.6 (1M tokens)TEE: phala/glm-5 (202K tokens), phala/glm-4.7-flash (202K tokens)Agentic / Reasoning
Agentic / Reasoning
Best:
moonshotai/kimi-k2-thinking, deepseek/deepseek-r1-0528, openai/o3Budget: phala/glm-4.7-flash, openai/o4-miniGet Latest Models
Via API
Via SDK
Model Properties
Each model includes:| Property | Description |
|---|---|
id | Model identifier for API calls |
name | Human-readable name |
context_length | Maximum tokens in context window |
pricing.prompt | Cost per token (prompt) |
pricing.completion | Cost per token (completion) |
input_modalities | Input types (text, image, file, audio) |
output_modalities | Output types (text) |
providers | Array of infrastructure providers (phala, tinfoil, near-ai, etc.) |
metadata.appid | Phala TEE application ID (for attestation) |
Model Compatibility
OpenAI SDK Compatible
All models work with OpenAI SDK:Streaming Support
All chat models support streaming:Function Calling
Supported models:- All OpenAI GPT models
- Anthropic Claude 3+ models
- Google Gemini models
- Meta Llama 3.2+ models
- Phala GLM and GPT-OSS models
FAQs
Are all models protected by TEE?
Are all models protected by TEE?
Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala, Tinfoil, Near AI, or Chutes confidential models.
How do I identify which models run on Phala?
How do I identify which models run on Phala?
Three ways: (1) Use the
/v1/models/phala endpoint, (2) check the providers field for "phala", or (3) use the phala/ prefix in model IDs.What's the difference between regular and TEE models?
What's the difference between regular and TEE models?
Regular models: TEE-protected gateway only (your request is protected in transit)TEE models (Phala/Tinfoil/Near AI/Chutes): Full end-to-end TEE (gateway + inference in GPU TEE)
How often are new models added?
How often are new models added?
We add new models regularly. Check the API or docs for the latest additions.
Can I request a specific model?
Can I request a specific model?
Yes! Email support@redpill.ai with your model request.
Next Steps
Start Using Models
Make your first request
Confidential AI Models
Explore all TEE models in detail
API Reference
Models API endpoint
Pricing Details
Understand pricing