Overview
RedPill supports 60+ active AI models accessible through a single TEE-protected API, with 66+ provider integrations in the codebase enabling easy expansion. Every request is hardware-protected regardless of which model you choose.Currently Active: OpenAI, Anthropic, Google, xAI, DeepSeek, Qwen, ZhipuAI, Meta, NousResearch, and more
Available Integrations: 66+ providers including Mistral, Groq, Together AI, Fireworks, Replicate, Cohere, Cerebras, Lambda, and others
List Models via API
Get the latest models programmatically →
Model Categories
Chat Models
Conversational AI for chatbots and assistants
Instruction Models
Task completion and code generation
Vision Models
Image understanding and analysis
Embedding Models
Text embeddings for search and similarity
Featured Models
OpenAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
openai/gpt-5.2 | 400K | $1.25/M | $10/M |
openai/gpt-5.1 | 400K | $1.25/M | $10/M |
openai/gpt-5 | 400K | $1.25/M | $10/M |
openai/gpt-5-mini | 400K | $0.25/M | $2/M |
openai/gpt-5-nano | 400K | $0.05/M | $0.4/M |
openai/o4-mini | 200K | $1.1/M | $4.4/M |
openai/o3 | 200K | $2/M | $8/M |
openai/gpt-4.1 | 1M | $2/M | $8/M |
openai/gpt-4.1-mini | 1M | $0.4/M | $1.6/M |
New: GPT-5.2 is OpenAI’s latest flagship model with enhanced reasoning and improved performance across all benchmarks.
Anthropic Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
anthropic/claude-sonnet-4.5 | 1M | $3/M | $15/M |
anthropic/claude-opus-4.1 | 200K | $15/M | $75/M |
anthropic/claude-opus-4 | 200K | $15/M | $75/M |
anthropic/claude-sonnet-4 | 1M | $3/M | $15/M |
anthropic/claude-3.7-sonnet | 200K | $3/M | $15/M |
anthropic/claude-3.5-haiku | 200K | $0.8/M | $4/M |
Google Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
google/gemini-3-pro-preview | 1M | $1.25/M | $10/M |
google/gemini-2.5-pro | 1M | $1.25/M | $10/M |
google/gemini-2.5-flash | 1M | $0.3/M | $2.5/M |
google/gemini-2.5-flash-lite | 1M | $0.1/M | $0.4/M |
google/gemma-3-27b-it | 53K | $0.11/M | $0.4/M |
New: Gemini 3 Pro Preview is Google’s next-generation model with advanced reasoning capabilities.
xAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
x-ai/grok-4 | 128K | $3/M | $15/M |
x-ai/grok-4.1-fast | 128K | $1/M | $5/M |
x-ai/grok-code-fast-1 | 128K | $1/M | $5/M |
New: xAI’s Grok models are now available! Grok 4 is the flagship model, while Grok 4.1 Fast and Grok Code Fast offer optimized performance for speed and code generation.
DeepSeek Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-v3.2 | 128K | $0.28/M | $1.14/M |
deepseek/deepseek-chat-v3.1 | 128K | $0.28/M | $1.14/M |
deepseek/deepseek-chat-v3-0324 | 128K | $0.28/M | $1.14/M |
deepseek/deepseek-r1-0528 | 128K | $0.55/M | $2.19/M |
deepseek/deepseek-chat | 128K | $0.14/M | $0.28/M |
New: DeepSeek V3.2 is the latest version with improved performance. DeepSeek R1 is a reasoning model optimized for complex multi-step tasks.
Qwen Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
qwen/qwen3-coder-480b-a35b-instruct | 131K | $0.3/M | $1.49/M |
qwen/qwen3-vl-30b-a3b-instruct | 131K | $0.15/M | $0.6/M |
qwen/qwen3-30b-a3b-instruct-2507 | 131K | $0.15/M | $0.6/M |
qwen/qwen2.5-vl-72b-instruct | 64K | $0.59/M | $0.59/M |
qwen/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M |
New: Qwen3 Coder 480B is the largest coding model available, with massive 480B parameters. Qwen3 VL and Qwen3 30B offer excellent multimodal and general capabilities.
ZhipuAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
z-ai/glm-4.6 | 128K | $0.5/M | $2/M |
New: GLM-4.6 is ZhipuAI’s latest large language model with strong Chinese and English bilingual capabilities.
GPU TEE Confidential Models
RedPill offers 15 confidential AI models running entirely in GPU TEE across 3 providers:Phala Network (8 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-v3.2 | 128K | $0.28/M | $1.14/M |
deepseek/deepseek-chat-v3-0324 | 163K | $0.28/M | $1.14/M |
openai/gpt-oss-120b | 131K | $0.1/M | $0.49/M |
openai/gpt-oss-20b | 131K | $0.04/M | $0.15/M |
qwen/qwen2.5-vl-72b-instruct | 128K | $0.59/M | $0.59/M |
qwen/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M |
google/gemma-3-27b-it | 53K | $0.11/M | $0.4/M |
Tinfoil (4 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-r1-0528 | 128K | $0.55/M | $2.19/M |
qwen/qwen3-coder-480b-a35b-instruct | 131K | $0.3/M | $1.49/M |
qwen/qwen3-vl-30b-a3b-instruct | 131K | $0.15/M | $0.6/M |
meta-llama/llama-3.3-70b-instruct | 128K | $0.1/M | $0.4/M |
Near AI (3 models)
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
deepseek/deepseek-chat-v3.1 | 128K | $0.28/M | $1.14/M |
qwen/qwen3-30b-a3b-instruct-2507 | 131K | $0.15/M | $0.6/M |
z-ai/glm-4.6 | 128K | $0.5/M | $2/M |
Learn About Confidential AI
Explore all 14 GPU TEE models in detail →
Vision Models
Models that understand images:| Model ID | Context | Features | TEE Provider |
|---|---|---|---|
qwen/qwen2.5-vl-72b-instruct | 128K | Vision + Text | Phala |
qwen/qwen3-vl-30b-a3b-instruct | 131K | Vision + Text | Tinfoil |
Embedding Models
Generate vector embeddings for semantic search:| Model ID | Dimensions | Max Tokens |
|---|---|---|
openai/text-embedding-3-large | 3072 | 8191 |
openai/text-embedding-3-small | 1536 | 8191 |
openai/text-embedding-ada-002 | 1536 | 8191 |
cohere/embed-english-v3.0 | 1024 | 512 |
Provider Coverage
Supported Providers
- OpenAI - GPT-5.2, GPT-5.1, GPT-5, GPT-4.1, O3, O4, embeddings
- Anthropic - Claude Sonnet 4.5, Claude Opus 4.1, Claude 3.7, Claude 3.5 Haiku
- Google - Gemini 3 Pro Preview, Gemini 2.5 Pro/Flash/Flash-Lite, Gemma 3
- xAI - Grok 4, Grok 4.1 Fast, Grok Code Fast
- DeepSeek - DeepSeek V3.2, DeepSeek V3.1, DeepSeek R1
- Qwen - Qwen3 Coder 480B, Qwen3 VL, Qwen 2.5
- ZhipuAI - GLM-4.6 bilingual model
- Meta - Llama 3.3
- NousResearch - Hermes 3 Llama 405B
- And more providers
Pricing Tiers
Free Models
Great for testing and low-volume use:qwen/qwen-2.5-7b-instruct- $0.04/M promptliquid/lfm-40b- Free
Budget Models
Balance of cost and quality:openai/gpt-3.5-turbo- $0.5/M promptopenai/gpt-4o-mini- $0.15/M promptgoogle/gemini-2.5-flash-lite- $0.1/M promptqwen/qwen-2.5-7b-instruct- $0.04/M prompt
Premium Models
Best quality for production:openai/gpt-5- $1.25/M promptanthropic/claude-sonnet-4.5- $3/M promptgoogle/gemini-2.5-pro- $1.25/M promptphala/deepseek-chat-v3-0324- $0.28/M prompt (TEE)
Model Selection Guide
By Use Case
Chatbots & Assistants
Chatbots & Assistants
Best:
anthropic/claude-sonnet-4.5, openai/gpt-5Budget: openai/gpt-5-mini, qwen/qwen-2.5-7b-instructCode Generation
Code Generation
Best:
openai/gpt-5, anthropic/claude-opus-4.1Budget: qwen/qwen2.5-vl-72b-instruct, google/gemini-2.5-flashText Analysis
Text Analysis
Best:
anthropic/claude-sonnet-4.5, google/gemini-2.5-proBudget: google/gemini-2.5-flash, qwen/qwen-2.5-7b-instructImage Understanding
Image Understanding
Best:
qwen/qwen3-vl-235b-a22b-instruct, qwen/qwen2.5-vl-72b-instructBudget: phala/qwen2.5-vl-72b-instructHigh-Privacy Workloads
High-Privacy Workloads
TEE-Protected: All Phala models
phala/deepseek-chat-v3-0324- Best qualityphala/gpt-oss-120b- OpenAI architecturephala/qwen-2.5-7b-instruct- Budget option
Long Context
Long Context
Best:
google/gemini-2.5-pro (1M tokens), anthropic/claude-sonnet-4.5 (1M tokens)Others: openai/gpt-4.1 (1M tokens), qwen/qwen3-vl-235b-a22b-instruct (131K tokens)Get Latest Models
Via API
Via SDK
Model Properties
Each model includes:| Property | Description |
|---|---|
id | Model identifier for API calls |
name | Human-readable name |
context_length | Maximum tokens in context window |
pricing.prompt | Cost per 1K prompt tokens |
pricing.completion | Cost per 1K completion tokens |
quantization | Model quantization (e.g., FP8, FP16) |
modality | Input/output types (text, image) |
Model Compatibility
OpenAI SDK Compatible
All models work with OpenAI SDK:Streaming Support
All chat models support streaming:Function Calling
Supported models:- All OpenAI GPT models
- Anthropic Claude 3+ models
- Google Gemini models
- Meta Llama 3.2+ models
- Mistral models
FAQs
Are all 60+ models (66+ provider integrations available) protected by TEE?
Are all 60+ models (66+ provider integrations available) protected by TEE?
Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala confidential models.
Can I use my favorite model?
Can I use my favorite model?
If it’s one of the 60+ models (66+ provider integrations available), yes! If not, request it and we’ll consider adding it.
What's the difference between regular and Phala models?
What's the difference between regular and Phala models?
Regular models: TEE-protected gateway onlyPhala models: Full end-to-end TEE (gateway + inference in GPU TEE)
How often are new models added?
How often are new models added?
We add new models weekly. Check the API or docs for the latest additions.
Can I request a specific model?
Can I request a specific model?
Yes! Email [email protected] with your model request.