Overview
RedPill supports 50+ AI models from leading providers, all accessible through a single TEE-protected API. Every request is hardware-protected regardless of which model you choose.List Models via API
Get the latest models programmatically →
Model Categories
Chat Models
Conversational AI for chatbots and assistants
Instruction Models
Task completion and code generation
Vision Models
Image understanding and analysis
Embedding Models
Text embeddings for search and similarity
Featured Models
OpenAI Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
openai/gpt-5 | 400K | $1.25/M | $10/M |
openai/gpt-5-mini | 400K | $0.25/M | $2/M |
openai/gpt-5-nano | 400K | $0.05/M | $0.4/M |
openai/o4-mini | 200K | $1.1/M | $4.4/M |
openai/o3 | 200K | $2/M | $8/M |
openai/gpt-4.1 | 1M | $2/M | $8/M |
openai/gpt-4.1-mini | 1M | $0.4/M | $1.6/M |
Anthropic Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
anthropic/claude-sonnet-4.5 | 1M | $3/M | $15/M |
anthropic/claude-opus-4.1 | 200K | $15/M | $75/M |
anthropic/claude-opus-4 | 200K | $15/M | $75/M |
anthropic/claude-sonnet-4 | 1M | $3/M | $15/M |
anthropic/claude-3.7-sonnet | 200K | $3/M | $15/M |
anthropic/claude-3.5-haiku | 200K | $0.8/M | $4/M |
Google Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
google/gemini-2.5-pro | 1M | $1.25/M | $10/M |
google/gemini-2.5-flash | 1M | $0.3/M | $2.5/M |
google/gemini-2.5-flash-lite | 1M | $0.1/M | $0.4/M |
google/gemma-3-27b-it | 53K | $0.11/M | $0.4/M |
Qwen Models
| Model ID | Context | Prompt | Completion |
|---|---|---|---|
qwen/qwen2.5-vl-72b-instruct | 64K | $0.59/M | $0.59/M |
qwen/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M |
qwen/qwen3-vl-235b-a22b-instruct | 131K | $0.3/M | $1.49/M |
Phala Confidential AI Models
Native TEE models running entirely in GPU secure enclaves:| Model ID | Context | Prompt | Completion | Quantization |
|---|---|---|---|---|
phala/deepseek-chat-v3-0324 | 163K | $0.28/M | $1.14/M | FP8 |
phala/gemma-3-27b-it | 53K | $0.11/M | $0.4/M | FP8 |
phala/gpt-oss-120b | 131K | $0.1/M | $0.49/M | FP8 |
phala/gpt-oss-20b | 131K | $0.04/M | $0.15/M | FP8 |
phala/qwen-2.5-7b-instruct | 32K | $0.04/M | $0.1/M | FP8 |
phala/qwen2.5-vl-72b-instruct | 128K | $0.59/M | $0.59/M | FP8 |
phala/qwen3-vl-235b-a22b-instruct | 131K | $0.3/M | $1.49/M | FP8 |
Learn About Confidential AI
Explore Phala TEE models in detail →
Vision Models
Models that understand images:| Model ID | Context | Features |
|---|---|---|
phala/qwen2.5-vl-72b-instruct | 128K | TEE-protected vision |
phala/qwen3-vl-235b-a22b-instruct | 131K | TEE-protected vision |
Embedding Models
Generate vector embeddings for semantic search:| Model ID | Dimensions | Max Tokens |
|---|---|---|
openai/text-embedding-3-large | 3072 | 8191 |
openai/text-embedding-3-small | 1536 | 8191 |
openai/text-embedding-ada-002 | 1536 | 8191 |
cohere/embed-english-v3.0 | 1024 | 512 |
Provider Coverage
Supported Providers
- OpenAI - GPT-5, GPT-4.1, O3, O4, embeddings
- Anthropic - Claude Sonnet 4.5, Claude Opus 4.1, Claude 3.7, Claude 3.5 Haiku
- Google - Gemini 2.5 Pro/Flash/Flash-Lite
- Qwen - Qwen 2.5, Qwen-VL, Qwen 3
- Phala - Confidential AI models in TEE
- Cohere - Command, Embed
- DeepSeek - Chat, Code models
- And more providers
Pricing Tiers
Free Models
Great for testing and low-volume use:qwen/qwen-2.5-7b-instruct- $0.04/M promptliquid/lfm-40b- Free
Budget Models
Balance of cost and quality:openai/gpt-3.5-turbo- $0.5/M promptopenai/gpt-4o-mini- $0.15/M promptgoogle/gemini-2.5-flash-lite- $0.1/M promptqwen/qwen-2.5-7b-instruct- $0.04/M prompt
Premium Models
Best quality for production:openai/gpt-5- $1.25/M promptanthropic/claude-sonnet-4.5- $3/M promptgoogle/gemini-2.5-pro- $1.25/M promptphala/deepseek-chat-v3-0324- $0.28/M prompt (TEE)
Model Selection Guide
By Use Case
Chatbots & Assistants
Chatbots & Assistants
Best:
anthropic/claude-sonnet-4.5, openai/gpt-5Budget: openai/gpt-5-mini, qwen/qwen-2.5-7b-instructCode Generation
Code Generation
Best:
openai/gpt-5, anthropic/claude-opus-4.1Budget: qwen/qwen2.5-vl-72b-instruct, google/gemini-2.5-flashText Analysis
Text Analysis
Best:
anthropic/claude-sonnet-4.5, google/gemini-2.5-proBudget: google/gemini-2.5-flash, qwen/qwen-2.5-7b-instructImage Understanding
Image Understanding
Best:
qwen/qwen3-vl-235b-a22b-instruct, qwen/qwen2.5-vl-72b-instructBudget: phala/qwen2.5-vl-72b-instructHigh-Privacy Workloads
High-Privacy Workloads
TEE-Protected: All Phala models
phala/deepseek-chat-v3-0324- Best qualityphala/gpt-oss-120b- OpenAI architecturephala/qwen-2.5-7b-instruct- Budget option
Long Context
Long Context
Best:
google/gemini-2.5-pro (1M tokens), anthropic/claude-sonnet-4.5 (1M tokens)Others: openai/gpt-4.1 (1M tokens), qwen/qwen3-vl-235b-a22b-instruct (131K tokens)Get Latest Models
Via API
Via SDK
Model Properties
Each model includes:| Property | Description |
|---|---|
id | Model identifier for API calls |
name | Human-readable name |
context_length | Maximum tokens in context window |
pricing.prompt | Cost per 1K prompt tokens |
pricing.completion | Cost per 1K completion tokens |
quantization | Model quantization (e.g., FP8, FP16) |
modality | Input/output types (text, image) |
Model Compatibility
OpenAI SDK Compatible
All models work with OpenAI SDK:Streaming Support
All chat models support streaming:Function Calling
Supported models:- All OpenAI GPT models
- Anthropic Claude 3+ models
- Google Gemini models
- Meta Llama 3.2+ models
- Mistral models
FAQs
Are all 50+ models protected by TEE?
Are all 50+ models protected by TEE?
Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala confidential models.
Can I use my favorite model?
Can I use my favorite model?
If it’s one of the 50+ models, yes! If not, request it and we’ll consider adding it.
What's the difference between regular and Phala models?
What's the difference between regular and Phala models?
Regular models: TEE-protected gateway onlyPhala models: Full end-to-end TEE (gateway + inference in GPU TEE)
How often are new models added?
How often are new models added?
We add new models weekly. Check the API or docs for the latest additions.
Can I request a specific model?
Can I request a specific model?
Yes! Email support@redpill.ai with your model request.