Overview
RedPill supports 218+ AI models from leading providers, all accessible through a single TEE-protected API. Every request is hardware-protected regardless of which model you choose.List Models via API
Get the latest models programmatically →
Model Categories
Chat Models
Conversational AI for chatbots and assistants
Instruction Models
Task completion and code generation
Vision Models
Image understanding and analysis
Embedding Models
Text embeddings for search and similarity
Featured Models
OpenAI Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
openai/gpt-4-turbo | 128K | $0.01/1K | $0.03/1K |
openai/gpt-4 | 8K | $0.03/1K | $0.06/1K |
openai/gpt-3.5-turbo | 16K | $0.0005/1K | $0.0015/1K |
openai/o1-preview | 128K | $0.015/1K | $0.06/1K |
openai/o1-mini | 128K | $0.003/1K | $0.012/1K |
Anthropic Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
anthropic/claude-3.5-sonnet | 200K | $0.003/1K | $0.015/1K |
anthropic/claude-3-opus | 200K | $0.015/1K | $0.075/1K |
anthropic/claude-3-haiku | 200K | $0.00025/1K | $0.00125/1K |
Google Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
google/gemini-1.5-pro | 2M | $0.00125/1K | $0.005/1K |
google/gemini-1.5-flash | 1M | $0.000075/1K | $0.0003/1K |
google/gemini-flash-1.5-8b | 1M | $0.0000375/1K | $0.00015/1K |
Meta Llama Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
meta-llama/llama-3.3-70b-instruct | 131K | $0.00035/1K | $0.0004/1K |
meta-llama/llama-3.2-90b-vision-instruct | 131K | $0.00035/1K | $0.0004/1K |
meta-llama/llama-3.2-11b-vision-instruct | 131K | $0.000055/1K | $0.000055/1K |
meta-llama/llama-3.2-3b-instruct | 131K | $0.00003/1K | $0.00005/1K |
Mistral AI Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
mistralai/mistral-large-latest | 128K | $0.002/1K | $0.006/1K |
mistralai/mixtral-8x22b-instruct | 64K | $0.00065/1K | $0.00065/1K |
mistralai/ministral-8b | 128K | $0.0000001/1K | $0.0000001/1K |
mistralai/ministral-3b | 128K | $0.00000004/1K | $0.00000004/1K |
Qwen Models
Model ID | Context | Prompt | Completion |
---|---|---|---|
qwen/qwen-2.5-72b-instruct | 131K | $0.00035/1K | $0.0004/1K |
qwen/qwen-2.5-7b-instruct | 131K | $0.00027/1K | $0.00027/1K |
qwen/qwen-2-vl-72b-instruct | 33K | $0.0004/1K | $0.0004/1K |
Phala Confidential AI Models
Native TEE models running entirely in GPU secure enclaves:Model ID | Context | Prompt | Completion | Quantization |
---|---|---|---|---|
phala/deepseek-chat-v3-0324 | 164K | $0.00049/1K | $0.00114/1K | FP8 |
phala/gpt-oss-120b | 131K | $0.0001/1K | $0.00049/1K | FP8 |
phala/gpt-oss-20b | 131K | $0.0001/1K | $0.0004/1K | FP8 |
phala/qwen2.5-vl-72b-instruct | 128K | $0.00059/1K | $0.00059/1K | FP8 |
phala/qwen-2.5-7b-instruct | 33K | $0.00004/1K | $0.0001/1K | FP8 |
phala/gemma-3-27b-it | 54K | $0.00011/1K | $0.0004/1K | FP8 |
Learn About Confidential AI
Explore Phala TEE models in detail →
Vision Models
Models that understand images:Model ID | Context | Features |
---|---|---|
meta-llama/llama-3.2-90b-vision-instruct | 131K | High-quality vision |
meta-llama/llama-3.2-11b-vision-instruct | 131K | Efficient vision |
qwen/qwen-2-vl-72b-instruct | 33K | Chinese + English |
phala/qwen2.5-vl-72b-instruct | 128K | TEE-protected vision |
google/gemini-1.5-pro | 2M | Long context vision |
Embedding Models
Generate vector embeddings for semantic search:Model ID | Dimensions | Max Tokens |
---|---|---|
openai/text-embedding-3-large | 3072 | 8191 |
openai/text-embedding-3-small | 1536 | 8191 |
openai/text-embedding-ada-002 | 1536 | 8191 |
cohere/embed-english-v3.0 | 1024 | 512 |
Provider Coverage
Supported Providers
- OpenAI - GPT-4, GPT-3.5, o1, o3, embeddings
- Anthropic - Claude 3.5, Claude 3, Claude 2
- Google - Gemini 1.5 Pro/Flash, PaLM
- Meta - Llama 3.3, Llama 3.2, Llama 3.1
- Mistral - Large, Medium, Small, Mixtral
- Qwen - Qwen 2.5, Qwen-VL
- Phala - Confidential AI models in TEE
- Cohere - Command, Embed
- Perplexity - Sonar models
- NVIDIA - Nemotron models
- DeepSeek - Chat, Code models
- And 60+ more providers
Pricing Tiers
Free Models
Great for testing and low-volume use:mistralai/ministral-3b
- $0.00000004/1K tokensmistralai/ministral-8b
- $0.0000001/1K tokensmeta-llama/llama-3.2-1b-instruct
- $0.00000001/1K promptliquid/lfm-40b
- Free
Budget Models
Balance of cost and quality:openai/gpt-3.5-turbo
- $0.0005/1K promptanthropic/claude-3-haiku
- $0.00025/1K promptgoogle/gemini-flash-1.5-8b
- $0.0000375/1K promptmeta-llama/llama-3.2-3b-instruct
- $0.00003/1K prompt
Premium Models
Best quality for production:openai/gpt-4-turbo
- $0.01/1K promptanthropic/claude-3.5-sonnet
- $0.003/1K promptgoogle/gemini-1.5-pro
- $0.00125/1K promptphala/deepseek-chat-v3-0324
- $0.00049/1K prompt (TEE)
Model Selection Guide
By Use Case
Chatbots & Assistants
Chatbots & Assistants
Best:
anthropic/claude-3.5-sonnet
, openai/gpt-4-turbo
Budget: openai/gpt-3.5-turbo
, meta-llama/llama-3.3-70b-instruct
Code Generation
Code Generation
Best:
openai/gpt-4-turbo
, anthropic/claude-3-opus
Budget: meta-llama/llama-3.3-70b-instruct
, qwen/qwen-2.5-72b-instruct
Text Analysis
Text Analysis
Best:
anthropic/claude-3.5-sonnet
, google/gemini-1.5-pro
Budget: google/gemini-1.5-flash
, qwen/qwen-2.5-7b-instruct
Image Understanding
Image Understanding
Best:
meta-llama/llama-3.2-90b-vision-instruct
, google/gemini-1.5-pro
Budget: meta-llama/llama-3.2-11b-vision-instruct
High-Privacy Workloads
High-Privacy Workloads
TEE-Protected: All Phala models
phala/deepseek-chat-v3-0324
- Best qualityphala/gpt-oss-120b
- OpenAI architecturephala/qwen-2.5-7b-instruct
- Budget option
Long Context
Long Context
Best:
google/gemini-1.5-pro
(2M tokens), google/gemini-1.5-flash
(1M tokens)Others: anthropic/claude-3.5-sonnet
(200K), qwen/qwen-2.5-72b-instruct
(131K)Get Latest Models
Via API
Via SDK
Model Properties
Each model includes:Property | Description |
---|---|
id | Model identifier for API calls |
name | Human-readable name |
context_length | Maximum tokens in context window |
pricing.prompt | Cost per 1K prompt tokens |
pricing.completion | Cost per 1K completion tokens |
quantization | Model quantization (e.g., FP8, FP16) |
modality | Input/output types (text, image) |
Model Compatibility
OpenAI SDK Compatible
All models work with OpenAI SDK:Streaming Support
All chat models support streaming:Function Calling
Supported models:- All OpenAI GPT models
- Anthropic Claude 3+ models
- Google Gemini models
- Meta Llama 3.2+ models
- Mistral models
FAQs
Are all 218 models protected by TEE?
Are all 218 models protected by TEE?
Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala confidential models.
Can I use my favorite model?
Can I use my favorite model?
If it’s one of the 218+ models, yes! If not, request it and we’ll consider adding it.
What's the difference between regular and Phala models?
What's the difference between regular and Phala models?
Regular models: TEE-protected gateway onlyPhala models: Full end-to-end TEE (gateway + inference in GPU TEE)
How often are new models added?
How often are new models added?
We add new models weekly. Check the API or docs for the latest additions.
Can I request a specific model?
Can I request a specific model?
Yes! Email support@redpill.ai with your model request.