Skip to main content

Overview

RedPill supports 50+ AI models from leading providers, all accessible through a single TEE-protected API. Every request is hardware-protected regardless of which model you choose.

List Models via API

Get the latest models programmatically →

Model Categories

Chat Models

Conversational AI for chatbots and assistants

Instruction Models

Task completion and code generation

Vision Models

Image understanding and analysis

Embedding Models

Text embeddings for search and similarity

OpenAI Models

Model IDContextPromptCompletion
openai/gpt-5400K$1.25/M$10/M
openai/gpt-5-mini400K$0.25/M$2/M
openai/gpt-5-nano400K$0.05/M$0.4/M
openai/o4-mini200K$1.1/M$4.4/M
openai/o3200K$2/M$8/M
openai/gpt-4.11M$2/M$8/M
openai/gpt-4.1-mini1M$0.4/M$1.6/M

Anthropic Models

Model IDContextPromptCompletion
anthropic/claude-sonnet-4.51M$3/M$15/M
anthropic/claude-opus-4.1200K$15/M$75/M
anthropic/claude-opus-4200K$15/M$75/M
anthropic/claude-sonnet-41M$3/M$15/M
anthropic/claude-3.7-sonnet200K$3/M$15/M
anthropic/claude-3.5-haiku200K$0.8/M$4/M

Google Models

Model IDContextPromptCompletion
google/gemini-2.5-pro1M$1.25/M$10/M
google/gemini-2.5-flash1M$0.3/M$2.5/M
google/gemini-2.5-flash-lite1M$0.1/M$0.4/M
google/gemma-3-27b-it53K$0.11/M$0.4/M

Qwen Models

Model IDContextPromptCompletion
qwen/qwen2.5-vl-72b-instruct64K$0.59/M$0.59/M
qwen/qwen-2.5-7b-instruct32K$0.04/M$0.1/M
qwen/qwen3-vl-235b-a22b-instruct131K$0.3/M$1.49/M

Phala Confidential AI Models

Native TEE models running entirely in GPU secure enclaves:
Model IDContextPromptCompletionQuantization
phala/deepseek-chat-v3-0324163K$0.28/M$1.14/MFP8
phala/gemma-3-27b-it53K$0.11/M$0.4/MFP8
phala/gpt-oss-120b131K$0.1/M$0.49/MFP8
phala/gpt-oss-20b131K$0.04/M$0.15/MFP8
phala/qwen-2.5-7b-instruct32K$0.04/M$0.1/MFP8
phala/qwen2.5-vl-72b-instruct128K$0.59/M$0.59/MFP8
phala/qwen3-vl-235b-a22b-instruct131K$0.3/M$1.49/MFP8

Learn About Confidential AI

Explore Phala TEE models in detail →

Vision Models

Models that understand images:
Model IDContextFeatures
phala/qwen2.5-vl-72b-instruct128KTEE-protected vision
phala/qwen3-vl-235b-a22b-instruct131KTEE-protected vision

Embedding Models

Generate vector embeddings for semantic search:
Model IDDimensionsMax Tokens
openai/text-embedding-3-large30728191
openai/text-embedding-3-small15368191
openai/text-embedding-ada-00215368191
cohere/embed-english-v3.01024512

Provider Coverage

Supported Providers

  • OpenAI - GPT-5, GPT-4.1, O3, O4, embeddings
  • Anthropic - Claude Sonnet 4.5, Claude Opus 4.1, Claude 3.7, Claude 3.5 Haiku
  • Google - Gemini 2.5 Pro/Flash/Flash-Lite
  • Qwen - Qwen 2.5, Qwen-VL, Qwen 3
  • Phala - Confidential AI models in TEE
  • Cohere - Command, Embed
  • DeepSeek - Chat, Code models
  • And more providers

Pricing Tiers

Free Models

Great for testing and low-volume use:
  • qwen/qwen-2.5-7b-instruct - $0.04/M prompt
  • liquid/lfm-40b - Free

Budget Models

Balance of cost and quality:
  • openai/gpt-3.5-turbo - $0.5/M prompt
  • openai/gpt-4o-mini - $0.15/M prompt
  • google/gemini-2.5-flash-lite - $0.1/M prompt
  • qwen/qwen-2.5-7b-instruct - $0.04/M prompt

Premium Models

Best quality for production:
  • openai/gpt-5 - $1.25/M prompt
  • anthropic/claude-sonnet-4.5 - $3/M prompt
  • google/gemini-2.5-pro - $1.25/M prompt
  • phala/deepseek-chat-v3-0324 - $0.28/M prompt (TEE)

Model Selection Guide

By Use Case

Best: anthropic/claude-sonnet-4.5, openai/gpt-5Budget: openai/gpt-5-mini, qwen/qwen-2.5-7b-instruct
Best: openai/gpt-5, anthropic/claude-opus-4.1Budget: qwen/qwen2.5-vl-72b-instruct, google/gemini-2.5-flash
Best: anthropic/claude-sonnet-4.5, google/gemini-2.5-proBudget: google/gemini-2.5-flash, qwen/qwen-2.5-7b-instruct
Best: qwen/qwen3-vl-235b-a22b-instruct, qwen/qwen2.5-vl-72b-instructBudget: phala/qwen2.5-vl-72b-instruct
TEE-Protected: All Phala models
  • phala/deepseek-chat-v3-0324 - Best quality
  • phala/gpt-oss-120b - OpenAI architecture
  • phala/qwen-2.5-7b-instruct - Budget option
Best: google/gemini-2.5-pro (1M tokens), anthropic/claude-sonnet-4.5 (1M tokens)Others: openai/gpt-4.1 (1M tokens), qwen/qwen3-vl-235b-a22b-instruct (131K tokens)

Get Latest Models

Via API

# All models
curl https://api.redpill.ai/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

# Phala confidential models only
curl https://api.redpill.ai/v1/models/phala \
  -H "Authorization: Bearer YOUR_API_KEY"

Via SDK

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.redpill.ai/v1"
)

# List all models
models = client.models.list()
for model in models.data:
    print(f"{model.id}: {model.name}")

Model Properties

Each model includes:
PropertyDescription
idModel identifier for API calls
nameHuman-readable name
context_lengthMaximum tokens in context window
pricing.promptCost per 1K prompt tokens
pricing.completionCost per 1K completion tokens
quantizationModel quantization (e.g., FP8, FP16)
modalityInput/output types (text, image)

Model Compatibility

OpenAI SDK Compatible

All models work with OpenAI SDK:
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.redpill.ai/v1"
)

# Use any model
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",  # ✅ Works
    messages=[{"role": "user", "content": "Hello"}]
)

Streaming Support

All chat models support streaming:
stream = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)

Function Calling

Supported models:
  • All OpenAI GPT models
  • Anthropic Claude 3+ models
  • Google Gemini models
  • Meta Llama 3.2+ models
  • Mistral models

FAQs

Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala confidential models.
If it’s one of the 50+ models, yes! If not, request it and we’ll consider adding it.
Regular models: TEE-protected gateway onlyPhala models: Full end-to-end TEE (gateway + inference in GPU TEE)
We add new models weekly. Check the API or docs for the latest additions.
Yes! Email support@redpill.ai with your model request.

Next Steps