Skip to main content

Overview

RedPill supports 48+ active AI models accessible through a single TEE-protected API across 9 providers. Every request is hardware-protected regardless of which model you choose.
Currently Active: OpenAI, Anthropic, Google, xAI, DeepSeek, Qwen, ZhipuAI, Meta, MoonshotAI, NousResearch, and more

List Models via API

Get the latest models programmatically ->

Model Categories

Chat Models

Conversational AI for chatbots and assistants

Instruction Models

Task completion and code generation

Vision Models

Image understanding and analysis

Embedding Models

Text embeddings for search and similarity

OpenAI Models

Model IDContextPromptCompletion
openai/gpt-5.2400K$1.75/M$14/M
openai/gpt-5.1400K$1.25/M$10/M
openai/gpt-5400K$1.25/M$10/M
openai/gpt-5-mini400K$0.25/M$2/M
openai/gpt-5-nano400K$0.05/M$0.4/M
openai/o3200K$2/M$8/M
openai/o4-mini200K$1.1/M$4.4/M
openai/gpt-4.11M$2/M$8/M
openai/gpt-4.1-mini1M$0.4/M$1.6/M
openai/gpt-4.1-nano1M$0.1/M$0.4/M
openai/gpt-4o128K$2.5/M$10/M
openai/gpt-4o-mini128K$0.15/M$0.6/M
openai/gpt-48K$30/M$60/M
openai/gpt-3.5-turbo16K$0.5/M$1.5/M
New: GPT-5.2 is OpenAI’s latest flagship model with enhanced reasoning and improved performance across all benchmarks.

Anthropic Models

Model IDContextPromptCompletion
anthropic/claude-opus-4.61M$10/M$37.5/M
anthropic/claude-sonnet-4.61M$3/M$15/M
anthropic/claude-opus-4.5200K$5/M$25/M
anthropic/claude-sonnet-4.51M$3/M$15/M
anthropic/claude-opus-4.1200K$15/M$75/M
anthropic/claude-opus-4200K$15/M$75/M
anthropic/claude-sonnet-41M$3/M$15/M
anthropic/claude-haiku-4.5200K$1/M$5/M
anthropic/claude-3.7-sonnet200K$3/M$15/M
anthropic/claude-3.5-haiku200K$0.8/M$4/M
New: Claude Opus 4.6 and Sonnet 4.6 are the latest Anthropic models. Claude Opus 4.5 and Haiku 4.5 are also now available.

Google Models

Model IDContextPromptCompletion
google/gemini-3-pro-preview1M$4/M$18/M
google/gemini-2.5-pro1M$2.5/M$15/M
google/gemini-2.5-flash1M$0.3/M$2.5/M
google/gemini-2.5-flash-lite1M$0.1/M$0.4/M
New: Gemini 3 Pro Preview is Google’s next-generation model with advanced reasoning capabilities. Gemini 2.5 Pro pricing updated to $2.5/M prompt.

xAI Models

Model IDContextPromptCompletion
x-ai/grok-4256K$3/M$15/M
x-ai/grok-4.1-fast2M$0.2/M$0.5/M
x-ai/grok-code-fast-1256K$0.2/M$1.5/M
New: Grok 4.1 Fast now supports a massive 2M token context window at just $0.2/M prompt.

DeepSeek Models

Model IDContextPromptCompletionProvider
deepseek/deepseek-v3.2163K$0.27/M$0.4/MChutes
deepseek/deepseek-chat-v3.1163K$1/M$2.5/MNear AI
deepseek/deepseek-r1-0528163K$2/M$2/MTinfoil
New: DeepSeek V3.2 is the latest version available through Chutes. DeepSeek R1 is a reasoning model available with Tinfoil TEE protection.

MoonshotAI Models

Model IDContextPromptCompletionProvider
moonshotai/kimi-k2.5262K$0.6/M$3/MChutes
moonshotai/kimi-k2-thinking262K$2/M$2/MTinfoil
New: Kimi K2.5 is a native multimodal model with state-of-the-art visual coding. Kimi K2 Thinking is an advanced reasoning model optimized for long-horizon agentic tasks.

Qwen Models

Model IDContextPromptCompletionProvider
qwen/qwen3-coder-480b-a35b-instruct262K$2/M$2/MTinfoil
qwen/qwen3-vl-30b-a3b-instruct128K$0.2/M$0.7/MPhala
qwen/qwen3-30b-a3b-instruct-2507262K$0.15/M$0.45/MNear AI
qwen/qwen-2.5-7b-instruct32K$0.04/M$0.1/MPhala

ZhipuAI Models

Model IDContextPromptCompletionProvider
z-ai/glm-5202K$1.2/M$3.5/MPhala
z-ai/glm-4.7131K$0.85/M$3.3/MNear AI
z-ai/glm-4.7-flash202K$0.1/M$0.43/MPhala
New: GLM-5 is ZhipuAI’s latest flagship model for systems engineering. GLM 4.7 Flash offers excellent speed-to-quality ratio with 202K context.

Other Models

Model IDContextPromptCompletionProvider
openai/gpt-oss-120b131K$0.1/M$0.49/MPhala
openai/gpt-oss-20b131K$0.04/M$0.15/MPhala
google/gemma-3-27b-it53K$0.11/M$0.4/MPhala
meta-llama/llama-3.3-70b-instruct131K$2/M$2/MTinfoil
nousresearch/hermes-3-llama-3.1-405b131K$1/M$1/MOpenRouter
phala/uncensored-24b32K$0.2/M$0.9/MPhala

GPU TEE Confidential Models

RedPill offers 17 confidential AI models running entirely in GPU TEE across 4 providers:

Phala Network (8 models)

Model IDContextPromptCompletion
phala/glm-5202K$1.2/M$3.5/M
phala/gpt-oss-120b131K$0.1/M$0.49/M
phala/qwen3-vl-30b-a3b-instruct128K$0.2/M$0.7/M
phala/gemma-3-27b-it53K$0.11/M$0.4/M
phala/uncensored-24b32K$0.2/M$0.9/M
phala/gpt-oss-20b131K$0.04/M$0.15/M
phala/glm-4.7-flash202K$0.1/M$0.43/M
phala/qwen-2.5-7b-instruct32K$0.04/M$0.1/M

Tinfoil (4 models)

Model IDContextPromptCompletion
deepseek/deepseek-r1-0528163K$2/M$2/M
qwen/qwen3-coder-480b-a35b-instruct262K$2/M$2/M
moonshotai/kimi-k2-thinking262K$2/M$2/M
meta-llama/llama-3.3-70b-instruct131K$2/M$2/M

Near AI (3 models)

Model IDContextPromptCompletion
deepseek/deepseek-chat-v3.1163K$1/M$2.5/M
qwen/qwen3-30b-a3b-instruct-2507262K$0.15/M$0.45/M
z-ai/glm-4.7131K$0.85/M$3.3/M

Chutes (2 models)

Model IDContextPromptCompletion
deepseek/deepseek-v3.2163K$0.27/M$0.4/M
moonshotai/kimi-k2.5262K$0.6/M$3/M

Learn About Confidential AI

Explore all 17 GPU TEE models in detail ->

Vision Models

Models that understand images:
Model IDContextFeaturesProvider
phala/qwen3-vl-30b-a3b-instruct128KVision + TextPhala
phala/gemma-3-27b-it53KVision + TextPhala
moonshotai/kimi-k2.5262KVision + TextChutes

Embedding Models

Generate vector embeddings for semantic search:
Model IDDimensionsMax Tokens
openai/text-embedding-3-large30728191
openai/text-embedding-3-small15368191
openai/text-embedding-ada-00215368191

Provider Coverage

Supported Providers

  • OpenAI - GPT-5.2, GPT-5.1, GPT-5, GPT-4.1, O3, O4-mini, GPT-OSS, embeddings
  • Anthropic - Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, Claude Opus 4.1, Claude Haiku 4.5, Claude 3.7 Sonnet, Claude 3.5 Haiku
  • Google - Gemini 3 Pro Preview, Gemini 2.5 Pro/Flash/Flash-Lite
  • xAI - Grok 4, Grok 4.1 Fast, Grok Code Fast
  • DeepSeek - DeepSeek V3.2, DeepSeek V3.1, DeepSeek R1
  • MoonshotAI - Kimi K2.5, Kimi K2 Thinking
  • Qwen - Qwen3 Coder 480B, Qwen3 VL, Qwen3 30B, Qwen 2.5
  • ZhipuAI - GLM-5, GLM-4.7, GLM-4.7 Flash
  • Meta - Llama 3.3 70B
  • NousResearch - Hermes 3 Llama 405B

Pricing Tiers

Budget Models

Balance of cost and quality:
  • phala/qwen-2.5-7b-instruct - $0.04/M prompt (TEE)
  • phala/gpt-oss-20b - $0.04/M prompt (TEE)
  • openai/gpt-5-nano - $0.05/M prompt
  • phala/glm-4.7-flash - $0.1/M prompt (TEE)
  • google/gemini-2.5-flash-lite - $0.1/M prompt
  • openai/gpt-4o-mini - $0.15/M prompt

Mid-Range Models

Production-ready with strong quality:
  • openai/gpt-5-mini - $0.25/M prompt
  • google/gemini-2.5-flash - $0.3/M prompt
  • moonshotai/kimi-k2.5 - $0.6/M prompt
  • anthropic/claude-3.5-haiku - $0.8/M prompt

Premium Models

Best quality for production:
  • openai/gpt-5.2 - $1.75/M prompt
  • anthropic/claude-sonnet-4.5 - $3/M prompt
  • google/gemini-2.5-pro - $2.5/M prompt
  • anthropic/claude-opus-4.6 - $10/M prompt

Model Selection Guide

By Use Case

Best: anthropic/claude-sonnet-4.5, openai/gpt-5Budget: openai/gpt-5-mini, phala/qwen-2.5-7b-instruct
Best: openai/gpt-5, anthropic/claude-opus-4.6Budget: x-ai/grok-code-fast-1, phala/glm-4.7-flash
Best: anthropic/claude-sonnet-4.5, google/gemini-2.5-proBudget: google/gemini-2.5-flash, phala/qwen-2.5-7b-instruct
Best: phala/qwen3-vl-30b-a3b-instruct, moonshotai/kimi-k2.5Budget: phala/gemma-3-27b-it
TEE-Protected: All Phala models
  • phala/glm-5 - Best quality
  • phala/gpt-oss-120b - OpenAI architecture
  • phala/qwen-2.5-7b-instruct - Budget option
Best: x-ai/grok-4.1-fast (2M tokens), google/gemini-2.5-pro (1M tokens), anthropic/claude-opus-4.6 (1M tokens)TEE: phala/glm-5 (202K tokens), phala/glm-4.7-flash (202K tokens)
Best: moonshotai/kimi-k2-thinking, deepseek/deepseek-r1-0528, openai/o3Budget: phala/glm-4.7-flash, openai/o4-mini

Get Latest Models

Via API

# All models
curl https://api.redpill.ai/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

# Phala confidential models only
curl https://api.redpill.ai/v1/models/phala \
  -H "Authorization: Bearer YOUR_API_KEY"

Via SDK

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.redpill.ai/v1"
)

# List all models
models = client.models.list()
for model in models.data:
    print(f"{model.id}")

# Filter for TEE models
import requests
resp = requests.get(
    "https://api.redpill.ai/v1/models",
    headers={"Authorization": "Bearer YOUR_API_KEY"}
)
for model in resp.json()["data"]:
    if "phala" in model.get("providers", []):
        print(f"TEE: {model['id']}")

Model Properties

Each model includes:
PropertyDescription
idModel identifier for API calls
nameHuman-readable name
context_lengthMaximum tokens in context window
pricing.promptCost per token (prompt)
pricing.completionCost per token (completion)
input_modalitiesInput types (text, image, file, audio)
output_modalitiesOutput types (text)
providersArray of infrastructure providers (phala, tinfoil, near-ai, etc.)
metadata.appidPhala TEE application ID (for attestation)

Model Compatibility

OpenAI SDK Compatible

All models work with OpenAI SDK:
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.redpill.ai/v1"
)

# Use any model
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Hello"}]
)

Streaming Support

All chat models support streaming:
stream = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)

Function Calling

Supported models:
  • All OpenAI GPT models
  • Anthropic Claude 3+ models
  • Google Gemini models
  • Meta Llama 3.2+ models
  • Phala GLM and GPT-OSS models

FAQs

Yes! All requests flow through the TEE-protected gateway, regardless of model. For end-to-end TEE protection (including model inference), use Phala, Tinfoil, Near AI, or Chutes confidential models.
Three ways: (1) Use the /v1/models/phala endpoint, (2) check the providers field for "phala", or (3) use the phala/ prefix in model IDs.
Regular models: TEE-protected gateway only (your request is protected in transit)TEE models (Phala/Tinfoil/Near AI/Chutes): Full end-to-end TEE (gateway + inference in GPU TEE)
We add new models regularly. Check the API or docs for the latest additions.
Yes! Email support@redpill.ai with your model request.

Next Steps

Start Using Models

Make your first request

Confidential AI Models

Explore all TEE models in detail

API Reference

Models API endpoint

Pricing Details

Understand pricing