Skip to main content

What is Confidential AI?

Confidential AI refers to AI models that run entirely inside Trusted Execution Environments (TEE), providing end-to-end privacy from input to output. Unlike regular models where only the gateway is TEE-protected, confidential AI models run the entire inference process inside secure enclaves. Confidential AI Architecture

RedPill’s Two-Layer TEE Protection

RedPill offers dual privacy protection:

Layer 1: TEE-Protected Gateway (All Models)

  • ✅ Applies to all 60+ models (66+ provider integrations available)
  • ✅ Request processing in TEE
  • ✅ Response handling in TEE
  • ✅ No additional cost

Layer 2: TEE-Protected Inference (Phala Models)

  • ✅ Model weights in GPU TEE
  • ✅ Inference computation in TEE
  • ✅ Complete end-to-end protection
  • ✅ Cryptographic attestation

17 TEE Models

From 4 confidential providers

GPU TEE

NVIDIA H100/H200 secure enclaves

4 Providers

Phala, Tinfoil, Near AI, Chutes

Verifiable

Cryptographic attestation

GPU TEE Providers

RedPill offers 17 confidential AI models across 4 GPU TEE providers:

Phala Network (8 models)

ModelParametersContextUse Case
phala/glm-5Large202KSystems engineering
phala/gpt-oss-120b117B (MoE)131KOpenAI-compatible
phala/qwen3-vl-30b-a3b-instruct30B (MoE)128KVision + language
phala/gemma-3-27b-it27B53KMultilingual
phala/uncensored-24b24B32KUnrestricted
phala/gpt-oss-20b21B (MoE)131KEfficient inference
phala/glm-4.7-flash~30B202KAgentic coding
phala/qwen-2.5-7b-instruct7B32KBudget-friendly

Tinfoil (4 models)

ModelParametersContextUse Case
deepseek/deepseek-r1-0528685B (MoE)163KReasoning model
qwen/qwen3-coder-480b-a35b-instruct480B (MoE)262KCode generation
moonshotai/kimi-k2-thinking1T (MoE)262KAgentic reasoning
meta-llama/llama-3.3-70b-instruct70B131KGeneral purpose

Near AI (3 models)

ModelParametersContextUse Case
deepseek/deepseek-chat-v3.1671B (MoE)163KHybrid reasoning
qwen/qwen3-30b-a3b-instruct-250730B (MoE)262KGeneral purpose
z-ai/glm-4.7130B131KBilingual (CN/EN)

Chutes (2 models)

ModelParametersContextUse Case
deepseek/deepseek-v3.2685B (MoE)163KLatest DeepSeek
moonshotai/kimi-k2.5Large (MoE)262KVisual coding

All TEE Model Details

Explore all 17 confidential models →

How It Works

1. Model Loading in TEE

Model weights are decrypted only inside the GPU TEE.

2. Request Processing

All pink nodes are TEE-protected - your data never leaves hardware security.

3. Cryptographic Attestation

Every request generates verifiable proof:
# Get attestation report
curl https://api.redpill.ai/v1/attestation/report?model=phala/qwen-2.5-7b-instruct \
  -H "Authorization: Bearer YOUR_API_KEY"
Returns:
  • GPU TEE measurements - Proves genuine NVIDIA H100 TEE
  • Model hash - Verifies exact model version
  • Code hash - Confirms inference code integrity
  • Cryptographic signature - Signed by TEE hardware

Verify Attestation

Learn how to verify TEE proofs →

Privacy Guarantees

What CANNOT Be Accessed

Even with full system access, nobody can see:
Data TypeAccessible?Protection
Your prompts❌ NoGPU TEE encrypted
Model responses❌ NoGPU TEE encrypted
Model weights❌ NoEncrypted at rest & in-use
Intermediate activations❌ NoGPU TEE memory isolation
Gradients (fine-tuning)❌ NoTEE-protected

Trust Model

You must trust:
  • NVIDIA GPU vendor - H100/H200 TEE correctness
  • Phala Network - Model deployment integrity
  • Open source code - Auditable on GitHub
You do NOT need to trust:
  • ❌ RedPill operators
  • ❌ Cloud provider (AWS, GCP, Azure)
  • ❌ System administrators
  • ❌ Other users on same hardware

Performance

Near-Native Speed

GPU TEE adds minimal overhead:
MetricNativeTEE ModeOverhead
Throughput100 tok/s99 tok/s~1%
Latency50ms51ms~2%
TFLOPS19791959~1%
99% efficiency on NVIDIA H100 GPUs.

Benchmark Results

See detailed performance benchmarks →

Use Cases

Healthcare

Process patient data with HIPAA compliance

Financial Services

Analyze confidential financial data

Legal

Handle privileged communications

Enterprise AI

Protect trade secrets and IP

Government

Classified data processing

Research

Sensitive research data analysis

Example Usage

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.redpill.ai/v1"
)

# Use Phala confidential model
response = client.chat.completions.create(
    model="phala/glm-5",
    messages=[
        {
            "role": "user",
            "content": "Analyze this confidential financial report: ..."
        }
    ]
)

print(response.choices[0].message.content)

# Verify TEE execution
request_id = response.id
attestation = requests.get(
    f"https://api.redpill.ai/v1/attestation/report?model=phala/glm-5",
    headers={"Authorization": f"Bearer YOUR_API_KEY"}
)
print("Signing Address:", attestation.json()["signing_address"])

vs Regular Models

FeatureRegular ModelsConfidential AI Models
Gateway TEE✅ Yes✅ Yes
Inference TEE❌ NoYes
Model in TEE❌ NoYes
End-to-end TEE❌ NoYes
Attestation✅ Gateway onlyFull stack
Model count48+17
PriceProvider pricingCompetitive

Integration with Phala Network

RedPill’s confidential AI is powered by Phala Network, pioneers in:
  • GPU TEE - First GPU-based confidential computing
  • Verifiable AI - Cryptographic proof of execution
  • dstack - Open source TEE infrastructure
  • Decentralized - Distributed trust model

Phala Documentation

Learn more about Phala’s TEE technology →

Compliance

Confidential AI helps meet regulatory requirements:
  • HIPAA - Healthcare data protection
  • GDPR - European data privacy
  • CCPA - California privacy law
  • SOC 2 - Security controls
  • ISO 27001 - Information security
  • FedRAMP - US government (in progress)

FAQs

  • Gateway TEE: Protects request routing (all 60+ models (66+ provider integrations available))
  • Confidential AI: Protects entire inference (Phala, Tinfoil, Near AI, Chutes models)
For maximum privacy, use confidential AI models.
No! TEE mode runs at 99% of native speed. Performance impact is minimal.
Custom fine-tuning is available for enterprise customers. Contact sales@redpill.ai
Use the attestation API to get cryptographic proof. See Attestation Guide.
  • Best quality: phala/glm-5 - Systems engineering flagship
  • OpenAI-compatible: phala/gpt-oss-120b (117B MoE)
  • Reasoning: deepseek/deepseek-r1-0528 (685B MoE, Tinfoil)
  • Vision + language: phala/qwen3-vl-30b-a3b-instruct (30B MoE)
  • Budget-friendly: phala/qwen-2.5-7b-instruct (7B)
  • Lowest cost: deepseek/deepseek-v3.2 ($0.27/M, Chutes)
Yes! Enterprise customers can deploy custom models in GPU TEE. Contact sales@redpill.ai

Next Steps

Phala Models

Explore all 17 confidential models

Attestation

Verify TEE execution

Verification

Signature verification guide

Get Started

Start using confidential AI