What is Confidential AI?
Confidential AI refers to AI models that run entirely inside Trusted Execution Environments (TEE), providing end-to-end privacy from input to output. Unlike regular models where only the gateway is TEE-protected, confidential AI models run the entire inference process inside secure enclaves.
RedPill’s Two-Layer TEE Protection
RedPill offers dual privacy protection:Layer 1: TEE-Protected Gateway (All Models)
✅ Applies to all 218+ models ✅ Request processing in TEE ✅ Response handling in TEE ✅ No additional costLayer 2: TEE-Protected Inference (Phala Models)
✅ Model weights in GPU TEE ✅ Inference computation in TEE ✅ Complete end-to-end protection ✅ Cryptographic attestation6 Phala Models
Native confidential AI models
GPU TEE
NVIDIA H100/H200 secure enclaves
FP8 Quantization
99% native performance
Verifiable
Cryptographic attestation
Phala Confidential Models
RedPill offers 6 confidential AI models powered by Phala Network:Model | Parameters | Context | Use Case |
---|---|---|---|
phala/deepseek-chat-v3-0324 | 685B (MoE) | 164K | Advanced reasoning |
phala/gpt-oss-120b | 117B (MoE) | 131K | OpenAI architecture |
phala/gpt-oss-20b | 21B (MoE) | 131K | Efficient inference |
phala/qwen2.5-vl-72b-instruct | 72B | 128K | Vision + language |
phala/qwen-2.5-7b-instruct | 7B | 33K | Budget-friendly |
phala/gemma-3-27b-it | 27B | 54K | Multilingual |
Model Details
Explore all Phala models →
How It Works
1. Model Loading in TEE
Model weights are decrypted only inside the GPU TEE.2. Request Processing
All pink nodes are TEE-protected - your data never leaves hardware security.3. Cryptographic Attestation
Every request generates verifiable proof:- GPU TEE measurements - Proves genuine NVIDIA H100 TEE
- Model hash - Verifies exact model version
- Code hash - Confirms inference code integrity
- Cryptographic signature - Signed by TEE hardware
Verify Attestation
Learn how to verify TEE proofs →
Privacy Guarantees
What CANNOT Be Accessed
Even with full system access, nobody can see:Data Type | Accessible? | Protection |
---|---|---|
Your prompts | ❌ No | GPU TEE encrypted |
Model responses | ❌ No | GPU TEE encrypted |
Model weights | ❌ No | Encrypted at rest & in-use |
Intermediate activations | ❌ No | GPU TEE memory isolation |
Gradients (fine-tuning) | ❌ No | TEE-protected |
Trust Model
You must trust:- NVIDIA GPU vendor - H100/H200 TEE correctness
- Phala Network - Model deployment integrity
- Open source code - Auditable on GitHub
- ❌ RedPill operators
- ❌ Cloud provider (AWS, GCP, Azure)
- ❌ System administrators
- ❌ Other users on same hardware
Performance
Near-Native Speed
GPU TEE adds minimal overhead:Metric | Native | TEE Mode | Overhead |
---|---|---|---|
Throughput | 100 tok/s | 99 tok/s | ~1% |
Latency | 50ms | 51ms | ~2% |
TFLOPS | 1979 | 1959 | ~1% |
Benchmark Results
See detailed performance benchmarks →
Use Cases
Healthcare
Process patient data with HIPAA compliance
Financial Services
Analyze confidential financial data
Legal
Handle privileged communications
Enterprise AI
Protect trade secrets and IP
Government
Classified data processing
Research
Sensitive research data analysis
Example Usage
vs Regular Models
Feature | Regular Models | Phala Confidential Models |
---|---|---|
Gateway TEE | ✅ Yes | ✅ Yes |
Inference TEE | ❌ No | ✅ Yes |
Model in TEE | ❌ No | ✅ Yes |
End-to-end TEE | ❌ No | ✅ Yes |
Attestation | ✅ Gateway only | ✅ Full stack |
Model count | 218+ | 6 |
Price | Provider pricing | Competitive |
Integration with Phala Network
RedPill’s confidential AI is powered by Phala Network, pioneers in:- GPU TEE - First GPU-based confidential computing
- Verifiable AI - Cryptographic proof of execution
- dstack - Open source TEE infrastructure
- Decentralized - Distributed trust model
Phala Documentation
Learn more about Phala’s TEE technology →
Compliance
Confidential AI helps meet regulatory requirements:- HIPAA - Healthcare data protection
- GDPR - European data privacy
- CCPA - California privacy law
- SOC 2 - Security controls
- ISO 27001 - Information security
- FedRAMP - US government (in progress)
FAQs
What's the difference between gateway TEE and confidential AI?
What's the difference between gateway TEE and confidential AI?
- Gateway TEE: Protects request routing (all 218+ models)
- Confidential AI: Protects entire inference (Phala models only)
Are Phala models slower?
Are Phala models slower?
No! TEE mode runs at 99% of native speed. Performance impact is minimal.
Can I fine-tune Phala models?
Can I fine-tune Phala models?
Custom fine-tuning is available for enterprise customers. Contact sales@redpill.ai
How do I verify TEE execution?
How do I verify TEE execution?
Use the attestation API to get cryptographic proof. See Attestation Guide.
Which model should I choose?
Which model should I choose?
- Best quality:
phala/deepseek-chat-v3-0324
(685B) - OpenAI-like:
phala/gpt-oss-120b
(117B) - Vision:
phala/qwen2.5-vl-72b-instruct
(72B) - Budget:
phala/qwen-2.5-7b-instruct
(7B)
Can I add custom Phala models?
Can I add custom Phala models?
Yes! Enterprise customers can deploy custom models in GPU TEE. Contact sales@redpill.ai