Deploy machine learning in production

Cortex makes it simple to deploy machine learning models in production

What you can do with Cortex

Write APIs in Python

Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework or hardware.

Configure infrastructure in YAML

Define autoscaling behavior, monitoring, compute resources, update strategies, and more—or use Cortex's defaults.

Focus on machine learning

You don't need to bring your own cluster or containerize your models—Cortex automates your cloud infrastructure.

Scale to handle production traffic

Handle traffic with request-based autoscaling. Optimize spend with spot instances and multi-model APIs.
How AI Dungeon scaled to 1,000,000 users with Cortex
Case study
How AI Dungeon scaled to 1,000,000 users with Cortex
Case study

Integrate with your stack

Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.

Run on your AWS account

Run Cortex on your AWS account (GCP coming soon), maintaining control over resource utilization and data access.

Get started

Open Source

Run Cortex on your AWS account
Spot instances
Rolling updates
Log streaming
Prediction monitoring


Coming soon
Get Cortex as a service
Join the waitlist
Fully managed
Team workflows
Admin dashboard
Advanced security
Customer success