[ ABORT TO HUD ]
SEQ. 1
SEQ. 2
SEQ. 3
SEQ. 4

Managed Compute Deployments

🗂️ The Model Catalog 8 min 70 BASE XP

Deploy Models to Your Own Infrastructure

For models not available as serverless APIs, or when you need full control, use Managed Compute deployments.

Serverless vs Managed Compute

AspectServerless APIManaged Compute
InfrastructureFully managed by MicrosoftYou manage VM quota
BillingPer-token / PTUPer-hour (VM hosting)
SetupMinutes15-30 minutes
ControlLimitedFull (GPU type, scaling)
Best ForOpenAI models, quick startsOpen-source models, custom configs

Managed compute uses Azure ML Online Endpoints under the hood, deploying models to VMs with specific GPU SKUs (like A100, H100).

🚧 Important: Managed compute requires VM quota approval in your Azure subscription. Request quota for GPU SKUs (e.g., Standard_NC24ads_A100_v4) before attempting deployment — approval can take 1-3 business days.
FOUNDRY VERIFICATION
QUERY 1 // 1
What is the key difference between Serverless and Managed Compute deployments?
Serverless uses better models
Serverless is fully managed; Managed Compute gives you control over VMs and GPU hardware
There is no difference
Managed Compute is always cheaper
Watch: 139x Rust Speedup
Managed Compute Deployments | The Model Catalog — Azure Foundry Academy