Model Endpoints vs APIs
☁️ The Vertex AI Ecosystem 10m 150 BASE XP
Deployment Paradigms
Vertex AI offers two distinct ways to interact with models:
- Foundation Model APIs: Serverless endpoints for Gemini models. You just call the API, and Google handles the scaling. You pay per token or character.
- Custom Endpoints: When you fine-tune an open-source model (like Llama 3) from the Model Garden, you deploy it to a dedicated Endpoint. You pay per hour for the underlying Compute Engine VMs (GPUs/TPUs).