When you cache a large prompt (like a codebase or a 1-hour video), Google processes the input and stores the Key-Value (KV) cache in memory.
Subsequent queries against that cached content skip the initial processing phase. This results in:
from vertexai.preview import caching
# Cache a massive 1-hour video (minimum 32k tokens required)
cache = caching.CachedContent.create(
model_name="gemini-3.1-pro-001",
system_instruction="You are a video analyst.",
contents=[video_part],
ttl=datetime.timedelta(minutes=60)
)