[ ABORT TO HUD ]
SEQ. 1
SEQ. 2

Thresholds, Costs, ZDR

Prompt Caching Framework15 min400 BASE XP

Economic and Technical Limits

Caching is not free; it involves a write premium during the initial serialization of the context. However, all subsequent 'reads' of that cache hit a massive 90% discount. To justify the overhead, Anthropic enforces minimum token thresholds: 1024 tokens for Claude Sonnet 4.6 and 2048 tokens for Claude Opus 4.7. If your context is smaller than this, the cache header is simply ignored.

Lifecycle & Compliance

Caches are ephemeral and have a default TTL (Time To Live) of 5 minutes. Every time a cache is 'read', the TTL timer resets. For enterprises focused on security, this system is fully compatible with Zero Data Retention (ZDR)—the cached bits are held recursively in the inference boundary and vaporize immediately upon expiry, ensuring no persistent logs are generated natively.

SYNAPSE VERIFICATION
QUERY 1 // 3
If you send a 500-token prompt with a cache header to Sonnet, what happens?
It caches successfully
The system ignores the header because the context is below the 1024-token minimum
It costs double tokens
It throws a 400 error
Watch: 139x Rust Speedup
Thresholds, Costs, ZDR | Prompt Caching Framework — Claude Academy