For large-scale tasks (ETL, bulk summarization) that don't need instant feedback, use the Batch API. You prepare a JSONL file where each line is a standard Messages API request. Anthropic processes this asynchronously, typically within 24 hours (SLA), though usually much faster.
Because the Batch API allows Anthropic to optimize their GPU routing and timing, they offer a flat 50% discount on all batch tokens. This makes it the only viable solution for processing millions of documents or performing massive content moderation tasks in high-scale enterprises.