[ ABORT TO HUD ]
SEQ. 1
SEQ. 2
SEQ. 3

The Batch API

💰 Production & Cost Optimization 12 min 250 BASE XP

50% Cost Savings for Async Work

The Batch API lets you submit large batches of requests asynchronously. In exchange for flexible completion times (up to 24 hours), you get a 50% discount on input/output tokens.

When to Use Batch API

✅ Good Fit❌ Bad Fit
Bulk classification (10K+ items)Real-time chat responses
Dataset labeling/annotationUser-facing interactions
Content moderation queuesTime-sensitive queries
Embedding generation at scaleInteractive agents
// 1. Create a JSONL file of requests
// 2. Upload it
const file = await openai.files.create({
  file: fs.createReadStream("batch_requests.jsonl"),
  purpose: "batch"
});

// 3. Submit the batch
const batch = await openai.batches.create({
  input_file_id: file.id,
  endpoint: "/v1/responses",
  completion_window: "24h"
});

// 4. Poll for completion
const status = await openai.batches.retrieve(batch.id);
// status.status: "completed" → download results
💡 Pro Tip: Batch API works with all endpoints — Responses, Chat Completions, Embeddings, and even Image Generation. Use it for any high-volume, non-urgent workload.
SYNAPSE VERIFICATION
QUERY 1 // 3
How much discount does the Batch API provide?
10%
25%
50% on input/output tokens
75%
Watch: 139x Rust Speedup
The Batch API | Production & Cost Optimization — OpenAI Academy