Model Distillation is the process of using a large, expensive model (teacher) to generate training data, then fine-tuning a smaller, cheaper model (student) to replicate the teacher's behavior.
| Metric | GPT-5.4 Thinking | Distilled Mini | Savings |
|---|---|---|---|
| Cost per 1M tokens | ~$15 | ~$0.40 | 97% |
| Latency | ~3-8s | ~0.3s | 90% |
| Quality (on your task) | 98% | 92-95% | Minimal loss |
store: true in the Responses API, OpenAI stores your completions. You can then use these stored outputs directly as fine-tuning data for distillation — no manual data collection needed.