Unlike Chat Completions where you manually manage message history, the Responses API can persist conversations server-side.
// First message
const r1 = await openai.responses.create({
model: "gpt-5.4",
store: true,
input: "My name is Alex and I'm building a SaaS app."
});
// Follow-up — references the previous response
const r2 = await openai.responses.create({
model: "gpt-5.4",
store: true,
previous_response_id: r1.id,
input: "What tech stack would you recommend for my project?"
});
When you have dozens of function tools or MCP servers, loading all their schemas into context wastes tokens. Tool Search defers tool loading until the model needs them.
const response = await openai.responses.create({
model: "gpt-5.4",
tools: [
{ type: "function", name: "get_weather", ... },
{ type: "function", name: "book_flight", ... },
// ... 50 more functions
],
tool_search: true, // Only inject relevant tools
input: "What's the weather in London?"
});