Claude treats images as first-class citizens in the messages array. You have two options for passing visual data:
source object with type: "base64", media_type (e.g., image/jpeg), and the raw base64-encoded string.source object with type: "url" and a public URL. Claude will fetch and process the image directly, eliminating backend encoding overhead.The Claude 4.6 model family automatically resizes images that exceed internal limits. The maximum dimension is typically capped at 1568px. Every image is converted into a grid of 'tokens' (tiles). A typical 1024x768 image costs approximately 1,600 input tokens. Understanding this mapping is essential for managing costs in high-frequency vision applications.
Claude now natively supports PDF ingestion — you can pass multi-page PDF documents directly as content blocks. Each page is rendered and analyzed at the model's native resolution, making it ideal for contract review, invoice processing, and regulatory document analysis.