[ ABORT TO HUD ]
SEQ. 1

Base64 Tensors & Calculations

👁️ Vision & Multimodality15 min650 BASE XP

Direct Optical Processing

Claude treats images as first-class citizens in the messages array. You have two options for passing visual data:

  • Base64: Provide a source object with type: "base64", media_type (e.g., image/jpeg), and the raw base64-encoded string.
  • URL (2026+): Provide a source object with type: "url" and a public URL. Claude will fetch and process the image directly, eliminating backend encoding overhead.

Resolution and Token Costs

The Claude 4.6 model family automatically resizes images that exceed internal limits. The maximum dimension is typically capped at 1568px. Every image is converted into a grid of 'tokens' (tiles). A typical 1024x768 image costs approximately 1,600 input tokens. Understanding this mapping is essential for managing costs in high-frequency vision applications.

PDF Document Processing

Claude now natively supports PDF ingestion — you can pass multi-page PDF documents directly as content blocks. Each page is rendered and analyzed at the model's native resolution, making it ideal for contract review, invoice processing, and regulatory document analysis.

OCR Tip: While Claude has elite spatial perception, reading tiny font (below 8pt) from dense scans remains a challenge. For high-precision document analysis, it is best practice to pass the visual image AND the structural text extracted via a standard OCR engine simultaneously.
SYNAPSE VERIFICATION
QUERY 1 // 2
What are the two valid ways to pass an image to Claude in 2026?
Email and FTP
Base64-encoded source objects or URL-based source objects
Only PDF attachments
Via a local file path string
Watch: 139x Rust Speedup
Base64 Tensors & Calculations | Vision & Multimodality — Claude Academy