Cross-Document Analysis
Analyze and cross-reference multiple documents together. Unlike /analyze (single
document) or /analyze/batch (many documents independently), this endpoint lets
the agent reason across all documents simultaneously.
Use cases:
- Validate an invoice against a contract or purchase order
- Check that a report’s numbers match a source spreadsheet
- Compare terms across multiple agreements
- Reconcile data from different sources
Upload 2-5 files and/or provide URLs. Each document gets a label (auto-generated from filename, or you can provide custom labels). The agent can search, read, and compare data across all documents.
Provide a typed schema — each field has a type and description.
Example:
{
"rates_match": {"type": "boolean", "description": "Do the hourly rates on the invoice match the contract?"},
"total_valid": {"type": "boolean", "description": "Does the invoice total equal the sum of line items?"},
"correct_vendor": {"type": "boolean", "description": "Is the vendor name on the invoice the same as the contracting party?"}
}
Pricing: 5 credits/document + 3 credits/page (minimum 10).
Headers
Body
JSON schema — keys are field names, values describe what to validate or compute across documents
Files to analyze (2-5). Labeled by filename or via document_labels.
JSON array of URLs, or JSON object mapping labels to URLs (e.g. {"invoice": "https://..."}). Combined with files, total must be 2-5.
JSON-encoded headers for URL auth
JSON array of labels for the uploaded files, in order. Must match files count. E.g. ["invoice", "contract"]
Include the agent's reasoning trace in each answer
Set to false to process asynchronously. Returns a task_id to poll.
URL to POST the result to when async processing completes.
Response
Successful Response
Response for cross-document analysis — flat parallel dicts matching /extract.
Computed/derived values keyed by field name
Metadata for each document in the analysis
Model used for analysis
Combined text length across all documents
Combined page count across all documents
Per-field confidence scores (0.0–1.0)
Per-field source text snippets (prefixed with [doc_label])
Per-field step-by-step explanation referencing specific documents
Per-field agent reasoning trace. Empty unless include_steps=true.
Credits consumed
Schema format: always 'typed'
typed