Cross-Analyze - The Drive AI

Cross-analyze lets the agent reason across 2-5 documents simultaneously. Unlike /analyze/batch (which processes documents independently), cross-analyze compares, validates, and reconciles data between documents.

Use cases

Validate an invoice against a contract or purchase order
Check that a report’s numbers match a source spreadsheet
Compare terms across multiple agreements
Reconcile data from different sources

Example

from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.analyze_cross(
    files=["invoice.pdf", "contract.pdf"],
    document_labels=["invoice", "contract"],
    schema={
        "rates_match": {
            "type": "boolean",
            "description": "Do the hourly rates on the invoice match the contract?"
        },
        "total_valid": {
            "type": "boolean",
            "description": "Does the invoice total equal the sum of line items?"
        },
        "correct_vendor": {
            "type": "boolean",
            "description": "Is the vendor name on the invoice the same as the contracting party?"
        },
    },
)

print(result.data)
# {"rates_match": false, "total_valid": true, "correct_vendor": true}

print(result.sources["rates_match"])
# ['[invoice] "Rate: $150.00/hr"', '[contract] "Hourly rate: $125.00"']

print(result.reasoning["rates_match"])
# "Invoice hourly rate ($150) differs from contract rate ($125). Discrepancy of $25/hr."

for doc in result.documents:
    print(f"{doc.label}: {doc.total_pages} pages")

Works with Pydantic and Zod too — see Schemas.

Providing documents

You can mix uploaded files and URLs in a single request:

result = client.analyze_cross(
    files=["local_invoice.pdf"],
    urls={"contract": "https://example.com/contract.pdf"},
    document_labels=["invoice"],
    schema={...},
)

URL formats

Format	Example	Label behavior
Array	`urls=["https://a.com/inv.pdf", "https://b.com/ctr.pdf"]`	Auto-generated from filename
Object	`urls={"invoice": "https://a.com/inv.pdf"}`	Uses the key as the label

Document labels

Each document gets a label used in source citations ([invoice] "...", [contract] "..."). Labels are auto-generated from filenames, or you can set them explicitly via document_labels (for uploaded files) or URL object keys (for URLs).

Response

Same structure as /analyze, plus per-document metadata:

Field	Description
`data`	Computed answers, type-enforced
`confidence`	Per-field confidence scores (0.0 - 1.0)
`reasoning`	Per-field explanation referencing specific documents
`sources`	Per-field text snippets prefixed with `[doc_label]`
`documents`	List of `{label, filename, content_type, total_pages, text_length}`
`total_pages`	Combined page count across all documents
`credits_used`	Credits consumed

Limits

Constraint	Value
Documents per request	2 - 5
Total combined size	100 MB
Fields per schema	10
Timeout	5 minutes (use `sync=false` for async)

Pricing

5 credits per document + 3 credits per page (minimum 10).

​Use cases

​Example

​Providing documents

​URL formats

​Document labels

​Response

​Limits

​Pricing