Skip to main content
Cross-analyze lets the agent reason across 2-5 documents simultaneously. Unlike /analyze/batch (which processes documents independently), cross-analyze compares, validates, and reconciles data between documents.

Use cases

  • Validate an invoice against a contract or purchase order
  • Check that a report’s numbers match a source spreadsheet
  • Compare terms across multiple agreements
  • Reconcile data from different sources

Example

from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.analyze_cross(
    files=["invoice.pdf", "contract.pdf"],
    document_labels=["invoice", "contract"],
    schema={
        "rates_match": {
            "type": "boolean",
            "description": "Do the hourly rates on the invoice match the contract?"
        },
        "total_valid": {
            "type": "boolean",
            "description": "Does the invoice total equal the sum of line items?"
        },
        "correct_vendor": {
            "type": "boolean",
            "description": "Is the vendor name on the invoice the same as the contracting party?"
        },
    },
)

print(result.data)
# {"rates_match": false, "total_valid": true, "correct_vendor": true}

print(result.sources["rates_match"])
# ['[invoice] "Rate: $150.00/hr"', '[contract] "Hourly rate: $125.00"']

print(result.reasoning["rates_match"])
# "Invoice hourly rate ($150) differs from contract rate ($125). Discrepancy of $25/hr."

for doc in result.documents:
    print(f"{doc.label}: {doc.total_pages} pages")
Works with Pydantic and Zod too — see Schemas.

Providing documents

You can mix uploaded files and URLs in a single request:
result = client.analyze_cross(
    files=["local_invoice.pdf"],
    urls={"contract": "https://example.com/contract.pdf"},
    document_labels=["invoice"],
    schema={...},
)

URL formats

FormatExampleLabel behavior
Arrayurls=["https://a.com/inv.pdf", "https://b.com/ctr.pdf"]Auto-generated from filename
Objecturls={"invoice": "https://a.com/inv.pdf"}Uses the key as the label

Document labels

Each document gets a label used in source citations ([invoice] "...", [contract] "..."). Labels are auto-generated from filenames, or you can set them explicitly via document_labels (for uploaded files) or URL object keys (for URLs).

Response

Same structure as /analyze, plus per-document metadata:
FieldDescription
dataComputed answers, type-enforced
confidencePer-field confidence scores (0.0 - 1.0)
reasoningPer-field explanation referencing specific documents
sourcesPer-field text snippets prefixed with [doc_label]
documentsList of {label, filename, content_type, total_pages, text_length}
total_pagesCombined page count across all documents
credits_usedCredits consumed

Limits

ConstraintValue
Documents per request2 - 5
Total combined size100 MB
Fields per schema10
Timeout5 minutes (use sync=false for async)

Pricing

5 credits per document + 3 credits per page (minimum 10).