Analyze - The Drive AI

Unlike /extract which pulls explicitly stated values, /analyze uses a multi-step reasoning agent that can compute, infer, and derive answers that aren’t written in the document.

When to use Analyze vs Extract

Use case	Endpoint
”What is the invoice number?”	`/extract`
”What is the sum of all line items?”	`/analyze`
”Does this contract auto-renew?”	`/analyze`
”Rate the legal risk: low/medium/high”	`/analyze`
”Pull the vendor name and address”	`/extract`
”Are any line items duplicated?”	`/analyze`

Rule of thumb: if the answer is directly stated in the document, use /extract. If it requires computation, comparison, or judgment, use /analyze.

Example

from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.analyze(
    file="contract.pdf",
    schema={
        "total_value": {
            "type": "number",
            "description": "Total contract value (sum all payment amounts)"
        },
        "auto_renews": {
            "type": "boolean",
            "description": "Does this contract auto-renew?"
        },
        "termination_notice_days": {
            "type": "integer",
            "description": "How many days notice required to terminate?"
        },
        "risk_level": {
            "type": "string",
            "enum": ["low", "medium", "high"],
            "description": "Overall legal risk level"
        },
    },
)

print(result.data)
# {"total_value": 250000, "auto_renews": true, "termination_notice_days": 90, "risk_level": "medium"}

print(result.reasoning["auto_renews"])
# "Section 8.2 states 'This agreement shall automatically renew for successive one-year terms
#  unless either party provides written notice of non-renewal at least 90 days prior...'"

print(result.sources["total_value"])
# ["Year 1: $100,000 (Section 4.1)", "Year 2: $150,000 (Section 4.2)"]

Works with Pydantic and Zod too — see Schemas.

Response

Field	Description
`data`	Computed answers, type-enforced to match your schema
`confidence`	Per-field confidence scores (0.0 - 1.0)
`reasoning`	Per-field step-by-step explanation of how the answer was derived
`sources`	Per-field text snippets from the document supporting the answer
`steps`	Agent tool call trace (only when `include_steps=true`)
`total_pages`	Number of pages in the document
`credits_used`	Credits consumed

Confidence scores

Range	Meaning
0.95 - 1.0	Directly stated or computed from unambiguous data
0.80 - 0.94	Clearly derivable, minor interpretation required
0.60 - 0.79	Reasonably inferred, some assumptions made
0.30 - 0.59	Ambiguous or indirect evidence
0.00 - 0.29	Speculative or not found

How the agent works

The analyze endpoint runs a multi-step reasoning agent that has access to:

Tool	Purpose
`search`	Semantic + keyword search. Understands synonyms and handles typos.
`read_pages`	Read specific pages by number
`regex_search`	Find patterns — dates, amounts, percentages, IDs
`compute`	Execute Python code for calculations. All math goes through code, never mental math.
`extract_table`	Get structured table data from PDFs, CSVs, or spreadsheets
`view_page_image`	Visual analysis of scanned pages, charts, or diagrams (PDFs only)

The agent plans its approach, navigates to relevant sections, gathers evidence, cross-references, and verifies its answer before responding. Set include_steps=true to see the full reasoning trace.

Pricing

Input	Cost
Documents	2 credits/page (minimum 5)
Websites	10 credits flat

​When to use Analyze vs Extract

​Example

​Response

​Confidence scores

​How the agent works

​Pricing