Technical 7 min read

The Two-Agent Pattern for Production Document AI

DocAI Fabric Team ·

When teams first build document AI systems, they typically follow one of two patterns:

Pattern A: Fully Deterministic A rigid pipeline with hardcoded rules. Reliable and auditable, but can’t adapt to new document formats or learn from mistakes.

Pattern B: Fully Autonomous AI An LLM agent that handles everything end-to-end. Flexible and adaptive, but unpredictable — different results on different runs, no audit trail, and hard to debug.

We’ve found that production document AI needs both — a deterministic runtime for reliability and an intelligent agent for improvement. We call this the Two-Agent Pattern.

Agent 1: The Document Processing Agent

The Document Processing Agent is the runtime engine. It processes documents through a fixed pipeline:

Upload → OCR → Split → Classify → Extract → Validate → Output

Key characteristics:

  • Deterministic — same input produces same output
  • Fast — optimized for throughput, processes documents in seconds
  • Auditable — every step produces a trace with confidence scores and source references
  • Stateless — no learning happens during processing

This agent uses AI models (GPT-4o, Claude, etc.) for classification and extraction, but wraps them in a deterministic framework:

  1. The prompt is generated from a fixed template + document-type schema
  2. The model’s response is parsed against a strict JSON schema
  3. Business rules validate the extracted data
  4. If validation fails, a retry is triggered with modified instructions
  5. The final output includes confidence scores and source text references

The Document Processing Agent is not a chatbot or a general-purpose AI. It’s a deterministic pipeline that uses AI as a component — not as the orchestrator.

Agent 2: The Supervisor Agent

The Supervisor Agent operates offline — it doesn’t process documents directly. Instead, it:

  1. Analyzes processing results — reviews extracted data, confidence scores, and validation failures
  2. Identifies patterns — finds systematic errors across documents (e.g., “the model consistently misreads the ‘Policy Effective Date’ on ACORD 125 forms”)
  3. Suggests improvements — proposes prompt modifications, field description changes, or business rule additions
  4. Learns from corrections — when a human corrects an extraction error, the Supervisor incorporates that feedback

How the Supervisor Improves Accuracy

The key insight is that the Supervisor Agent doesn’t change the runtime pipeline. It changes the configuration that the Document Processing Agent uses.

Example flow:

  1. Human reviews extracted data and corrects a field value
  2. Supervisor Agent sees the correction
  3. Supervisor analyzes: “The field ‘Total Premium’ was extracted from the wrong table. The model confused ‘Quoted Premium’ with ‘Total Premium’ on 3 out of 10 similar documents.”
  4. Supervisor suggests: Update the field description to: “Total Premium — the final premium amount after all adjustments. Located in the ‘Premium Summary’ section. Do NOT use the ‘Quoted Premium’ from the proposal section.”
  5. Human approves the suggestion
  6. The updated description is saved to the document type configuration
  7. Next time the Document Processing Agent processes this document type, it uses the improved description

The runtime is deterministic. The improvement is controlled.

Why Two Agents Instead of One?

Reliability vs. Adaptability

A single agent that both processes documents AND learns from feedback creates a dangerous coupling. If the learning affects the processing logic, you lose determinism:

  • Yesterday’s correct extraction might become wrong today because the agent “learned” something from a different document
  • You can’t reproduce a previous result because the agent’s state has changed
  • You can’t audit the decision because the reasoning is entangled with learning

The Two-Agent Pattern solves this by separating concerns:

ConcernDocument Processing AgentSupervisor Agent
When it runsReal-time, on every documentOffline, periodically
What it doesProcesses documentsAnalyzes results, suggests improvements
StateStatelessStateful (tracks patterns over time)
DeterminismFully deterministicNon-deterministic (creative analysis)
Human oversightNot needed per documentRequired for approving changes

Enterprise Requirements

Enterprises need:

  • Reproducibility — given the same input, produce the same output (for audit)
  • Explainability — explain why a particular value was extracted (for compliance)
  • Controlled change — changes to processing logic go through approval (for governance)

The Two-Agent Pattern delivers all three: the Processing Agent provides reproducibility and explainability, while the Supervisor Agent provides improvement with human-approved change control.

Implementation Architecture

                    ┌─────────────────────────────┐
                    │    Supervisor Agent          │
                    │    (offline, analytical)     │
                    │                             │
                    │  • Analyze results          │
                    │  • Identify error patterns  │
                    │  • Suggest config changes   │
                    │  • Learn from corrections   │
                    └──────────┬──────────────────┘

                    Approved changes
                    (updated prompts,
                     field descriptions,
                     business rules)


┌──────────┐     ┌─────────────────────────────┐     ┌──────────┐
│ Documents │ ──► │  Document Processing Agent   │ ──► │ Extracted │
│           │     │  (real-time, deterministic)  │     │   Data    │
└──────────┘     │                             │     └──────────┘
                  │  • OCR                      │
                  │  • Classify                 │
                  │  • Extract (with schema)    │
                  │  • Validate (business rules)│
                  │  • Retry on failure         │
                  └─────────────────────────────┘

The two agents share a configuration store (document type definitions, field schemas, prompt templates, business rules) but operate independently. The Processing Agent reads configuration; the Supervisor Agent writes suggestions that humans approve.

Getting Started

DocAI Fabric implements the Two-Agent Pattern out of the box. When you set up a new document type:

  1. The Document Processing Agent starts processing immediately using your natural language descriptions
  2. As you review and correct results, the Supervisor Agent learns from your feedback
  3. Over time, accuracy improves — but every change goes through your approval

The result: production-grade document processing from day one, with continuous improvement built in.

Try the platform and see the Two-Agent Pattern in action.

Ready to try it?

See how DocAI Fabric handles your documents.