Essay March 4, 2026

The Anatomy of a Document Agent

What does it actually mean for a document to 'decide' something? The four components that give an invoice or contract genuine agency.

We’ve talked about treating invoices and contracts as autonomous agents rather than passive data objects. But what does that actually mean? How does a document “decide” anything? What gives it agency?

This post breaks down the mechanics.

The Four Components of a Document Agent

Every document-agent has four essential components:

1. Identity and State

An invoice knows what it is. Not just the data fields (amount, customer, due date) but its complete situational identity.

I am invoice #4521
I am owed by Acme Corp, a customer since 2019 with a 94% on-time payment rate
I am 12 days past due
I have been acknowledged but not disputed
Two reminder emails have been sent, both opened but not responded to
The customer’s AP contact is Sarah Chen, who typically responds within 48 hours
Acme Corp has three other open invoices with us, all current

This isn’t a database record. It’s self-awareness. The document maintains a living understanding of where it stands and what context surrounds it.

2. Goals and Priorities

An invoice-agent has an objective: get paid. But that’s not a simple goal. It’s a hierarchy of priorities that shift based on context.

Primary goal: Collect the full amount owed.

Secondary goal: Maintain the customer relationship.

Tertiary goal: Minimize collection cost and effort.

These goals sometimes conflict. An aggressive collection approach might accelerate payment but damage a valuable relationship. The invoice-agent has to navigate this tension.

Different invoices have different goal weightings. A $50,000 invoice to a strategic account prioritizes relationship preservation. A $500 invoice to a one-time customer prioritizes efficiency.

The document knows which category it falls into.

3. Perception and Triggers

An agent must sense its environment. An invoice-agent monitors for:

Payment events (partial payment received, payment initiated, payment failed)
Communication events (email opened, response received, out-of-office reply)
Customer context changes (new invoices issued, disputes filed on other invoices, credit risk signals)
Calendar events (due date approaching, grace period expiring, escalation threshold reached)
Relationship signals (customer mentioned in CRM notes, sales team flagged the account, executive involvement)

Each of these inputs can trigger the agent to re-evaluate its situation and consider action.

This is different from workflow automation. A workflow triggers on a single condition: “if 30 days past due, send template B.” An agent perceives a full context and decides what that context implies.

4. Decision and Action

Here’s where agency lives. Given its identity, goals, and perception of current circumstances, the document decides what to do next.

This isn’t rule-following. It’s reasoning.

The invoice evaluates: I’m 12 days past due. Two emails sent, both opened, no response. This customer has a 94% on-time rate. Their AP contact usually responds within 48 hours. She hasn’t. That’s unusual.

Options: Send a third email. Call the AP contact. Escalate to AR manager. Wait another 48 hours. Check if there’s an unlogged dispute.

Decision: Before escalating, check whether Sarah Chen is out of office or whether Acme Corp has filed a dispute I’m not aware of. If neither, send a different type of message, not a reminder, but a genuine inquiry asking if something is wrong.

The invoice took an action that no workflow would have prescribed. It recognized a pattern anomaly and adapted.

The Agent Loop

Document-agents don’t act once and wait. They run a continuous loop:

Observe → What has changed since my last evaluation? Any new payments, communications, context signals?

Orient → Given my current state and goals, what does this new information mean? Am I on track? Is something unexpected happening?

Decide → What are my options? Which option best serves my goal hierarchy given current context?

Act → Execute the chosen action. Send the communication. File the escalation. Initiate the call.

Update → Record what I did, what response I received, and how this changes my state.

This loop runs constantly. Not on a schedule, on relevance. When something meaningful changes, the agent wakes up and evaluates.

How Contracts Work as Agents

Contracts are more complex because they govern ongoing relationships with multiple obligations flowing in both directions.

A contract-agent maintains:

Obligation tracking. Both parties have commitments. The contract knows who owes what, by when. It monitors fulfillment on both sides.

Term interpretation. The contract understands its own clauses. Not as text, but as operational logic. If Section 4.2 says payment is due within 30 days of milestone completion, the contract knows to watch for milestone completion and start the payment clock.

Amendment history. Contracts evolve. The agent carries the full history of changes, superseded terms, and negotiated exceptions.

Remedies and escalation paths. When a breach occurs, the contract knows what remedies it contains. Cure periods. Penalty clauses. Termination rights. It can invoke these autonomously or recommend them to human decision-makers.

A contract-agent’s loop is similar but slower. It’s less about daily collection activity and more about longer-cycle monitoring:

Is the counterparty performing?
Are we performing?
Is renewal approaching?
Have circumstances changed enough to warrant renegotiation?
Has a trigger event occurred that invokes specific provisions?

Coordination Between Document Agents

An invoice doesn’t exist in isolation. A customer might have ten invoices outstanding. These need to coordinate.

If Acme Corp has three invoices past due, you don’t want three separate agents sending three separate collection emails on the same day. Document-agents communicate with each other.

Invoice #4521 to Invoice #4522 and #4523: Acme Corp is past due on all three of us. I’ll take the lead on communication. You two wait. If I get a response, I’ll share the context.

This coordination extends to contracts. If the Acme Corp master services agreement contains provisions about payment terms, the invoices issued under that agreement inherit that context. The contract-agent and invoice-agents share information.

Where Humans Fit

Document-agents don’t replace human judgment. They surface the moments where human judgment is needed.

Most collection activity doesn’t require a human. “Send a reminder email on day 7” is not a decision that benefits from human involvement.

But “Acme Corp’s CFO just called our CEO about cash flow problems” is a context that changes everything. The invoice-agent recognizes this signal, pauses its autonomous activity, and escalates for human guidance.

The agent’s job is to handle the routine so that humans focus on the exceptional. It’s also to recognize the exceptional when it appears.

The Implementation Reality

This isn’t science fiction. The components exist:

State management: Modern databases and document systems can maintain rich, contextual records.
Goal specification: Objectives and priorities can be encoded and adjusted per document type or customer segment.
Perception: Event streams, webhooks, and integration platforms provide real-time signals.
Decision and action: Large language models can reason about context, evaluate options, and generate appropriate communications.

The technical infrastructure for document-agents exists. What’s missing is the architectural decision to build this way, to treat documents as the locus of intelligence rather than as passive inputs to centralized systems.

Chris Couch is Head of Product for B2B at Flywire. He writes about AI in B2B finance. Work with me →