What is provenance in agent systems?
Provenance is knowing who wrote what, when, based on what input, and whether it is authoritative.
Four answers, every write. That is the whole concept. The rest of this article unpacks why those four answers are non-negotiable, what breaks when they are missing, and why provenance is the bridge between shared state and shared trust.
Shared state without provenance
The promise of shared state is that humans and agents read and write the same surface. Plans, tasks, drafts, decisions, contributions. One canonical place, everyone aligned.
The problem with that promise is that a shared surface without provenance is just a wall of edits. You read an entry. You do not know if a senior researcher wrote it or a confused agent on its third retry overwrote it. You do not know if it reflects the latest user input or a hallucinated detail from yesterday. You do not know if it is the agreed plan or somebody's draft that never got finalized.
Shared state without provenance is shared ambiguity. The surface looks authoritative. It is not. Every reader has to decide for themselves whether to trust each entry, and they do not have the information to decide.
This is the failure mode that quietly kills multi-agent systems. Not the dramatic kind where an agent goes off the rails. The slow kind, where stale entries accumulate, wrong answers blend with right answers, and trust in the surface erodes until nobody reads it anymore. The system technically still works. The state technically still exists. But the humans have stopped relying on it, and the agents are reading garbage.
The four questions
Every write to shared state needs to answer four questions. None of them are optional.
Who. Which agent or human created this entry. Not a generic label. The specific identity. The drafter agent at version two, the senior reviewer named in the org, the user who logged in at this timestamp. Identity is the first thing a reader needs to know because everything else hinges on who is talking.
When. The timestamp. Not just the date. The exact moment of the write. Time matters because freshness matters. An entry from an hour ago carries more weight than one from a week ago. A reader who does not know when something was written cannot judge whether it still applies.
Based on what. The input that produced this entry. The user message that triggered it. The previous state version it was responding to. The retrieved document it cited. The reasoning trace, summarized or linked. Without this, the entry is a claim with no foundation. With it, the claim becomes inspectable.
Authority. Whether this entry is final or still in motion. Draft or published. Tentative or agreed. Proposed or accepted. Authority is the answer to the question every reader silently asks: should I act on this. An entry that does not declare its authority leaves every reader to guess.
Four answers. Every write. Anything less than that is a system that produces uncertainty as a byproduct of its own operation.
What goes wrong without it
Three failure modes dominate.
The first is memory rot. An agent writes a plan into shared state. The plan was right at the time. The user changes their mind, the constraints shift, the world moves on. Nobody updates the entry. A week later another agent retrieves the old plan and acts on it. The output is technically consistent with the state. It is also wrong. There was no signal in the entry that it was no longer authoritative, because there was no authority field, no timestamp anyone checked, no provenance trail showing what input it was responding to.
Memory rot is undetectable in the moment. You only notice it after the agent has produced something based on stale truth, and by then the cost is already paid. The fix is not better memory. The fix is provenance that lets a reader tell stale from fresh at a glance.
The second is stale entries quietly winning retrieval. An agent searches the state for relevant context. The search returns five entries. Three of them are current. Two are old and wrong. Without provenance, the agent treats all five equally. The two stale entries influence the next action just as much as the three fresh ones. The agent's output is a weighted average of truth and obsolescence, and nobody can tell from the output which was which.
This is how systems develop a slow drift. Each interaction is mostly right. Some fraction of each interaction is contaminated by stale state. Over time the contamination compounds. Outputs that would have been reliable become subtly wrong in ways that are very hard to diagnose. The agents look fine. The state looks fine. The system is decaying.
The third is oversight as forensic reconstruction. A human notices something went sideways. They want to figure out what happened. Without provenance, they have to interview each agent in turn, ask what it did, when, based on what. They have to piece together the sequence of writes from logs that were not designed for this. The reconstruction is approximate at best. Most of the time it is wrong.
This is the opposite of structural oversight. Real oversight reads the state and sees what happened. Forensic oversight tries to recover what happened after the fact, and pays the full cost of the investigation every time. A system without provenance taxes its operators with forensic work every time something needs to be reviewed.
Provenance as the trust primitive
Trust in an agent system is not a feeling. It is a property of the surface. A surface where every entry tells you who wrote it, when, based on what, and with what authority is a surface you can trust. A surface that hides any of those is a surface that requires faith.
Trust built on faith does not scale. The first time the system produces a bad output, the faith collapses, and now the operator has to read every entry skeptically. Trust built on provenance scales. The same surface that lets you act quickly when things are going well lets you investigate quickly when they are not. The information you need is already there.
This is the bridge from shared state to shared trust. Shared state without provenance is collaboration without accountability. Shared state with provenance is collaboration with a paper trail that anyone can read. The trail is the trust.
The teams that take this seriously treat provenance as part of the data model, not as a logging concern. The four fields are columns on the entry, not metadata stashed in a separate table. Every read surfaces them. Every write requires them. The validation that prevents a malformed entry is the same validation that prevents an entry without proper provenance.
Provenance and authority drift
There is a subtle thing about authority that matters more than it looks.
A draft becomes a final. A proposal becomes a decision. A tentative plan becomes an agreed plan. These transitions are themselves writes, and they themselves need provenance. Who promoted the draft. When. Based on what review. With what authority did they have the power to promote it.
Without that, you get authority drift. An agent decides on its own that a draft is final and starts citing it as such. Other agents read the cited draft and treat it as authoritative. The provenance trail looks intact at the entry level, but the transitions between states are unaccounted for. The surface technically has provenance, but the meaning of the entries has shifted without anyone noticing.
The fix is to treat state changes as first-class writes. Every promotion, demotion, retraction, or revision has the same four-field provenance as the original entry. The surface gets denser, but the denseness is what keeps the meaning stable as the work moves.
What this looks like in practice
In a system that takes provenance seriously, you can answer a reader question like "is this still the plan" by looking at one entry. You see who wrote it last. You see when. You see what input they were responding to. You see whether it is marked draft, proposed, or agreed. You make a judgment in seconds.
In a system that does not, the same question is a research project. You search the chat history. You ask the agents what they think. You try to remember when the last meaningful change happened. You guess.
The cost difference between those two operations, multiplied across every read in the system, is the cost of provenance. Cheap to add when you build the surface. Brutally expensive to retrofit when the surface is already in use.
How to add it now
If you already have shared state and it does not have provenance, you do not need to rebuild. You need to add the four fields to every write path and start enforcing them. New entries get full provenance. Old entries get a flag that says provenance is unknown, which is itself useful information for any reader.
Then you wait. As the system operates, new entries replace old ones. Provenance coverage grows. Within a few weeks of normal operation, most of the state has clean provenance, and the readers can rely on what they are reading.
The teams that do this stop having mysterious bugs that take days to investigate. The bugs they have are still real, but the investigation takes minutes because the trail is in the state itself.
The shape of the work
Provenance is unglamorous. It is fields on records. Timestamps. Identity tokens. References to input. Authority flags. None of it looks like progress in a demo.
It is the difference between a system you can trust and a system you have to babysit. Every multi-agent system eventually gets to a size where babysitting stops being feasible. The teams that built provenance from the start keep moving. The teams that did not get stuck explaining what their agents did, why, and whether it was right.
The trust is in the trail. Build the trail.