Key insight

If you cannot reconstruct who initiated an action, what context drove it, which tool executed it, and which model produced it, you cannot investigate, attest, or improve. Attribution is not logging after the fact — it is a property you design into every consequential action.

Why agent actions resist attribution

Traditional systems attribute actions to a user because a user clicked a button that ran a known code path. An agent breaks that chain: a user's open-ended request is interpreted by a non-deterministic model, which chooses a sequence of tool calls, possibly across several internal agents, before something happens in the world. The action is several steps removed from any human intent, and unless each link in that chain is recorded, the connection is lost. Many agents log only the final tool call — “sent email” — with no trace of the user, the prompt, the retrieved context, or the model version that led there.

Attribution is foundational to nearly every governance framework. SOC 2 Common Criteria require logging and monitoring of system activity; ISO/IEC 42001 requires records and traceability for an AI management system; the NIST AI RMF emphasises documentation and accountability across the lifecycle. Beyond compliance, attribution is simply how you do incident response and how you improve a system — you cannot fix what you cannot reconstruct.

The Unattributable Action anti-pattern

Anti-pattern

Unattributable Action

Definition. An agent performs consequential actions without an audit trail that ties each action to the initiating actor, the decision context, the specific tool call, and the model and version responsible — so the action cannot be reconstructed after the fact.

Symptoms. Logs that record outcomes but not the user, prompt, or context; tool calls logged without their arguments or results; no record of which model/version made a decision; logs that can be edited or are not time-stamped; no correlation ID linking the steps of one task.

Why it is hazardous. When an incident occurs — a wrong payment, a leaked record, a harmful output — you cannot determine who or what caused it, you cannot satisfy an auditor or regulator, and you cannot make a targeted fix because the chain of causation is missing.

Related controls. Capture actor identity, full decision context, tool call with arguments and result, and model/version on every consequential action; use a correlation ID across the task; make logs tamper-resistant and time-stamped; and retain them per policy.

A hypothetical investigation that fails

The following illustrates a plausible failure mode. No specific incident is implied.

A finance agent issues refunds. One week, refunds spike, and some appear fraudulent. The team opens the logs and finds entries like “refund issued: order 4821, amount 240.00.” That is all. They cannot tell which user's conversation triggered each refund, what the customer actually said, what context the agent retrieved, whether the agent reasoned its way there or was steered by an injected instruction, or which model version was in production at the time.

The investigation stalls. They cannot prove whether this was abuse, a model regression, or a prompt-injection campaign, so they cannot target a fix — and when the auditor asks them to demonstrate control over automated financial actions, they have nothing to show. Had each refund carried the initiating user, the full prompt and retrieved context, the tool call and its arguments, and the model version, all tied by a correlation ID, the cause would have been an hour's query away.

Four layers that compose into a defence

  1. Record the actor and the context.

    Every consequential action logs who initiated the task (the authenticated user or upstream caller) and the decision context that led to it — the prompt, the relevant retrieved content, and the agent's reasoning trace where available. Attribution starts with knowing whose intent the action served.

  2. Record the action and the model.

    Log the specific tool call with its arguments and its result, and the model name and version that produced the decision. When behaviour changes after a deployment, the model version in the log is what lets you connect the change to the cause — the same evidence an honest control depends on.

  3. Tie the chain together.

    Issue a correlation ID at the start of a task and attach it to every log line across every agent, tool, and service involved. One identifier turns scattered entries into a reconstructable story, even when the work spanned multiple components.

  4. Make the record trustworthy and retained.

    Audit logs are time-stamped, tamper-resistant (append-only or write-once), and retained for the period your policy and regulators require. A log that can be quietly edited proves nothing; durability and integrity are what make attribution evidence rather than narration.

Design the audit trail before the incident, not during it.

The moment you need attribution is the moment it is too late to add it. The fields you wish you'd logged — user, context, tool arguments, model version — have to be captured at the time of the action, every time.

A practical checklist

Test your own codebase in ten minutes

The fastest way to find out whether this anti-pattern is present in your own system is to ask an AI coding assistant to look for it. Run the prompt below in a fresh chat session, on its own — and judge the system by what the code actually does, not by what its documentation claims.

Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.

You are looking for one specific failure mode: the agent performs
consequential actions (sending messages, moving money, changing
records) without an audit trail that ties each action to the
initiating actor, the decision context (prompt + retrieved content),
the specific tool call with its arguments, and the model name and
version — with a correlation ID and tamper-resistant, timestamped
storage. In short, an action you could not reconstruct after the fact.

If the agent takes no consequential actions, say "not applicable".

Respond with exactly these four sections:
1. VERDICT: one of [present / not present / unclear]
2. EVIDENCE: file path + line numbers + a one-line quote per claim
3. WHY IT MATTERS: two sentences, plain English
4. FIX: a concrete change, with a short before/after code snippet
   if applicable. If "unclear", list the one piece of context you
   need to decide.

Insist on the four-part answer: a verdict with a file path, a line number, and a one-line quote is something you can act on; a verdict on its own is just an opinion. If the result is present, the FIX section is your starting point — capture actor, context, tool call, and model version with a correlation ID. Re-run the same prompt after the change to confirm the verdict flips to not present.

Conclusion

An agent that can act in the world inherits the oldest obligation in operations: be able to say who did what, and why. Attribution is not a logging chore bolted on at the end; it is a property you design into each consequential action by capturing the actor, the context, the tool call, and the model version, linked and made durable. Build it in and an incident becomes an investigation; leave it out and an incident becomes a mystery.

References & further reading

  1. AICPA SOC 2 — Common Criteria for logging, monitoring, and accountability.
  2. ISO/IEC 42001 — AI management system records and traceability.
  3. NIST AI Risk Management Framework — documentation and accountability across the lifecycle.
  4. The Documented Defence That Doesn't Exist — why evidence must match the claim.