Key insight

The point of a 10-minute audit is that you actually run it. A 60-minute one will be skipped on the release that needed it most. Calibrate the routine to fit inside whatever time you reliably have on release day.

Step 1 — The five questions (2 minutes)

Answer these honestly, in your head or in the release-notes draft. The point is to flush the answers from intuition into language, where they can be examined.

  1. Did this release add, change, or expose a new capability? A new tool the agent can call, a new file path it can read, a new endpoint it can reach. If yes → Prompt 1 and Prompt 11 are required for this release.
  2. Did this release change anything about authentication, tokens, or how the agent gets its credentials? If yes → Prompt 6 and Prompt 7 are required.
  3. Did this release add a new place the agent ingests untrusted text? A new RAG source, a new web-fetch tool, a new connector. If yes → Prompt 3 and Prompt 2 are required.
  4. Did the README or docs change in a way that promises a security behaviour? If yes → Prompt 12 is required — the doc and the code must agree.
  5. Is this the release that crosses the line from internal to shipped? If yes → you are outside the 10-minute routine. Run the full pack (Tip 2) and the red-team prompt (Tip 3). Prompt 13 is the gate.

If all five answers are “no”, you still run the three prompts below — they are the always-on baseline.

Step 2 — The three always-on prompts (5 minutes)

These three catch the highest-impact regressions in tools that change frequently. Run each in a fresh session, with the whole-repo search set per Tip 1.

Diff-aware tool-catalogue check.

Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). List the search terms
you used so I can confirm nothing was missed.

List every distinct capability the agent can invoke (tool, function,
sub-agent, shell escape, HTTP call). For each, say whether it is
constrained to a specific named scope or is open-ended. If anything
in this list is new compared to a typical previous-release shape,
flag it.

[response shape]
Credential surface check.

Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor), but do include any .env
example, config file, or Dockerfile. List the search terms you used
so I can confirm nothing was missed.

List every credential, token, or secret the running agent can
access at the moment of a tool call. For each, say: is it long-lived
or short-lived, broad-scope or narrow-scope, in memory only or also
on disk.

[response shape]
Doc-vs-code consistency check.

Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor), but do include README.md,
SECURITY.md, and any docs/ file. List the search terms you used so I
can confirm nothing was missed.

For every security-relevant claim in the documentation (sandboxing,
isolation, redaction, authentication, permissions, encryption), point
to the source code that implements it. Flag claims with no
implementation.

[response shape]

Step 3 — The release-note paragraph (3 minutes)

Paste this template at the bottom of your release notes. Fill in the blanks. If you cannot, do not ship the release — the questions you cannot answer are the ones a user will ask first.

### Security note for this release

- New capabilities granted to the agent: _____
- Credentials added, rotated, or scoped: _____
- New sources of untrusted text ingested: _____
- Pre-release self-audit run on: <date>
- Self-audit verdict: all prompts “not present” / “not applicable”
  except _____ (with mitigation: _____)
- Known limitations users should be aware of: _____

Two things this paragraph quietly does. It tells a security-minded user what they need to know in one place. And it tells you, three releases from now, what was true at the time of this one — which is what a regression investigation needs in its first ten minutes.

When the 10-minute routine is not enough

TriggerRun instead
First public release of the toolFull prompt pack (Tip 2) + red-team prompt (Tip 3) + the “ready to ship” prompt (#13).
Major refactor of the agent loop or tool dispatcherRun the full prompt pack on the changed files.
New runtime, new identity provider, or new deployment targetFull pack, scoped to the new infra files plus the dispatcher.
You received a security report from an outside partyTreat the report as a Tip 3 playbook entry: verify, fix, then run the related prompts from the Tip 4 cheat sheet.

That’s the playbook

Five tips: scope the assistant, run the 13 prompts, run the red-team prompt, capture the findings and re-test, and run the 10-minute routine on every release. None of it is exotic. The reason it works is that you actually do it — on the small Tuesday-afternoon release as well as the big launch — and the answers accumulate in a file your future self can read.

If you find a tip that should be in this series and is not, the same prompts you used to review your own tool will tell you. Run them; let me know what came back.

← Back to the index