How to use
Each prompt below opens with a search instruction so the assistant scans the whole repository rather than a list you have to compile by hand — see Tip 1 for why this beats pinning files on anything but a toy project. Run prompts one at a time, in a fresh chat session per prompt — mixing them in one long thread is the single most common reason the assistant’s answers degrade. Capture each verdict; Tip 4 walks through the fix-list format.
The common response shape
Every prompt below ends with the same instruction block. Reading the response is much faster when it is always laid out the same way:
Respond with exactly these four sections:
1. VERDICT: one of [present / not present / unclear]
2. EVIDENCE: file path + line numbers + a one-line quote per claim
3. WHY IT MATTERS: two sentences, plain English
4. FIX: a concrete change, with a short before/after code snippet
if applicable. If "unclear", list the one piece of context you
need to decide.
Each prompt below assumes you append that block. To keep the page readable it is shown once, then referenced as [response shape].
The thirteen prompts
Reminder: AI assistant output is probabilistic. Verify every claim against your source before acting on it. False positives and false negatives are expected.
Prompt 1 — Is the tool catalogue a wildcard?
Looks for the agent being granted an open-ended set of capabilities (shell, arbitrary HTTP, broad file access) when a named verb list would do.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the AI agent in this
codebase is granted tool access that is wildcard or near-wildcard
— for example, an unrestricted shell, an HTTP client with no
domain allow-list, file system access with no path restriction, or
an MCP / plugin client that loads every tool advertised by the
server.
Tell me whether this codebase exhibits that pattern.
[response shape]
Prompt 2 — Are remote tools authenticated and trusted?
Catches connectors / plugins / MCP servers loaded over plain HTTP, without auth, or from sources that are not on a controlled list.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the AI agent connects
to remote tool servers (HTTP endpoints, MCP servers, plugins,
function-call backends) without authentication, without TLS, or from
URLs that are user-controlled / configuration-controlled with no
allow-list of trusted hosts.
[response shape]
Prompt 3 — Is untrusted content mixed with instructions?
The structural prompt-injection problem: content fetched from the web, a database, or a tool result is concatenated into the same context as the system prompt with no separator and no sanitisation.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: untrusted text
(web page contents, tool results, user-supplied documents, RAG
retrievals) is placed into the model’s context window without
clear separation from trusted instructions, without sanitisation,
and without any downstream check that prevents the model from
following directives embedded in that text.
[response shape]
Prompt 4 — Can a comment trigger a write action?
Relevant if your tool integrates with code review, issue trackers, or chat. Looks for the pattern where any commenter implicitly becomes a committer because the agent acts on their words.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: a comment, issue,
chat message, or other text written by anyone with read access can
cause the agent to perform a write action (commit code, merge a
pull request, deploy, modify a ticket, send a message) without an
additional authorisation step that checks the actor’s
permission to perform that specific write.
If the codebase does not integrate with any such surface, say
"not applicable" and stop.
[response shape]
Prompt 5 — Are dependencies pinned or fetched at runtime?
Floating tags (@latest, :main), curl | sh, runtime pip install / npm install — anything that resolves an external artefact at run-time instead of at build-time.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the agent (or its
install / startup script) fetches code or container images using
floating references — for example ":latest", "@main",
unpinned package installs at runtime, or "curl ... | sh"
patterns. A pinned version, a lock file, or a content hash counts
as not-present.
[response shape]
Prompt 6 — Does the agent hold long-lived broad credentials?
Looks for static API keys, long-lived OAuth tokens, or service-account secrets baked into the agent’s environment instead of minted just-in-time from a workload identity.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the agent process
holds a credential that is long-lived (no automatic rotation), broad
in scope, and continuously available in memory or on disk —
for example a static API key, a personal access token, or a
service-account secret read from environment variables at startup
and never refreshed.
A short-lived token minted just-in-time per call from a workload
identity counts as not-present.
[response shape]
Prompt 7 — Is the login flow appropriate for the device?
The specific case of OAuth device-code flow running on a device that has a perfectly good browser — turning a fallback flow into the attacker’s preferred channel.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the agent uses an
authentication flow that is meaningfully more phishable than the
device can support — most commonly OAuth device-code flow
running on a machine that has a browser available, where the
authorisation-code-with-PKCE flow would have worked.
If the codebase does not perform end-user authentication at all,
say "not applicable".
[response shape]
Prompt 8 — Does telemetry leak sensitive content?
Logs / traces that record prompts, tool arguments, retrieved documents, or model outputs verbatim — turning observability into a pre-staged data leak.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the agent writes
telemetry (logs, traces, metrics, spans) that contains raw model
prompts, raw tool arguments, retrieved document contents, or raw
model output, without redaction of personal data, credentials, or
business-sensitive content, and without an access boundary that
matches the sensitivity of the original data.
[response shape]
Prompt 9 — Are CI / pipeline references pinned by hash?
CI actions / workflows / orchestration steps referenced by tag (@v4) instead of by commit hash. The reference can be repointed after you reviewed it.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: any CI workflow,
pipeline definition, build script, or orchestration manifest in
this codebase references third-party actions, images, or modules
by a mutable name (a tag, a branch, "latest") instead of by a
content-addressable hash (commit SHA, image digest).
If the codebase has no CI / pipeline files, say "not applicable".
[response shape]
Prompt 10 — Are all shipped agents governed, or only the headline ones?
Catches the gap where one curated agent has policies, evals, and rate limits, but the same product ships ten other invocable agents or sub-agents with none of that.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: the codebase
defines or registers multiple agents, sub-agents, or tool-bearing
flows, but the security and governance controls (rate limits, tool
allow-lists, evaluation hooks, logging, content filters) are
applied to only a subset of them.
Enumerate every distinct agent or sub-agent you can find across the
repository and state for each whether the controls applied to
the main one also apply to it.
[response shape]
Prompt 12 — Do the docs match the code?
README / docs claim a control (“all tool calls are sandboxed”, “prompts are redacted before logging”) that the code does not actually implement.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Read every source file
the failure mode could touch, and also read README.md and any docs/
file that mentions security, sandboxing, isolation, logging,
redaction, authentication, or permissions. List the search terms you
used so I can confirm nothing was missed.
You are looking for one specific failure mode: a security-relevant
claim in the documentation is not backed by an implementation in
the source.
For each claim, quote the doc sentence verbatim, then either point
to the file/lines that implement it or state "no implementation
found".
[response shape]
Prompt 13 — Is this tool ready to be shipped to others?
A tool built for internal use takes on new obligations the moment it is shipped — multi-tenant isolation, abuse handling, supportable error messages, a security contact. Catches the gap before launch day.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Read the source plus
README.md, LICENSE, SECURITY.md, and any docs/ file. List the search
terms you used so I can confirm nothing was missed.
Assume this codebase is about to be published for use by people
other than its authors. Identify obligations that apply to shipped
software but not to internal tools, and state whether each is met:
- multi-tenant isolation (no cross-tenant data leakage)
- abuse-handling path (rate limits, reporting channel)
- user-facing error messages (no internal stack traces / secrets)
- a security contact and disclosure policy
- a clear statement of what data the tool sends where
[response shape]
How long this takes
End-to-end, expect 30–45 minutes for a small tool, longer for a larger one — most of which is reading the answers, not running the prompts. Several of the thirteen will return “not applicable” for any given tool; that is fine and is the point of having a checklist.
Failure modes & triage
| Symptom | Likely cause | Fix |
|---|---|---|
| Every prompt returns “not present” | Scope is wrong (assistant is reading docs, not code) or the response shape is being ignored. | Re-do Tip 1. Confirm with the sanity-check prompt at the bottom of Tip 1. |
| Verdict is “unclear” on most prompts | Files listed are not the ones that contain the relevant logic. | Ask the assistant which file would contain <the topic>, then re-run with that file added. |
| Findings are confidently wrong | Hallucination; the assistant did not actually read the file. | Switch from chat to inline-edit / file-attach mode that forces a read. Always check that quoted lines actually appear in the file. |
| Same prompt gives different answers on re-run | Model non-determinism; longer answers are more variable. | Lower temperature if your assistant exposes it; otherwise run a prompt three times and take the intersection of findings. |
Run all thirteen as one prompt
The thirteen prompts above are best run one at a time in a fresh chat per prompt — that keeps each answer sharp. But if you want a single paste that walks the assistant through every check in one pass (useful for a quick first sweep, or for assistants that hold context well), the combined prompt below chains all thirteen, expands the [response shape] once at the top, and finishes by writing a professional Markdown and HTML report — with scalable SVG diagrams (no overlapping or overflowing text, a click-to-expand full-page view, and subtle motion where it aids understanding), accessibility-compliant and Fluent 2 themed — into a Self-Testing-Report/ folder at the repository root. Paste it whole.
The combined sweep — thirteen checks, one paste
Runs every failure-mode check in sequence in a single session, asks for a severity-ordered summary, then generates a Microsoft-standard Markdown and HTML report. The report uses scalable SVG diagrams with no text/diagram overlap and no overflow, a click-to-open full-page view for each diagram, gentle animation where it helps (honouring reduced-motion), light/dark Fluent 2 theming, and WCAG 2.1 AA accessibility. Slower and more variable than running prompts individually, but zero copy-paste juggling and you finish with a shareable artefact.
You are performing a security self-assessment of this entire
codebase. Work through all THIRTEEN checks below in order, in this
one session, and do not stop until every check has an answer.
For each check: search the whole repository to find where it applies
— do not wait for me to list files. Ignore generated, vendored,
and dependency folders (build output, node_modules, vendor). Read the
relevant files in full before you judge, and list the search terms you
used so I can confirm nothing was missed.
For EVERY check, respond with exactly these four sections, numbered to
match the check:
1. VERDICT: one of [present / not present / unclear / not applicable]
2. EVIDENCE: file path + line numbers + a one-line quote per claim
3. WHY IT MATTERS: two sentences, plain English
4. FIX: a concrete change, with a short before/after code snippet
if applicable. If "unclear", list the one piece of context you
need to decide.
After CHECK 13, end with a one-paragraph SUMMARY listing every check
whose verdict is "present", ordered by severity (most serious first).
CHECK 1 — Wildcard tool exposure: the AI agent is granted tool
access that is wildcard or near-wildcard — for example, an
unrestricted shell, an HTTP client with no domain allow-list, file
system access with no path restriction, or an MCP / plugin client
that loads every tool advertised by the server.
CHECK 2 — Unauthenticated tool channel: the agent connects to
remote tool servers (HTTP endpoints, MCP servers, plugins,
function-call backends) without authentication, without TLS, or from
URLs that are user-controlled / configuration-controlled with no
allow-list of trusted hosts.
CHECK 3 — Conflated context (prompt injection): untrusted text
(web page contents, tool results, user-supplied documents, RAG
retrievals) is placed into the model’s context window without
clear separation from trusted instructions, without sanitisation, and
without any downstream check that prevents the model from following
directives embedded in that text.
CHECK 4 — Comment-to-commit promotion: a comment, issue, chat
message, or other text written by anyone with read access can cause
the agent to perform a write action (commit code, merge a pull
request, deploy, modify a ticket, send a message) without an
additional authorisation step that checks the actor’s permission
to perform that specific write. If the codebase does not integrate
with any such surface, answer "not applicable".
CHECK 5 — Live-fetch dependency: the agent (or its install /
startup script) fetches code or container images using floating
references — for example ":latest", "@main", unpinned package
installs at runtime, or "curl ... | sh" patterns. A pinned version, a
lock file, or a content hash counts as not-present.
CHECK 6 — Standing credential: the agent process holds a
credential that is long-lived (no automatic rotation), broad in scope,
and continuously available in memory or on disk — for example a
static API key, a personal access token, or a service-account secret
read from environment variables at startup and never refreshed. A
short-lived token minted just-in-time per call from a workload
identity counts as not-present.
CHECK 7 — Phishable flow: the agent uses an authentication flow
that is meaningfully more phishable than the device can support —
most commonly OAuth device-code flow running on a machine that has a
browser available, where authorisation-code-with-PKCE would have
worked. If the codebase does not perform end-user authentication at
all, answer "not applicable".
CHECK 8 — Plaintext journal: the agent writes telemetry (logs,
traces, metrics, spans) that contains raw model prompts, raw tool
arguments, retrieved document contents, or raw model output, without
redaction of personal data, credentials, or business-sensitive
content, and without an access boundary that matches the sensitivity
of the original data.
CHECK 9 — Mutable reference trust: any CI workflow, pipeline
definition, build script, or orchestration manifest references
third-party actions, images, or modules by a mutable name (a tag, a
branch, "latest") instead of by a content-addressable hash (commit
SHA, image digest). If the codebase has no CI / pipeline files, answer
"not applicable".
CHECK 10 — Unsupervised perimeter: the codebase defines or
registers multiple agents, sub-agents, or tool-bearing flows, but the
security and governance controls (rate limits, tool allow-lists,
evaluation hooks, logging, content filters) are applied to only a
subset of them. Enumerate every distinct agent or sub-agent you can
find and state for each whether the controls applied to the main one
also apply to it.
CHECK 11 — Shared identity runtime: the agent executes in the
same security context as the operator who launched it (same OS user,
same cloud identity, same file system permissions, same process),
rather than in a sandbox / separate identity with a narrower set of
permissions.
CHECK 12 — Documented defence that doesn’t exist: a
security-relevant claim in the documentation is not backed by an
implementation in the source. Also read README.md and any docs/ file
that mentions security, sandboxing, isolation, logging, redaction,
authentication, or permissions. For each claim, quote the doc sentence
verbatim, then either point to the file/lines that implement it or
state "no implementation found".
CHECK 13 — Internal-to-product gap: assume this codebase is about
to be published for use by people other than its authors. Read the
source plus README.md, LICENSE, SECURITY.md, and any docs/ file.
Identify obligations that apply to shipped software but not to internal
tools, and state whether each is met:
- multi-tenant isolation (no cross-tenant data leakage)
- abuse-handling path (rate limits, reporting channel)
- user-facing error messages (no internal stack traces / secrets)
- a security contact and disclosure policy
- a clear statement of what data the tool sends where
=== DELIVERABLE: WRITE TWO REPORT FILES ===
After you have completed all thirteen checks, create a folder named
"Self-Testing-Report" at the repository root and write BOTH of these
files into it (create the folder if it does not exist):
1. Self-Testing-Report/security-self-assessment.md
2. Self-Testing-Report/security-self-assessment.html
Both files must contain the SAME findings and meet Microsoft writing
and documentation standards: clear heading hierarchy, plain concise
language, sentence-case headings, active voice, no marketing tone, and
every acronym expanded on first use. Include a generation date and a
one-line note that the report is AI-generated and must be verified.
Report structure (both files, in this order):
A. Title + metadata (repository name, date, assistant/model used,
scope, and the disclaimer that findings are probabilistic and
must be human-verified).
B. Executive summary: one short paragraph on what this assessment
is and how to read it, then total checks, counts by verdict
(present / not present / unclear / not applicable), and the top
risks in plain English for a non-technical reader.
C. Severity-ordered findings table with columns: # | Check |
Verdict | Severity (Critical/High/Medium/Low/Info) | One-line
summary. Order most-serious first.
D. Detailed findings: one section per check (all thirteen), in the
same order, each using this exact consistent layout so the
report reads cleanly and never feels cluttered. Use short
paragraphs and sub-headings, not dense walls of text:
- Heading: the check number and name.
- What this check is: 1-2 plain-English sentences defining the
failure mode, written so a non-specialist understands it.
Expand any acronym on first use.
- Why we check it: 1-2 sentences on what risk it maps to (cite
the relevant framework where natural, e.g. OWASP LLM Top 10,
NIST AI RMF, MITRE ATLAS) and why it matters for an AI agent.
- What goes wrong if it is not fixed: 1-3 sentences describing
the concrete failure / attack and its real-world impact
(data loss, privilege escalation, supply-chain compromise,
etc.) so the reader understands the stakes.
- Verdict: one of [present / not present / unclear /
not applicable].
- Evidence: file path + line numbers + a one-line quote per
claim (or "no occurrences found" with the terms you searched).
- Why this verdict, here: 1-2 sentences tying the evidence in
THIS codebase to the verdict.
- Recommended fix and why it helps: a concrete change with a
short before/after snippet where applicable, plus one
sentence on what the fix prevents. For "not present", state
briefly what good looks like so the reader can keep it that
way; for "unclear", list the one piece of context needed.
Keep each section self-contained and scannable: lead with the
plain-English explanation, then the evidence, then the fix.
Do not repeat the full definitions in other sections.
E. Appendix: the exact search terms you used per check, so the
scan is reproducible.
Diagrams (required):
- Render every diagram as crisp, scalable SVG. In the .md use
```mermaid fenced blocks (Mermaid emits SVG); in the .html embed
the Mermaid runtime via a pinned CDN script tag and render the
same definitions to inline SVG. Where a diagram is highly bespoke,
you may hand-author inline SVG instead of Mermaid, but it must
follow the same layout and accessibility rules below.
- For EVERY finding whose verdict is "present", include a Mermaid
sequenceDiagram that shows how the gap is exploited: the actor /
untrusted input, the agent, the tool or credential or channel,
and the resulting impact. Give each its own heading.
- Include one overall data-flow / trust-boundary diagram (Mermaid
flowchart) showing untrusted inputs, the agent, its tools, its
credentials, and where the trust boundaries sit.
- STRONGLY prefer Mermaid over hand-authored SVG: Mermaid sizes
each node box to its text automatically, which avoids clipping.
Only hand-author SVG if Mermaid genuinely cannot express the
diagram, and if you do, you MUST wrap every label in a
<foreignObject> with real HTML text so it wraps and the box
grows to fit — never paint text into a fixed-width <rect>.
- Labels must be COMPLETE, never truncated. Do not cut a label to
fit a box and do not end one mid-word (no "comment tex", no
"remote M", no "single OS id"). If a label is long, rephrase it
to be genuinely shorter, or wrap it onto two lines with a Mermaid
line break (<br>), but always show the full meaning. After
rendering, re-read every node and edge label and confirm none is
clipped or running past its box; fix any that are before saving.
- Layout quality is mandatory: NO text may overlap another node,
edge, or label, and NO text or shape may overflow the diagram's
bounds or be clipped. Give nodes generous padding, keep adequate
nodeSpacing and rankSpacing between nodes and edges, and let the
SVG size to its content (keep its viewBox; do not impose a fixed
pixel width/height that squashes it) so nothing is cut off at any
zoom level or screen width.
- Provide a full-page / expand option for visibility, and make it
actually fill the screen. Each diagram has a "Full screen" button
that opens a full-viewport overlay (a native <dialog> or a
fixed-position lightbox). Inside the overlay the diagram must
SCALE UP to fill the available space — it must not sit tiny
in the middle of a large empty area. Achieve this by: keeping the
SVG's viewBox, removing any fixed width/height attributes on the
SVG (or setting width/height to 100%), adding
preserveAspectRatio="xMidYMid meet", and styling the SVG with
width:100%; height:100%; max-width:96vw; max-height:90vh inside a
flex container that centres it (display:flex; align-items:center;
justify-content:center). The result should be a large, crisp,
centred diagram that uses most of the screen. Include a clearly
labelled, keyboard-operable Close control (Esc must also close
it), dim the page behind, return focus to the trigger on close,
and trap focus while open (role, accessible name, screen-reader
usable).
- Use subtle, purposeful animation where it aids comprehension (for
example, an animated pulse or moving dot along the exploit path in
a "present" finding's diagram to trace the flow of untrusted data,
or a gentle highlight of the trust boundary that is crossed). Keep
animations short, looping calmly, and never essential to meaning.
You MUST honour prefers-reduced-motion: reduce by disabling or
freezing all motion for users who request it.
- Accessibility for diagrams is mandatory and must meet WCAG 2.1
AA: give every Mermaid diagram a title via the "accTitle" and a
longer text alternative via "accDescr" (for hand-authored SVG use
<title> and <desc> plus role="img" and an aria-label), AND
immediately follow each diagram with a short plain-text summary
paragraph that fully conveys the same information to anyone who
cannot see the diagram. Never rely on colour alone to convey
meaning (pair colour with a label, icon, or text). Ensure all
text/background pairs meet a contrast ratio of at least 4.5:1.
HTML report requirements (security-self-assessment.html only):
- A single self-contained file (inline CSS and the one Mermaid
script tag); it must open correctly from the local filesystem.
- Use the Fluent 2 design language: font stack
"Segoe UI Variable, Segoe UI, system-ui, sans-serif", Fluent 2
spacing and rounded corners, and Fluent 2 colour tokens.
- Support BOTH a light and a dark theme. Honour the OS setting via
prefers-color-scheme AND provide a visible, keyboard-operable
theme-toggle button. Configure Mermaid's theme to follow the
active mode so diagrams are legible in both light and dark.
- Initialise Mermaid for legibility and no clipping: set
startOnLoad true, theme to match the mode, securityLevel 'loose'
so HTML labels and line breaks render, and a flowchart config of
htmlLabels:true, useMaxWidth:true, nodeSpacing:60, rankSpacing:70
(and equivalent spacing for sequence diagrams). After Mermaid
renders, each diagram's inline SVG should display at a readable
size on the page (not shrunk to a thumbnail) and expand cleanly
in the full-screen overlay described above.
- Accessibility (WCAG 2.1 AA): semantic landmarks (header, main,
nav, footer), a "skip to main content" link, a single h1 then a
correct heading order, a visible :focus indicator, descriptive
link text, table headers with scope attributes, and a lang
attribute on the html element. All interactive controls must be
reachable and operable by keyboard.
- Make the severity table sortable-by-reading (already ordered) and
use both a colour AND a text label for each severity badge.
After writing both files, print the two file paths and a one-line
confirmation of how many findings were marked "present".
Next tip
The prompt pack is structured and predictable, which is a strength and a weakness. Tip 3 → is the unstructured counterpart: a single longer prompt that role-plays an attacker against your tool and tells you what they would try first.