Key insight
A nearest-neighbour search ranks by meaning, not by permission. If your retrieval layer does not constrain the search to what the caller is allowed to see — enforced on the server, not requested by the client — a shared vector store will eventually hand one tenant another tenant's data.
Similarity is not authorisation
A vector store answers the question “which chunks are closest to this query?” It says nothing about who is allowed to read them. When multiple tenants, teams, or trust levels share one index — the common design, because it is cheaper and simpler than per-tenant infrastructure — a query naturally ranks every document in the store, and the top result may belong to someone the caller has no right to read. The retriever did its job perfectly; the job was just the wrong one.
This is OWASP LLM08: Vector and Embedding Weaknesses. It covers both directions of the problem: data leaking out of the store to the wrong reader, and information leaking about the store through repeated probing — an attacker who can query the index can infer or reconstruct content they were never shown. The root cause in both is the same: authorisation was not part of the retrieval decision.
The Porous Vector Store anti-pattern
Anti-pattern
Porous Vector Store
Definition. A retrieval system runs similarity search over an index that mixes documents from multiple tenants or trust levels, without enforcing — server-side — that results are limited to what the caller is authorised to read.
Symptoms. One shared index for all tenants with no per-query scoping; access filters supplied by the client and trusted by the server; permission checked when documents are written but not when they are retrieved; no re-check of document ACLs after retrieval; no logging that would reveal cross-tenant access.
Why it is hazardous. A single query can return another customer's data into the model's context and then into the answer, and a determined caller can probe the index to reconstruct content they were never granted — a confidentiality breach that often looks, in logs, like an ordinary search.
Related controls. Scope every query to the caller's tenant; enforce the filter server-side; prefer per-tenant indexes or namespaces for strong isolation; re-check permissions on retrieved documents; log and monitor retrieval for cross-boundary access.
A hypothetical leak
The following illustrates a plausible failure mode. No specific incident is implied.
A SaaS analytics product gives each customer an assistant that answers questions over the customer's uploaded documents. To keep costs down, every customer's chunks live in one large shared index, each tagged with a tenant_id. The application is supposed to add a tenant_id filter to each query, but in one code path — a newly added “related insights” feature — the filter is omitted.
A user at customer A asks a question. The retriever, now searching the whole index unfiltered, returns the three most similar chunks — one of which is a financial figure from customer B, who happens to operate in the same industry and uses similar language. The model weaves customer B's confidential number into customer A's answer. No alarm fires: from the system's point of view, a search ran and returned relevant results.
Four layers that compose into a defence
- Make authorisation part of the query.
Every retrieval call carries the caller's identity and tenant, and the search is constrained to documents that identity may read — via a mandatory metadata filter or, more strongly, a per-tenant index or namespace. Retrieval without a scope is rejected, not run wide.
- Enforce the scope server-side.
The tenant filter is applied by the trusted retrieval service from the authenticated session, never accepted as a client-supplied parameter. A client that can name the tenant it wants to read from can name someone else's.
- Re-check permissions on what comes back.
After retrieval, verify that each returned document is still readable by the caller before it enters the model's context. ACLs change; an index can lag. The post-retrieval check closes the gap between what was indexed and what is permitted now.
- Log retrieval and watch the boundary.
Record who queried, with what scope, and which documents were returned. Cross-tenant access — a query for tenant A returning a tenant B document — should be a detectable, alertable event, not an invisible one. Logging also surfaces probing patterns that signal reconstruction attempts.
You would never run a SQL query without a tenant predicate and trust the client to add it. A vector search is a query; it deserves the same server-enforced row-level security.
A practical checklist
- Every retrieval query is scoped to the caller's tenant or permission set.
- The scope is enforced server-side from the authenticated session, not supplied by the client.
- Strongly-isolated tenants use a per-tenant index or namespace rather than a shared index with a filter.
- Retrieved documents are re-checked against current ACLs before entering the model's context.
- A query that returns no permitted results returns nothing — it does not fall back to an unscoped search.
- Retrieval is logged with caller, scope, and returned document IDs.
- Cross-boundary retrieval is an alertable event; probing patterns are monitored.
- Newly added retrieval code paths are reviewed specifically for a missing tenant filter.
Test your own codebase in ten minutes
The fastest way to find out whether this anti-pattern is present in your own system is to ask an AI coding assistant to look for it. Run the prompt below in a fresh chat session, on its own — and judge the system by what the code actually does, not by what its documentation claims.
Search the whole repository to find where this applies — do not
wait for me to list files. Ignore generated, vendored, and dependency
folders (build output, node_modules, vendor). Identify every location
the failure mode below could occur, read those files in full before
you judge, and list the search terms you used so I can confirm nothing
was missed.
You are looking for one specific failure mode: similarity search runs
over a vector / embedding index that holds documents from more than
one tenant or trust level, and the query is not scoped server-side to
what the authenticated caller may read — the tenant filter is
missing, optional, or supplied by the client, and retrieved documents
are not re-checked against current permissions before use.
If there is no multi-tenant retrieval, say "not applicable".
Respond with exactly these four sections:
1. VERDICT: one of [present / not present / unclear]
2. EVIDENCE: file path + line numbers + a one-line quote per claim
3. WHY IT MATTERS: two sentences, plain English
4. FIX: a concrete change, with a short before/after code snippet
if applicable. If "unclear", list the one piece of context you
need to decide.
Insist on the four-part answer: a verdict with a file path, a line number, and a one-line quote is something you can act on; a verdict on its own is just an opinion. If the result is present, the FIX section is your starting point — enforce a server-side tenant scope and re-check ACLs after retrieval. Re-run the same prompt after the change to confirm the verdict flips to not present.
Conclusion
A vector store is a wonderfully efficient way to find relevant text and a completely indifferent one when it comes to permission. Treat retrieval as a query against protected data: scope it to the caller, enforce that scope on the server, re-verify what comes back, and log the boundary. Efficiency from a shared index is fine — as long as authorisation, not similarity, decides what each caller is allowed to see.
References & further reading
- OWASP Top 10 for LLM Applications — LLM08: Vector and Embedding Weaknesses.
- OWASP Agentic AI — Threats and Mitigations — retrieval and data-access controls for agents.
- NIST AI Risk Management Framework — confidentiality and access-control guidance for AI data.