Curated Research
OWASP LLM Top 10 (v2025): The Baseline Threat Model for LLM-Backed Applications
The baseline threat model for any LLM-backed application. If your team has not read this, they are not ready to deploy.
1. Why this document is the one you start with
There is now a large and growing literature on LLM security. Papers on prompt injection, reports on agentic misuse, vendor whitepapers on defence-in-depth architectures, regulatory guidance from ENISA and NIST, academic work on model extraction and membership inference. Most of it is useful. Almost none of it is where you should start.
The OWASP Top 10 for Large Language Model Applications is the document that every engineering and security team deploying an LLM-backed application should read before the first architecture review, and should re-read before every major release. It is not the most technically deep reference on any single threat class; specialist papers go further on prompt injection, on model supply chain, on data poisoning. It is not the most current reference on the agentic threat landscape, where the state of the art moves faster than any standard can track. What it is, uniquely, is the shared vocabulary that your security team, your engineering team, your legal team, and your board can all use to describe the same risks.
That shared vocabulary is the thing most LLM deployment programmes lack. Security raises concerns in language borrowed from web application security that does not map cleanly to LLM systems. Engineering describes failure modes in model-specific terminology that security cannot evaluate. Legal hears about "hallucination risk" and "prompt injection" and cannot place them on a familiar impact scale. The OWASP Top 10 is the Rosetta Stone that makes cross-functional risk conversations possible, and for that reason alone it is the document you read first.
The 2025 revision of the list (the current version as of this guide's review date) reflects two years of field experience with production LLM deployments. The changes from the original 2023 draft are substantial and worth understanding in their own right, because they tell you which threats turned out to matter in practice and which turned out to be less consequential than the early threat-modelling predicted.
2. What the Top 10 is and what it is not
The OWASP Top 10 for LLMs is a consensus document produced by a working group of practitioners from security vendors, research institutions, cloud providers, and deploying enterprises. It identifies the ten most consequential risk categories for applications that integrate LLMs into production workflows, and for each category it provides a definition, an example, prevention guidance, and reference scenarios.
It is modelled on the original OWASP Top 10 for web applications, which has been the baseline threat taxonomy for web security for two decades. The LLM variant inherits the strengths of that lineage: it is consensus-driven, it is opinionated about what matters most, and it is deliberately scoped to application-layer risks rather than attempting to catalogue every possible threat.
It is important to be clear about what the document is not. It is not a compliance framework; it does not map to specific regulatory regimes, though it is referenced by many of them. It is not a comprehensive threat catalogue; it deliberately omits threats that are either too specialised or too infrastructure-dependent to apply broadly. It is not a prescriptive architecture; it tells you what to defend against, not precisely how. And it is not, in any meaningful sense, adversarial-AI research; the threats it describes are the ones that matter for real applications deployed by real organisations, not the edge-case jailbreaks that dominate social media.
The practical consequence of these scoping choices is that the Top 10 is usable by teams that are not LLM security specialists. This is the feature, not a bug. A document that only a specialist could apply would not solve the cross-functional communication problem that is the real bottleneck in most deployments.
3. The ten categories, in the order that matters
The list below is the 2025 version of the OWASP Top 10 for LLMs. We present each category with a short definition, an interpretation of what it actually means for a production deployment, and a note on how it tends to manifest in the field.
LLM01: Prompt Injection
The most well-known category on the list, and still the most misunderstood. Prompt injection is the manipulation of an LLM's behaviour by adversarial input, either from a user directly (direct injection) or from content the model processes on behalf of the user (indirect injection). The direct variant is the one that dominates public discussion; the indirect variant is the one that causes most actual production incidents.
The field-level interpretation is this: any LLM that processes untrusted content (a web page, an email, a PDF, a document from a third party) can have its behaviour altered by that content, and the alteration can persist across the session. A model summarising an email that contains hidden instructions can be induced to exfiltrate data, mislead the user, or take actions it would not otherwise take. This is not a bug that can be patched; it is a structural property of how LLMs process text. The correct response is architectural: treat model outputs as untrusted, put authorisation boundaries around any action the model can take, and do not give the model capabilities that a malicious prompt could misuse.
The category has held its number-one position across both versions of the Top 10 for a reason. In our engagement experience, prompt injection is the single most common root cause of LLM-related security incidents, and indirect injection specifically is the variant that most deployments underestimate.
LLM02: Sensitive Information Disclosure
The 2025 revision moved this category up from its original position, reflecting how consistently it appears in post-incident analysis. The threat is that LLMs can reveal sensitive information (proprietary data, personal data, credentials, system prompts, training data) through their outputs, either in direct response to adversarial queries or inadvertently as part of normal operation.
The interpretation that matters in the field: LLM outputs should be treated as having the same sensitivity classification as the highest-sensitivity data the model has ever seen, for the duration of any session in which that data was present. System prompts leak. Training data leaks. Context-window contents leak. Retrieval results leak. The defensive posture is not "prevent the model from revealing this information" (that is not reliably achievable) but "ensure the model never has access to information it is not authorised to reveal in the current context."
The category subsumes what earlier versions of the list treated as separate concerns around training data leakage and system prompt exposure. The consolidation is sensible; the underlying defensive principle is the same across all of them.
LLM03: Supply Chain
LLM applications have a dependency graph that extends well beyond the conventional software supply chain. The model itself is a dependency, often from a third party. The training data is a dependency. Fine-tuning datasets, tokenisers, embedding models, vector databases, retrieval indices, and the inference stack are all dependencies. Each introduces its own supply chain risk, and several of them (notably model weights) do not have the provenance tooling that has become standard for software packages.
This category has risen in importance as production deployments have matured past the "call a hosted API" stage into the "fine-tuned, retrieval-augmented, multi-model" architectures that are now typical. The field-level reality is that most organisations cannot currently attest to the provenance of every component in their LLM stack. Model cards are inconsistent, training data is opaque even for open-weight models, and the supply chain for specialised components (embedding models in particular) is poorly documented.
The practical work here is unglamorous: maintain an inventory of every model and ancillary component in production, track the provenance and licence of each, and have a vulnerability response process that can handle "the model we fine-tuned from has a newly disclosed issue" as a first-class scenario.
LLM04: Data and Model Poisoning
The threat that an adversary introduces crafted data into training, fine-tuning, or retrieval corpora in order to manipulate the resulting model's behaviour. The 2025 revision broadened this category to include poisoning of retrieval-augmented systems and vector databases, which had become a meaningful attack surface during the period between revisions.
The interpretation in practice: any dataset used for training or retrieval is a control surface for the attacker who can influence it. For public web data used in training, the threat is diffuse but real. For retrieval corpora populated from user-generated content, customer uploads, or third-party document feeds, the threat is direct and exploitable. Most retrieval-augmented deployments we see have no data validation between ingestion and indexing; a malicious document inserted into the corpus can poison the outputs of every subsequent query that touches it.
The defensive pattern is provenance-aware retrieval: every document in the corpus is tagged with its source and trust level, retrieval respects trust boundaries, and output generation is aware of the trust level of the context it was given. This is more work than most teams budget for, and it is one of the most common gaps in production deployments.
LLM05: Improper Output Handling
The category that captures what happens when a downstream system treats LLM output as trusted input. The classic example is a model output piped into a SQL query, a shell command, or an HTML template without validation, producing injection vulnerabilities of the conventional kind but with a novel source.
This is the category that most directly connects LLM security to conventional application security, and it is where the existing security-team playbook transfers most cleanly. The interpretation is straightforward: LLM output is user input for every system that receives it, and must be validated, escaped, and bounded accordingly. The subtlety is that many teams building LLM applications come from ML backgrounds rather than web application security backgrounds, and the "treat all input as hostile" reflex that is universal in web security is not universal in ML engineering.
In the 2025 revision this category gained additional scope around structured output handling (JSON, function calls, tool-use responses) as function-calling and agent architectures became standard. The threat surface there is larger than the simple injection case: a model that produces a malformed function call can cause cascading failures across an agent system, and the handling of that malformation is often the weakest link in the architecture.
LLM06: Excessive Agency
The category that most directly captures the risks of agentic systems. Excessive agency refers to giving an LLM capabilities, permissions, or autonomy beyond what its specific task requires, such that a failure (malicious or otherwise) produces disproportionate consequences.
The 2025 revision elevated this category significantly, reflecting the mainstream adoption of agent architectures. The threat model has three dimensions: excessive functionality (the model has tools it does not need), excessive permissions (the tools it has operate with broader authorisation than necessary), and excessive autonomy (the model can take consequential actions without human confirmation).
The defensive principle is least authority, applied ruthlessly. An agent that drafts emails does not need the ability to send them. An agent that reads a calendar does not need write access. An agent that operates on behalf of a user should operate with that user's permissions, not with an application service-account that has broader scope. Every tool available to an agent should be justified against a specific task requirement, and every action with irreversible consequences should have a human-confirmation gate.
This is the category where the gap between public demos and production-ready architecture is widest. Public agent demos routinely grant models permissions that would be indefensible in production. The field-level reality is that most production agent deployments now operate under substantially more constrained authority than public demos suggest, and that this is correct.
LLM07: System Prompt Leakage
New in the 2025 revision, though it was implicitly covered under earlier categories. The explicit elevation reflects how consistently system prompts have been shown to leak across adversarial prompting, how much sensitive information they contain in practice, and how many deployment teams still treat system prompts as confidential when they cannot be.
The interpretation: assume your system prompt is public, design accordingly. Do not embed credentials, API keys, internal URLs, business logic that must not be disclosed, or information about other users in system prompts. Treat the system prompt as a behavioural configuration, not a security boundary. Any secret a user should not know must be enforced outside the model, typically at the tool-use or API gateway layer.
This sounds obvious. It is violated in the majority of production systems we review. The most common leakage vector is credentials or internal API endpoints embedded in system prompts to enable tool use, a pattern that should be replaced with proper secret injection at the tool-execution layer.
LLM08: Vector and Embedding Weaknesses
Also new in the 2025 revision, reflecting the now-ubiquitous use of retrieval-augmented generation and vector databases in production. The category covers attacks that exploit the embedding and retrieval layer: adversarial documents crafted to be retrieved inappropriately, cross-tenant data leakage through shared vector stores, membership inference against embedding databases, and manipulation of similarity search to alter retrieval results.
The field-level issue this category addresses: vector databases are data stores with their own access-control and isolation requirements, and they have frequently been deployed with weaker controls than the traditional data stores they augment. A vector database that indexes documents from multiple tenants without proper isolation is a cross-tenant data leak waiting to happen; a vector database that accepts user-controlled content into its corpus without validation is a retrieval-poisoning vulnerability.
This category is likely to grow in importance over the next revision cycle as the threat surface matures. Treat it as a signal to audit your vector stack with the same rigour you apply to your relational databases.
LLM09: Misinformation
The category covering the ways an LLM can produce false, misleading, or inappropriately confident output, and the downstream consequences of systems that treat that output as authoritative. This is the category that most directly addresses what the public calls "hallucination," though the scope is broader: it includes confidently-wrong outputs, fabricated citations, made-up APIs in generated code, and the general problem of models producing plausible-sounding content that is not grounded in fact.
The defensive posture is architectural rather than model-level. No amount of prompting, fine-tuning, or model selection reduces misinformation risk to zero. The deployment patterns that actually work involve grounding model outputs in retrieved sources with citation, restricting generation to verifiable outputs where stakes are high, applying validation layers appropriate to the output domain (compilers for code, schema validators for structured data, fact-checking systems for factual claims), and, crucially, shaping user interfaces so that users are not invited to treat uncertain outputs as certain.
The category also covers the subtler failure mode where an LLM's output influences human decisions in ways the user does not notice. This is the threat model that matters most for high-stakes applications: medical, legal, financial, safety-critical. The defensive question is not "how do we prevent hallucination" but "how do we prevent hallucination from producing a consequential action without human verification."
LLM10: Unbounded Consumption
The category that covers resource exhaustion attacks: adversarial inputs that cause excessive token consumption, runaway agent loops, denial-of-service through prompt amplification, and the financial exposure that LLM applications face when a per-request cost is not capped.
This category replaces what earlier versions called "model denial of service" and broadens the scope to include the financial-exhaustion variant, which has become a meaningful threat as LLM-backed applications have scaled. A misconfigured agent loop can run up hundreds of thousands of dollars in API charges before detection. A crafted prompt that induces long outputs can multiply per-request costs by orders of magnitude. A recursive tool-calling pattern that does not have hard iteration caps can consume unbounded resources.
The defensive pattern is straightforward: rate limiting, token budgets per request, cost budgets per user and per tenant, circuit breakers on agent loops, and alerting on anomalous consumption patterns. These are routine in mature deployments and absent in most first-year deployments.
4. How the 2025 revision changed the list
Understanding what changed between the original 2023 list and the 2025 revision tells you something important: which threats turned out to matter in practice. Five shifts are worth noting.
Excessive agency was elevated. In the 2023 list, this category appeared but was positioned as a developing concern. Two years of production agent deployments moved it up to a first-tier risk. The lesson: agentic architectures amplify every other category on this list, and the defensive work around agent authority is substantially more consequential than the 2023 draft anticipated.
System prompt leakage became its own category. Previously treated implicitly under other categories, system prompt leakage was elevated because the pattern of embedding secrets and business logic in system prompts persisted in production despite being a known anti-pattern. The explicit category is an intervention, not a new discovery.
Vector and embedding weaknesses became their own category. Retrieval-augmented generation moved from emerging architecture in 2023 to default architecture in 2025, and the category reflects that transition. The threats here were underestimated in the original list and are still underestimated in most deployments.
Model denial of service was subsumed into unbounded consumption. The financial-exhaustion variant turned out to matter more than the availability variant, and the scope broadened accordingly.
Insecure plugin design was retired as a standalone category. The threats it described were absorbed into excessive agency and improper output handling, reflecting that the plugin architecture of 2023 has given way to the tool-use and function-calling patterns of 2025. The underlying threats persist; the architectural framing has moved on.
5. How to use the Top 10 in your organisation
A threat taxonomy is only useful if it is actually applied to real systems. Four practical uses of the OWASP Top 10 are worth codifying in your deployment programme.
Architecture review gate
Every LLM-backed application moving toward production should pass a review that explicitly addresses each of the ten categories. For each category, the review asks: is this threat present in this architecture, what is the mitigation, and what is the residual risk? A category that is not present in the architecture should be documented as not-applicable with a reason, not silently skipped. This is the single highest-leverage use of the document, and it takes roughly two hours per application in a well-run programme.
Security team training
Every security engineer who will work on LLM systems should have read the document in full, and ideally worked through the reference scenarios. The document is short enough that this is a one-afternoon exercise, and it dramatically improves the quality of subsequent architecture reviews. The working-group reference scenarios are particularly useful because they translate abstract categories into concrete exploit narratives.
Vendor evaluation
When evaluating an LLM product or service (whether a foundation model API, a retrieval platform, an agent framework, or an AI-enabled SaaS vendor) the ten categories provide a consistent evaluation lens. Ask each vendor how they address each category, where the security boundary sits between them and you, and what their shared-responsibility model looks like. The answers are informative not just about the product, but about the maturity of the vendor's security programme.
Incident classification
When an LLM-related incident occurs, classify it against the ten categories in the post-incident review. This produces two benefits over time: your organisation develops internal data on which categories actually produce incidents in your environment, and the incident history accumulates in a form that can be directly compared against industry data. Both are valuable for the board-level conversation about AI risk.
6. What the Top 10 does not cover, and what to read next
For all its strengths, the OWASP Top 10 is scoped to the deploying-organisation's perspective on LLM-backed applications. There are important threat classes it deliberately does not address, and a complete security programme needs to layer additional references on top.
Model-level adversarial robustness. The document largely treats model behaviour as a property to be designed around rather than something the deploying organisation can meaningfully improve. For organisations that train or heavily fine-tune their own models, the NIST AI Risk Management Framework and the academic literature on adversarial robustness are the relevant next references.
Jurisdictional and regulatory risk. The Top 10 is technology-focused and does not address the EU AI Act, GDPR implications of LLM deployments, sector-specific regimes (HIPAA, FINRA, etc.), or the emerging national AI regulations. The ENISA AI threat landscape reports and the NIST AI RMF profiles are better starting points for that material.
Societal and reputational risk. Bias, fairness, representation, and the reputational consequences of LLM-generated content are outside the document's scope by design. Organisations deploying in public-facing or regulated contexts need a separate workstream for these risks.
Emerging agentic threat classes. The 2025 revision captures agent risks well, but the field continues to move. Research on multi-agent collusion, on tool-use chain exploits, and on long-horizon autonomous systems is ahead of any current standard. For organisations deploying serious agent systems, the specialist literature (including ongoing work from MITRE on ATLAS and from the major labs' own safety research publications) is necessary reading on top of the Top 10.
Prompt injection research specifically. Because LLM01 is such a deep category, a specialist literature has grown up around it. Simon Willison's running coverage, the indirect-injection papers from the 2023-2024 cycle, and the more recent work on multimodal prompt injection are the natural follow-ups for teams that want to go deeper than the Top 10 provides.
7. The uncomfortable conclusion
Here is the thing nobody in the OWASP working group says out loud, because it is not the working group's place to say it: if the engineering team on an LLM deployment has not read this document, the deployment is not ready to go to production. Not because reading the document produces security (reading anything does not produce security) but because the absence of a shared threat vocabulary across the team is a strong predictor of the kind of failure mode this list describes.
We have reviewed LLM deployments whose teams had never heard of indirect prompt injection. We have reviewed agent architectures that violated every principle of excessive agency, from teams that had built the system in good faith without a framework for asking whether they should. We have reviewed retrieval systems that had no concept of document provenance because the term had not entered the team's design conversations. In every case, the gap was not that the team lacked the technical ability to fix the problem. The gap was that the team did not have the vocabulary to notice the problem existed.
The OWASP Top 10 is thirty pages. It takes ninety minutes to read the first time, another hour to work through the reference scenarios, and roughly a week of application to start noticing its categories in your own systems. That is the cheapest investment available in LLM security, and it is the one every organisation deploying these systems should be making before the first production traffic flows.
8. Where to find the document
The OWASP Top 10 for Large Language Model Applications is freely available from the OWASP project website. The 2025 revision is the current version as of this guide's review date. The project also publishes the reference scenarios, the mapping to related standards (NIST AI RMF, MITRE ATLAS, ISO/IEC 42001), and a growing library of community-contributed deployment patterns aligned to the ten categories. All of it is worth knowing about; the document itself is where to start.