The AI Defense Matrix

A structured framework for defending AI systems

Identify gaps, assign ownership, and select controls for defending AI systems. Aligned with NIST CSF 2.0, it’s a “security for AI” companion to the Cyber Defense Matrix. Created by Lenny Zeltser and Sounil Yu.

Explore the Matrix → Learn More

The AI Defense Matrix

The AI Defense Matrix is a structured framework for defending AI systems. Each row is an AI-specific asset class. Columns are NIST CSF 2.0 functions. Cells show the AI-specific control category, objective, or representative tooling for each intersection. You can download the AI Defense Matrix as CSV, YAML, and Markdown files.

Asset Class ⚖️Govern 🔍Identify 🛡️Protect 📡Detect Respond ♻️Recover
DEVICES
AI-Workload Platforms
AI-platform standards
AI security posture management
AI-workload hardening; model-loading supply-chain verification
AI-workload runtime detection
Generic container IR
Generic platform restore
AI Orchestration Tools
AI application governance
AIBOM for applications; agent-framework discovery
System-prompt hardening; plugin allowlisting
Prompt-injection testing; agent anomaly detection
Agent runtime IR; plugin disable
Framework config; prompt rollback
APPLICATIONS
AI-Generated Code
AI coding standards, code-review policy, license; provenance policy
AI-code provenance; origin tracking
AI-aware SAST
Hallucinated dependency; insecure-pattern detection
PR block; revert of AI-generated commits
Code rewrite; replacement of flagged artifacts
NETWORKS
AI Gateways and Routers
AI egress policy; approved-service registry
AI traffic discovery
AI gateways for egress; MCP gateways for tool gating
Anomalous AI traffic; RAG-leakage egress detection
AI traffic blocking; shadow AI takedown
Generic network failover
DATA
AI Model
Model selection; provider evaluation
Model inventory; AIBOM
Model firewalls; weight protection
Model drift; integrity monitoring
Model rollback; provider coordination for consumed models
Model version restore; provider re-selection
Training Data
Dataset provenance; licensing policy
Dataset inventory; lineage
Data access control
Poisoning; backdoor detection
Dataset quarantine; retraining trigger
Dataset restore from golden copies; model retraining
Runtime AI Data
Prompt; RAG policy, memory-retention governance, interaction-history policy
RAG source; LLM-oversharing inventory
Prompt-injection defense, RAG sanitization, memory-poisoning defense, AI-content DLP
Prompt anomaly, jailbreak attempts, RAG leakage, memory tampering
Session termination; RAG source isolation
Vector DB restore; re-indexing
USERS
AI Agent Identities
AI agent identity policy, authorization standards, OAuth for agents
AI agent; non-human principal inventory
Agent OAuth; capability scoping, short-lived credentials
Agent behavioral monitoring; runtime authorization drift
Credential revocation, agent quarantine, session termination
Agent identity re-provisioning

Using the AI Defense Matrix

The AI Defense Matrix helps you identify gaps, assign ownership, and select controls for defending AI systems. It focuses on “security for AI,” covering AI-specific components enterprises must defend and the controls that secure them. “AI for security” stays with the Cyber Defense Matrix, since AI is now incorporated into most cybersecurity products.

Practitioners

Gap Analysis

Walk through every cell and ask whether process, technology, or both cover that asset class and NIST CSF function. Start with Govern to understand your current state regarding AI ownership, risk appetite, and policy.

Read the left-to-right progression as a maturity signal. Mark each cell as covered, partial, or absent to produce your gap inventory, then develop a plan to address the gaps based on your priorities.

Vendors

Product Positioning

Walk through every cell and identify the intersections of asset class and NIST CSF function that your product addresses. Map your capabilities to those specific cells rather than claim broad coverage across the matrix.

Treat thinly covered cells as opportunities for differentiation or new products that solve underserved customer needs. Use the cell map to sharpen product roadmap, sales narrative, and other go-to-market decisions.

Design Criteria

Each row must satisfy the primary criterion. The secondary criterion resolves edge cases where two asset classes might otherwise be collapsed or split.

Primary

AI-Specific Defense Required

A new row should exist only if defending that asset requires AI-specific considerations, processes, or tools. If a generic security approach handles the defense well, or if the defense mirrors that of a non-AI asset, those entries belong in the Cyber Defense Matrix.

For example, hardening an LLM inference server requires awareness of model loading paths, safetensors provenance, GPU memory isolation, and CUDA-library supply chain. Generic Kubernetes hardening misses these risks entirely.

Secondary

Shared Defender Team & Tooling

Two asset classes fold into one row when the same defender team handles both with the same tools. If either the team or the tools differ, the assets likely deserve separate rows.

For example, self-hosted models and consumed-as-a-service models are distinct assets. Their model-layer trust concerns and defender tooling overlap enough to keep them in a single AI Model row.

Why Each Row Passes the Criteria

Every asset class in the AI Defense Matrix requires AI-specific defense that traditional security tooling cannot adequately deliver.

🖥️
AI-Workload Platforms
Devices Theme
Model-serving platform CVEs, model-loading supply-chain attacks, and ML orchestration framework exploits require AI-platform-specific hardening. Generic Kubernetes or container security does not reach these risks. AI security posture management (AI-SPM) tooling anchors this row.
⚙️
AI Orchestration Tools
Devices Theme
Supply chain of AI scaffolding, prompt-injection testing of system prompts, and plugin governance all require AI-specific AppSec tooling. AI application security, AIBOM scanning, and prompt-defense tooling anchor this row.
💻
AI-Generated Code
Applications Theme
AI-suggested code introduces hallucinated dependencies, insecure patterns inherited from training data, and licensing ambiguity. Generic SAST does not catch hallucinated APIs or AI-specific insecure patterns. AI-aware static analysis tooling anchors this row.
🔀
AI Gateways and Routers
Networks Theme
New protocols such as MCP and AI-aware content inspection require AI-specific gateway and egress tooling. Generic firewalls miss prompt and RAG leakage. AI gateway tooling covers egress and cost controls; MCP-specific gateway tooling covers tool gating and scope enforcement.
🧠
AI Model
Data Theme
Self-hosted models need model-weight protection, model firewalls, and AIBOM scanning. Consumed-as-a-service models need provider evaluation, model card review, and version pinning. No traditional security tool addresses either. Model security and AI governance tooling anchor this row.
📚
Training Data
Data Theme
Traditional DLP and data classification tools do not address poisoning and backdoor attacks on training data. The defense requires AI-specific provenance and poisoning detection, alongside data access control and lineage governance tooling.
Runtime AI Data
Data Theme
Prompt injection, RAG poisoning, vector DB access control, and AI-content DLP require AI-native tooling. Persistent agent memory adds memory-poisoning attacks that survive model swaps. AI-native prompt defense, knowledge-layer access control, and guardrail tooling anchor this row.
🤖
AI Agent Identities
Users Theme
AI agents scale differently from human users — ephemeral and high-volume credentials across services. Traditional IAM does not handle agent lifecycle, capability scoping, or delegation-chain attenuation at runtime. Non-human identity tooling anchors this row.

Framework Alignment

The AI Defense Matrix rows accommodate major AI security frameworks. Select a framework below to see how its concepts map to each asset class. You can download the crossmapping as CSV, YAML, and Markdown files.

NIST IR 8596: NIST IR 8596 identifies AI system components organizations should protect. Each concept below maps to a row in the AI Defense Matrix or to the Cyber Defense Matrix.

Asset Class NIST IR 8596 Concepts
AI-Workload Platforms Containers, microservices, and libraries (AI-specific subset); inference endpoints (platform side)
AI Orchestration Tools Agents as deployed artifacts; system prompts and templates
AI-Generated Code (not explicitly named in IR 8596)
AI Gateways and Routers AI data flows; APIs; inference endpoints (traffic side); model registries and dataset sources
AI Model Models; Algorithms (model configuration)
Training Data Training data
Runtime AI Data Prompts (runtime); inference and RAG data
AI Agent Identities Agents as autonomous principals; Keys; Integrations and permissions
Cyber Defense Matrix Hardware and GPUs; generic containers and microservices (non-AI-specific)

CSA AI Controls Matrix: CSA AICM organizes AI security controls into 18 domains. The primary domain(s) for each asset class are listed below. Auditors using STAR for AI can use this mapping directly.

Asset Class CSA AICM Domains
AI-Workload Platforms Infrastructure Security; Threat & Vulnerability Management
AI Orchestration Tools Application and Interface Security; Supply Chain Management
AI-Generated Code Application and Interface Security; Supply Chain Management
AI Gateways and Routers Infrastructure Security; Interoperability and Portability
AI Model Model Security; Governance, Risk and Compliance
Training Data Data Security and Privacy Lifecycle Management; Model Security
Runtime AI Data Data Security and Privacy Lifecycle Management; Application and Interface Security
AI Agent Identities IAM; Governance, Risk and Compliance
Cyber Defense Matrix IT & Cloud Security; Endpoint & Network Security; IAM (non-AI-specific domains)

ISO 42001: ISO 42001 Annex A defines controls for an AI management system. Each asset class maps to one or more Annex A clauses. Non-AI-specific controls fall under ISO/IEC 27001.

Asset Class ISO 42001 Annex A Clauses
AI-Workload Platforms A.6 AI system life cycle; A.4 Resources for AI systems
AI Orchestration Tools A.6 AI system life cycle; A.5 Assessing impacts of AI systems
AI-Generated Code A.6 AI system life cycle
AI Gateways and Routers A.8 Information for interested parties; A.9 Use of AI systems; A.10 Third-party and customer relationships
AI Model A.6 AI system life cycle; A.10 Third-party and customer relationships; A.5 Assessing impacts of AI systems
Training Data A.7 Data for AI systems
Runtime AI Data A.7 Data for AI systems; A.8 Information for interested parties
AI Agent Identities A.9 Use of AI systems; A.3 Internal organization; A.5 Assessing impacts of AI systems
Cyber Defense Matrix ISO/IEC 27001 Annex A (general IT security controls)

Google SAIF: Google SAIF organizes AI security into six principles covering infrastructure, model, data, and application layers. Full coverage — and SAIF's Focus on Agents section maps directly to the AI Agent Identities row.

Asset Class SAIF Coverage
AI-Workload Platforms Expand strong security foundations; secure and harden the AI deployment environment
AI Orchestration Tools Secure the AI supply chain; application and pipeline security; agent orchestration controls
AI-Generated Code Secure the AI pipeline; code provenance and supply chain integrity
AI Gateways and Routers Harden and monitor infrastructure; network-level access and egress controls
AI Model Protect the AI model; ensure model integrity, provenance, and weight security
Training Data Secure training data; data-security foundations; dataset provenance and integrity
Runtime AI Data Expand AI red-teaming; runtime input and output safety; prompt defense
AI Agent Identities Focus on Agents (explicit SAIF section); identity, authorization, and delegation controls
Cyber Defense Matrix Expand strong security foundations — non-AI-specific infrastructure, endpoint, and identity security

MITRE ATLAS: ATLAS tactics populate matrix cells (Identify, Protect, Detect columns) rather than rows. Techniques are listed here by the asset class most directly affected.

Asset Class Relevant Tactics and Techniques
AI-Workload Platforms AML.T0010 ML Supply Chain Compromise; AML.T0012 Valid Accounts (platform credential abuse); container and inference-server exploits
AI Orchestration Tools AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0016 Obtain Capabilities (malicious plugins)
AI-Generated Code AML.T0018 Backdoor ML Model (via training-poisoned code); hallucinated-package injection
AI Gateways and Routers AML.T0057 LLM Meta Prompt Extraction; network-level exfiltration via AI egress channels
AI Model AML.T0043 Craft Adversarial Data; AML.T0034 Model Inversion Attack; AML.T0006 Adversarial ML Attack; AML.T0024 Exfiltration via ML Inference API
Training Data AML.T0020 Poison Training Data; AML.T0019 Publish Poisoned Datasets
Runtime AI Data AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0056 Embedding Inversion Attack
AI Agent Identities AML.T0053 Compromised ML Software Dependencies; credential and delegation-chain abuse; unauthorized tool invocation
Cyber Defense Matrix Standard MITRE ATT&CK techniques apply to underlying infrastructure (Initial Access, Persistence, Lateral Movement)

OWASP AI Exchange: OWASP AI Exchange classifies threats across development-time, input, and runtime phases. Each asset class sits primarily in one or two of those phases.

Asset Class Threat Categories
AI-Workload Platforms Development-time threats — supply chain attacks, model-platform CVEs, container escape
AI Orchestration Tools Development-time threats — agent framework supply chain; runtime threats — plugin abuse, prompt injection via tools
AI-Generated Code Development-time threats — insecure code generation, license risk, hallucinated dependencies
AI Gateways and Routers Runtime threats — data leakage via AI egress; network-level access control gaps
AI Model Development-time and runtime model threats — model inversion, extraction, evasion, poisoning
Training Data Development-time threats — data poisoning, backdoor injection, dataset integrity violations
Runtime AI Data Input threats — prompt injection, adversarial inputs, evasion; runtime threats — RAG poisoning, memory tampering
AI Agent Identities Runtime threats — unauthorized agent actions, capability abuse, delegation chain exploitation
Cyber Defense Matrix Standard OWASP secure software development (SSDF) and application security practices

OWASP LLM Top 10: Each LLM risk maps to one or two rows. The table shows primary mapping; several risks span multiple asset classes.

Asset Class Applicable Risks
AI-Workload Platforms LLM03 Supply Chain (compromised ML platform components); LLM04 Data and Model Poisoning (via platform)
AI Orchestration Tools LLM01 Prompt Injection; LLM02 Insecure Output Handling; LLM07 System Prompt Leakage; LLM10 Unbounded Consumption
AI-Generated Code LLM06 Excessive Agency (code execution); insecure or vulnerable code patterns inherited from training data
AI Gateways and Routers LLM10 Unbounded Consumption (cost and rate control); shadow AI egress and output handling
AI Model LLM03 Supply Chain; LLM04 Data and Model Poisoning; LLM09 Misinformation
Training Data LLM04 Data and Model Poisoning; LLM03 Supply Chain (dataset provenance)
Runtime AI Data LLM01 Prompt Injection; LLM08 Vector and Embedding Weaknesses; LLM05 Improper Output Handling
AI Agent Identities LLM06 Excessive Agency; LLM05 Improper Output Handling; unauthorized actions by AI agents
Cyber Defense Matrix Traditional OWASP Top 10 (injection, broken access control, etc.) applies to underlying web and API infrastructure

OWASP Agentic Security Top 10: OWASP ASI reinforces AI Agent Identities and AI Orchestration Tools as the primary rows. Memory and context poisoning touches Runtime AI Data; unexpected code execution touches AI-Generated Code.

Asset Class Applicable Issues
AI-Workload Platforms ASI-06 Resource Oversubscription; platform-level agent execution environment risks
AI Orchestration Tools ASI-01 Agent Goal and Instruction Manipulation; ASI-04 Excessive Autonomy; ASI-07 Prompt Injection; ASI-08 Misuse of Tool Calls
AI-Generated Code ASI-09 Unexpected Code Execution; ASI-06 Resource Oversubscription via generated code
AI Gateways and Routers ASI-03 Identity and Authorization Exploitation; API and tool invocation scope enforcement
AI Model ASI-02 Compromised AI Model; model integrity and manipulation issues
Training Data ASI-10 Trust Boundary Violations; malicious data injection affecting model behavior
Runtime AI Data ASI-07 Prompt Injection Attacks; ASI-05 Context Manipulation and Memory Poisoning
AI Agent Identities ASI-01 Agent Goal and Instruction Manipulation; ASI-03 Identity and Authorization Exploitation; ASI-08 Misuse of Tool Calls
Cyber Defense Matrix Supporting identity, network, and endpoint controls that underpin agentic infrastructure

Considered and Rejected

These asset classes were evaluated but excluded because they fold into existing rows in the AI Defense Matrix or into the Cyber Defense Matrix.

Human Using AI

Captured in the Users row of the Cyber Defense Matrix, since a human’s usage of AI extends traditional patterns and tooling.

Workforce AI Tools

Captured in the Device and Application rows of the Cyber Defense Matrix. AI-aware variants of endpoint DLP exist, but controls for these mirror traditional endpoint or SaaS governance.

Prompts

User prompts align with Runtime AI Data; system prompts align into AI Orchestration.

Model registries and AI BOM

Overlaps with AI Model where defender team and emerging tool overlap.

Service accounts and API keys

Captured in AI Agent Identities. Agent credentials are inseparable from agent identity.

MLOps and training pipelines

Captured in AI-Workload Platforms (build systems), AI Gateways and Routers (artifact movement), and Training Data (governance).

Embedded AI in third-party SaaS

Captured in AI Orchestration Tools (SSPM), AI Agent Identities (embedded-agent identity), and cross-cutting third-party AI risk.

Change Log

As we make revisions, we capture them here along with our reasoning. Feedback welcome; reach out to Lenny Zeltser or Sounil Yu.

Asset Class Prior State Why Changed
AI-Workload Platforms GPUs, Containers, Isolates, Unikernels Raw GPUs, containers, isolates, and unikernels fail the primary criterion. AI-Workload Platforms are the AI-specific sliver.
AI Orchestration Tools Agent Frameworks Renamed and broadened to cover agentic orchestration tools, plugins, harnesses, system prompts, and MCP clients on user devices.
AI-Generated Code AI generated code Retained as its own row. SAST with AI-pattern awareness and IDE-level policy form a distinct tool category from AI Orchestration Tools.
AI Gateways and Routers MCP Proxy, LLM Routers, MCP Gateways, MCP Servers Renamed for clarity and widened to cover outbound AI-service traffic, shadow AI egress, and model-registry traffic.
AI Model Self hosted AI Models, Consumed-as-a-Service AI Models Scope widened to cover self-hosted artifacts plus consumed-as-a-service models. Neither gets handled by traditional security tooling.
Runtime AI Data Input/Inference Data/RAG Data/Vector DB Renamed for clarity and widened to include user prompts, interaction history, and persistent agent memory. The vector DB platform moves to AI-Workload Platforms.
AI Agent Identities Agents Name clarified. Scope widens to include identities, credentials, keys, permission scopes, service accounts, and agent-to-agent delegation chains.