AI Defense Matrix

Core Framework

The AI Defense Matrix

The AI Defense Matrix is a structured framework for defending AI systems. Each row is an AI-specific asset class. Columns are NIST CSF 2.0 functions. Cells show the AI-specific control category, objective, or representative tooling for each intersection. You can download the AI Defense Matrix as CSV, YAML, and Markdown files.

Asset Class	⚖️Govern	🔍Identify	🛡️Protect	📡Detect	⚡Respond	♻️Recover
DEVICES
AI-Workload Platforms	AI-platform standards	AI security posture management	AI-workload hardening; model-loading supply-chain verification	AI-workload runtime detection	Generic container IR	Generic platform restore
AI Orchestration Tools	AI application governance	AIBOM for applications; agent-framework discovery	System-prompt hardening; plugin allowlisting	Prompt-injection testing; agent anomaly detection	Agent runtime IR; plugin disable	Framework config; prompt rollback
APPLICATIONS
AI-Generated Code	AI coding standards, code-review policy, license; provenance policy	AI-code provenance; origin tracking	AI-aware SAST	Hallucinated dependency; insecure-pattern detection	PR block; revert of AI-generated commits	Code rewrite; replacement of flagged artifacts
NETWORKS
AI Gateways and Routers	AI egress policy; approved-service registry	AI traffic discovery	AI gateways for egress; MCP gateways for tool gating	Anomalous AI traffic; RAG-leakage egress detection	AI traffic blocking; shadow AI takedown	Generic network failover
DATA
AI Model	Model selection; provider evaluation	Model inventory; AIBOM	Model firewalls; weight protection	Model drift; integrity monitoring	Model rollback; provider coordination for consumed models	Model version restore; provider re-selection
Training Data	Dataset provenance; licensing policy	Dataset inventory; lineage	Data access control	Poisoning; backdoor detection	Dataset quarantine; retraining trigger	Dataset restore from golden copies; model retraining
Runtime AI Data	Prompt; RAG policy, memory-retention governance, interaction-history policy	RAG source; LLM-oversharing inventory	Prompt-injection defense, RAG sanitization, memory-poisoning defense, AI-content DLP	Prompt anomaly, jailbreak attempts, RAG leakage, memory tampering	Session termination; RAG source isolation	Vector DB restore; re-indexing
USERS
AI Agent Identities	AI agent identity policy, authorization standards, OAuth for agents	AI agent; non-human principal inventory	Agent OAuth; capability scoping, short-lived credentials	Agent behavioral monitoring; runtime authorization drift	Credential revocation, agent quarantine, session termination	Agent identity re-provisioning

Overview

Using the AI Defense Matrix

The AI Defense Matrix helps you identify gaps, assign ownership, and select controls for defending AI systems. It focuses on “security for AI,” covering AI-specific components enterprises must defend and the controls that secure them. “AI for security” stays with the Cyber Defense Matrix, since AI is now incorporated into most cybersecurity products.

Practitioners

Gap Analysis

Review every cell and ask whether process, technology, or both cover that asset class and NIST CSF function. Start with Govern to understand your current state regarding AI ownership, risk appetite, and policy.

Read the left-to-right progression as a maturity signal. Mark each cell as covered, partial, or absent to produce your gap inventory, then develop a plan to address the gaps based on your priorities.

Vendors

Product Positioning

Review every cell and identify the intersections of asset class and NIST CSF function that your product addresses. Map your capabilities to those specific cells rather than claim broad coverage across the matrix.

Treat thinly covered cells as opportunities for differentiation or new products that solve underserved customer needs. Use the cell map to sharpen product roadmap, sales narrative, and other go-to-market decisions.

Principles

Design Criteria

Each row must satisfy the primary criterion. The secondary criterion resolves edge cases where two asset classes might otherwise be collapsed or split.

Primary

AI-Specific Defense Required

A new row should exist only if defending that asset requires AI-specific considerations, processes, or tools. If a generic security approach handles the defense well, or if the defense mirrors that of a non-AI asset, those entries belong in the Cyber Defense Matrix.

For example, hardening an LLM inference server requires awareness of model loading paths, safetensors provenance, GPU memory isolation, and CUDA-library supply chain. Generic Kubernetes hardening misses these risks entirely.

Secondary

Shared Defender Team & Tooling

Two asset classes fold into one row when the same defender team handles both with the same tools. If either the team or the tools differ, the assets likely deserve separate rows.

For example, self-hosted models and consumed-as-a-service models are distinct assets. Their model-layer trust concerns and defender tooling overlap enough to keep them in a single AI Model row.

Asset Classes

What Each Row Defends and Why

Every asset class in the AI Defense Matrix requires AI-specific defense that traditional security tooling doesn’t adequately deliver.

🖥️

AI-Workload Platforms

Devices Theme

What? Inference servers, training platforms, vector DB platforms, and the model-loading supply chain.

Why? Model-serving platform CVEs, model-loading supply-chain attacks, and ML orchestration framework exploits require AI-platform-specific hardening. Generic Kubernetes or container security does not reach these risks. AI security posture management (AI-SPM) tooling anchors this row.

⚙️

AI Orchestration Tools

Devices Theme

What? Agentic orchestration tools, plus their plugins, skills, hooks, system prompts, scaffolding, harnesses, configuration settings, and MCP clients on user devices.

Why? Supply chain of AI scaffolding, prompt-injection testing of system prompts, and plugin governance all require AI-specific AppSec tooling. AI application governance, AIBOM scanning, and prompt-defense tooling anchor this row.

💻

AI-Generated Code

Applications Theme

What? Code produced by AI tools, AI-assisted reviews, AI-generated infrastructure-as-code and tests, and vibe-coded apps that bypass CI/CD.

Why? AI-suggested code introduces hallucinated dependencies, insecure patterns inherited from training data, and licensing ambiguity. Generic SAST does not catch hallucinated APIs or AI-specific insecure patterns. AI-aware static analysis tooling anchors this row.

🔀

AI Gateways and Routers

Networks Theme

What? MCP proxies and gateways, LLM routers, outbound AI-service traffic, shadow AI egress, and model-registry traffic.

Why? New protocols such as MCP and AI-aware content inspection require AI-specific gateway and egress tooling. Generic firewalls miss prompt and RAG leakage. AI gateway tooling covers egress and cost controls; MCP-specific gateway tooling covers tool gating and scope enforcement.

🧠

AI Model

Data Theme

What? Model weights, fine-tuning checkpoints, model cards, registries, AIBOM, and the third-party LLMs your enterprise consumes.

Why? Self-hosted models need model-weight protection, model firewalls, and AIBOM scanning. Consumed-as-a-service models need provider evaluation, model card review, and version pinning. No traditional security tool addresses either. Model security and AI governance tooling anchor this row.

📚

Training Data

Data Theme

What? Datasets used for training, fine-tuning, and continued learning.

Why? Traditional DLP and data classification tools do not address poisoning and backdoor attacks on training data. The defense requires AI-specific provenance and poisoning detection, alongside data access control and lineage governance tooling.

⚡

Runtime AI Data

Data Theme

What? User prompts, inference inputs, RAG content, vector DB content, persistent agent memory, and interaction history.

Why? Prompt injection, RAG poisoning, vector DB access control, and AI-content DLP require AI-native tooling. Persistent agent memory adds memory-poisoning attacks that survive model swaps. AI-native prompt defense, knowledge-layer access control, and guardrail tooling anchor this row.

🤖

AI Agent Identities

Users Theme

What? AI agents as non-human principals, plus credentials, keys, permission scopes, service accounts, and delegation chains across agents and tools.

Why? AI agents scale differently from human users, generating ephemeral, high-volume credentials across services. Traditional IAM does not handle agent lifecycle, capability scoping, or delegation-chain attenuation at runtime. Non-human identity tooling anchors this row.

Standards

Framework Alignment

The AI Defense Matrix rows accommodate major AI security frameworks. Select a framework below to see how its concepts map to each asset class. You can download the crossmapping as CSV, YAML, and Markdown files.

NIST IR 8596: NIST IR 8596 identifies AI system components organizations should protect. Each concept below maps to a row in the AI Defense Matrix or to the Cyber Defense Matrix.

Asset Class	NIST IR 8596 Concepts
AI-Workload Platforms	Containers, microservices, and libraries (AI-specific subset); inference endpoints (platform side)
AI Orchestration Tools	Agents as deployed artifacts (orchestration view; see AI Agent Identities row for the principal view); system prompts and templates
AI-Generated Code	(not explicitly named in IR 8596)
AI Gateways and Routers	AI data flows; APIs; inference endpoints (traffic side); model registries and dataset sources
AI Model	Models; Algorithms (model configuration)
Training Data	Training data
Runtime AI Data	Prompts (runtime); inference data
AI Agent Identities	Agents as autonomous principals; Keys; Integrations and permissions
Cyber Defense Matrix	Hardware and GPUs; generic containers and microservices (non-AI-specific)

CSA AI Controls Matrix: CSA AICM organizes AI security controls into 18 domains. The primary domain(s) for each asset class are listed below. Auditors using STAR for AI can use this mapping directly.

Asset Class	CSA AICM Domains
AI-Workload Platforms	Infrastructure Security; Threat & Vulnerability Management
AI Orchestration Tools	Application and Interface Security; Supply Chain Management
AI-Generated Code	Application and Interface Security; Supply Chain Management
AI Gateways and Routers	Infrastructure Security; Interoperability and Portability
AI Model	Model Security; Governance, Risk and Compliance
Training Data	Data Security and Privacy Lifecycle Management; Model Security
Runtime AI Data	Data Security and Privacy Lifecycle Management; Application and Interface Security
AI Agent Identities	IAM; Governance, Risk and Compliance
Cyber Defense Matrix	IT & Cloud Security; Endpoint & Network Security; IAM (non-AI-specific domains)

ISO 42001: ISO 42001 Annex A defines controls for an AI management system. Each asset class maps to one or more Annex A clauses. Non-AI-specific controls fall under ISO/IEC 27001.

Asset Class	ISO 42001 Annex A Clauses
AI-Workload Platforms	A.6 AI system life cycle; A.4 Resources for AI systems
AI Orchestration Tools	A.6 AI system life cycle; A.5 Assessing impacts of AI systems
AI-Generated Code	A.6 AI system life cycle
AI Gateways and Routers	A.8 Information for interested parties; A.9 Use of AI systems; A.10 Third-party and customer relationships
AI Model	A.6 AI system life cycle; A.10 Third-party and customer relationships; A.5 Assessing impacts of AI systems
Training Data	A.7 Data for AI systems
Runtime AI Data	A.7 Data for AI systems; A.8 Information for interested parties
AI Agent Identities	A.9 Use of AI systems; A.3 Internal organization; A.5 Assessing impacts of AI systems
Cyber Defense Matrix	ISO/IEC 27001 Annex A (general IT security controls)

Google SAIF: Google SAIF organizes AI security into six principles covering infrastructure, model, data, and application layers. SAIF's concepts all fit into the Matrix, with its Focus on Agents section mapping to the AI Agent Identities row.

Asset Class	SAIF Coverage
AI-Workload Platforms	Expand strong security foundations; secure and harden the AI deployment environment
AI Orchestration Tools	Secure the AI supply chain; application and pipeline security; agent orchestration controls
AI-Generated Code	Secure the AI pipeline; code provenance and supply chain integrity
AI Gateways and Routers	Harden and monitor infrastructure; network-level access and egress controls
AI Model	Protect the AI model; ensure model integrity, provenance, and weight security
Training Data	Secure training data; data-security foundations; dataset provenance and integrity
Runtime AI Data	Expand AI red-teaming; runtime input and output safety; prompt defense
AI Agent Identities	Focus on Agents (explicit SAIF section); identity, authorization, and delegation controls
Cyber Defense Matrix	Expand strong security foundations: non-AI-specific infrastructure, endpoint, and identity security

MITRE ATLAS: ATLAS tactics populate matrix cells (Identify, Protect, Detect columns) rather than rows. Techniques are listed here by the asset class most directly affected. Technique names follow ATLAS v5.6.0.

Asset Class	Relevant Tactics and Techniques
AI-Workload Platforms	AML.T0010 AI Supply Chain Compromise; AML.T0012 Valid Accounts (platform credential abuse); container and inference-server exploits
AI Orchestration Tools	AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0016 Obtain Capabilities (malicious plugins)
AI-Generated Code	AML.T0010 AI Supply Chain Compromise (hallucinated dependencies and slopsquatting); AML.T0018 Manipulate AI Model (when models embed code-execution backdoors)
AI Gateways and Routers	AML.T0057 LLM Data Leakage; AML.T0024 Exfiltration via AI Inference API (network-side observation)
AI Model	AML.T0043 Craft Adversarial Data; AML.T0024 Exfiltration via AI Inference API (subtechniques: AML.T0024.001 Invert AI Model and AML.T0024.002 Extract AI Model); AML.T0018 Manipulate AI Model (integrity and backdoor)
Training Data	AML.T0020 Poison Training Data; AML.T0019 Publish Poisoned Datasets; AML.T0024.000 Infer Training Data Membership
Runtime AI Data	AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0056 Extract LLM System Prompt
AI Agent Identities	AML.T0053 AI Agent Tool Invocation; credential and delegation-chain abuse
Cyber Defense Matrix	Standard MITRE ATT&CK techniques apply to underlying infrastructure (Initial Access, Persistence, Lateral Movement)

OWASP AI Exchange: OWASP AI Exchange classifies threats across development-time, input, and runtime phases. Each asset class sits primarily in one or two of those phases.

Asset Class	Threat Categories
AI-Workload Platforms	Development-time threats: supply chain attacks, model-platform CVEs, container escape
AI Orchestration Tools	Development-time threats: agent framework supply chain; runtime threats: plugin abuse, prompt injection via tools
AI-Generated Code	Development-time threats: insecure code generation, license risk, hallucinated dependencies
AI Gateways and Routers	Runtime threats: data leakage via AI egress; network-level access control gaps
AI Model	Development-time and runtime model threats: model inversion, extraction, evasion, poisoning
Training Data	Development-time threats: data poisoning, backdoor injection, dataset integrity violations
Runtime AI Data	Input threats: prompt injection, adversarial inputs, evasion; runtime threats: RAG poisoning, memory tampering
AI Agent Identities	Runtime threats: unauthorized agent actions, capability abuse, delegation chain exploitation
Cyber Defense Matrix	Standard OWASP secure software development (SSDF) and application security practices

OWASP LLM Top 10: Each LLM risk maps to one or two rows. The table shows primary mapping; several risks span multiple asset classes. Risk titles follow the 2025 release of the OWASP Top 10 for LLM Applications.

Asset Class	Applicable Risks
AI-Workload Platforms	LLM03 Supply Chain (compromised AI platform components); LLM04 Data and Model Poisoning (via platform)
AI Orchestration Tools	LLM01 Prompt Injection; LLM05 Improper Output Handling; LLM07 System Prompt Leakage; LLM10 Unbounded Consumption
AI-Generated Code	LLM06 Excessive Agency (code execution); insecure or vulnerable code patterns inherited from training data
AI Gateways and Routers	LLM10 Unbounded Consumption (cost and rate control); shadow AI egress and output handling
AI Model	LLM03 Supply Chain; LLM04 Data and Model Poisoning; LLM09 Misinformation
Training Data	LLM04 Data and Model Poisoning; LLM03 Supply Chain (dataset provenance)
Runtime AI Data	LLM01 Prompt Injection; LLM02 Sensitive Information Disclosure; LLM08 Vector and Embedding Weaknesses; LLM05 Improper Output Handling
AI Agent Identities	LLM06 Excessive Agency; LLM05 Improper Output Handling; unauthorized actions by AI agents
Cyber Defense Matrix	Traditional OWASP Top 10 (injection, broken access control, etc.) applies to underlying web and API infrastructure

OWASP Agentic Security Top 10: OWASP ASI reinforces AI Agent Identities and AI Orchestration Tools as the primary rows. Memory and context poisoning touches Runtime AI Data; unexpected code execution touches AI-Generated Code. Mappings follow the December 2025 release of the OWASP Top 10 for Agentic Applications 2026.

Asset Class	Applicable Issues
AI-Workload Platforms	ASI04 Agentic Supply Chain Vulnerabilities (model and tool-platform components); ASI08 Cascading Failures (platform fault propagation)
AI Orchestration Tools	ASI01 Agent Goal Hijack; ASI02 Tool Misuse and Exploitation; ASI05 Unexpected Code Execution (RCE); ASI07 Insecure Inter-Agent Communication; ASI08 Cascading Failures; ASI10 Rogue Agents
AI-Generated Code	ASI05 Unexpected Code Execution (RCE); ASI04 Agentic Supply Chain Vulnerabilities (hallucinated dependencies and vibe-coding artifacts)
AI Gateways and Routers	ASI07 Insecure Inter-Agent Communication; ASI02 Tool Misuse and Exploitation (egress and tool-invocation scope); ASI04 Agentic Supply Chain Vulnerabilities (MCP and tool-registry trust)
AI Model	ASI04 Agentic Supply Chain Vulnerabilities (model provenance, weights, and dynamic loading)
Training Data	ASI04 Agentic Supply Chain Vulnerabilities (dataset provenance and integrity)
Runtime AI Data	ASI06 Memory & Context Poisoning; ASI01 Agent Goal Hijack (via prompt injection in runtime inputs)
AI Agent Identities	ASI03 Identity and Privilege Abuse; ASI10 Rogue Agents; ASI09 Human-Agent Trust Exploitation; ASI02 Tool Misuse and Exploitation (when tied to agent permissions)
Cyber Defense Matrix	Supporting identity, network, and endpoint controls that underpin agentic infrastructure

Scope Decisions

Considered and Rejected

These asset classes were evaluated but excluded because they fold into existing rows in the AI Defense Matrix or into the Cyber Defense Matrix.

Human Using AI

Captured in the Users row of the Cyber Defense Matrix, since a human’s usage of AI extends traditional patterns and tooling.

Workforce AI Tools

Captured in the Device and Application rows of the Cyber Defense Matrix. AI-aware variants of endpoint DLP exist, but controls for these mirror traditional endpoint or SaaS governance.

Prompts

User prompts align with Runtime AI Data; system prompts align into AI Orchestration.

Model registries and AI BOM

Overlaps with AI Model where defender team and emerging tool overlap.

Service accounts and API keys

Captured in AI Agent Identities. Agent credentials are inseparable from agent identity.

MLOps and training pipelines

Captured in AI-Workload Platforms (build systems), AI Gateways and Routers (artifact movement), and Training Data (governance).

Embedded AI in third-party SaaS

Captured in AI Orchestration Tools (SSPM), AI Agent Identities (embedded-agent identity), and cross-cutting third-party AI risk.

Evolution

Change Log

As we make revisions, we capture them here along with our reasoning. Feedback welcome; reach out to Lenny Zeltser or Sounil Yu.

Asset Class	Prior State	Why Changed
AI-Workload Platforms	GPUs, Containers, Isolates, Unikernels	Raw GPUs, containers, isolates, and unikernels fail the primary criterion. AI-Workload Platforms are the AI-specific sliver.
AI Orchestration Tools	Agent Frameworks	Renamed and broadened to cover agentic orchestration tools, plugins, harnesses, system prompts, and MCP clients on user devices.
AI-Generated Code	AI generated code	Retained as its own row. SAST with AI-pattern awareness and IDE-level policy form a distinct tool category from AI Orchestration Tools.
AI Gateways and Routers	MCP Proxy, LLM Routers, MCP Gateways, MCP Servers	Renamed for clarity and widened to cover outbound AI-service traffic, shadow AI egress, and model-registry traffic.
AI Model	Self hosted AI Models, Consumed-as-a-Service AI Models	Scope widened to cover self-hosted artifacts plus consumed-as-a-service models. Neither gets handled by traditional security tooling.
Runtime AI Data	Input/Inference Data/RAG Data/Vector DB	Renamed for clarity and widened to include user prompts, interaction history, and persistent agent memory. The vector DB platform moves to AI-Workload Platforms.
AI Agent Identities	Agents	Name clarified. Scope widens to include identities, credentials, keys, permission scopes, service accounts, and agent-to-agent delegation chains.