AI agents are no longer experimental. They are writing code, triaging support tickets, processing financial data, managing infrastructure, and making decisions that affect customers. In most organizations, this adoption has outpaced security's ability to govern it.
If you are a CISO, you do not need to become an AI researcher to secure your organization's agent deployments. You need to ask the right questions and know what good answers sound like. These five questions will give you a clear picture of where you stand, what gaps exist, and what to prioritize.
1 How many AI agents are running in our environment right now?
Why it matters: You cannot secure what you cannot see. The first principle of any security program is asset inventory. If you do not know how many agents exist, where they run, and who owns them, every other security control is built on an incomplete foundation. This is the AI equivalent of not knowing how many servers you have.
The typical answer: Most organizations can name the agents that went through formal deployment processes. These are usually the flagship projects: the customer-facing chatbot, the internal knowledge base assistant, perhaps a code review tool. The number they give is typically 3 to 10. The actual number, including informal automations, personal scripts calling LLM APIs, and embedded AI features in third-party tools, is usually 3x to 10x higher.
What the right answer looks like: A maintained inventory that includes every agent, whether it was deployed by engineering, data science, operations, or an individual employee. Each entry should include: the agent's purpose, its owner, where it runs, which LLM provider it calls, what data it accesses, and when it was last reviewed. The inventory should be updated through a combination of automated discovery (network monitoring for LLM API traffic, code scanning for API keys) and periodic self-reporting from teams.
What to do if the answer is wrong: Start with automated discovery. Scan your network logs for outbound traffic to known LLM API endpoints (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com). Scan code repositories and CI/CD configurations for API key patterns. Audit billing records for AI provider charges. Then supplement with a company-wide survey, framed as an inventory exercise rather than an enforcement action. Give teams 30 days to register their agents. The goal is visibility, not punishment.
2 For each agent, what data can it access and what actions can it take?
Why it matters: An agent's risk profile is determined by two factors: the sensitivity of data it can read and the impact of actions it can perform. An agent that summarizes public documentation is low risk. An agent that reads customer PII and can send emails is high risk. An agent that accesses financial systems and can initiate transactions is critical risk. Without understanding these dimensions for each agent, you cannot prioritize your security efforts.
The typical answer: Teams can usually describe what data the agent is intended to access, but they rarely know the full scope of what it can access. An agent built to query a single database table often runs with credentials that grant access to the entire database. An agent designed to read files from one S3 bucket often has an IAM role that permits access to dozens of buckets. The principle of least privilege is almost never applied to agent deployments.
What the right answer looks like: For each agent, a documented mapping of: (a) data sources it connects to, with the specific scope of access (which tables, which buckets, which API endpoints), (b) tools or functions it can invoke, with the parameters each tool accepts, (c) external services it communicates with, and (d) a data classification label indicating the highest sensitivity level of data it touches. This mapping should be enforced through technical controls, not just documented as policy.
# Example: Agent access profile (what "right" looks like)
agent: support-ticket-router
owner: support-engineering
environment: production
llm_provider: anthropic (corporate account)
last_reviewed: 2026-03-15
data_access:
- source: support_db
tables: [tickets, ticket_categories] # NOT customers, NOT billing
operations: [SELECT] # Read only
row_filter: "status = 'open'" # Only open tickets
tools:
- name: assign_ticket
params: [ticket_id, category, priority]
constraints:
priority: ["low", "medium", "high"] # Cannot set "critical"
- name: add_internal_note
params: [ticket_id, note_text]
constraints:
note_text_max_length: 500
external_services: none
data_classification: PII (customer names in ticket text)
approval_required_for: [reassign to human, escalate to critical]
What to do if the answer is wrong: For each agent in your inventory, conduct an access audit. Review the credentials the agent uses and map their actual permissions, not just the intended ones. Identify over-provisioned access and create a remediation plan to scope credentials down to the minimum required. Implement tool-level allowlists that restrict which functions the agent can call, enforced at the infrastructure layer rather than the prompt layer.
3 What happens if someone sends a crafted malicious input to one of our agents?
Why it matters: Prompt injection is the SQL injection of AI systems. Any agent that processes external input (user messages, emails, uploaded documents, data from third-party APIs) is a potential injection target. A successful injection can cause the agent to ignore its instructions, access unauthorized data, take unintended actions, or leak sensitive information. If your agents handle external input and you have not tested them against injection attacks, you have an open vulnerability.
The typical answer: "We have a strong system prompt that tells the agent not to do anything harmful." This is the equivalent of preventing SQL injection by telling developers to "be careful with user input." System prompts are guidance for the model, not enforcement boundaries. They can be overridden by well-crafted input. Relying on them as your primary defense is insufficient.
What the right answer looks like: A layered defense that does not depend on the model following instructions. Input validation that detects and flags known injection patterns before they reach the model. Tool-level access controls that limit what the agent can do regardless of what it is told. Parameter validation that ensures tool arguments stay within expected bounds. Output filtering that catches sensitive data before it reaches the user. Approval gates for high-impact actions. And regular red team testing to verify these controls work against current attack techniques.
What to do if the answer is wrong: Start with a threat model for each externally-facing agent. Identify the input sources, the tools available, and the worst-case outcome of a successful injection. Prioritize agents where the blast radius is largest (those with access to sensitive data or high-impact actions). Implement infrastructure-level controls that enforce security independently of the model. Then schedule regular adversarial testing, either with an internal red team or an external firm that specializes in AI agent security.
4 If an agent misbehaves right now, how fast can we detect and stop it?
Why it matters: Even with strong preventive controls, agents will occasionally do unexpected things. Models are probabilistic, inputs are unpredictable, and edge cases are inevitable. The question is not whether an agent will ever misbehave. The question is whether you will notice when it does, and whether you can stop it before the impact escalates. This is the detection and response side of agent security, and it is almost universally neglected.
The typical answer: Silence. Most organizations have no monitoring specific to agent behavior. Agent actions may appear in general application logs, but there are no alerts tuned to detect anomalous agent activity. If an agent starts accessing data outside its normal scope, sending unusual volumes of requests, or producing outputs that contain sensitive information, nobody is watching. Detection relies on someone noticing downstream effects, a customer complaint, an unexpected bill, or a data breach discovered by other means.
What the right answer looks like: Real-time monitoring of agent actions with alerting on anomalous behavior. Every tool call, every data access, and every external communication should be logged with enough context to reconstruct what happened and why. Alerts should fire on: tool calls outside the agent's normal pattern, access to data outside the agent's defined scope, sudden changes in request volume or frequency, outputs flagged by content filters, and failed policy checks. There should be a documented runbook for agent incidents, including how to shut down a specific agent, revoke its credentials, and assess the blast radius.
# Agent monitoring: key signals to watch
monitoring:
tool_calls:
alert_on:
- tool not in agent's allowlist (blocked, but indicates injection attempt)
- tool call frequency > 3x baseline in 5-minute window
- parameter values outside expected ranges
data_access:
alert_on:
- query targets table/bucket outside agent's scope
- row count returned > 10x normal for this agent
- access outside agent's normal operating hours
outputs:
alert_on:
- PII detected in user-facing response
- credentials or API keys in response
- response length > 5x normal (possible data exfiltration)
kill_switch:
mechanism: revoke API key + disable tool access
target_time: under 60 seconds from alert to shutdown
runbook: /docs/incident-response/agent-misbehavior.md
What to do if the answer is wrong: Implement structured logging for all agent actions as the first step. Every tool call should be logged with a timestamp, the agent's identity, the tool name, the parameters, and the result. Build dashboards that show agent activity patterns over time, so anomalies become visible. Define alert thresholds based on baseline behavior. And document a kill switch for each agent: the exact steps to shut it down, revoke its access, and contain the damage.
5 Are there AI agents running on personal credentials that security has no visibility into?
Why it matters: This is the shadow agent question, and it is arguably the most important one on this list. Shadow agents represent unknown risk. They operate outside every control you have built: no access governance, no monitoring, no incident response capability, no data loss prevention. They are the dark matter of your AI security posture. They exert gravitational force on your risk profile, but you cannot see them directly.
The typical answer: "I don't think so" or "We have a policy against that." Neither answer is based on evidence. Policies do not prevent shadow agents any more than policies prevent shadow IT. People build unofficial tools because the official path is too slow or too restrictive. If your organization has developers, data scientists, or technically capable employees (and of course it does), you almost certainly have shadow agents.
What the right answer looks like: "We actively scan for them, and here is what we found." The right answer is based on ongoing discovery efforts: network monitoring for LLM API traffic from unauthorized sources, code scanning for personal API keys, billing audits for AI provider charges, and regular surveys of technical teams. The organization has a fast-track process for bringing shadow agents into compliance, so employees are incentivized to self-report rather than hide.
What to do if the answer is wrong: Launch a shadow agent discovery initiative. Monitor network egress for traffic to LLM API endpoints. Scan repositories for API key patterns. Run an anonymous survey asking employees about their AI tool usage. Announce an amnesty period for self-reporting. Then build the infrastructure to make sanctioned agent usage easier than unsanctioned usage: a shared corporate LLM API account, a lightweight approval process, and clear minimum security requirements that are simple to follow.
Score Yourself
For each question above, give yourself one point if your organization has a credible, evidence-based answer. Be honest. "We're working on it" is a zero. "We have a policy but haven't verified compliance" is a zero. Only count it if you have implemented controls and can demonstrate they work.
Score: 0 to 1 out of 5
Your AI agent security posture has critical gaps. You likely do not have visibility into what agents exist, what they can access, or whether they are behaving correctly. This is not unusual; most organizations are here. But it means you are operating with material unquantified risk. Prioritize discovery (Questions 1 and 5) immediately. You need to know what you are dealing with before you can secure it. Consider engaging an external firm for a rapid assessment, because internal teams often lack the specialized tooling and experience to discover shadow agents efficiently.
Score: 2 to 3 out of 5
You have foundational visibility but significant gaps in enforcement or detection. You probably know about your officially deployed agents and have some access controls in place, but you may be missing shadow agents and you likely lack real-time monitoring. Focus on closing the gaps in order: access controls (Question 2), then injection defenses (Question 3), then monitoring (Question 4). Each layer you add substantially reduces your risk surface. You are in a strong position to build a comprehensive program.
Score: 4 to 5 out of 5
You are ahead of the vast majority of organizations. Your agent security program has coverage across discovery, access control, adversarial defense, and monitoring. Your next steps are about maturity: red team exercises against your controls, automation of your discovery and monitoring processes, and integration of agent security into your broader security operations workflow. Consider publishing your approach; the industry needs more examples of what good looks like.
Regardless of your score, the fact that you are asking these questions puts you ahead. AI agent adoption is accelerating, and the security practices around it are still forming. The organizations that establish strong foundations now will avoid the painful breaches and compliance failures that will inevitably hit those that wait. Start with visibility. Build toward enforcement. The questions above give you a roadmap.
Find out where you stand.
Our free AI agent security assessment evaluates your organization against all five dimensions. You will get a detailed report with prioritized recommendations in under a week.
Get Your Free Assessment