Skip navigation
AI Security

Agentic AI security: Three threats your team should know

AI agents are in your environment right now. They’re reading databases, sending messages on behalf of employees, and executing multi-step workflows across production systems. If you’re a security leader, you already know this introduces risk.

The hard part isn’t awareness. It’s the pressure to keep pace. Every week brings new agentic capabilities, new integrations, new competitive advantages your organization can’t afford to sit out. So you let adoption move forward and accept a certain level of risk, because falling behind feels worse.

That’s a reasonable trade-off. But most organizations are accepting risk they haven’t actually scoped. The agentic AI security challenge covers more ground than traditional security models account for, and without a way to think about it, it’s hard to know which exposures matter, which ones are already present, and where your existing controls fall short.

In our research at Cisco Duo into how AI agents interact with enterprise systems through protocols like MCP (Model Context Protocol), we keep seeing threats cluster into three categories. The point here isn’t to slow down adoption. It’s to give security teams a framework for reasoning about where the real exposure is, so you can keep moving forward with your eyes open.

Category 1: Misconfiguration: “The threats hiding in your setup”

The most common agentic threats aren’t attacks. They’re configuration mistakes and tooling limitations.

When organizations deploy agents, they connect them to enterprise tools and grant permissions. The urgency is part of it, but the bigger issue is tooling. The policy and configuration systems most organizations rely on were designed for human users. They don’t map cleanly to agents, which are non-human identities that need per-action scoping, tighter delegation boundaries, and identity models that most traditional tools simply don’t support. So teams do the best they can with what they have, and the result is permissions that are too broad, too persistent, and too loosely scoped. You don’t need an attacker to exploit these conditions. They’re exploitable by design.

Over-privileged agents are the simplest example. A developer wants a coding assistant to review pull requests and leave comments. They hand it their personal GitHub access token, the same one they use for CLI work. That token carries every permission the developer has: push code, merge branches, delete repositories, access private repos across the org. The agent was meant to read and comment. Now it can do everything the developer can, with no guardrails and no one in the loop.

Delegation scope drift is harder to spot. An agent starts with read-only Salesforce access. Over eight months, support tickets lead to adding write access, then export, then full API access. Each change makes sense at the time. Nobody reviews the cumulative result, which now far exceeds the original intent.

Cross-user boundary violations show up when agents can reach data belonging to someone other than the user who authorized them. An enterprise Slack integration grants agents access to “all channels the app is installed in” rather than “channels the delegating user belongs to.” One user’s agent can now read another user’s private channels and DMs.

Shared agent identities are an attribution problem. A platform team creates a single “team-devops-bot” identity shared by 20 engineers, connected to AWS, Kubernetes, and Terraform. When the agent runs a destructive terraform destroy, logs show “team-devops-bot.” Good luck figuring out which engineer triggered it.

None of these are edge cases. They’re the default outcome when you apply existing identity and access patterns to agents without rethinking the model. The fixes aren’t exotic: least-privilege scoping per agent, user-level isolation, individual identities with clear ownership, periodic review of accumulated permissions. Most teams just haven’t had the bandwidth to get there yet.

Category 2: Non-deterministic execution: “When agents go off-script”

Perfect permissions don’t prevent all damage. Agents interpret instructions, make judgment calls, and sometimes get it wrong. No attacker required.

Dangerous tool sequences are where traditional access controls break down. An automation agent (1) reads database credentials from a secrets vault, (2) queries customer PII from a database, and (3) uploads a “backup” to an S3 bucket. Every individual action is allowed. Strung together, it’s data exfiltration. If your policy only evaluates actions one at a time, you’ll never catch this.

Runaway execution is the least sophisticated failure mode and potentially the most disruptive. In March 2026, a multi-agent research system’s Analysis and Verification agents entered an undetected recursive feedback loop. The Analysis Agent expanded content based on Verification feedback, which triggered new verification questions, which triggered more analysis. Every API call succeeded. Every response was well-formed. The loop ran for eleven days before anyone noticed. Cost: $47,000 (Dev|Journal, 2026). This pattern is endemic across AI coding tools: Cursor, Copilot, and Claude Code have all had documented infinite loop incidents, with individual cases racking up hundreds to thousands of dollars in minutes.

The thread connecting these is that access control alone won’t save you. You need behavioral monitoring: what does normal agent activity actually look like, and what deviates from that? You need to evaluate tool interactions as sequences, not individual calls in isolation.

Category 3: Malicious attacks: "When there's an adversary on the other end"

The first two categories are self-inflicted. This one involves an adversary on the other end.

As agents become standard enterprise infrastructure, attackers are adapting. The attack surface is there, and they’ve already started working it.

The confused deputy is a classic attack pattern that gets significantly more dangerous with agents. In 2025, four critical-severity vulnerabilities (CVSS 9.3-9.4) hit Anthropic, Microsoft, ServiceNow, and Salesforce, all following the same pattern: an attacker injects hidden instructions into content the agent processes (an email, a web form, a Slack message), and the agent uses its legitimate permissions to exfiltrate data to the attacker. Microsoft’s EchoLeak vulnerability (CVE-2025-32711) was a zero-click attack: the victim never even opened the malicious email. Copilot’s retrieval engine ingested the payload alongside trusted SharePoint files and encoded sensitive data into an outbound URL. The agent isn’t over-privileged in any of these cases. These are identity-based attacks where the agent’s own credentials become the weapon, manipulated into misusing its legitimate access on behalf of someone who doesn’t have it.

Agent credential theft is scaling fast, and it’s a different problem than stolen passwords. When a human credential leaks, the attacker gets one person’s access, usually gated by multi-factor authentication (MFA), to a limited set of systems. Agent credentials are bearer tokens. There’s no second factor. Whoever has the key IS the agent. And because agents tend to accumulate access across multiple services (AWS, GitHub, Slack, databases), a single compromised credential can grant broad cross-system access at machine speed. Making it worse: research has found that 53% of MCP servers rely on long-lived static secrets, and only 8.5% use OAuth (ReversingLabs, 2025). These aren’t short-lived tokens that expire in an hour. They’re keys that sit valid for months. In February 2026, researchers discovered a misconfigured database on the AI agent platform Moltbook that exposed 1.5 million of these keys in plaintext (prplbx, 2026) (OpenAI, Anthropic, AWS, GitHub, Google Cloud). Any attacker with those keys could fully impersonate any agent on the platform. The broader trend is accelerating: 67% of compromised organizations experienced credential theft against cloud management consoles in 2025, and 61% of organizations now cite AI as their top data security concern (AICerts, 2026).

MCP server security is a genuinely new concern, and the numbers are sobering. The MCPTox benchmark found that 5.5% of MCP servers exhibit tool poisoning attacks, 43% are vulnerable to command injection, and a third allow unrestricted network access (MCPTox, 2026). In the first two months of 2026 alone, over 30 CVEs were filed against MCP servers, clients, and infrastructure (heyuan110, 2026). One of the most notable, CVE-2025-6514, was a CVSS 9.6 remote code execution flaw in mcp-remote (Amla Labs, 2025), an npm OAuth proxy package with over 437,000 downloads. On the supply chain side, open source MCP servers have been found with hidden reverse shells and single-line code updates that silently forward data to third-party servers (Docker, 2026).

What this means for security teams

Each category demands something different. Misconfigurations are a governance problem: better defaults, tighter guardrails, regular review. Non-deterministic execution and malicious attacks are both observability and detection problems, but different kinds: the first requires behavioral monitoring across sequences of actions, not just individual calls; the second layers on threat intelligence, credential hygiene, and infrastructure validation.

No single control covers all three. But AI agent security starts with a few things that help everywhere:

  • Least-privilege authorization at the tool-call level. Agents should get exactly the permissions they need for their specific task, evaluated per action, not granted in bulk.

  • User isolation by design. An agent acting on behalf of one user should never be able to touch another user’s data or sessions.

  • Infrastructure validation. The tools and servers agents connect to need verification. Governing the agents themselves isn’t enough if the infrastructure underneath them is compromised.

  • AI agent monitoring across sequences. A single tool call might look fine. The pattern across a session is where risk shows up.

The agentic threat landscape is growing fast, and it’s more specific than “agents might go rogue.” Keeping up with AI capability in your organization matters. So does keeping up with security practices in this space, and the two should move in lockstep, not six months apart.

Looking ahead

At Duo, we’re actively researching these threat categories as part of our work on agentic identity and MCP security. To see how we’re applying least-privilege authorization, user isolation, and infrastructure validation to the agent ecosystem, visit our Agentic AI Security page or start a free trial.

Frequently asked questions

  • What is agentic AI security?

    Agentic AI security is the practice of managing the risks that emerge when AI agents operate autonomously in enterprise systems, including the tools they connect to, the permissions they hold, and the actions they take. These risks cluster into three categories: misconfiguration (agents granted too much access), non-deterministic execution (agents making harmful decisions without an attacker involved), and malicious attacks (adversaries targeting the agent layer directly).

  • How is agentic AI security different from traditional application security?
  • What is MCP server security and why does it matter?
  • How can I protect against AI agent credential theft?