Data Security in the age of Agentic AI
10
mins read
0% complete
Data Security in the age of Agentic AI
24/4/2026

Data Security in the age of Agentic AI

10
mins read
Research
AI
Data
Cyber

1. the new paradigm for Data Security

For decades, the discipline of data security rested on a seemingly stable premise: data lived in known places, moved through known paths, and was accessed by known users. Enterprises protected it by building walls - perimeter firewalls, VPNs, access control lists - around on-premises data centres where the enterprise data warehouse sat at the center of a clearly bounded architecture. Security teams could reason about the attack surface because it was finite: a handful of databases, a corporate network, and a defined set of privileged identities. The challenge was hard, but the map was readable. That map has been (mainly) redrawn three times in the last 10 years.

The first redrawing came with cloud adoption. As organizations migrated workloads to hyperscalers and embraced cloud-native architectures, data escaped the perimeter entirely. Object storage buckets, serverless functions, managed databases, and SaaS applications distributed sensitive information across environments that no single team controlled or could even fully inventory. Traditional Data Loss Prevention (DLP) tools, designed to inspect flows at a known exit point, were blind to data that never transited a corporate gateway. A new discipline, Data Security Posture Management (DSPM), emerged precisely to fill this gap: providing continuous discovery, classification, and risk assessment of data wherever it resided across multi-cloud and hybrid environments.

The second redrawing, still underway, is the generalization of AI across enterprise IT. The initial wave of generative AI introduced large language models as conversational interfaces over enterprise data, raising immediate questions about prompt injection, sensitive data exposure in model responses, and the governance of AI-accessible data stores. Shadow AI — employees and teams connecting sanctioned and unsanctioned LLM services to sensitive data without security oversight — emerged as an important risk vector.[1] The relationship between LLMs and humans became a key concern for data security.

But the third and most consequential redrawing is now arriving: the agentic AI era. Agentic systems do not merely respond to queries: they plan, decide, and act. These agents perceive data from diverse sources, reason over it using foundation models, and take real-world actions through tool integrations: querying databases, calling APIs, writing records, sending communications, and orchestrating downstream agents.[2] In doing so, they dissolve the boundary between data at rest and data in action which has profound security implications.

The convergence of DSPM, DLP, and the emerging discipline of AI Security Posture Management (AISPM) appears to be the way to answer this challenge — extending posture management from static data assets to dynamic, autonomous data actors.

2. Market Landscape

2.1 Data Loss Prevention (DLP)

DLP is undergoing its most significant architectural transformation. The emergence of Gen AI and Agentic AI has fundamentally broken the assumptions on which legacy DLP was built: sensitive data moves through structured, identifiable channels (email, USB, file uploads) and can be detected by pattern-matching against fixed signatures.[3] Today, sensitive data flows through natural language prompts, LLM context windows, and autonomous agent tool calls, creating a new class of exfiltration vectors that conventional DLP cannot see, classify, or block.

Where legacy DLP falls short in the AI era?

  1. Language-based transformations bypass detection. When an employee pastes a confidential document into ChatGPT and asks for a "summary," the data leaves the organization through a paraphrase rather than a recognizable structured record. Legacy DLP sees a text string, not a leak
  2. Agentic tool calls are invisible to network-layer DLP. AI agents that invoke APIs, execute code, or query internal databases create exfiltration paths through permitted endpoints.[4] Traditional DLP inspects data at rest or in motion, but agent-mediated transfers use legitimate channels at machine speed
  3. Context is absent from rule engines. A DLP rule tuned to block "Social Security Number" patterns cannot assess whether an agent's output — which synthesizes customer records, proprietary algorithms, and market intelligence — constitutes a concentrated sensitive payload.

AI-Native DLP

  1. Language-native (semantic) detection: Next‑gen DLP starts from the assumption that sensitive data is expressed in natural language, not just in neat, machine‑readable fields.[5] Instead of asking “does this text match a known pattern?”, it asks “what is this text actually saying?”. That shift enables the system to recognize sensitivity in context. Semantic models compare meaning, not surface form, so they can still flag a risk when confidential documents are paraphrased, summarized, or translated into another language. The same applies to LLM outputs that inadvertently reproduce internal training data or proprietary knowledge
  2. Real-time, natural language policy definition: In fast‑moving AI environments, “sensitive data” can’t be frozen into a static policy pack. Security teams need to describe new risks in plain language such as “anything referencing Project ABC launch timing”, “non‑public quarterly metrics”, “customer data from region X, and have the system enforce those definitions immediately. Modern DLP therefore exposes a natural‑language policy surface: you write what you care about, and the platform turns that into a detector without model‑training cycles or engineering tickets. Equally important is latency. Detection has to sit inline with prompts and responses so it can block, redact, or warn before content leaves the organization or is stored in an agent’s long‑term memory
  3. GenAI & Agent awareness (reasoning chain-coverage): Once agents can browse, call APIs, and write back to internal systems, the real risk isn’t a single prompt or response, it is the chain of actions the agent takes over time. Next‑gen DLP therefore needs visibility into the full reasoning loop: what data an agent reads, how that data flows through intermediate steps, which tools it invokes, and where the results are sent (e.g., an agent pulls customer records from a CRM, combined them with pricing spreadsheets, summarized the result, and then attempted to post that summary into an external collaboration tool). Traditional DLP would only see the final message; agent‑aware DLP reconstructs the entire path and can intervene earlier in the chain.

2.2. AI Security Posture Management (AISPM)

AI Security Posture Management (AISPM) is the discipline used to continuously discover, assess, and improve the security of AI systems, with a particular focus on how they handle sensitive data. It extends ideas from CSPM and DSPM into the AI stack such as models, agents, training pipelines, and GenAI applications so organizations can see which AI workloads touch confidential data, how that data flows, and where misconfigurations or misuse might lead to leakage. We can think of  AISPM as the “control plane” that connects AI usage back to data‑security objectives: preventing data poisoning, unauthorized access, and sensitive data exposure during training, inference, and agentic execution

  1. Continuous discovery & data-centric visibility: AISPM, like any security platform, starts with building an accurate inventory of all AI assets and their data relationships. Tools scan cloud accounts, repositories, pipelines, and agent frameworks to find models, vector stores, notebooks, and services, including unsanctioned “shadow AI”. In addition, they also identify which datasets, tables, and object stores feed these systems, classifying sensitive information. This gives data‑security teams a graph of “which data is used by which AI,” bridging traditional DSPM with AI‑specific usage patterns. Without this mapping, it is impossible to enforce meaningful least‑privilege for data or to reason about downstream leakage if a model or agent is compromised
  2. Risk assessment & secure configuration: Once assets and data flows are mapped, AISPM evaluates risk in context by analyzing how AI systems are configured and which sensitive datasets they touch. It inspects encryption, logging, retention, identity & access management (IAM), and network controls that could expose training or inference data. It then scores and prioritizes issues based on data sensitivity, exposure paths, attack complexity, and business impact, focusing remediation on the riskiest misconfigurations
  3. Runtime monitoring & policy enforcement: Static posture is not enough; AISPM adds continuous runtime monitoring focused on data misuse. It analyses prompts, responses, and agent actions to spot exfiltration patterns, jailbreaks, and anomalous access to sensitive data sources. In addition, policies can enforce guardrails in real time, blocking or redacting outputs that contain classified information, preventing over‑permissive agents from calling dangerous tools with sensitive inputs. This runtime layer complements DLP by being AI‑aware: it understands model identities, agent workflows, and AI‑specific attack paths, not just network traffic or file events.

For data security, AISPM provides the missing connective tissue between where sensitive data lives and how AI systems actually use it. By combining continuous discovery of AI/data assets, contextual risk assessment of configurations and datasets, and AI‑aware runtime controls, organizations can prevent data leaks, poisoning, and misuse across the AI lifecycle — not just react after incidents occur. In modern architectures, AISPM and DLP are complementary.

For context, below a table summarizing the difference among CSPM, DSPM, ASPM, and AISPM:

2.3. Data Management for Security

Over the past decade, enterprise security has focused heavily on improving detection, response, and automation. Organizations have deployed increasingly sophisticated security tools and invested heavily in Security Operations Centers (SOCs). Yet, despite this progress, security teams continue to struggle with escalating costs, analyst overload, and diminishing returns from additional tooling. At the root of this challenge sits an often overlooked layer of the security stack: how security data is collected, stored, and managed at scale. This is where Security Data Management (SDM) is becoming increasingly critical.

The status quo (and why it fails)

Today, most enterprises rely on a SIEM (Security Information and Event Management) - centric architecture.[6] Security telemetry (logs and events) from endpoints, networks, cloud infrastructure, SaaS applications, and security tools such as EDR (Endpoint Detection and Response) or NDR (Network Detection and Response) is ingested into a SIEM, which is used for detection, correlation, investigation, and compliance reporting. For long‑term retention and audits, some data is also stored in lower‑cost data lakes.

This model no longer scales. Security data volumes continue to grow rapidly, driven by cloud and AI adoption, tool sprawl, and increased monitoring. SIEM platforms are largely priced on data ingestion volume, causing costs to rise faster than security value. Enterprises are therefore forced to limit what data they send to the SIEM, even though regulatory, forensic, and incident‑response requirements increasingly demand broader and longer data retention. Vendor lock‑in and operational complexity make SIEMs difficult to replace, while specialised skills are often needed to operate them effectively. The result is a structural trade‑off between cost control and security visibility.

The opportunity for SDM startups

This gap creates a clear opportunity for new SDM entrants. SDM platforms sit upstream of the SIEM, ingesting telemetry once and then filtering, enriching, normalising, and routing data to different destinations (e.g., SIEM for real‑time use, data lakes for retention). By separating data management from analytics, SDM platforms reduce SIEM costs, preserve access to complete datasets, and lower vendor lock‑in.

From an enterprise perspective, SDM solutions are attractive where they can:

  • materially reduce SIEM ingestion costs without losing critical security data,
  • intelligently route data between “hot” environments (SIEMs used for active detection) and “cold” environments (data lakes used for retention and audits),
  • normalise and enrich telemetry to reduce analyst workload,
  • be deployed and operated with minimal data‑engineering overhead,
  • integrate seamlessly into existing SOC (Security Operations Center) workflows rather than requiring disruptive changes,
  • and support emerging standards such as OCSF (Open Cybersecurity Schema Framework) or OpenTelemetry to future‑proof data flows.

Startups in the market and how they differentiate

The most appropriate Security Data Management (SDM) tools vary depending on the size of the enterprise and how security telemetry is used - specifically whether the primary need is general observability or security‑focused analysis.

  • For general observability, several vendors focus on flexible data pipelines that can be adapted across multiple use cases. This includes more customisable, developer‑oriented platforms such as Confluent, Cribl, and Mezmo, as well as more out‑of‑the‑box solutions such as Calyptia (acquired by Chronosphere), observo.ai, and Sawmills
  • For security‑focused use cases, a different set of vendors emphasise security‑native context, intelligence, and workflows. Players in this category include DataBahn.ai, Onum, and Tarsal, as well as Tenzir, Auguria, and Ziggiz. DataBahn.ai, for example, positions itself around security‑native pipelines with built‑in intelligence, predefined routing logic, and automation, targeting faster time‑to‑value and reduced operational burden for security teams.

Across both segments, differentiation increasingly depends on:

  • the depth of security context, rather than generic observability capabilities,
  • the level of automation and intelligence embedded in routing and enrichment decisions,
  • usability for non‑specialist teams, limiting reliance on bespoke data engineering,
  • and the ability to support modular, vendor‑agnostic security architectures that integrate seamlessly with existing tooling.

3. Conclusion: Data as the security control plane

Enterprise data security is undergoing a structural shift. The assumptions that once made perimeter controls, static DLP, and SIEM‑centric architectures effective have been eroded by cloud, generative AI, and now agentic systems that act autonomously on sensitive data. As data becomes more distributed, dynamic, and machine‑operated, security failures increasingly stem from a lack of visibility and control over how data moves, transforms, and is used, rather than from missing detection tools.

This reality is driving a gradual convergence between AI‑native DLP, AI Security Posture Management (AISPM), and Security Data Management (SDM). Rather than relying on fixed inspection points, the security stack is moving toward continuous, context‑aware control across the full data lifecycle. SDM plays a practical enabling role in this shift by shaping and routing security data before scale and cost overwhelm downstream systems.

The implication is straightforward. As AI becomes embedded in core workflows, data itself becomes the effective control plane for security.[7] Protecting data is no longer just about where it is stored, but about governing how it is interpreted, transformed, and used by machines operating at speed. Enterprises that adapt to this model will be better positioned to scale AI responsibly. Vendors that enable it are building infrastructure that will matter regardless of how individual security tools evolve.

References and Resources

The information contained in this article is provided for informational and educational purposes only and does not constitute an investment recommendation or any other type of professional advice. The views and opinions are those of the author at the time of publication and are subject to change at any time. Any mention of a company name or security is not a recommendation to purchase.

Published on:
24/4/2026

Authors

Labinot Braimi

Labinot Braimi

Principal

Julien van den Rul

Julien van den Rul

Vice President

Related articles

Research
Cyber
AI Security: Deepfakes, MCP and Agentic AI Security
As AI becomes embedded across digital systems, it introduces a new class of cybersecurity risk—faster, more scalable, and more autonomous than traditional threats. This deep dive is our second into AI security, building on prior work and focusing on three areas where the attack surface...
28/11/2025
.
17
mins read
Research
AI
Data
The State of AI Data Infrastructure 2026
Data is the core infrastructure of enterprise software. As enterprises adopt AI and data-driven workflows, the systems that collect, process, and govern data have become a primary source of competitive advantage. The “data layer” determines not just how software scales, but how effectively organizations generate...
10/2/2026
.
30
mins read
Research
AI
Cyber
ID & Access Management
Identity and access management (IAM) is a framework of business processes, policies, and technologies that facilitates the management of electronic or digital identities. With an IAM framework in place, IT managers can control user access to critical information within their organizations. 1. Human Identity Protection ...
29/1/2025
.
18
mins read

Want to know more?

Send us a message

Contact Us