The AI Layer: What “Layers of Data Protection” Missed
The AI Layer: What “Layers of Data Protection” Missed
TL;DR: The classic data protection layers (connectivity, cyber, recovery) were designed for a world of human users and known attackers. AI changes that. It is a new kind of tenant in your environment, fast, capable, under-supervised, and already inside the walls. This post walks through what each existing layer looks like when AI is in the mix, why your backups are now a governance problem as well as a recovery asset, and why “AI governance” has to be treated as its own layer with its own owner.
In the last post, we stacked up the layers that keep you recoverable when things go sideways: connectivity, cybersecurity, and data protection/recovery, three disciplines that have to work together rather than as three independent shopping lists.
And then I published it, re‑read it a few days later, and realised I’d conveniently skipped over something that’s quietly rewriting all three.
AI.
Not “AI” in the cheerful vendor-slide sense , “now with AI‑powered anomaly detection!” , but AI as a genuine new tenant in your environment. Something that reads your data, writes your data, and increasingly takes decisions on your data, often with permissions nobody ever consciously granted it.
This post is about that gap. Call it the fourth layer, or the layer that cuts sideways through all the others. Either way, if your resilience story doesn’t account for it yet, it’s already out of date.
The uncomfortable bit first
Let’s be honest about what’s actually happening in most organisations right now:
-
Developers are shipping AI-generated code to production daily, often without a meaningful review.
-
“Must‑use AI” clauses are appearing in enterprise contracts, so deployment is a commercial requirement, not an engineering choice.
-
Employees are installing AI tools on personal devices and quietly pointing them at corporate data to “save time”.
-
SaaS vendors are turning on AI features by default, then asking forgiveness via release notes.
None of this is hypothetical. This is what the last eighteen months have actually looked like. Policy is lagging capability by a mile, and the gap is where all the interesting failure modes live.
So when we talk about layers of protection, we now need to ask a harder question: protection from whom, and from what? Because AI doesn’t fit neatly into the “external attacker” or “insider threat” buckets we’ve been using for twenty years. It’s something new , a well-meaning, tireless, extremely fast thing with access to a lot of your data and very little judgment.
Layer 1 revisited: connectivity, AI traffic is not like user traffic
The old assumption was: users connect in, data flows back. Design your network around that and you’re mostly fine.
AI traffic breaks that assumption in subtle ways:
-
Agents don’t sleep. They’ll hammer an API at 4am in ways no human workflow ever would.
-
Context windows are hungry. One “innocent” prompt can pull thousands of documents across your file shares, SaaS tools and email archives in a few seconds.
-
Outbound is the new inbound. A compromised or badly-configured AI tool quietly exfiltrating to a third-party model endpoint looks a lot like legitimate business traffic.
Building blocks to look for:
-
Egress controls and logging for AI/model endpoints, not just user web traffic.
-
Segmentation that treats AI agents as their own zone, with its own rules.
-
Out-of-band access to your management and DR tooling that doesn’t depend on the same AI-integrated platforms you’re trying to contain.
The uncomfortable takeaway: if you can’t see what your AI tools are talking to, you’ve got a connectivity blind spot, not just a security one.
Layer 2 revisited: cyber, ACLs were designed for humans
Most of our access control thinking still assumes a person on the other end, clicking through folders. AI tooling doesn’t behave like that. It crawls. It indexes. It “helpfully” ingests anything it’s allowed to touch, then serves it up to whoever asks the right question.
The result is a quiet, legitimate form of data leakage that doesn’t show up on any of your usual security dashboards:
-
SharePoint folders marked “internal” because nobody imagined a bot would enumerate them.
-
API credentials hardcoded in AI deployment scripts because it was quicker than doing it properly.
-
Email archives synced into AI assistants for “context” with no security review.
-
HR databases exposed to a personal ChatGPT account because an exec wanted a faster summary.
None of this is malicious. All of it is a breach waiting to be named.
Building blocks to look for:
-
Least privilege for agents, not just users. Read-only. Specific fields. Specific scopes.
-
Data classification that AI tools actually respect, tags that drive behaviour, not decoration.
-
Gate and log what AI can request. API gateways and MCP-style rules exist for a reason.
-
Zero trust for AI: verify every request, not just the initial connection.
If your cyber story still ends at “the user authenticated and has rights to the share”, you’re protecting against yesterday’s problem.
Layer 3 revisited: recovery, your backups are now a very tempting dataset
Here’s the one that keeps me up at night.
Your backups are, by design, the most complete, richest, most comprehensively permissioned copy of your corporate data that exists anywhere. Historically, that was a recovery feature. In an AI world, it’s also a governance problem.
Two flavours of that problem:
1. Legitimate AI tooling pointed at your backups. Someone, somewhere, is going to decide it would be great to give an LLM read-access to your backup index “just for search”. Maybe it already has been. Immutability protects the bits. It doesn’t protect you from a well-meaning integration quietly making ten years of deleted emails queryable again.
2. Attackers using AI to accelerate the old playbook. Ransomware already routinely targets backups. Add AI-driven adaptation on top and the window between “something’s wrong” and “your last good copy is gone” shrinks from days to seconds. Signature-based defences don’t keep up. Neither do humans who only look at the backup console once a week.
Building blocks to look for:
-
Immutability and isolation as non-negotiable defaults, not add‑ons.
-
Retention as an attack‑surface decision, not just a storage one. Not all data needs to live forever. Some of it is a liability the moment an AI tool finds it.
-
Clean-room recovery, somewhere isolated, with different credentials, that isn’t wired into your AI-integrated production estate.
-
Real tests that include AI-flavoured scenarios, not just “can we restore a VM”, but “can we restore into an environment that we trust isn’t already compromised”.
If you haven’t tested recovery, you don’t have recovery. If you’ve tested it but only against 2019-era threat models, you also don’t have recovery; you’ve got nostalgia.
The fourth layer nobody wants to own: AI governance
The honest answer is that AI doesn’t slot neatly into any of the three existing layers, because it cuts across all of them. So the pragmatic thing to do is treat it as its own layer, with its own owners, its own controls, and its own place in the plan.
What does that actually look like in practice?
-
An inventory. Every AI tool, every integration, every agent. What it’s connected to, what it can read, what it can write. If you can’t draw it on a whiteboard, you can’t defend it.
-
Explicit scopes per agent. Not “AI can see SharePoint” but “this agent can read these sites, for these users, for this purpose, logged here”.
-
A human in the loop for anything destructive. Agentic AI making production decisions at 3 am without approvals isn’t innovation; it’s an incident waiting to write its own post-mortem.
-
A shared definition of “acceptable”, the same social contract idea from the DR/BC post. Who signs their name next to “the AI did it”?
That last one matters more than people think. In a real incident, you’re going to have to explain to a regulator, a customer, or a board what happened and why. “The model decided” is not an answer anyone will accept, but it’s the answer a lot of organisations are currently building towards by accident.
Bringing it together: a practical checklist
If you want something tangible to take back to your own environment:
-
Inventory your AI. Every tool, integration, and agent. Today, not next quarter.
-
Map it onto the layers.
-
What does it touch on the connectivity side (endpoints, egress)?
-
What does it touch on the cyber side (permissions, data classes)?
-
What does it touch on the recovery side (backups, archives, runbooks)?
-
-
Find the orphans. Tools nobody owns, integrations nobody reviewed, scopes nobody agreed to.
-
Add AI scenarios to your tests. A compromised agent. A hallucinated delete. A shadow deployment exfiltrating quietly. If your DR tests don’t include any of those, they’re rehearsing for the wrong decade.
-
Revisit the social contract. For each critical service, add a line: what happens if an AI tool is involved when this goes wrong, and who decides?
None of this requires a new product. Most of it is what good architecture has always looked like: least privilege, segmentation, tested recovery, clear ownership, applied honestly to a new kind of tenant in the environment.
The point
AI isn’t the end of data protection. It’s not the start of it either. It’s a new kind of participant, fast, capable, under-supervised, and already inside the walls, and the protective layers we spent years building need to be re‑read with that participant in mind.
The good news: if you’ve already done the social-contract work, and you’ve already stacked your connectivity, cyber and recovery layers so that they fail safely together, you’re most of the way there. You just need to add a fourth question to the conversation.
> “When the lights go out, what was the AI doing, and did anyone sign their name next to that?”
That’s where the next version of real data protection begins.

