← All articles

On-device vs network-based AI DLP: what's actually different

8 min read · Updated May 16, 2026

Every AI DLP vendor sits in one of three architectural buckets, and the bucket determines what they can and can't see. Pick the wrong one and your “protection” has a giant hole in the middle of it. This is a practical breakdown of the three, with the failure modes vendors don't put on their datasheets.

The three architectures

  • Browser extension. A Chrome / Edge / Firefox add-on that hooks into the page DOM and intercepts prompts before they're submitted. Examples: Harmonic Protect, parts of Nightfall, parts of Prompt Security.
  • Network-based DLP / proxy. Traffic goes through a corporate proxy (often part of a SASE / SWG platform) that TLS-inspects HTTPS and scans for sensitive content. Examples: Zscaler AI Protect, Cisco Umbrella, Netskope, Palo Alto Prisma Access.
  • Device-level agent. A small daemon on the user's laptop installs a local trusted CA and proxies outbound HTTPS, intercepting AI prompts before they reach the provider. Capture is on-device; where the redaction runs varies by product. NexusNest captures on the device but redacts server-side, before the prompt reaches the provider. Examples: NexusNest, Cyberhaven endpoint, parts of Nightfall's agent.

Some products combine two of these (e.g. Nightfall has both a browser extension and an endpoint agent). The honest question isn't “which architecture” - it's “which surfaces does the architecture actually cover.”

What each architecture catches

Browser extension

Catches: prompts typed into supported web AI tools (ChatGPT, Claude, Gemini, Perplexity) in the browser you installed the extension into.

Misses: native desktop apps (the ChatGPT macOS app, Claude desktop, Cursor, Grok desktop), IDE assistants (GitHub Copilot in VS Code / JetBrains), CLI AI tools (Claude Code, OpenAI's `o` CLI), mobile apps, anything in a browser the user installs but you don't manage (Brave, Arc, Comet, Firefox on a BYOD laptop).

Browser extensions are the easiest deploy - push via MDM and you're done in minutes. They're also the easiest to bypass: an employee opens a different browser, the extension isn't there, the protection isn't there.

Network proxy / SASE-based DLP

Catches: any AI traffic that traverses the corporate network - in-office, on a corporate VPN, or routed through the SASE PoP from a managed device.

Misses: AI traffic from personal hotspots, home wifi without the SASE client, anything routed through certificate-pinned channels the proxy can't TLS-inspect, and AI tools the network classifier doesn't recognise yet. Also has trouble with the “temporary chat” and unauthenticated flows in ChatGPT and Gemini.

Network DLP is the right answer if you already run a SASE platform and want one more checkbox on it. The cost is architectural lock-in: you're committing to that vendor's entire network stack, and rolling AI DLP out means rolling SASE out. That's a 6-12 month programme, not a Tuesday-afternoon install.

Device-level agent

Catches: any outbound HTTPS to a recognised AI endpoint, regardless of which app sent it - browser, native desktop app, IDE plugin, CLI, terminal-spawned API call. Works whether the user is on the corporate network, a coffee shop wifi, or a personal hotspot.

Misses: AI tools the agent doesn't have a schema for yet (admin can add them), and traffic the user actively works to hide (e.g. running their own forward proxy outside the device). Both are edge cases for a workforce actually trying to get their job done.

The trade-off is that a device-level agent is a heavier deploy than a browser extension - you're installing a daemon and a trusted CA on every laptop. With modern MDM and silent-install flags this is straightforward, but it's not as point-and-shoot as “push a Chrome extension.”

The surface matrix

Map your actual AI surface against the three architectures. Most teams discover within ten minutes that browser-only or network-only leaves more than half their AI usage uncovered.

  • ChatGPT browser - all three architectures cover this.
  • ChatGPT macOS / Windows desktop app - only device-level agent or network proxy.
  • Claude desktop app - device-level or network.
  • Cursor, Continue, JetBrains AI - device-level or network. Cursor specifically encrypts heavily and negotiates over a small number of endpoints that some network products handle poorly.
  • Claude Code, OpenAI CLI, Gemini CLI - device-level (catches it as outbound HTTPS) or network. No browser extension covers a terminal AI client.
  • Perplexity Comet browser - device-level (catches the underlying HTTPS regardless of which Chromium variant) or network. A Chrome-only extension is invisible here.
  • Mobile (iOS / Android AI apps) - the honest answer is that none of the three cover personal mobile devices well. Managed mobile (MDM + per-app VPN) gets you partial network coverage. Most teams accept the gap and policy around it.
  • Personal-hotspot laptop usage - only device-level (the agent runs locally on the laptop, not on the network it's connected to).
  • Temporary / unauthenticated chats - device-level catches them by URL pattern. Network proxies often classify them as unknown traffic and miss them.

Where redaction physically happens matters too

Independent of where the capture happens, the redaction can happen in different places - and that changes who sees the original.

  • On the device. The original prompt never leaves the user's laptop. Only the redacted version goes upstream. Best for compliance (data residency, DPDP, HIPAA).
  • In the vendor's cloud. The original prompt is shipped to the DLP vendor, scanned, redacted, then forwarded. You've replaced one third party (the AI provider) with two. The DLP vendor's retention and residency become a compliance question.
  • In your VPC / on-prem. The original prompt goes to a server you control. Adds operational cost but fully solves the data-residency question for regulated shops.

Network-based DLP almost always does scanning at the SASE PoP, which is operationally fine but means the vendor sees the un-redacted text in transit. Browser extensions vary - Harmonic claims local SLM inference; others ship to a cloud scanner. Device-level agents can do either; ask specifically.

How to pick

The honest decision tree:

  • Already running a SASE platform from Zscaler, Netskope, Palo Alto, or Cisco, and AI is one of many controls you want on it? Use that vendor's AI module. Don't add a second product just for AI.
  • Browser is genuinely the only AI surface (no native desktop apps, no IDE AI, no CLI tools)? A browser extension is fine and easier to deploy.
  • Mixed surface - some browser, some native apps, some IDE - and you're not buying into a SASE platform? A device-level agent is the only architecture that covers all of it from a single install.
  • Regulated and care about where redaction physically happens? Insist on a no-retention guarantee for original prompts, or in-VPC / on-prem redaction, regardless of architecture.

Stuck on the decision? The AI DLP buyer's checklist walks through the 12 questions that surface the architecture answer indirectly. Or see how NexusNest's device-level approach compares to the network-based path on the vs Zscaler AI Protect page.

Sources & further reading