← All articles

How ChatGPT, Claude, Gemini, and Copilot handle your data

10 min read · Reference · Last verified May 12, 2026

This is a reference, not a hot take. Each section summarises what the provider publicly states about retention, training, and access controls - and points at where the real risk sits once those policies are in place.

None of these policies do anything about what an employee types. They're about what happens to the data after it reaches the provider. The exfiltration vector is the prompt itself - that's the gap NexusNest closes.

OpenAI / ChatGPT

Consumer ChatGPT (Free / Plus / Pro)

  • Training: by default conversations are used to improve OpenAI's models. Users can opt out per-conversation with Temporary Chat, or globally in Settings → Data Controls.
  • Retention: conversations are stored indefinitely unless the user deletes them, and even then OpenAI retains them for 30 days for abuse review.
  • Real risk: nothing the user types is treated as confidential. Personal-account users in a corporate context are effectively training a public model on your data.

ChatGPT Team / Enterprise

  • Training: never trained on by default.
  • Retention: admin-configurable. Enterprise can be set to zero-day retention.
  • Access controls: SAML SSO, SCIM provisioning, audit log API, IP allow-listing on Enterprise.
  • Real risk: employees still paste sensitive content in plaintext. Retention + training opt-outs limit downstream exposure but don't change what reaches the model in the first place. If the conversation is logged for compliance, the secret in it is logged too.

Anthropic / Claude

claude.ai consumer (Free / Pro / Max)

  • Training: by default, consumer chats are not used for training unless the user explicitly opts in. Safety-flagged conversations may still be reviewed.
  • Retention: chats are kept until the user deletes them. Feedback-flagged conversations may be stored for up to 5 years.

Claude for Work / Enterprise / API

  • Training: commercial products are governed by a separate policy. Anthropic's commercial terms say customer prompts are not used to train default models.
  • Retention: standard 30-day retention is widely cited; zero-data-retention is available for enterprise customers on request.
  • Real risk: Claude is the favourite tool for long-form work, which means employees paste big artefacts - docs, threads, drafts. The context window (1M tokens on Opus 4) invites larger pastes than ChatGPT.

Google / Gemini

Consumer Gemini

  • Training: a subset of chats is reviewed by human raters (including Google's service providers) to improve Google services. Turning off Gemini Apps Activity stops future chats being used to train Google's AI models.
  • Retention: auto-deleted after 18 months by default. Configurable to 3 or 36 months, or to never auto-delete. Users can delete activity any time.
  • Real risk: personal Google accounts in a corporate setting are a major exposure - Workspace audit logs don't cover them.

Gemini for Google Workspace

  • Training: Workspace content is not used to train models that serve other customers.
  • Retention: follows your Workspace data retention rules.
  • Real risk: the giant context window invites pasting full Docs / Sheets content. Workspace DLP rules scan documents at rest, not what users type into Gemini.

GitHub / Microsoft Copilot

Copilot for Individuals

  • Training: users can opt in/out of sharing prompts and completions for model improvement.
  • Real risk: regurgitation has been documented. Personal accounts have weaker guarantees.

Copilot for Business / Enterprise

  • Training: never on customer prompts / completions.
  • Retention: prompts retained for a short window for service operation.
  • Real risk: Copilot ships nearby file content as context. A .env open in another tab can be uploaded as part of a completion request. Enterprise terms don't prevent the upload - they just say what happens to it after.

Cursor

  • Training: Cursor's privacy policy states that they do not use Inputs or Suggestions to train their models - with narrow exceptions (security review, explicit feedback, or explicit user agreement). Users manage their preferences in-app.
  • Routing: Cursor proxies to OpenAI, Anthropic, or its own models depending on settings. Bring-your-own-key changes who you trust but not what data leaves the laptop.
  • Real risk: Cursor sends full file context and walks across files in agent mode. Their training-opt-out controls who can use the data after the fact; it does not redact what was actually sent.

The pattern

Every enterprise tier from every provider says the same thing in different words: we don't train on your data, we don't retain it long, and we limit who sees it. Those are all true and all valuable. None of them prevent the actual leak, which happens the moment the prompt is sent to the model.

If the only thing standing between an employee's clipboard and a third-party model is a checkbox in a settings menu, the risk is what gets sent - not the retention policy.

Want a control that catches it before the prompt is sent? NexusNest redacts sensitive data in every prompt in-flight, before it reaches any of these providers. See how it works →

Sources

All claims in this article are taken from each provider's own public documentation. Last verified May 2026 - providers update terms frequently; check the source pages for the current version.