DLP

What is DLP?

DLP (Data Loss Prevention, or Data Leak Prevention) is the set of technologies, policies and processes whose objective is to detect and block the unauthorized exit of sensitive information: personal data, intellectual property, source code, trade secrets, financial information or any classified data. It works by identifying content by pattern (card numbers, ID numbers, health data), by label (sensitivity labels) or by contextual analysis, and applying block, alert or quarantine rules when that content tries to leave by email, endpoint, cloud or network.

Why does it matter?

Under GDPR, a personal data leak can trigger sanctions of up to 4% of global revenue. Under NIS2 and DORA, the notification obligation kicks in with any incident affecting confidentiality of critical information. Under PCI DSS, a cardholder data leak is effectively terminal for the business. Beyond compliance, internal leaks are no less common than external attacks: an employee downloading the client portfolio before leaving for a competitor is a real, frequent case, solvable only with properly designed DLP controls. It's one of the controls ISO 27001 explicitly looks at (Annex A.8.12).

Key points

Three DLP families: endpoint (agents controlling USB, copy-paste, printing, web uploads), email and messaging (filters scanning outbound traffic), and network/cloud (proxies and CASB controlling traffic to SaaS and external services).

Prior data classification is a prerequisite. Without knowing what's 'sensitive' in your organization, generic DLP rules generate massive false positives. Labeling sensitive documents first saves work later.

Block mode vs alert mode: always start in alert+log mode, measure false positives for 4-8 weeks, and only then move to blocking on specific policies. Blocking from day one creates resistance and creative bypass.

DLP + CASB + IAM are the usual triad for protecting data in SaaS. DLP detects, CASB controls the usage context, IAM closes the access loop.

Example: blocking exfiltration to personal Gmail

An employee tries to upload an Excel file with 12,000 client records (name, email, phone, billed amount) to their personal Gmail from the corporate laptop browser. The endpoint DLP agent detects the file, analyzes it, identifies a "client list with PII" pattern (multiple columns with sensitive fields) and applies the rule: block + notify + open ticket to the SOC. The employee sees an explanatory message, the security and compliance team is notified in real time, and a record of the attempt is kept. Without DLP, the data would have left untracked, and later detection would have required outbound traffic auditing and luck.

Common mistakes

Buying DLP without classifying data first. A DLP rule needs to know what it protects. Without prior classification, all you generate is noise.
Limiting DLP to email. Today's leaks increasingly happen through SaaS (Google Drive, Slack, GitHub), USB downloads, screen captures and prompts to generative AI. All meaningful channels must be covered.
Not educating users. A DLP block without explanation feels like an obstacle and encourages bypass. A block with contextual messaging ('this file contains PII classified confidential; use the authorized folder') reduces friction.
Confusing DLP with encryption. Encryption protects data at rest and in transit when it travels through authorized channels. DLP decides whether data may leave at all. They are distinct, complementary layers.

Related services

This concept may relate to services such as:

Microsoft 365 security GDPR compliance Cybersecurity consulting

Frequently asked questions

Does DLP cover leakage via prompts to generative AI (ChatGPT, Copilot, etc.)?

Yes, in modern architectures. Next-generation DLP detects sending of sensitive information to public generative-AI interfaces and applies policy (block, anonymize, route to authorized enterprise version). It's one of the fastest-growing use cases in 2025-2026.

Is DLP viable without a prior data classification project?

Results are very limited. You can start with basic rules (card patterns, national IDs, IBAN) without classification, but for useful coverage you have to classify sensitive assets first. Both projects often run in parallel.

What's the difference between DLP and CASB?

DLP is the discipline and engine that decides what's considered sensitive and what to do with it. CASB is the specific enforcement point for cloud/SaaS services. A CASB can apply DLP rules in the context of a specific SaaS use (who, from where, on which device).

DLP