DataLossPrevention
Data Loss Prevention (DLP) Sample Data Files
Install / Use
/learn @bhdicaire/DataLossPreventionREADME

You’ve been there too — setting up a data loss prevention solution might be a damn long project (DLP), if you need to support multiple languages and don’t have adequate data sources.
This repository consolidate Data Loss/Leak Prevention insight and sample files (e.g., datasets), that I have collected and used over the years. Your quality assurance library does not have to be unique, everyone strives for consistency.
Fork this repository, and improve your library. Even better, send me an update :laughing:.
A DLP solution is a set of enterprise processes, tools, and techniques that monitor sensitive information and prevent data exfiltration.
What problem does it solve and why is it useful?
I wasn't happy with the provided bundle of mock files to test my DLP policies and demonstrate compliance. They were either too simple or not localized for my use case.
Friend don’t let friends test the effectiveness of a DLP solution with production data. You need realistic test data[^1] in several formats such as CSV, JSON, SQL, TXT, and Excel to make sure your DLP Policies are working correctly especially after a significant change.
dataLossPrevention by Benoît H. Dicaire is shared with an unlicense. For more information, please refer to unlicense.org.
[^1]: Refer to the sensitive information type entity definitions provided by Microsoft for more information about the required structure.
Fake sensitive information generators
| Name | Cybersecurity | Finance | Legal | Personal | Technology| | :-- | :--: | :--:| :--: | :--:| :--: | |DLP Test| X | X | X | X | X | |Fake Person Generator| X | X | X | X | X | |Fake Generator| X | X | X | X | X | |GenerateData.com[^2]| X | X | X | X | X | |Get Fake Data| X | X | X | X | X | |Get Bored Human| X | X | X | X | X | |Mockaroo| X | X | X | X | X | |Mock Turtle| X | X | X | X | X | |Venkom| X | X | X | X | X |
[^2]:Source code is available on GitHub/benkeen/generatedata
You can also search on GitHub for library code and C tool related to data-generator, fake-data, mock-data , mock-data-generator, and test data.
Related Skills
node-connect
335.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
335.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.5kCommit, push, and open a PR
