Notes on keeping data out of models.
Privacy, PII redaction, AI regulation, and the engineering of a safe LLM boundary, written for the people who have to build it.
-
Data anonymization: techniques and tools
The practical guide to data anonymization: masking, generalization, k-anonymity, differential privacy, pseudonymization, and tokenization, plus how to choose and the tools landscape (Presidio, ARX, Anonde).
Read -
What is PII? Examples, and PII vs PHI
A plain-English definition of PII: direct vs quasi-identifiers, real examples, whether an email is PII, sensitive PII, and how PII differs from PHI under HIPAA.
Read -
OWASP Top 10 for LLMs: where a PII boundary fits
The OWASP Top 10 for LLM Applications mapped to a PII boundary: where redacting PII before the model helps with sensitive information disclosure, and where it does not stop prompt injection.
Read -
AI and data privacy: using LLMs on personal data safely
What actually happens to data you send to ChatGPT or an LLM, the real risks, and how to use a model on personal data safely by keeping PII out of the prompt.
Read -
How to redact PII before sending it to an LLM
A hands-on guide: detect personal data, replace it with stable tokens before the API call, and reveal the real values only inside your own trust boundary.
Read -
HIPAA and LLMs: handling PHI safely
What counts as PHI, why a prompt can be a disclosure, the two HIPAA de-identification methods, and how tokenizing PHI before the model reduces exposure.
Read -
Anonde vs Microsoft Presidio
A fair comparison of two open-source PII tools: where Presidio's Python SDK shines, and where Anonde's LLM trust-boundary and reveal-on-return workflow fits.
Read -
Anonymisation vs pseudonymisation vs tokenization
Clear definitions and when each applies: what falls outside GDPR, what stays personal data, and which approach is realistic for an LLM pipeline.
Read -
GDPR and LLMs: keep personal data in bounds
How GDPR applies when you send personal data to LLMs: the core principles, your duties as controller and processor, and where a PII boundary fits.
Read -
The EU AI Act and your LLM pipeline
What the EU AI Act actually requires when you run personal data through LLMs: the risk tiers, the dates that matter, and where a PII boundary fits.
Read