If you're building AI systems that touch EU users or data, you're operating under two overlapping regulatory frameworks: GDPR (in force since 2018) and the EU AI Act (rolling into effect 2025-2027). Most developers know GDPR exists but treat it as a legal problem. The EU AI Act makes it an engineering problem — with specific technical requirements for risk assessment, human oversight, traceability, and data governance.
After building compliant AI systems for Holding Morelli (multi-agent compliance monitoring), Pellemoda (inventory forecasting), BitBricks (MiCA-regulated tokenization), and H-Farm (admissions chatbot), here's what compliance actually looks like in code.
The Two Frameworks: What Developers Need to Know
GDPR — Data Protection
GDPR governs how you process personal data. For AI systems, the critical requirements are:
| Requirement | What It Means for AI | Technical Implication |
|---|---|---|
| Lawful basis (Art. 6) | You need a legal reason to process data | Document why your AI needs each data field |
| Data minimization (Art. 5) | Collect only what's necessary | Don't feed entire user profiles into prompts when you only need a name |
| Right to erasure (Art. 17) | Users can request deletion | Your vector stores, training data, and logs must support deletion |
| Right to explanation (Art. 22) | Users can demand human review of automated decisions | If your agent makes consequential decisions, you need a human-in-the-loop path |
| Privacy by design (Art. 25) | Build data protection into the system, not as an afterthought | PII filtering, anonymization, and access controls from day one |
| DPIA (Art. 35) | Data Protection Impact Assessment for high-risk processing | Document risks before deploying AI that profiles or scores users |
EU AI Act — Risk-Based Regulation
The EU AI Act classifies AI systems by risk level:
| Risk Level | Examples | Requirements | Timeline |
|---|---|---|---|
| Unacceptable | Social scoring, manipulative AI, untargeted biometric scraping | Banned | Feb 2025 |
| High-risk | Employment screening, credit scoring, education access, critical infrastructure | Risk assessment, human oversight, documentation, conformity assessment | Aug 2026 |
| Limited risk | Chatbots, generative AI | Transparency — disclose that users are interacting with AI | Aug 2026 |
| Minimal risk | Spam filters, AI-assisted games | No requirements | N/A |
Most business AI systems fall into "limited risk" or "high-risk." If your AI agent makes decisions that affect people's access to services, employment, or credit, it's high-risk.
Layer 1: PII Filtering
The most common GDPR violation in AI systems: accidentally logging personal data in LLM traces. A user types their email, credit card, or address into a chatbot — your tracing system captures it, stores it in plaintext, and now you have uncontrolled PII in your logs.
LangChain's guardrails middleware solves this at the agent level:
from langchain.agents import create_agent
from langchain.agents.middleware import PIIMiddleware
agent = create_agent(
model="gpt-4.1",
tools=[customer_service_tool, email_tool],
middleware=[
# Redact emails before they reach the LLM
PIIMiddleware(
"email",
strategy="redact",
apply_to_input=True,
apply_to_output=True, # Also scrub model responses
),
# Mask credit cards (show last 4 only)
PIIMiddleware(
"credit_card",
strategy="mask",
apply_to_input=True,
),
# Block API keys entirely — raise error if detected
PIIMiddleware(
"api_key",
detector=r"sk-[a-zA-Z0-9]{32}",
strategy="block",
apply_to_input=True,
),
],
)The four strategies:
| Strategy | What It Does | Use When |
|---|---|---|
redact |
Replace with [REDACTED_EMAIL] |
Default — safe for most cases |
mask |
Show partial data (****-1234) |
User needs to verify "which card" |
hash |
Deterministic hash (a8f5f167...) |
Need to correlate across sessions without storing PII |
block |
Raise exception | Zero tolerance — API keys, passwords |
Apply to both input AND output. The LLM can hallucinate or echo PII in responses even if you scrubbed the input. Set apply_to_output=True for email, credit card, and any custom PII types.
For Holding Morelli's compliance system, I added custom Italian fiscal code detection:
PIIMiddleware(
"codice_fiscale",
detector=r"[A-Z]{6}\d{2}[A-Z]\d{2}[A-Z]\d{3}[A-Z]",
strategy="redact",
apply_to_input=True,
apply_to_output=True,
)Layer 2: Human-in-the-Loop for Consequential Decisions
GDPR Article 22 gives users the right to not be subject to fully automated decisions that significantly affect them. The EU AI Act reinforces this: high-risk AI systems must have human oversight mechanisms.
LangChain's HumanInTheLoopMiddleware makes this a configuration choice, not an architecture change:
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
agent = create_agent(
model="gpt-4.1",
tools=[score_applicant, send_rejection, approve_application],
middleware=[
HumanInTheLoopMiddleware(
interrupt_on={
"send_rejection": True, # Human must approve rejections
"approve_application": True, # Human must approve approvals
"score_applicant": False, # Scoring is advisory, auto-execute
}
),
],
checkpointer=InMemorySaver(),
)When the agent calls send_rejection, execution pauses. A human reviewer sees the proposed action, the reasoning, and the data — then approves or denies. The agent resumes with the decision.
For production, replace InMemorySaver with a persistent checkpointer (PostgreSQL) so the review queue survives restarts.
Layer 3: Audit Logging and Traceability
The EU AI Act requires activity logging for traceability on all high-risk AI systems. GDPR requires you to demonstrate compliance — meaning you need records of what data was processed, why, and what decisions were made.
LangSmith for Compliance Tracing
LangSmith provides the tracing infrastructure:
import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "compliance-agent-prod"Every agent run is traced with:
- Full input/output history
- Tool calls with arguments and results
- Token usage and latency
- Error states and retries
But for GDPR compliance, you need to prevent PII from appearing in traces. LangSmith supports this:
# Option 1: Hide all inputs/outputs globally
os.environ["LANGSMITH_HIDE_INPUTS"] = "true"
os.environ["LANGSMITH_HIDE_OUTPUTS"] = "true"
# Option 2: Mask specific fields with regex
from langsmith import Client
client = Client(
hide_inputs=lambda inputs: mask_pii(inputs),
hide_outputs=lambda outputs: mask_pii(outputs),
)For Pellemoda, I use Option 2 — mask PII fields but keep the rest visible for debugging. Complete hiding makes debugging impossible.
Structured Audit Log
Beyond tracing, maintain a structured audit log for regulators:
async def log_ai_decision(
agent_name: str,
input_summary: str, # No PII — summarized
decision: str,
confidence: float,
human_reviewed: bool,
reviewer_id: str | None,
data_subjects: list[str], # Pseudonymized IDs
):
await db.execute("""
INSERT INTO ai_audit_log
(timestamp, agent, input_summary, decision, confidence,
human_reviewed, reviewer_id, data_subjects)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
""", datetime.utcnow(), agent_name, input_summary, decision,
confidence, human_reviewed, reviewer_id, data_subjects)This gives you an immutable record linking every AI decision to the data it processed and whether a human was involved — exactly what GDPR Article 30 (Records of Processing) and the EU AI Act's documentation requirements demand.
Layer 4: Content Safety Guardrails
Beyond PII, you need guardrails that prevent the agent from generating harmful, biased, or non-compliant content.
Deterministic Guardrails (Fast, Predictable)
Block known-bad patterns before they reach the LLM:
from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
from langgraph.runtime import Runtime
class ComplianceFilterMiddleware(AgentMiddleware):
"""Block non-compliant requests before processing."""
BLOCKED_PATTERNS = [
r"ignore.*previous.*instructions", # Prompt injection
r"pretend.*you.*are", # Role hijacking
r"generate.*fake.*documents", # Fraud
]
@hook_config(can_jump_to=["end"])
def before_agent(self, state: AgentState, runtime: Runtime):
content = state["messages"][0].content.lower()
for pattern in self.BLOCKED_PATTERNS:
if re.search(pattern, content):
return {
"messages": [{"role": "assistant", "content": "I cannot process this request."}],
"jump_to": "end"
}
return NoneModel-Based Guardrails (Nuanced, Expensive)
For subtler compliance — tone, bias detection, factual accuracy:
from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
class BiasDetectionMiddleware(AgentMiddleware):
"""Check agent output for discriminatory language or bias."""
def __init__(self):
super().__init__()
self.safety_model = init_chat_model("gpt-4.1-mini")
@hook_config(can_jump_to=["end"])
def after_agent(self, state: AgentState, runtime: Runtime):
last_message = state["messages"][-1]
if not isinstance(last_message, AIMessage):
return None
result = self.safety_model.invoke([{
"role": "user",
"content": f"""Evaluate if this response contains discriminatory language,
bias against protected groups, or unfair treatment recommendations.
Respond SAFE or UNSAFE.\n\nResponse: {last_message.content}"""
}])
if "UNSAFE" in result.content:
last_message.content = "I need to reconsider my response. Let me provide a more balanced analysis."
return NoneStacking Guardrails
In production, I stack 4 layers on every compliance-sensitive agent:
agent = create_agent(
model="gpt-4.1",
tools=[...],
middleware=[
ComplianceFilterMiddleware(), # L1: Block bad inputs
PIIMiddleware("email", strategy="redact", apply_to_input=True, apply_to_output=True),
PIIMiddleware("credit_card", strategy="mask", apply_to_input=True),
HumanInTheLoopMiddleware(interrupt_on={"send_decision": True}),
BiasDetectionMiddleware(), # L4: Check outputs
],
)Order matters. Deterministic checks first (fast, cheap), PII filtering next, human approval for actions, then model-based safety checks on the final output.
Layer 5: Data Minimization and Retention
GDPR requires you to collect only what's necessary and delete when no longer needed. For AI systems, this means:
Minimize Context
Don't dump entire user profiles into prompts:
# Bad — sends everything to the LLM
context = await db.query("SELECT * FROM users WHERE id = $1", user_id)
# Good — send only what the task needs
context = await db.query(
"SELECT subscription_tier, support_history_summary FROM users WHERE id = $1",
user_id
)Automated Retention Policies
# Delete traces older than 90 days (configurable per tenant)
async def apply_data_retention(tenant_id: str):
retention_days = await get_retention_policy(tenant_id)
cutoff = datetime.utcnow() - timedelta(days=retention_days)
# Delete AI audit logs
await db.execute(
"DELETE FROM ai_audit_log WHERE tenant_id = $1 AND timestamp < $2",
tenant_id, cutoff
)
# Delete vector store embeddings for expired data
await vectorstore.delete(filter={"tenant_id": tenant_id, "created_at": {"$lt": cutoff}})For BitBricks, I run retention policies via a scheduled job — chunked batch deletion to avoid locking the database during cleanup.
Right to Erasure
When a user requests deletion (Article 17), you must delete their data from:
- Your database
- Your vector store embeddings
- Your LLM trace logs (LangSmith)
- Any cached responses
- Backup systems (within a reasonable timeframe)
This is the hardest GDPR requirement for AI systems — embeddings don't have a "delete by user ID" button in most vector stores. Design your embedding pipeline with user-level metadata from the start.
EU AI Act Compliance Checklist
For limited-risk AI (chatbots, generative AI):
- Disclose that users are interacting with AI
- Label AI-generated content clearly
- Maintain basic documentation of the system
For high-risk AI (scoring, screening, access decisions):
- Conduct risk assessment before deployment
- Implement human oversight mechanism
- Maintain activity logging for traceability
- Create technical documentation (data sources, training process, performance metrics)
- Conduct conformity assessment
- Register in the EU AI database
- Implement post-market monitoring
- Report serious incidents
What I'd Do Differently
-
Build PII filtering from day zero. Retrofitting PII scrubbing onto an existing tracing pipeline is painful. Start with
PIIMiddlewareon every agent. -
Use middleware, not custom code. LangChain's guardrails middleware composes cleanly and survives agent refactors. Custom if/else blocks in your agent logic don't.
-
Log decisions, not data. Your audit log should record "approved loan application for User-ABC123 with confidence 0.87, reviewed by human" — not the user's actual financial data.
-
Design for deletion from the start. Every data store (Postgres, Pinecone, Redis, LangSmith) needs a "delete everything for user X" path. If you can't do this, you can't comply with Article 17.
-
EU-region storage isn't optional. If processing EU personal data, your vector stores, databases, and LLM API calls should stay in EU regions. Check your provider's data residency options.
Related Posts
- Building AI Agents with LangGraph: From Prototype to Production — the orchestration framework these guardrails run on
- MCP (Model Context Protocol): Connecting AI Agents to Real Tools — securing tool access with interceptors and authentication
Building AI systems for EU markets and need compliance-ready architecture? I've shipped GDPR-compliant agents across compliance, finance, and tokenization. Get in touch or book a call.