RAG for Legal Documents: HallucinationGuard and CitationValidator in Production
AI confidently cites non-existent articles and fabricates case numbers. In the legal domain, this is not just an error — it is malpractice. We built two layers of protection: HallucinationGuard verifies every claim, CitationValidator validates every citation. Zero tolerance for fabrication.
RAG for Legal Documents: HallucinationGuard and CitationValidator
AI confidently cites non-existent articles. In the legal domain, this is not an error — it is malpractice.
The Problem: AI Lies Confidently
Ask ChatGPT to name court decisions on copyright protection in Ukraine. It will produce 5 case numbers. Check them — 4 out of 5 do not exist. The fifth exists but is about an entirely different topic.
For a legal platform, this is unacceptable. Every case number, every legislation article, every citation — must be real.
Protection Architecture
Layer 1: HallucinationGuard
Works before the response reaches the user. Verifies every factual claim in the AI response:
- Claim extraction — parses the response into individual factual claims
- Source lookup — for each claim, searches for confirmation in tool call results
- Classification: supported (found in sources), unsupported (not in sources), contradicted (conflicts with sources)
- Decision: unsupported claims are flagged or removed, contradicted claims are always removed
Layer 2: CitationValidator
Works with specific references:
- Case numbers — verifies existence through the ZakonOnline API
- Legislation articles — validates through the Verkhovna Rada API
- Decision quotes — compares against the actual decision text
Layer 3: Precedent Status
Every decision is returned with a status:
- valid — in force, not overturned
- limited — narrowed by a higher court
- overruled — reversed
- questioned — under doubt
System Prompt Rule #1
"Never generate case numbers, legislation articles, or court decisions from memory. Always use tools to obtain factual data."
This is not a recommendation — it is a hard instruction. The AI cannot name any Civil Code article without calling get_legislation_article. It cannot reference a case without finding it via search_legal_precedents.
Result
Every reference in the response is clickable. Click a case number — the full text opens. Click a legislation article — see the current version. The lawyer does not take AI at its word — they verify in one click.
Zero tolerance for hallucinations is not a feature. It is the foundation.