HallucinationGuard
HallucinationGuard compares LLM outputs against source context to detect unsupported or contradicted claims using Natural Language Inference (NLI).
Overview
| Property |
Value |
| Latency |
20-100ms |
| Memory |
~350 MB (NLI model) |
| Async |
Yes |
| ML Required |
Yes |
| License |
Professional |
Algorithm
- Split output into individual claims (sentences)
- Split context into context sentences
- For each claim, run NLI against every context sentence (O(n*m))
- Per claim: take max entailment and max contradiction across context sentences
- Classify each claim as Supported, Contradicted, or Unsupported
- Aggregate into a hallucination score
Claim Verdicts
| Verdict |
Description |
| Supported |
Claim is entailed by context evidence |
| Contradicted |
Claim contradicts context evidence |
| Unsupported |
Claim has no supporting context |
Usage
Rust
use oxide_hallucination::HallucinationGuard;
use oxideshield_guard::Guard;
let guard = HallucinationGuard::new("hallucination", nli_classifier)
.with_threshold(0.7);
let result = guard.check_with_context(
"The capital of France is Berlin",
"France is a country in Europe. Its capital is Paris."
);
assert!(!result.passed);
Python
from oxideshield import hallucination_guard
guard = hallucination_guard(threshold=0.7)
result = guard.check_with_context(
output="The capital of France is Berlin",
context="France is a country in Europe. Its capital is Paris."
)
assert not result.passed
Configuration
guards:
- type: hallucination
threshold: 0.7
action: block
Research References