Skip to content

HallucinationGuard

HallucinationGuard compares LLM outputs against source context to detect unsupported or contradicted claims using Natural Language Inference (NLI).

Overview

Property Value
Latency 20-100ms
Memory ~350 MB (NLI model)
Async Yes
ML Required Yes
License Professional

Algorithm

  1. Split output into individual claims (sentences)
  2. Split context into context sentences
  3. For each claim, run NLI against every context sentence (O(n*m))
  4. Per claim: take max entailment and max contradiction across context sentences
  5. Classify each claim as Supported, Contradicted, or Unsupported
  6. Aggregate into a hallucination score

Claim Verdicts

Verdict Description
Supported Claim is entailed by context evidence
Contradicted Claim contradicts context evidence
Unsupported Claim has no supporting context

Usage

Rust

use oxide_hallucination::HallucinationGuard;
use oxideshield_guard::Guard;

let guard = HallucinationGuard::new("hallucination", nli_classifier)
    .with_threshold(0.7);

let result = guard.check_with_context(
    "The capital of France is Berlin",
    "France is a country in Europe. Its capital is Paris."
);
assert!(!result.passed);

Python

from oxideshield import hallucination_guard

guard = hallucination_guard(threshold=0.7)
result = guard.check_with_context(
    output="The capital of France is Berlin",
    context="France is a country in Europe. Its capital is Paris."
)
assert not result.passed

Configuration

guards:
  - type: hallucination
    threshold: 0.7
    action: block

Research References