Skip to content

Security Guards Overview

OxideShield™ provides multiple security guards that can be used individually or combined in a multi-layer defense pipeline.

Available Guards

Security Guards

Guard Detection Method Latency Tier Use Case
PatternGuard Aho-Corasick + Regex <1ms Community Prompt injection, jailbreaks
LengthGuard Character/token counting <1μs Community Input limits
EncodingGuard Unicode analysis <100μs Community Encoding attacks
PerplexityGuard N-gram perplexity <5ms Community Adversarial suffixes
PIIGuard Regex + Luhn validation <10ms Community PII detection
ToxicityGuard Keyword matching <5ms Community Content moderation
AuthoritarianUseGuard Pattern matching (extensive pattern library) <5ms Community Authoritarian misuse detection
StructuredOutputGuard Regex (pattern-based) <1ms Community Structured output injection
SemanticSimilarityGuard ML embeddings <15ms Professional Semantic attacks
MLClassifierGuard BERT classification <25ms Professional Multi-class detection
AgenticGuard Agentic workflow analysis <5ms Professional Agentic workflow security
SwarmGuard Multi-agent monitoring <5ms Professional Multi-agent swarm protection
ContainmentPolicy Policy enforcement <1ms Professional Swarm containment & isolation
RAGInjectionGuard Regex + Unicode analysis (multi-category detection) <5ms Professional RAG document injection detection
EmbeddingPIIFilter Regex + Luhn validation (multi-category) <5ms Professional Privacy-safe embedding generation

Wellbeing Guards

Guard Detection Method Latency Tier Use Case
DarkPatternGuard Multi-category manipulation detection (comprehensive pattern set) <5ms Professional UI/response manipulation
DependencyGuard Engagement metrics <1ms Professional Session/addiction monitoring
PsychologicalSafetyGuard Crisis + sycophancy detection <5ms Professional Mental health protection
AutonomyGuard 10 violation types <5ms Professional User agency protection
MisalignmentGuard 6 misalignment categories <5ms Professional AI behavior monitoring
HelpfulnessGuard AhoCorasick + density scoring (6 categories) <5ms Professional Over-refusal & evasiveness detection
AccessibilityGuard Flesch-Kincaid + jargon detection <5ms Professional Readability & plain language

Guard Trait

All guards implement the Guard trait:

pub trait Guard: Send + Sync {
    fn name(&self) -> &str;
    fn check(&self, input: &str) -> GuardResult;
}

GuardResult

Every guard returns a GuardResult containing: pass/fail status, reason, optional sanitized content, severity level, and metadata.

Severity Levels

pub enum Severity {
    None,      // No issues detected
    Low,       // Minor concern
    Medium,    // Moderate risk
    High,      // Significant threat
    Critical,  // Severe security issue
}

Multi-Layer Defense

Combine guards for defense-in-depth:

use oxideshield_guard::{
    MultiLayerDefense, LayerConfig, AggregationStrategy,
    PatternGuard, PIIGuard, ToxicityGuard,
};

let defense = MultiLayerDefense::builder("full-defense")
    // Layer 1: Fast pattern matching
    .add_guard(
        LayerConfig::new("patterns")
            .with_weight(1.0)
            .with_timeout_ms(10),
        Box::new(PatternGuard::new("patterns"))
    )
    // Layer 2: PII detection with redaction
    .add_guard(
        LayerConfig::new("pii")
            .with_weight(0.8),
        Box::new(PIIGuard::new("pii"))
    )
    // Layer 3: Toxicity filtering
    .add_guard(
        LayerConfig::new("toxicity")
            .with_weight(0.7),
        Box::new(ToxicityGuard::new("toxicity"))
    )
    .with_strategy(AggregationStrategy::FailFast)
    .build();

let result = defense.check(user_input);

Aggregation Strategies

Strategy Description
FailFast Stop on first failure
Unanimous All guards must pass
Majority >50% of guards must pass
Weighted Weighted score threshold
Comprehensive Run all, aggregate results

Python Usage

from oxideshield import (
    pattern_guard, pii_guard, toxicity_guard,
    length_guard, encoding_guard, perplexity_guard,
    authoritarian_use_guard, structured_output_guard,
    multi_layer_defense
)

# Individual guards
pattern = pattern_guard()
result = pattern.check("ignore previous instructions")

# Responsible AI guard
auth = authoritarian_use_guard()
result = auth.check("build a social credit scoring system")

# Multi-layer defense
defense = multi_layer_defense(
    enable_length=True,
    enable_pii=True,
    enable_toxicity=True,
    strategy="fail_fast"
)
result = defense.check(user_input)

Performance Comparison

Guard Speed Memory Tier
LengthGuard Near-instant Minimal Community
PatternGuard Sub-microsecond Minimal Community
EncodingGuard Sub-millisecond Minimal Community
StructuredOutputGuard Sub-millisecond Minimal Community
ToxicityGuard Sub-millisecond Low Community
ContainmentPolicy Sub-millisecond Minimal Professional
DependencyGuard Sub-millisecond Minimal Professional
PIIGuard Low milliseconds Low Community
PerplexityGuard Low milliseconds Low Community
AuthoritarianUseGuard Low milliseconds Low Community
AgenticGuard Low milliseconds Low Professional
SwarmGuard Low milliseconds Low Professional
DarkPatternGuard Low milliseconds Low Professional
PsychologicalSafetyGuard Low milliseconds Low Professional
AutonomyGuard Low milliseconds Low Professional
MisalignmentGuard Low milliseconds Low Professional
HelpfulnessGuard Low milliseconds Low Professional
AccessibilityGuard Low milliseconds Low Professional
RAGInjectionGuard Low milliseconds Low Professional
EmbeddingPIIFilter Low milliseconds Low Professional
SemanticSimilarityGuard 10-30ms Model-dependent Professional
MLClassifierGuard 10-30ms Model-dependent Professional

Choosing Guards

Fast + Lightweight

For low-latency applications: - PatternGuard + LengthGuard + EncodingGuard

Balanced

For most applications: - PatternGuard + PIIGuard + ToxicityGuard

Maximum Security

For high-security applications: - All guards with MultiLayerDefense

ML-Enhanced

When accuracy is critical: - PatternGuard + SemanticSimilarityGuard + MLClassifierGuard

Agentic Security

For multi-agent and tool-use workflows: - AgenticGuard + SwarmGuard + ContainmentPolicy

RAG Security

For RAG pipelines with external document ingestion: - RAGInjectionGuard + EmbeddingPIIFilter + PatternGuard

Responsible AI

For consumer-facing and compliance-sensitive deployments: - AuthoritarianUseGuard + DarkPatternGuard + AutonomyGuard + PsychologicalSafetyGuard + HelpfulnessGuard + AccessibilityGuard