Skip to content

Multi-Layer Defense

OxideShield™ implements a defense-in-depth architecture inspired by research-backed approaches. Combine multiple guards into a layered defense that balances security coverage with performance.

Architecture

Layer 1: Fast Regex (PatternGuard)     → <1ms, ~70% detection
Layer 2: Perplexity Analysis           → <5ms, +10% detection
Layer 3: Semantic/ML (if enabled)      → <25ms, +15% detection
Layer 4: PII/Toxicity Filters          → <10ms, comprehensive

Why Multi-Layer?

Single Guard Multi-Layer Defense
Single point of failure Redundant detection
Optimized for one attack type Broad coverage
Attacker only needs one bypass Attacker must bypass all layers
~70% detection rate ~95% detection rate

Research Foundation

OxideShield™'s multi-layer defense is inspired by peer-reviewed research:

  • PromptGuard - Nature Scientific Reports, 2025
  • 4-layer defense: regex + MiniBERT + semantic + adaptive
  • Achieved F1=0.91, 67% injection reduction

  • The Attacker Moves Second

  • 12 defenses bypassed at >90% with adaptive attacks
  • Validates need for defense-in-depth

Layer Types

Type Speed Use Case Default Weight
Regex <1ms Pattern matching, known attacks 1.0
Perplexity <5ms Adversarial suffix detection 1.0
PII <10ms Personal data protection 1.0
Toxicity <10ms Content moderation 1.0
MLClassifier <100ms ML-based classification 1.5
Semantic <100ms Embedding similarity 1.5
Custom Varies User-defined guards 1.0

Aggregation Strategies

Choose how guard results combine to determine the final action.

FailFast (Default)

Stops at first blocking detection. Fastest response time.

multilayer:
  strategy: fail_fast

Behavior: - First guard that blocks → immediate block - Best for: High-security environments - Trade-off: May miss additional detections for logging

Input → PatternGuard [BLOCK] → STOP
                        Return: Block

Unanimous

Requires all guards to detect a threat before blocking. Minimizes false positives.

multilayer:
  strategy: unanimous

Behavior: - All guards must block → block - Any guard passes → allow - Best for: User-facing applications where false positives are costly - Trade-off: May allow edge-case attacks

Input → PatternGuard [BLOCK] → Perplexity [BLOCK] → PII [PASS]
                                                  Return: Allow

Majority

More than 50% of guards must detect to block. Balanced approach.

multilayer:
  strategy: majority

Behavior: - >50% guards block → block - ≤50% guards block → allow - Best for: General-purpose deployments - Trade-off: Moderate security/usability balance

5 guards: 3 block, 2 pass → Block (60% blocked)
5 guards: 2 block, 3 pass → Allow (40% blocked)

Early Termination: Majority strategy includes smart early termination: - Stops once majority threshold is reached - Stops if remaining guards can't change outcome

Weighted

Guards contribute based on their configured weight. ML-based guards can have higher influence.

multilayer:
  strategy: weighted
  layers:
    - type: regex
      weight: 1.0
    - type: semantic
      weight: 1.5  # Higher confidence
    - type: perplexity
      weight: 1.2

Behavior: - Sum blocked weights vs total weights - Block if blocked_weight/total_weight > 0.5 - Best for: Fine-tuned deployments - Trade-off: Requires weight calibration

Pattern (1.0) blocks, Semantic (1.5) passes, Perplexity (1.2) blocks
Blocked weight: 2.2, Total: 3.7, Ratio: 0.59 → Block

Comprehensive

Runs all guards and returns combined results. Maximum visibility.

multilayer:
  strategy: comprehensive

Behavior: - Always runs all guards - Blocks if any guard blocks - Returns all matches from all guards - Best for: Security auditing, debugging - Trade-off: Highest latency

Configuration

YAML Configuration

# oxideshield.yaml
multilayer:
  name: production_defense
  strategy: fail_fast
  enable_telemetry: true

  layers:
    # Layer 1: Fast pattern matching
    - name: pattern
      type: regex
      weight: 1.0
      enabled: true
      timeout_ms: 10
      config:
        categories:
          - prompt_injection
          - jailbreak

    # Layer 2: Entropy analysis
    - name: perplexity
      type: perplexity
      weight: 1.0
      enabled: true
      timeout_ms: 50
      config:
        threshold: 100

    # Layer 3: Semantic similarity (requires semantic feature)
    - name: semantic
      type: semantic
      weight: 1.5
      enabled: true
      timeout_ms: 100
      config:
        threshold: 0.85

    # Layer 4: PII protection
    - name: pii
      type: pii
      weight: 1.0
      enabled: true
      timeout_ms: 50
      config:
        action: sanitize

Rust API

use oxide_guard::multilayer::{
    MultiLayerDefense,
    AggregationStrategy,
    LayerConfig
};
use oxide_guard::guards::{PatternGuard, PerplexityGuard, PIIGuard};

let defense = MultiLayerDefense::builder("production")
    .add_guard(
        LayerConfig::regex().with_weight(1.0),
        Box::new(PatternGuard::new("pattern")),
    )
    .add_guard(
        LayerConfig::perplexity().with_weight(1.0),
        Box::new(PerplexityGuard::new("perplexity")),
    )
    .add_guard(
        LayerConfig::pii().with_weight(1.0),
        Box::new(PIIGuard::new("pii")),
    )
    .with_strategy(AggregationStrategy::FailFast)
    .with_telemetry(true)
    .build();

// Check content
let result = defense.check("user input here");

if !result.passed {
    println!("Blocked by: {}", result.summary);
    for layer in &result.layer_results {
        println!("  {} ({:?}): passed={}, duration={}ms",
            layer.layer_name,
            layer.layer_type,
            layer.result.passed,
            layer.duration.as_millis()
        );
    }
}

Python API

from oxideshield import MultiLayerDefense, AggregationStrategy

defense = MultiLayerDefense.builder("production") \
    .add_pattern_guard(weight=1.0) \
    .add_perplexity_guard(weight=1.0) \
    .add_pii_guard(weight=1.0) \
    .with_strategy(AggregationStrategy.FailFast) \
    .with_telemetry(True) \
    .build()

result = defense.check("user input here")

if not result.passed:
    print(f"Blocked: {result.summary}")
    for layer in result.layer_results:
        print(f"  {layer.name}: {layer.passed} ({layer.duration_ms}ms)")

Telemetry

Multi-layer defense includes built-in telemetry for monitoring and optimization.

Enabling Telemetry

let defense = MultiLayerDefense::builder("monitored")
    .add_guard(config, guard)
    .with_telemetry(true)  // Enable telemetry
    .build();

Available Metrics

let telemetry = defense.telemetry().unwrap();

// Overall statistics
println!("Total checks: {}", telemetry.total_checks());
println!("Passed: {}", telemetry.passed_checks());
println!("Blocked: {}", telemetry.blocked_checks());
println!("Block rate: {:.2}%", telemetry.block_rate() * 100.0);

// Per-layer statistics
for (name, stats) in telemetry.layer_stats() {
    println!("{}: {} checks, {} detections, {:.2}ms avg",
        name,
        stats.checks,
        stats.detections,
        stats.avg_duration_ms
    );
}

Prometheus Metrics

When used with the proxy, multi-layer telemetry exports to Prometheus:

# Layer-level metrics
oxideshield_multilayer_checks_total{defense="production"} 15000
oxideshield_multilayer_blocks_total{defense="production"} 150
oxideshield_multilayer_block_rate{defense="production"} 0.01

# Per-layer metrics
oxideshield_layer_checks_total{defense="production",layer="pattern"} 15000
oxideshield_layer_detections_total{defense="production",layer="pattern"} 120
oxideshield_layer_duration_ms{defense="production",layer="pattern",quantile="0.99"} 0.8

Strategy Selection Guide

Use Case Recommended Strategy Reason
API gateway FailFast Minimize latency
Chat application Majority Balance UX and security
Financial services Unanimous Minimize false positives
Security research Comprehensive Full visibility
Custom ML pipeline Weighted Leverage model confidence

Decision Flowchart

                    ┌─────────────────────┐
                    │ What's your priority? │
                    └─────────┬───────────┘
          ┌───────────────────┼───────────────────┐
          │                   │                   │
      Latency            Accuracy           Visibility
          │                   │                   │
    ┌─────▼─────┐      ┌─────▼─────┐      ┌─────▼─────┐
    │ FailFast  │      │ Are false │      │Comprehensive│
    └───────────┘      │ positives │      └───────────┘
                       │ critical? │
                       └─────┬─────┘
                    Yes ─────┼───── No
                    ┌────────▼────────┐
                    │   Unanimous or  │
                    │    Majority     │
                    └─────────────────┘

Performance Considerations

Layer Ordering

Order layers from fastest to slowest for FailFast strategy:

layers:
  - type: regex      # <1ms - check first
  - type: perplexity # <5ms
  - type: pii        # <10ms
  - type: semantic   # <100ms - check last

Timeouts

Configure per-layer timeouts to prevent slow guards from blocking requests:

layers:
  - type: semantic
    timeout_ms: 100  # Fail open after 100ms

Disabling Layers

Conditionally disable expensive layers:

layers:
  - type: semantic
    enabled: ${ENABLE_SEMANTIC:-false}  # Disabled by default

Result Structure

The MultiLayerResult contains complete information about the check:

Field Type Description
passed bool Overall pass/fail
action GuardAction Final action (Block, Allow, etc.)
layer_results Vec<LayerResult> Results from each layer
all_matches Vec<Match> Combined matches from all layers
total_duration Duration Total execution time
strategy AggregationStrategy Strategy used
summary String Human-readable summary

LayerResult

Field Type Description
layer_name String Layer identifier
layer_type LayerType Type of layer
result GuardCheckResult Full guard result
duration Duration Layer execution time
weight f32 Layer weight

Example: Production Configuration

# production-multilayer.yaml
multilayer:
  name: production_defense
  strategy: fail_fast
  enable_telemetry: true

  layers:
    # Layer 1: Known attack patterns
    - name: known_attacks
      type: regex
      weight: 1.0
      timeout_ms: 10
      config:
        categories: [prompt_injection, jailbreak, system_prompt_leak]
        severity: high

    # Layer 2: Adversarial detection
    - name: adversarial
      type: perplexity
      weight: 1.0
      timeout_ms: 50
      config:
        threshold: 100
        action: block

    # Layer 3: Semantic similarity (premium feature)
    - name: semantic_match
      type: semantic
      weight: 1.5
      enabled: ${OXIDESHIELD_PREMIUM:-false}
      timeout_ms: 100
      config:
        threshold: 0.85
        embeddings_path: /opt/oxideshield/embeddings.bincode

    # Layer 4: Privacy protection
    - name: privacy
      type: pii
      weight: 1.0
      timeout_ms: 50
      config:
        action: sanitize
        entities: [email, phone, ssn, credit_card]

Next Steps