Multi-Layer Defense¶

OxideShield™ implements a defense-in-depth architecture inspired by research-backed approaches. Combine multiple guards into a layered defense that balances security coverage with performance.

Architecture¶

Layer 1: Fast Regex (PatternGuard)     → <1ms, ~70% detection
Layer 2: Perplexity Analysis           → <5ms, +10% detection
Layer 3: Semantic/ML (if enabled)      → <25ms, +15% detection
Layer 4: PII/Toxicity Filters          → <10ms, comprehensive

Why Multi-Layer?¶

Single Guard	Multi-Layer Defense
Single point of failure	Redundant detection
Optimized for one attack type	Broad coverage
Attacker only needs one bypass	Attacker must bypass all layers
~70% detection rate	~95% detection rate

Research Foundation¶

OxideShield™'s multi-layer defense is inspired by peer-reviewed research:

PromptGuard - Nature Scientific Reports, 2025
4-layer defense: regex + MiniBERT + semantic + adaptive
Achieved F1=0.91, 67% injection reduction
The Attacker Moves Second
12 defenses bypassed at >90% with adaptive attacks
Validates need for defense-in-depth

Layer Types¶

Type	Speed	Use Case	Default Weight
`Regex`	<1ms	Pattern matching, known attacks	1.0
`Perplexity`	<5ms	Adversarial suffix detection	1.0
`PII`	<10ms	Personal data protection	1.0
`Toxicity`	<10ms	Content moderation	1.0
`MLClassifier`	<100ms	ML-based classification	1.5
`Semantic`	<100ms	Embedding similarity	1.5
`Custom`	Varies	User-defined guards	1.0

Aggregation Strategies¶

Choose how guard results combine to determine the final action.

FailFast (Default)¶

Stops at first blocking detection. Fastest response time.

multilayer:
  strategy: fail_fast

Behavior: - First guard that blocks → immediate block - Best for: High-security environments - Trade-off: May miss additional detections for logging

Input → PatternGuard [BLOCK] → STOP
                              ↓
                        Return: Block

Unanimous¶

Requires all guards to detect a threat before blocking. Minimizes false positives.

multilayer:
  strategy: unanimous

Behavior: - All guards must block → block - Any guard passes → allow - Best for: User-facing applications where false positives are costly - Trade-off: May allow edge-case attacks

Input → PatternGuard [BLOCK] → Perplexity [BLOCK] → PII [PASS]
                                                        ↓
                                                  Return: Allow

Majority¶

More than 50% of guards must detect to block. Balanced approach.

multilayer:
  strategy: majority

Behavior: - >50% guards block → block - ≤50% guards block → allow - Best for: General-purpose deployments - Trade-off: Moderate security/usability balance

5 guards: 3 block, 2 pass → Block (60% blocked)
5 guards: 2 block, 3 pass → Allow (40% blocked)

Early Termination: Majority strategy includes smart early termination: - Stops once majority threshold is reached - Stops if remaining guards can't change outcome

Weighted¶

Guards contribute based on their configured weight. ML-based guards can have higher influence.

multilayer:
  strategy: weighted
  layers:
    - type: regex
      weight: 1.0
    - type: semantic
      weight: 1.5  # Higher confidence
    - type: perplexity
      weight: 1.2

Behavior: - Sum blocked weights vs total weights - Block if blocked_weight/total_weight > 0.5 - Best for: Fine-tuned deployments - Trade-off: Requires weight calibration

Pattern (1.0) blocks, Semantic (1.5) passes, Perplexity (1.2) blocks
Blocked weight: 2.2, Total: 3.7, Ratio: 0.59 → Block

Comprehensive¶

Runs all guards and returns combined results. Maximum visibility.

multilayer:
  strategy: comprehensive

Behavior: - Always runs all guards - Blocks if any guard blocks - Returns all matches from all guards - Best for: Security auditing, debugging - Trade-off: Highest latency

Configuration¶

YAML Configuration¶

# oxideshield.yaml
multilayer:
  name: production_defense
  strategy: fail_fast
  enable_telemetry: true

  layers:
    # Layer 1: Fast pattern matching
    - name: pattern
      type: regex
      weight: 1.0
      enabled: true
      timeout_ms: 10
      config:
        categories:
          - prompt_injection
          - jailbreak

    # Layer 2: Entropy analysis
    - name: perplexity
      type: perplexity
      weight: 1.0
      enabled: true
      timeout_ms: 50
      config:
        threshold: 100

    # Layer 3: Semantic similarity (requires semantic feature)
    - name: semantic
      type: semantic
      weight: 1.5
      enabled: true
      timeout_ms: 100
      config:
        threshold: <threshold>  # Configure per deployment

    # Layer 4: PII protection
    - name: pii
      type: pii
      weight: 1.0
      enabled: true
      timeout_ms: 50
      config:
        action: sanitize

Rust API¶

use oxideshield_guard::multilayer::{
    MultiLayerDefense,
    AggregationStrategy,
    LayerConfig
};
use oxideshield_guard::guards::{PatternGuard, PerplexityGuard, PIIGuard};

let defense = MultiLayerDefense::builder("production")
    .add_guard(
        LayerConfig::regex().with_weight(1.0),
        Box::new(PatternGuard::new("pattern")),
    )
    .add_guard(
        LayerConfig::perplexity().with_weight(1.0),
        Box::new(PerplexityGuard::new("perplexity")),
    )
    .add_guard(
        LayerConfig::pii().with_weight(1.0),
        Box::new(PIIGuard::new("pii")),
    )
    .with_strategy(AggregationStrategy::FailFast)
    .with_telemetry(true)
    .build();

// Check content
let result = defense.check("user input here");

if !result.passed {
    println!("Blocked by: {}", result.summary);
    for layer in &result.layer_results {
        println!("  {} ({:?}): passed={}, duration={}ms",
            layer.layer_name,
            layer.layer_type,
            layer.result.passed,
            layer.duration.as_millis()
        );
    }
}

Python API¶

from oxideshield import MultiLayerDefense, AggregationStrategy

defense = MultiLayerDefense.builder("production") \
    .add_pattern_guard(weight=1.0) \
    .add_perplexity_guard(weight=1.0) \
    .add_pii_guard(weight=1.0) \
    .with_strategy(AggregationStrategy.FailFast) \
    .with_telemetry(True) \
    .build()

result = defense.check("user input here")

if not result.passed:
    print(f"Blocked: {result.summary}")
    for layer in result.layer_results:
        print(f"  {layer.name}: {layer.passed} ({layer.duration_ms}ms)")

Telemetry¶

Multi-layer defense includes built-in telemetry for monitoring and optimization.

Enabling Telemetry¶

let defense = MultiLayerDefense::builder("monitored")
    .add_guard(config, guard)
    .with_telemetry(true)  // Enable telemetry
    .build();

Available Metrics¶

let telemetry = defense.telemetry().unwrap();

// Overall statistics
println!("Total checks: {}", telemetry.total_checks());
println!("Passed: {}", telemetry.passed_checks());
println!("Blocked: {}", telemetry.blocked_checks());
println!("Block rate: {:.2}%", telemetry.block_rate() * 100.0);

// Per-layer statistics
for (name, stats) in telemetry.layer_stats() {
    println!("{}: {} checks, {} detections, {:.2}ms avg",
        name,
        stats.checks,
        stats.detections,
        stats.avg_duration_ms
    );
}

Prometheus Metrics¶

When used with the proxy, multi-layer telemetry exports to Prometheus:

# Layer-level metrics
oxideshield_multilayer_checks_total{defense="production"} 15000
oxideshield_multilayer_blocks_total{defense="production"} 150
oxideshield_multilayer_block_rate{defense="production"} 0.01

# Per-layer metrics
oxideshield_layer_checks_total{defense="production",layer="pattern"} 15000
oxideshield_layer_detections_total{defense="production",layer="pattern"} 120
oxideshield_layer_duration_ms{defense="production",layer="pattern",quantile="0.99"} 0.8

Strategy Selection Guide¶

Use Case	Recommended Strategy	Reason
API gateway	FailFast	Minimize latency
Chat application	Majority	Balance UX and security
Financial services	Unanimous	Minimize false positives
Security research	Comprehensive	Full visibility
Custom ML pipeline	Weighted	Leverage model confidence

Decision Flowchart¶

                    ┌─────────────────────┐
                    │ What's your priority? │
                    └─────────┬───────────┘
                              │
          ┌───────────────────┼───────────────────┐
          │                   │                   │
      Latency            Accuracy           Visibility
          │                   │                   │
    ┌─────▼─────┐      ┌─────▼─────┐      ┌─────▼─────┐
    │ FailFast  │      │ Are false │      │Comprehensive│
    └───────────┘      │ positives │      └───────────┘
                       │ critical? │
                       └─────┬─────┘
                             │
                    Yes ─────┼───── No
                             │
                    ┌────────▼────────┐
                    │   Unanimous or  │
                    │    Majority     │
                    └─────────────────┘

Performance Considerations¶

Layer Ordering¶

Order layers from fastest to slowest for FailFast strategy:

layers:
  - type: regex      # <1ms - check first
  - type: perplexity # <5ms
  - type: pii        # <10ms
  - type: semantic   # <100ms - check last

Timeouts¶

Configure per-layer timeouts to prevent slow guards from blocking requests:

layers:
  - type: semantic
    timeout_ms: 100  # Fail open after 100ms

Disabling Layers¶

Conditionally disable expensive layers:

layers:
  - type: semantic
    enabled: ${ENABLE_SEMANTIC:-false}  # Disabled by default

Result Structure¶

The MultiLayerResult contains complete information about the check:

Field	Type	Description
`passed`	bool	Overall pass/fail
`action`	GuardAction	Final action (Block, Allow, etc.)
`layer_results`	`Vec<LayerResult>`	Results from each layer
`all_matches`	`Vec<Match>`	Combined matches from all layers
`total_duration`	Duration	Total execution time
`strategy`	AggregationStrategy	Strategy used
`summary`	String	Human-readable summary

LayerResult¶

Field	Type	Description
`layer_name`	String	Layer identifier
`layer_type`	LayerType	Type of layer
`result`	GuardCheckResult	Full guard result
`duration`	Duration	Layer execution time
`weight`	f32	Layer weight

Example: Production Configuration¶

# production-multilayer.yaml
multilayer:
  name: production_defense
  strategy: fail_fast
  enable_telemetry: true

  layers:
    # Layer 1: Known attack patterns
    - name: known_attacks
      type: regex
      weight: 1.0
      timeout_ms: 10
      config:
        categories: [prompt_injection, jailbreak, system_prompt_leak]
        severity: high

    # Layer 2: Adversarial detection
    - name: adversarial
      type: perplexity
      weight: 1.0
      timeout_ms: 50
      config:
        threshold: 100
        action: block

    # Layer 3: Semantic similarity (premium feature)
    - name: semantic_match
      type: semantic
      weight: 1.5
      enabled: ${OXIDESHIELD_PREMIUM:-false}
      timeout_ms: 100
      config:
        threshold: <threshold>  # Configure per deployment
        embeddings_path: /opt/oxideshield/embeddings.bincode

    # Layer 4: Privacy protection
    - name: privacy
      type: pii
      weight: 1.0
      timeout_ms: 50
      config:
        action: sanitize
        entities: [email, phone, ssn, credit_card]

Next Steps¶

Pattern Guard - Configure regex-based detection
Perplexity Guard - Adversarial suffix detection
PII Guard - Personal data protection
Proxy Advanced Features - Rate limiting and alerts