Skip to content

Collective Policy Engine

Enables democratic stakeholder input into runtime safety policies. Stakeholders propose guard rules, vote on changes, reach consensus, and export to OxideShield's YAML policy format for simulation and deployment. Based on the Collective Constitutional AI methodology.

Enterprise License Required

The Collective Policy Engine requires an Enterprise license. See Licensing for details.

Executive Summary

The Problem

AI safety policies are typically set by a small group of engineers or compliance officers. This creates blind spots, lacks diverse perspectives, and can produce policies that don't reflect the values of affected communities. Research shows that public input produces higher-quality safety policies.

How It Works

Stakeholders → Proposals → Voting → Consensus → Policy YAML → Deployment
  1. Stakeholders register with roles (Admin, PolicyAuthor, Reviewer, Observer)
  2. Proposals describe policy changes (add/remove/modify guards, change strategies)
  3. Voting with configurable quorum, thresholds, and weighted votes
  4. Consensus resolution with admin veto capability
  5. Export to OxideShield YAML policy format for deployment

Industry Context

The Collective Constitutional AI study (Anthropic, 2023) demonstrated that 1,000 participants across 38,252 votes produced AI safety policies with strong consensus on accessibility, helpfulness, and safety. The OECD AI Principles (2024) recommend multi-stakeholder governance incorporating diverse perspectives.

Sources: Collective Constitutional AI (Anthropic, 2023), OECD AI Principles (2024), Polis deliberation platform


Stakeholder Roles

Role Create Proposals Vote Comment View
Admin Yes Yes (weighted) Yes Yes
PolicyAuthor Yes Yes Yes Yes
Reviewer No Yes Yes Yes
Observer No No Yes Yes

Weighted Voting

Stakeholders can have different voting weights (default: 1.0). Admins typically have higher weights to reflect their responsibility:

let admin = Stakeholder::new("Security Lead", StakeholderRole::Admin)
    .with_weight(2.0);  // Admin vote counts double

Admin Veto

When enabled (default), any Admin rejection automatically vetoes the proposal, regardless of other votes.


Proposal Lifecycle

Draft → Open → [Closed] → Accepted/Rejected
                         Apply to Policy

Proposal Statuses

Status Description
Draft Not yet open for voting
Open Accepting votes
Closed Voting ended, awaiting resolution
Accepted Approved and ready to apply
Rejected Rejected by vote or veto
Superseded Replaced by a newer proposal

Proposed Changes

Change Type Description
AddGuard Add a new guard to the policy
RemoveGuard Remove an existing guard
ModifyGuard Change a guard's configuration
SetPipelineStrategy Change aggregation strategy (FailFast, Unanimous, etc.)
SetEnforcementMode Change enforcement mode (Strict, Permissive, Audit)
SetThreshold Adjust a specific guard threshold

Consensus Rules

Rule Default Description
quorum 0.5 (50%) Minimum fraction of eligible voters who must vote
approval_threshold 0.6 (60%) Fraction of approvals needed to pass
use_weighted_votes true Whether to weight votes by stakeholder weight
admin_veto true Whether admin rejection vetoes the proposal
auto_close_after_hours 168 (7 days) Hours before auto-closing proposals

Developer Guide

Basic Usage

use oxide_policy::collective::{
    CollectivePolicyEngine, Stakeholder, StakeholderRole,
    PolicyProposal, ProposedChange, Vote, VoteDecision,
};
use oxide_policy::schema::{SecurityPolicy, GuardConfig, GuardAction};

// Load base policy
let policy = SecurityPolicy::from_yaml(include_str!("policy.yaml"))?;

// Create engine
let mut engine = CollectivePolicyEngine::new(policy)?;

// Add stakeholders
let admin = Stakeholder::new("Security Lead", StakeholderRole::Admin)
    .with_weight(2.0);
let admin_id = admin.id;
engine.add_stakeholder(admin);

let reviewer = Stakeholder::new("ML Engineer", StakeholderRole::Reviewer);
let reviewer_id = reviewer.id;
engine.add_stakeholder(reviewer);

// Create a proposal
let proposal = PolicyProposal::new(
    "Add helpfulness monitoring",
    "Monitor AI responses for over-refusal patterns",
    admin_id,
).with_change(ProposedChange::AddGuard {
    config: GuardConfig::new("helpfulness")
        .with_action(GuardAction::Suggest),
});

let proposal_id = engine.create_proposal(proposal)?;

// Open for voting
engine.open_proposal(proposal_id)?;

// Cast votes
engine.add_vote(proposal_id, Vote::new(admin_id, VoteDecision::Approve))?;
engine.add_vote(proposal_id, Vote::new(reviewer_id, VoteDecision::Approve))?;

// Resolve
let status = engine.resolve_proposal(proposal_id)?;
assert_eq!(status, ProposalStatus::Accepted);

// Apply accepted proposals to create new policy
let new_policy = engine.apply_accepted_proposals()?;
println!("New version: {}", new_policy.metadata.version.unwrap());
from oxideshield import (
    CollectivePolicyEngine, Stakeholder, StakeholderRole,
    PolicyProposal, ProposedChange, Vote, VoteDecision,
)

# Create engine with base policy
engine = CollectivePolicyEngine.from_yaml("policy.yaml")

# Add stakeholders
admin = Stakeholder("Security Lead", StakeholderRole.Admin, weight=2.0)
engine.add_stakeholder(admin)

reviewer = Stakeholder("ML Engineer", StakeholderRole.Reviewer)
engine.add_stakeholder(reviewer)

# Create and vote on proposal
proposal = PolicyProposal(
    title="Add helpfulness monitoring",
    description="Monitor AI responses for over-refusal",
    proposed_by=admin.id,
)
proposal.add_change(ProposedChange.add_guard("helpfulness", action="suggest"))

proposal_id = engine.create_proposal(proposal)
engine.open_proposal(proposal_id)
engine.add_vote(proposal_id, Vote(admin.id, VoteDecision.Approve))
engine.add_vote(proposal_id, Vote(reviewer.id, VoteDecision.Approve))

status = engine.resolve_proposal(proposal_id)
print(f"Status: {status}")  # "accepted"

Preview and Simulate

Preview policy changes before applying them:

// Preview what the policy would look like
let preview = engine.preview_policy(proposal_id)?;
println!("Guards after change: {}", preview.spec.guards.len());

// Original policy is unchanged
let current = engine.export_policy();
assert_ne!(preview.spec.guards.len(), current.spec.guards.len());

// Simulate against test suite
let report = engine.simulate(&test_suite)?;
println!("Pass rate: {:.1}%", report.pass_rate * 100.0);

Export

// Export as YAML
let yaml = engine.export_yaml()?;
std::fs::write("policy.yaml", yaml)?;

// Export as SecurityPolicy struct
let policy = engine.export_policy();

Configuration

YAML Policy Format

The Collective Policy Engine operates on OxideShield's standard YAML policy format:

apiVersion: oxideshield.ai/v1
kind: SecurityPolicy
metadata:
  name: my-policy
  version: "1.0.0"
spec:
  guards:
    - name: pattern
      enabled: true
      action: block
    - name: helpfulness
      enabled: true
      action: suggest
  pipeline:
    strategy: fail_fast
  enforcement:
    mode: strict

Custom Consensus Rules

use oxide_policy::collective::ConsensusRules;

let rules = ConsensusRules {
    quorum: 0.75,                   // 75% must vote
    approval_threshold: 0.8,        // 80% approval needed
    use_weighted_votes: true,
    admin_veto: true,
    auto_close_after_hours: 48,     // 2-day voting window
};

let engine = CollectivePolicyEngine::new(policy)?
    .with_consensus_rules(rules);

Best Practices

1. Start with Read-Only Stakeholders

Begin by adding Observers and Reviewers to build familiarity before granting PolicyAuthor access.

2. Use Preview Before Apply

Always preview and simulate proposed changes before applying them to production:

let preview = engine.preview_policy(proposal_id)?;
let report = engine.simulate(&test_suite)?;
// Review report before resolving

3. Set Appropriate Quorum

For small teams, lower the quorum to avoid deadlocks. For high-stakes policies, raise it:

// Small team (3-5 people)
rules.quorum = 0.5;

// Large organization (50+ people)
rules.quorum = 0.3;  // Lower quorum, higher threshold
rules.approval_threshold = 0.75;

4. Version Control Policies

The engine automatically bumps minor versions on each apply_accepted_proposals() call. Track the PolicyHistory for audit trails.


References

Research Sources

  • Collective Constitutional AI (Anthropic, 2023)
  • https://www.anthropic.com/research/collective-constitutional-ai-aligning-a-language-model-with-the-collective-input-of-13k-people
  • 1,000 participants, 38,252 votes, 1,369 statements via Polis platform
  • OECD AI Principles (2024 update)
  • https://oecd.ai/en/ai-principles
  • Multi-stakeholder governance recommendation
  • Polis (pol.is)
  • Open-source deliberation platform used in Collective CAI study