Collective Policy Engine¶
Enables democratic stakeholder input into runtime safety policies. Stakeholders propose guard rules, vote on changes, reach consensus, and export to OxideShield's YAML policy format for simulation and deployment. Based on the Collective Constitutional AI methodology.
Enterprise License Required
The Collective Policy Engine requires an Enterprise license. See Licensing for details.
Executive Summary¶
The Problem¶
AI safety policies are typically set by a small group of engineers or compliance officers. This creates blind spots, lacks diverse perspectives, and can produce policies that don't reflect the values of affected communities. Research shows that public input produces higher-quality safety policies.
How It Works¶
- Stakeholders register with roles (Admin, PolicyAuthor, Reviewer, Observer)
- Proposals describe policy changes (add/remove/modify guards, change strategies)
- Voting with configurable quorum, thresholds, and weighted votes
- Consensus resolution with admin veto capability
- Export to OxideShield YAML policy format for deployment
Industry Context¶
The Collective Constitutional AI study (Anthropic, 2023) demonstrated that 1,000 participants across 38,252 votes produced AI safety policies with strong consensus on accessibility, helpfulness, and safety. The OECD AI Principles (2024) recommend multi-stakeholder governance incorporating diverse perspectives.
Sources: Collective Constitutional AI (Anthropic, 2023), OECD AI Principles (2024), Polis deliberation platform
Stakeholder Roles¶
| Role | Create Proposals | Vote | Comment | View |
|---|---|---|---|---|
| Admin | Yes | Yes (weighted) | Yes | Yes |
| PolicyAuthor | Yes | Yes | Yes | Yes |
| Reviewer | No | Yes | Yes | Yes |
| Observer | No | No | Yes | Yes |
Weighted Voting¶
Stakeholders can have different voting weights (default: 1.0). Admins typically have higher weights to reflect their responsibility:
let admin = Stakeholder::new("Security Lead", StakeholderRole::Admin)
.with_weight(2.0); // Admin vote counts double
Admin Veto¶
When enabled (default), any Admin rejection automatically vetoes the proposal, regardless of other votes.
Proposal Lifecycle¶
Proposal Statuses¶
| Status | Description |
|---|---|
| Draft | Not yet open for voting |
| Open | Accepting votes |
| Closed | Voting ended, awaiting resolution |
| Accepted | Approved and ready to apply |
| Rejected | Rejected by vote or veto |
| Superseded | Replaced by a newer proposal |
Proposed Changes¶
| Change Type | Description |
|---|---|
AddGuard |
Add a new guard to the policy |
RemoveGuard |
Remove an existing guard |
ModifyGuard |
Change a guard's configuration |
SetPipelineStrategy |
Change aggregation strategy (FailFast, Unanimous, etc.) |
SetEnforcementMode |
Change enforcement mode (Strict, Permissive, Audit) |
SetThreshold |
Adjust a specific guard threshold |
Consensus Rules¶
| Rule | Default | Description |
|---|---|---|
quorum |
0.5 (50%) | Minimum fraction of eligible voters who must vote |
approval_threshold |
0.6 (60%) | Fraction of approvals needed to pass |
use_weighted_votes |
true |
Whether to weight votes by stakeholder weight |
admin_veto |
true |
Whether admin rejection vetoes the proposal |
auto_close_after_hours |
168 (7 days) | Hours before auto-closing proposals |
Developer Guide¶
Basic Usage¶
use oxide_policy::collective::{
CollectivePolicyEngine, Stakeholder, StakeholderRole,
PolicyProposal, ProposedChange, Vote, VoteDecision,
};
use oxide_policy::schema::{SecurityPolicy, GuardConfig, GuardAction};
// Load base policy
let policy = SecurityPolicy::from_yaml(include_str!("policy.yaml"))?;
// Create engine
let mut engine = CollectivePolicyEngine::new(policy)?;
// Add stakeholders
let admin = Stakeholder::new("Security Lead", StakeholderRole::Admin)
.with_weight(2.0);
let admin_id = admin.id;
engine.add_stakeholder(admin);
let reviewer = Stakeholder::new("ML Engineer", StakeholderRole::Reviewer);
let reviewer_id = reviewer.id;
engine.add_stakeholder(reviewer);
// Create a proposal
let proposal = PolicyProposal::new(
"Add helpfulness monitoring",
"Monitor AI responses for over-refusal patterns",
admin_id,
).with_change(ProposedChange::AddGuard {
config: GuardConfig::new("helpfulness")
.with_action(GuardAction::Suggest),
});
let proposal_id = engine.create_proposal(proposal)?;
// Open for voting
engine.open_proposal(proposal_id)?;
// Cast votes
engine.add_vote(proposal_id, Vote::new(admin_id, VoteDecision::Approve))?;
engine.add_vote(proposal_id, Vote::new(reviewer_id, VoteDecision::Approve))?;
// Resolve
let status = engine.resolve_proposal(proposal_id)?;
assert_eq!(status, ProposalStatus::Accepted);
// Apply accepted proposals to create new policy
let new_policy = engine.apply_accepted_proposals()?;
println!("New version: {}", new_policy.metadata.version.unwrap());
from oxideshield import (
CollectivePolicyEngine, Stakeholder, StakeholderRole,
PolicyProposal, ProposedChange, Vote, VoteDecision,
)
# Create engine with base policy
engine = CollectivePolicyEngine.from_yaml("policy.yaml")
# Add stakeholders
admin = Stakeholder("Security Lead", StakeholderRole.Admin, weight=2.0)
engine.add_stakeholder(admin)
reviewer = Stakeholder("ML Engineer", StakeholderRole.Reviewer)
engine.add_stakeholder(reviewer)
# Create and vote on proposal
proposal = PolicyProposal(
title="Add helpfulness monitoring",
description="Monitor AI responses for over-refusal",
proposed_by=admin.id,
)
proposal.add_change(ProposedChange.add_guard("helpfulness", action="suggest"))
proposal_id = engine.create_proposal(proposal)
engine.open_proposal(proposal_id)
engine.add_vote(proposal_id, Vote(admin.id, VoteDecision.Approve))
engine.add_vote(proposal_id, Vote(reviewer.id, VoteDecision.Approve))
status = engine.resolve_proposal(proposal_id)
print(f"Status: {status}") # "accepted"
Preview and Simulate¶
Preview policy changes before applying them:
// Preview what the policy would look like
let preview = engine.preview_policy(proposal_id)?;
println!("Guards after change: {}", preview.spec.guards.len());
// Original policy is unchanged
let current = engine.export_policy();
assert_ne!(preview.spec.guards.len(), current.spec.guards.len());
// Simulate against test suite
let report = engine.simulate(&test_suite)?;
println!("Pass rate: {:.1}%", report.pass_rate * 100.0);
Export¶
Configuration¶
YAML Policy Format¶
The Collective Policy Engine operates on OxideShield's standard YAML policy format:
apiVersion: oxideshield.ai/v1
kind: SecurityPolicy
metadata:
name: my-policy
version: "1.0.0"
spec:
guards:
- name: pattern
enabled: true
action: block
- name: helpfulness
enabled: true
action: suggest
pipeline:
strategy: fail_fast
enforcement:
mode: strict
Custom Consensus Rules¶
use oxide_policy::collective::ConsensusRules;
let rules = ConsensusRules {
quorum: 0.75, // 75% must vote
approval_threshold: 0.8, // 80% approval needed
use_weighted_votes: true,
admin_veto: true,
auto_close_after_hours: 48, // 2-day voting window
};
let engine = CollectivePolicyEngine::new(policy)?
.with_consensus_rules(rules);
Best Practices¶
1. Start with Read-Only Stakeholders¶
Begin by adding Observers and Reviewers to build familiarity before granting PolicyAuthor access.
2. Use Preview Before Apply¶
Always preview and simulate proposed changes before applying them to production:
let preview = engine.preview_policy(proposal_id)?;
let report = engine.simulate(&test_suite)?;
// Review report before resolving
3. Set Appropriate Quorum¶
For small teams, lower the quorum to avoid deadlocks. For high-stakes policies, raise it:
// Small team (3-5 people)
rules.quorum = 0.5;
// Large organization (50+ people)
rules.quorum = 0.3; // Lower quorum, higher threshold
rules.approval_threshold = 0.75;
4. Version Control Policies¶
The engine automatically bumps minor versions on each apply_accepted_proposals() call. Track the PolicyHistory for audit trails.
References¶
Research Sources¶
- Collective Constitutional AI (Anthropic, 2023)
- https://www.anthropic.com/research/collective-constitutional-ai-aligning-a-language-model-with-the-collective-input-of-13k-people
- 1,000 participants, 38,252 votes, 1,369 statements via Polis platform
- OECD AI Principles (2024 update)
- https://oecd.ai/en/ai-principles
- Multi-stakeholder governance recommendation
- Polis (pol.is)
- Open-source deliberation platform used in Collective CAI study
Related Features¶
- Policy Engine - Core policy engine and YAML schema
- Licensing - Enterprise license requirements
- Guards Overview - All available security guards