Policy Testing
This guide explains how to test AxonFlow policies to ensure they work as expected before deploying to production.
Overview
Testing policies is critical to:
- Verify pattern matching works correctly
- Ensure actions trigger as expected
- Prevent false positives/negatives
- Validate performance impact
Testing Methods
1. CLI Testing with axonctl
The axonctl CLI provides built-in policy testing:
# Test a single policy against input
axonctl policy test --policy <policy-id> --input "<test-input>"
# Test all policies against input
axonctl policy test --all --input "<test-input>"
# Test from file
axonctl policy test --policy <policy-id> --input-file test-input.txt
# Verbose output
axonctl policy test --policy <policy-id> --input "<test-input>" -v
2. API Testing
Test policies via the API:
POST /api/v1/policies/test
{
"policy_id": "block-ssn",
"input": "My SSN is 123-45-6789",
"context": {
"user_id": "test-user",
"organization_id": "test-org"
}
}
Response:
{
"policy_id": "block-ssn",
"matched": true,
"action": "block",
"matches": [
{
"pattern": "\\b(\\d{3})[- ]?(\\d{2})[- ]?(\\d{4})\\b",
"matched_text": "123-45-6789",
"position": {"start": 10, "end": 21}
}
],
"message": "Social Security Numbers cannot be processed.",
"processing_time_ms": 0.5
}
3. Batch Testing
Test multiple inputs at once:
POST /api/v1/policies/test/batch
{
"policy_id": "block-ssn",
"inputs": [
"My SSN is 123-45-6789",
"No SSN here",
"Multiple: 111-22-3333 and 444-55-6666"
]
}
Response:
{
"policy_id": "block-ssn",
"results": [
{"input_index": 0, "matched": true, "matches_count": 1},
{"input_index": 1, "matched": false, "matches_count": 0},
{"input_index": 2, "matched": true, "matches_count": 2}
],
"summary": {
"total": 3,
"matched": 2,
"not_matched": 1
}
}
Test Case Structure
Basic Test Case
# test-cases/pii-tests.yaml
test_suite: PII Detection Tests
policy_id: block-ssn
test_cases:
- name: Valid SSN with dashes
input: "My SSN is 123-45-6789"
expected:
matched: true
action: block
matches_count: 1
- name: Valid SSN with spaces
input: "SSN: 123 45 6789"
expected:
matched: true
- name: Valid SSN no separators
input: "123456789"
expected:
matched: true
- name: No SSN present
input: "No sensitive data here"
expected:
matched: false
- name: Invalid SSN format
input: "12-345-6789"
expected:
matched: false
- name: Multiple SSNs
input: "SSN1: 111-22-3333, SSN2: 444-55-6666"
expected:
matched: true
matches_count: 2
Running Test Suites
# Run a test suite
axonctl policy test-suite --file test-cases/pii-tests.yaml
# Output:
# PII Detection Tests
# ==================
# ✅ Valid SSN with dashes - PASSED
# ✅ Valid SSN with spaces - PASSED
# ✅ Valid SSN no separators - PASSED
# ✅ No SSN present - PASSED
# ✅ Invalid SSN format - PASSED
# ✅ Multiple SSNs - PASSED
#
# Results: 6/6 passed (100%)
Testing Patterns
Regex Pattern Testing
Test regex patterns in isolation:
# Test pattern directly
axonctl policy test-pattern \
--pattern '\b(\d{3})[- ]?(\d{2})[- ]?(\d{4})\b' \
--input "SSN: 123-45-6789"
# Output:
# Pattern matched: true
# Matches: ["123-45-6789"]
# Groups: [["123", "45", "6789"]]
Pattern Debugging
Debug why a pattern isn't matching:
axonctl policy test-pattern \
--pattern '\b[A-Z]{3}[PCHABGJLFT][A-Z][0-9]{4}[A-Z]\b' \
--input "PAN: abcpd1234e" \
--debug
# Output:
# Pattern: \b[A-Z]{3}[PCHABGJLFT][A-Z][0-9]{4}[A-Z]\b
# Input: "PAN: abcpd1234e"
# Matched: false
# Debug:
# - Pattern requires uppercase [A-Z]
# - Input contains lowercase "abcpd1234e"
# - Suggestion: Add (?i) flag for case-insensitive matching
Test Coverage
Coverage Report
Generate a coverage report for your policies:
axonctl policy coverage --test-dir test-cases/
# Output:
# Policy Coverage Report
# =====================
#
# Policy | Tests | Coverage | Status
# ---------------------|-------|----------|--------
# block-ssn | 12 | 95% | ✅
# redact-credit-card | 8 | 88% | ✅
# block-sql-injection | 15 | 100% | ✅
# admin-access-control | 3 | 60% | ⚠️
#
# Overall Coverage: 87%
# Recommendation: Add more tests for admin-access-control
Coverage Requirements
Set minimum coverage requirements:
# .axonflow/policy-tests.yaml
coverage:
minimum: 80
require_tests_for:
- severity: critical
- severity: high
Testing Actions
Test Block Action
test_cases:
- name: Test block action
input: "DROP TABLE users;"
expected:
matched: true
action: block
blocked: true
message_contains: "not permitted"
Test Redact Action
test_cases:
- name: Test redaction
input: "Card number: 4111111111111111"
expected:
matched: true
action: redact
output: "Card number: [CARD-REDACTED]"
Test Log Action
test_cases:
- name: Test logging
input: "Email: [email protected]"
expected:
matched: true
action: log
blocked: false
logged: true
Performance Testing
Benchmark Policies
Measure policy performance:
axonctl policy benchmark --policy block-ssn --iterations 10000
# Output:
# Policy Benchmark: block-ssn
# ==========================
# Iterations: 10,000
# Total time: 125ms
# Average: 0.0125ms per evaluation
# P50: 0.010ms
# P95: 0.018ms
# P99: 0.025ms
# Max: 0.045ms
Load Testing
Test policies under load:
axonctl policy load-test \
--policy block-ssn \
--concurrency 100 \
--duration 60s \
--input-file sample-inputs.txt
# Output:
# Load Test Results
# ================
# Duration: 60s
# Concurrency: 100
# Total evaluations: 245,000
# Throughput: 4,083/sec
# Avg latency: 0.24ms
# P99 latency: 0.85ms
# Errors: 0
Integration Testing
Test with Full Pipeline
Test policies in the complete request flow:
# Send test request through the agent
curl -X POST https://your-axonflow.com/api/v1/chat \
-H "X-Test-Mode: true" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"message": "My SSN is 123-45-6789",
"model": "claude-3-sonnet"
}'
# Response includes policy evaluation details
{
"response": null,
"blocked": true,
"policy_evaluations": [
{
"policy_id": "block-ssn",
"matched": true,
"action": "block",
"processing_time_ms": 0.5
}
],
"message": "Request blocked by policy: block-ssn"
}
Dry Run Mode
Test without actually blocking:
POST /api/v1/chat
{
"message": "Test message with SSN 123-45-6789",
"options": {
"dry_run": true
}
}
# Response shows what would happen
{
"dry_run": true,
"would_block": true,
"policy_evaluations": [...],
"response": "The actual LLM response..." # Still provided in dry run
}
CI/CD Integration
GitHub Actions
# .github/workflows/policy-tests.yml
name: Policy Tests
on:
push:
paths:
- 'policies/**'
- 'test-cases/**'
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install axonctl
run: |
curl -sSL https://get.axonflow.com/axonctl | bash
- name: Run policy tests
run: |
axonctl policy test-suite --dir test-cases/ --output junit.xml
- name: Check coverage
run: |
axonctl policy coverage --min 80
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: test-results
path: junit.xml
Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
# Run policy tests before commit
if git diff --cached --name-only | grep -q "^policies/"; then
echo "Running policy tests..."
axonctl policy test-suite --dir test-cases/
if [ $? -ne 0 ]; then
echo "Policy tests failed. Commit aborted."
exit 1
fi
fi
Test Fixtures
Common Test Data
Create reusable test fixtures:
# test-fixtures/pii-data.yaml
fixtures:
valid_ssns:
- "123-45-6789"
- "987-65-4321"
- "111 22 3333"
valid_credit_cards:
- "4111111111111111" # Visa
- "5500000000000004" # Mastercard
- "340000000000009" # Amex
valid_pans:
- "ABCPD1234E"
- "XYZC12345A"
invalid_pans:
- "ABC1234567" # Wrong format
- "123PD1234E" # Starts with numbers
Using Fixtures
test_suite: Credit Card Detection
policy_id: redact-credit-card
fixtures_file: test-fixtures/pii-data.yaml
test_cases:
- name: Test all valid credit cards
inputs: ${fixtures.valid_credit_cards}
expected:
all_matched: true
- name: Test mixed content
input: "Card: ${fixtures.valid_credit_cards[0]}"
expected:
matched: true
matches_count: 1
Troubleshooting
Pattern Not Matching
- Check for case sensitivity
- Verify escape sequences
- Test pattern in isolation
- Use
--debugflag
False Positives
- Make pattern more specific
- Add negative lookahead/lookbehind
- Add context conditions
- Test with more samples
Performance Issues
- Optimize regex patterns
- Avoid backtracking
- Use non-capturing groups
- Consider pattern compilation
Best Practices
- Test early and often - Write tests alongside policies
- Cover edge cases - Test boundary conditions
- Use realistic data - Test with production-like inputs
- Automate testing - Include in CI/CD pipeline
- Monitor in production - Track false positive rates
- Version control tests - Keep tests with policies