Benchmark Kit

How Safe Is Your AI?
Find Out in 10 Minutes.

91 attacks across 10 categories. 5 compliance levels from hobbyist to classified. Run against any AI system. Get a scored report. Know exactly where you stand.

Buy Benchmark Kit ($5) See Free Results First

Secure Stripe checkout. After purchase, forward your receipt to [email protected] for the full kit within 24 hours.

What's Inside

Adversarial Prompts

Production-grade attack corpus covering OWASP LLM Top 10, MITRE ATLAS techniques, and SCBE-specific vectors.

Attack Categories

Direct override, indirect injection, encoding obfuscation, multilingual, adaptive sequence, tool exfiltration, tongue manipulation, spin drift, boundary exploit, combined multi-vector.

Compliance Levels

From hobbyist (basic safety) through enterprise (SOC 2) to classified (NSA CNSA). Know which level your system hits.

Benign Baselines

Clean prompts across 6 categories to measure false positive rate. A good test catches attacks without blocking real users.

Eval Tasks

Route classification, governance posture, tongue encoding, null pattern detection, domain drift -- scored automatically.

Scored Report

JSON output with detection rate, false positive rate, per-class breakdown, compliance tier, and specific recommendations.

5 Compliance Levels

The benchmark scores your system against five tiers. Each tier adds requirements from the previous one.

Level 1

Hobbyist

Basic

Direct prompt injection blocked. Basic safety. Good for personal projects.

Level 2

Startup

OWASP

OWASP LLM Top 10 addressed. Encoding attacks caught. Ready for beta users.

Level 3

Enterprise

SOC 2

Multi-vector attacks, audit logging, SOC 2 AI controls. Ready for paying customers.

Level 4

Government

NIST + EU

NIST AI RMF aligned. EU AI Act conformity. MITRE ATLAS coverage. Post-quantum ready.

Level 5

Classified

NSA CNSA

CNSA 2.0 algorithms. FIPS 140-3 validation path. HSM integration. Formal verification.

Requirement	L1	L2	L3	L4	L5
Block direct prompt injection	✓	✓	✓	✓	✓
Block encoding obfuscation (base64, ROT13)		✓	✓	✓	✓
Block multilingual attacks		✓	✓	✓	✓
Block indirect injection (RAG poisoning)		✓	✓	✓	✓
Detect domain drift			✓	✓	✓
Audit logging (JSONL)			✓	✓	✓
Multi-vector attack resistance			✓	✓	✓
0% false positive rate			✓	✓	✓
NIST AI RMF alignment				✓	✓
MITRE ATLAS technique coverage				✓	✓
EU AI Act conformity documentation				✓	✓
Post-quantum cryptography (ML-KEM/ML-DSA)				✓	✓
FIPS 140-3 validation path					✓
NSA CNSA 2.0 algorithm suite					✓
Formal verification (Coq/Lean proofs)					✓

How To Run It

Option A: Python (recommended)

pip install scbe-aethermoore
python -m scbe_benchmark --target "your-api-endpoint" --key "your-api-key"
# Report saved to benchmark_report.json

Option B: Colab (no install)

Open the included Colab notebook. Paste your API key. Click Run All. Get your report in 10 minutes.

Option C: Manual (any system)

The kit includes all 91 attack prompts as a CSV. Send them to your AI system however you want. Score the responses against the included rubric.

What You Get Back

{
  "system": "your-system-name",
  "timestamp": "2026-03-31T...",
  "compliance_level": 3,
  "compliance_name": "Enterprise",
  "detection_rate": 0.879,
  "false_positive_rate": 0.0,
  "per_class": {
    "direct_override": { "blocked": 10, "total": 10, "rate": 1.0 },
    "encoding_obfuscation": { "blocked": 8, "total": 10, "rate": 0.8 },
    ...
  },
  "recommendations": [
    "Encoding obfuscation: 2 attacks bypassed. Add base64/ROT13 pre-processing.",
    "Multilingual: 1 attack bypassed. Add non-English pattern detection."
  ]
}

What People Use It For

Pre-launch audit

"We're shipping an AI feature next week. Is it safe?" Run the benchmark, get a compliance level, fix the gaps before launch.

Vendor evaluation

"Which AI provider has better safety?" Run the benchmark against multiple providers. Compare scores side by side.

Compliance evidence

"Our auditor asked for AI safety documentation." The benchmark report is structured evidence that maps to SOC 2, NIST RMF, and EU AI Act requirements.

Red team training

"Our security team needs practice attacking AI systems." The 91 attacks are organized by category and difficulty. Great for tabletop exercises.

Compare

	This Kit ($5)	Promptfoo (free OSS)	Enterprise Red Team ($50K+)
Attack corpus	91 attacks, 10 classes	50+ vulnerability types	Custom per engagement
Compliance mapping	OWASP + NIST + MITRE + EU AI Act + NSA	OWASP + MITRE	Full custom
Time to results	10 minutes	30 min - 2 hours	2-6 weeks
Scored report	Yes (JSON + compliance level)	Yes (HTML)	Yes (PDF)
Null-space detection	Yes (unique to SCBE)	No	Depends on team
Sacred Tongue profiling	Yes (6D domain analysis)	No	No
Price	$5	Free	$50,000+

Promptfoo is excellent open-source tooling (now part of OpenAI). This kit adds SCBE-specific detection (tongue profiling, null-space, compliance levels) and maps to more compliance frameworks. They complement each other.

See the real numbers (honest evidence ledger) →

Get the Kit

91 attacks. 5 compliance levels. 10 minutes. One JSON report that tells you exactly where your AI stands.