GAUNTLET - Classifier Robustness Under Attack
A Bittensor subnet for provable classifier robustness under attack
视频
技术栈
描述
Repository: https://github.com/JacobKohav/gauntlet/tree/main
Demo slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Demo.**pdf**
Pitch slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Pitch.pdf
🏁 GAUNTLET Subnet
🛡️ Classifier Adversarial Robustness Subnet
A Bittensor Subnet for Provable Classifier Robustness Under Attack
0. 🧾 Executive Summary
GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.
⛏️ Miners: Train and serve classifiers (image, tabular, signal).
🛡️ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).
Scoring: Accuracy under adaptive adversarial attack.
Reward: Emissions flow to models that remain accurate under stress.
This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.
Instead of rewarding raw accuracy, we reward resilience under attack.
1. 🧩 Subnet Design Proposal
1.1 🧭 System Overview
+------------------+ +----------------------+| Miner | | Validator ||------------------| |----------------------|| Robust Classifier| <------> | Adversarial Engine || API Endpoint | | (PGD, FGSM, AutoAtk) |+------------------+ +----------------------+
| |
| |
+------------> Scoring <---------+
|
v
Emission Allocation
The system creates a continuous adversarial game:
Validators try to break models.
Miners try to withstand attacks.
Emissions reflect robustness.
1.2 ⚖️ Incentive & Mechanism Design
1.2.1 🪙 Emission and Reward Logic
Each epoch:
Validators sample a hidden dataset batch.
Generate adversarial perturbations.
Evaluate:
Clean Accuracy (A_clean)
Robust Accuracy (A_adv)
Compute final robustness score:
[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]
Where:
α > β (robustness weighted higher)
γ discourages slow inference
Emission distribution:
[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]
τ = temperature parameter to sharpen competition.
1.2.2 🎯 Incentive Alignment
Miners
Incentivized to:
Train adversarially robust models
Reduce gradient masking
Provide fast inference
Avoid overfitting to validator patterns
They are punished if:
Attacks reduce performance drastically
Latency is excessive
Outputs are inconsistent
Validators
Validators are incentivized to:
Generate strong, valid attacks
Discover weaknesses
Avoid false negatives
Validator reward depends on:
[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]
Where:
Δ Accuracy = drop caused by attack
NoveltyFactor = encourages new perturbation types
Validators are penalized for:
Invalid perturbations (exceeding epsilon bounds)
Trivial or duplicate attacks
1.2.3 🧹 Discouraging Low-Quality or Adversarial Behavior
Against Miners:
Randomized attack strategies
Hidden test sets
Ensemble validators
Transfer attacks
Against Validators:
Attack validity checks
Bounded perturbation norms
Multi-validator consensus
1.2.4 🧠 Proof of Intelligence
This subnet qualifies as a Proof of Intelligence because:
Robustness under adversarial attack is computationally non-trivial.
It requires:
Adversarial training
Regularization strategies
Model architecture sophistication
Validators must compute gradient-based adversarial examples.
The system proves:
Model generalization
Defense capability
Computational effort
Unlike raw inference subnets, this one measures resilience against strategic adversaries.
1.2.5 🧪 High-Level Algorithm
Epoch Loop
For each epoch:
1. Validators sample hidden dataset batch
2. Validators query miner model
3. Generate adversarial samples (PGD/FGSM/etc)
4. Evaluate clean and adversarial accuracy
5. Compute score
6. Normalize emissions
7. Distribute rewards
2. ⛏️ Miner Design
2.1 🗂️ Miner Tasks
Miners must:
Host a classifier API
Accept batch inputs
Return:
Predicted class
Confidence score (optional)
Respond within latency constraints
Supported domains:
Image classification (e.g. CIFAR-style)
Tabular fraud detection
Signal classification
2.2 🔁 Expected Input → Output Format
Input (JSON)
{"task_id": "image_cifar","batch": [
{ "input": <base64_encoded_tensor> }]}Output
{"predictions": [
{ "label": 3, "confidence": 0.92 }],"latency_ms": 38}2.3 📊 Performance Dimensions
Dimension | Weight |
|---|---|
Robust Accuracy | High |
Clean Accuracy | Medium |
Latency | Medium |
Consistency | Medium |
3. 🛡️ Validator Design
3.1 🧪 Scoring Methodology
Validators:
Perform gradient estimation.
Run:
FGSM
PGD (multi-step)
AutoAttack (optional advanced phase)
Measure:
[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]
Validators submit:
Perturbed samples
Attack parameters
Result logs
3.2 ⏱️ Evaluation Cadence
Epoch-based scoring (e.g., every 100 blocks)
Rolling average to reduce variance
Randomized attack selection
3.3 🤝 Validator Incentive Alignment
Validators earn more if:
They discover new vulnerabilities.
They reduce miner robustness significantly.
Their attack validity is confirmed by peers.
Validator staking required to discourage spam attacks.
4. 💼 Business Logic & Market Rationale
4.1 ❗ The Problem
Adversarial attacks threaten:
Autonomous vehicles
Fraud detection systems
AI medical diagnostics
Financial AI systems
Most deployed AI models are:
Not adversarially tested
Easily manipulated
Vulnerable to gradient-based attacks
Robustness testing today is:
Centralized
Expensive
Static
We create:
A decentralized, continuous robustness benchmark.
4.2 🆚 Competing Solutions
Outside Bittensor:
RobustBench
Academic benchmarks
Internal red-teaming
Limitations:
Static datasets
No economic incentives
No adversarial evolution
Within Bittensor:
General inference subnets
LLM scoring subnets
None focus on adversarial ML robustness.
4.3 🧬 Why Bittensor?
Bittensor is uniquely suited because:
It supports adversarial competition.
Emissions reward measurable performance.
Validators can evolve attacks.
Miners continuously improve.
It creates:
A live adversarial ecosystem.
4.4 🛣️ Path to Sustainable Business
Possible monetization:
Enterprise robustness certification
API access to robustness leaderboard
Insurance underwriting input
White-label adversarial testing
Long term:
Robustness Score as on-chain primitive
Security oracle for AI systems
5. 🚀 Go-To-Market Strategy
5.1 🎯 Initial Target Users
AI startups deploying classifiers
Web3 AI protocols
Security-focused AI labs
Research institutions
Early dataset domains:
Fraud detection
Crypto transaction anomaly detection
Image moderation systems
5.2 📣 Distribution Channels
Crypto AI Twitter
Bittensor ecosystem partners
Research publications
Hackathon demos
Open leaderboard website
5.3 🎁 Incentives for Early Participation
Bootstrapping Miners
Bonus emission multiplier for first N epochs
Early adopter NFT badge
Bootstrapping Validators
Higher reward multiplier for novel attacks
Bounty pool for breaking top miner
Bootstrapping Users
Free robustness evaluation for first 100 external models
6. 🏗️ Extended Architecture Diagram
+------------------+
| Hidden Dataset |
+------------------+
|
v
+----------+ +------------------+ +------------+
| Miner A | <----> | Validators | <----> | Miner B |
+----------+ | (Attack Engine) | +------------+
| +------------------+
v |
Robust Model v
Score Aggregation
|
v
Emission Split
7. 🔭 Long-Term Vision
Phase 1:
Image & tabular classification robustness
Phase 2:
LLM jailbreak resistance
Multimodal robustness
Phase 3:
On-chain AI security oracle
AI robustness insurance market
8. 🌙 Why This Is a Moonshot
Although initially academic-feeling, this subnet can become:
The security layer of AI
The robustness oracle for autonomous systems
The on-chain benchmark for trustworthy intelligence
As AI integrates into finance, robotics, and defense:
Robustness becomes more valuable than raw intelligence.
We are building:
A decentralized adversarial intelligence arms race.
9. ✅ Closing Statement
The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.
It aligns:
Cryptoeconomics
Security engineering
Machine learning research
Into a single measurable signal:
Accuracy under attack.
This is not just proof of inference.
This is proof of resilience.
本次黑客松进展
Design proposal submtitted
融资状态
In progress