A Bittensor subnet for provable classifier robustness under attack
Repository: https://github.com/JacobKohav/gauntlet/tree/main
Demo slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Demo.**pdf**
Pitch slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Pitch.pdf
A Bittensor Subnet for Provable Classifier Robustness Under Attack
GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.
โ๏ธ Miners: Train and serve classifiers (image, tabular, signal).
๐ก๏ธ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).
Scoring: Accuracy under adaptive adversarial attack.
Reward: Emissions flow to models that remain accurate under stress.
This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.
Instead of rewarding raw accuracy, we reward resilience under attack.
+------------------+ +----------------------+| Miner | | Validator ||------------------| |----------------------|| Robust Classifier| <------> | Adversarial Engine || API Endpoint | | (PGD, FGSM, AutoAtk) |+------------------+ +----------------------+
| |
| |
+------------> Scoring <---------+
|
v
Emission Allocation
The system creates a continuous adversarial game:
Validators try to break models.
Miners try to withstand attacks.
Emissions reflect robustness.
Each epoch:
Validators sample a hidden dataset batch.
Generate adversarial perturbations.
Evaluate:
Clean Accuracy (A_clean)
Robust Accuracy (A_adv)
Compute final robustness score:
[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]
Where:
ฮฑ > ฮฒ (robustness weighted higher)
ฮณ discourages slow inference
Emission distribution:
[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]
ฯ = temperature parameter to sharpen competition.
Incentivized to:
Train adversarially robust models
Reduce gradient masking
Provide fast inference
Avoid overfitting to validator patterns
They are punished if:
Attacks reduce performance drastically
Latency is excessive
Outputs are inconsistent
Validators are incentivized to:
Generate strong, valid attacks
Discover weaknesses
Avoid false negatives
Validator reward depends on:
[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]
Where:
ฮ Accuracy = drop caused by attack
NoveltyFactor = encourages new perturbation types
Validators are penalized for:
Invalid perturbations (exceeding epsilon bounds)
Trivial or duplicate attacks
Randomized attack strategies
Hidden test sets
Ensemble validators
Transfer attacks
Attack validity checks
Bounded perturbation norms
Multi-validator consensus
This subnet qualifies as a Proof of Intelligence because:
Robustness under adversarial attack is computationally non-trivial.
It requires:
Adversarial training
Regularization strategies
Model architecture sophistication
Validators must compute gradient-based adversarial examples.
The system proves:
Model generalization
Defense capability
Computational effort
Unlike raw inference subnets, this one measures resilience against strategic adversaries.
For each epoch:
1. Validators sample hidden dataset batch
2. Validators query miner model
3. Generate adversarial samples (PGD/FGSM/etc)
4. Evaluate clean and adversarial accuracy
5. Compute score
6. Normalize emissions
7. Distribute rewards
Miners must:
Host a classifier API
Accept batch inputs
Return:
Predicted class
Confidence score (optional)
Respond within latency constraints
Supported domains:
Image classification (e.g. CIFAR-style)
Tabular fraud detection
Signal classification
{"task_id": "image_cifar","batch": [
{ "input": <base64_encoded_tensor> }]}{"predictions": [
{ "label": 3, "confidence": 0.92 }],"latency_ms": 38}Dimension | Weight |
|---|---|
Robust Accuracy | High |
Clean Accuracy | Medium |
Latency | Medium |
Consistency | Medium |
Validators:
Perform gradient estimation.
Run:
FGSM
PGD (multi-step)
AutoAttack (optional advanced phase)
Measure:
[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]
Validators submit:
Perturbed samples
Attack parameters
Result logs
Epoch-based scoring (e.g., every 100 blocks)
Rolling average to reduce variance
Randomized attack selection
Validators earn more if:
They discover new vulnerabilities.
They reduce miner robustness significantly.
Their attack validity is confirmed by peers.
Validator staking required to discourage spam attacks.
Adversarial attacks threaten:
Autonomous vehicles
Fraud detection systems
AI medical diagnostics
Financial AI systems
Most deployed AI models are:
Not adversarially tested
Easily manipulated
Vulnerable to gradient-based attacks
Robustness testing today is:
Centralized
Expensive
Static
We create:
A decentralized, continuous robustness benchmark.
RobustBench
Academic benchmarks
Internal red-teaming
Limitations:
Static datasets
No economic incentives
No adversarial evolution
General inference subnets
LLM scoring subnets
None focus on adversarial ML robustness.
Bittensor is uniquely suited because:
It supports adversarial competition.
Emissions reward measurable performance.
Validators can evolve attacks.
Miners continuously improve.
It creates:
A live adversarial ecosystem.
Possible monetization:
Enterprise robustness certification
API access to robustness leaderboard
Insurance underwriting input
White-label adversarial testing
Long term:
Robustness Score as on-chain primitive
Security oracle for AI systems
AI startups deploying classifiers
Web3 AI protocols
Security-focused AI labs
Research institutions
Early dataset domains:
Fraud detection
Crypto transaction anomaly detection
Image moderation systems
Crypto AI Twitter
Bittensor ecosystem partners
Research publications
Hackathon demos
Open leaderboard website
Bonus emission multiplier for first N epochs
Early adopter NFT badge
Higher reward multiplier for novel attacks
Bounty pool for breaking top miner
Free robustness evaluation for first 100 external models
+------------------+
| Hidden Dataset |
+------------------+
|
v
+----------+ +------------------+ +------------+
| Miner A | <----> | Validators | <----> | Miner B |
+----------+ | (Attack Engine) | +------------+
| +------------------+
v |
Robust Model v
Score Aggregation
|
v
Emission Split
Phase 1:
Image & tabular classification robustness
Phase 2:
LLM jailbreak resistance
Multimodal robustness
Phase 3:
On-chain AI security oracle
AI robustness insurance market
Although initially academic-feeling, this subnet can become:
The security layer of AI
The robustness oracle for autonomous systems
The on-chain benchmark for trustworthy intelligence
As AI integrates into finance, robotics, and defense:
Robustness becomes more valuable than raw intelligence.
We are building:
A decentralized adversarial intelligence arms race.
The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.
It aligns:
Cryptoeconomics
Security engineering
Machine learning research
Into a single measurable signal:
Accuracy under attack.
This is not just proof of inference.
This is proof of resilience.
Design proposal submtitted
In progress