hackquest logo

GAUNTLET - Classifier Robustness Under Attack

A Bittensor subnet for provable classifier robustness under attack

Videos

Tech Stack

Web3
Bittensor
Adversarial Robustness
Classifiers
Security

Description

Repository: https://github.com/JacobKohav/gauntlet/tree/main

Demo slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Demo.**pdf**

Demo video: https://github.com/JacobKohav/gauntlet/blob/main/resources/Introducing%20GAUNTLET_Provable%20Classifier%20Robustness%20Under%20Attack.mp4

Pitch slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Pitch.pdf

🏁 GAUNTLET Subnet

🛡️ Classifier Adversarial Robustness Subnet

A Bittensor Subnet for Provable Classifier Robustness Under Attack


0. 🧾 Executive Summary

GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.

  • ⛏️ Miners: Train and serve classifiers (image, tabular, signal).

  • 🛡️ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).

  • Scoring: Accuracy under adaptive adversarial attack.

  • Reward: Emissions flow to models that remain accurate under stress.

This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.

Instead of rewarding raw accuracy, we reward resilience under attack.


1. 🧩 Subnet Design Proposal


1.1 🧭 System Overview

+------------------+          +----------------------+|      Miner       |          |      Validator       ||------------------|          |----------------------|| Robust Classifier| <------> | Adversarial Engine   || API Endpoint     |          | (PGD, FGSM, AutoAtk) |+------------------+          +----------------------+
         |                                |
         |                                |
         +------------> Scoring <---------+
                          |
                          v
                 Emission Allocation

The system creates a continuous adversarial game:

  • Validators try to break models.

  • Miners try to withstand attacks.

  • Emissions reflect robustness.


1.2 ⚖️ Incentive & Mechanism Design


1.2.1 🪙 Emission and Reward Logic

Each epoch:

  1. Validators sample a hidden dataset batch.

  2. Generate adversarial perturbations.

  3. Evaluate:

    • Clean Accuracy (A_clean)

    • Robust Accuracy (A_adv)

  4. Compute final robustness score:

[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]

Where:

  • α > β (robustness weighted higher)

  • γ discourages slow inference

Emission distribution:

[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]

  • τ = temperature parameter to sharpen competition.


1.2.2 🎯 Incentive Alignment

Miners

Incentivized to:

  • Train adversarially robust models

  • Reduce gradient masking

  • Provide fast inference

  • Avoid overfitting to validator patterns

They are punished if:

  • Attacks reduce performance drastically

  • Latency is excessive

  • Outputs are inconsistent


Validators

Validators are incentivized to:

  • Generate strong, valid attacks

  • Discover weaknesses

  • Avoid false negatives

Validator reward depends on:

[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]

Where:

  • Δ Accuracy = drop caused by attack

  • NoveltyFactor = encourages new perturbation types

Validators are penalized for:

  • Invalid perturbations (exceeding epsilon bounds)

  • Trivial or duplicate attacks


1.2.3 🧹 Discouraging Low-Quality or Adversarial Behavior

Against Miners:

  • Randomized attack strategies

  • Hidden test sets

  • Ensemble validators

  • Transfer attacks

Against Validators:

  • Attack validity checks

  • Bounded perturbation norms

  • Multi-validator consensus


1.2.4 🧠 Proof of Intelligence

This subnet qualifies as a Proof of Intelligence because:

  • Robustness under adversarial attack is computationally non-trivial.

  • It requires:

    • Adversarial training

    • Regularization strategies

    • Model architecture sophistication

  • Validators must compute gradient-based adversarial examples.

The system proves:

  • Model generalization

  • Defense capability

  • Computational effort

Unlike raw inference subnets, this one measures resilience against strategic adversaries.


1.2.5 🧪 High-Level Algorithm

Epoch Loop

For each epoch:
    1. Validators sample hidden dataset batch
    2. Validators query miner model
    3. Generate adversarial samples (PGD/FGSM/etc)
    4. Evaluate clean and adversarial accuracy
    5. Compute score
    6. Normalize emissions
    7. Distribute rewards

2. ⛏️ Miner Design


2.1 🗂️ Miner Tasks

Miners must:

  • Host a classifier API

  • Accept batch inputs

  • Return:

    • Predicted class

    • Confidence score (optional)

  • Respond within latency constraints

Supported domains:

  • Image classification (e.g. CIFAR-style)

  • Tabular fraud detection

  • Signal classification


2.2 🔁 Expected Input → Output Format

Input (JSON)

{"task_id": "image_cifar","batch": [
    { "input": <base64_encoded_tensor> }]}

Output

{"predictions": [
    { "label": 3, "confidence": 0.92 }],"latency_ms": 38}

2.3 📊 Performance Dimensions

Dimension

Weight

Robust Accuracy

High

Clean Accuracy

Medium

Latency

Medium

Consistency

Medium


3. 🛡️ Validator Design


3.1 🧪 Scoring Methodology

Validators:

  1. Perform gradient estimation.

  2. Run:

    • FGSM

    • PGD (multi-step)

    • AutoAttack (optional advanced phase)

  3. Measure:

[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]

Validators submit:

  • Perturbed samples

  • Attack parameters

  • Result logs


3.2 ⏱️ Evaluation Cadence

  • Epoch-based scoring (e.g., every 100 blocks)

  • Rolling average to reduce variance

  • Randomized attack selection


3.3 🤝 Validator Incentive Alignment

Validators earn more if:

  • They discover new vulnerabilities.

  • They reduce miner robustness significantly.

  • Their attack validity is confirmed by peers.

Validator staking required to discourage spam attacks.


4. 💼 Business Logic & Market Rationale


4.1 ❗ The Problem

Adversarial attacks threaten:

  • Autonomous vehicles

  • Fraud detection systems

  • AI medical diagnostics

  • Financial AI systems

Most deployed AI models are:

  • Not adversarially tested

  • Easily manipulated

  • Vulnerable to gradient-based attacks

Robustness testing today is:

  • Centralized

  • Expensive

  • Static

We create:

A decentralized, continuous robustness benchmark.


4.2 🆚 Competing Solutions

Outside Bittensor:

  • RobustBench

  • Academic benchmarks

  • Internal red-teaming

Limitations:

  • Static datasets

  • No economic incentives

  • No adversarial evolution

Within Bittensor:

  • General inference subnets

  • LLM scoring subnets

None focus on adversarial ML robustness.


4.3 🧬 Why Bittensor?

Bittensor is uniquely suited because:

  • It supports adversarial competition.

  • Emissions reward measurable performance.

  • Validators can evolve attacks.

  • Miners continuously improve.

It creates:

A live adversarial ecosystem.


4.4 🛣️ Path to Sustainable Business

Possible monetization:

  1. Enterprise robustness certification

  2. API access to robustness leaderboard

  3. Insurance underwriting input

  4. White-label adversarial testing

Long term:

  • Robustness Score as on-chain primitive

  • Security oracle for AI systems


5. 🚀 Go-To-Market Strategy


5.1 🎯 Initial Target Users

  • AI startups deploying classifiers

  • Web3 AI protocols

  • Security-focused AI labs

  • Research institutions

Early dataset domains:

  • Fraud detection

  • Crypto transaction anomaly detection

  • Image moderation systems


5.2 📣 Distribution Channels

  • Crypto AI Twitter

  • Bittensor ecosystem partners

  • Research publications

  • Hackathon demos

  • Open leaderboard website


5.3 🎁 Incentives for Early Participation

Bootstrapping Miners

  • Bonus emission multiplier for first N epochs

  • Early adopter NFT badge

Bootstrapping Validators

  • Higher reward multiplier for novel attacks

  • Bounty pool for breaking top miner

Bootstrapping Users

  • Free robustness evaluation for first 100 external models


6. 🏗️ Extended Architecture Diagram

                    +------------------+
                    |   Hidden Dataset |
                    +------------------+
                             |
                             v
+----------+        +------------------+        +------------+
|  Miner A | <----> |   Validators     | <----> |  Miner B   |
+----------+        |  (Attack Engine)  |        +------------+
       |            +------------------+
       v                      |
  Robust Model                v
                       Score Aggregation
                              |
                              v
                        Emission Split

7. 🔭 Long-Term Vision

Phase 1:

  • Image & tabular classification robustness

Phase 2:

  • LLM jailbreak resistance

  • Multimodal robustness

Phase 3:

  • On-chain AI security oracle

  • AI robustness insurance market


8. 🌙 Why This Is a Moonshot

Although initially academic-feeling, this subnet can become:

  • The security layer of AI

  • The robustness oracle for autonomous systems

  • The on-chain benchmark for trustworthy intelligence

As AI integrates into finance, robotics, and defense:

Robustness becomes more valuable than raw intelligence.

We are building:

A decentralized adversarial intelligence arms race.


9. ✅ Closing Statement

The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.

It aligns:

  • Cryptoeconomics

  • Security engineering

  • Machine learning research

Into a single measurable signal:

Accuracy under attack.

This is not just proof of inference.

This is proof of resilience.

Progress During Hackathon

Design proposal submtitted

Fundraising Status

In progress

Team Leader
JJacob Kohav
Project Link
Sector
AIInfra