hackquest logo

GAUNTLET - Classifier Robustness Under Attack

A Bittensor subnet for provable classifier robustness under attack

视频

技术栈

Web3
Bittensor
Adversarial Robustness
AI
Classifiers
Security

描述

Repository: https://github.com/JacobKohav/gauntlet/tree/main

Demo slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Demo.**pdf**

Demo video: https://github.com/JacobKohav/gauntlet/blob/main/resources/Introducing%20GAUNTLET_Provable%20Classifier%20Robustness%20Under%20Attack.mp4

Pitch slides: https://github.com/JacobKohav/gauntlet/blob/main/resources/GAUNTLET-Subnet_Pitch.pdf

🏁 GAUNTLET Subnet

🛡️ Classifier Adversarial Robustness Subnet

A Bittensor Subnet for Provable Classifier Robustness Under Attack


0. 🧾 Executive Summary

GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.

  • ⛏️ Miners: Train and serve classifiers (image, tabular, signal).

  • 🛡️ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).

  • Scoring: Accuracy under adaptive adversarial attack.

  • Reward: Emissions flow to models that remain accurate under stress.

This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.

Instead of rewarding raw accuracy, we reward resilience under attack.


1. 🧩 Subnet Design Proposal


1.1 🧭 System Overview

+------------------+          +----------------------+|      Miner       |          |      Validator       ||------------------|          |----------------------|| Robust Classifier| <------> | Adversarial Engine   || API Endpoint     |          | (PGD, FGSM, AutoAtk) |+------------------+          +----------------------+
         |                                |
         |                                |
         +------------> Scoring <---------+
                          |
                          v
                 Emission Allocation

The system creates a continuous adversarial game:

  • Validators try to break models.

  • Miners try to withstand attacks.

  • Emissions reflect robustness.


1.2 ⚖️ Incentive & Mechanism Design


1.2.1 🪙 Emission and Reward Logic

Each epoch:

  1. Validators sample a hidden dataset batch.

  2. Generate adversarial perturbations.

  3. Evaluate:

    • Clean Accuracy (A_clean)

    • Robust Accuracy (A_adv)

  4. Compute final robustness score:

[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]

Where:

  • α > β (robustness weighted higher)

  • γ discourages slow inference

Emission distribution:

[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]

  • τ = temperature parameter to sharpen competition.


1.2.2 🎯 Incentive Alignment

Miners

Incentivized to:

  • Train adversarially robust models

  • Reduce gradient masking

  • Provide fast inference

  • Avoid overfitting to validator patterns

They are punished if:

  • Attacks reduce performance drastically

  • Latency is excessive

  • Outputs are inconsistent


Validators

Validators are incentivized to:

  • Generate strong, valid attacks

  • Discover weaknesses

  • Avoid false negatives

Validator reward depends on:

[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]

Where:

  • Δ Accuracy = drop caused by attack

  • NoveltyFactor = encourages new perturbation types

Validators are penalized for:

  • Invalid perturbations (exceeding epsilon bounds)

  • Trivial or duplicate attacks


1.2.3 🧹 Discouraging Low-Quality or Adversarial Behavior

Against Miners:

  • Randomized attack strategies

  • Hidden test sets

  • Ensemble validators

  • Transfer attacks

Against Validators:

  • Attack validity checks

  • Bounded perturbation norms

  • Multi-validator consensus


1.2.4 🧠 Proof of Intelligence

This subnet qualifies as a Proof of Intelligence because:

  • Robustness under adversarial attack is computationally non-trivial.

  • It requires:

    • Adversarial training

    • Regularization strategies

    • Model architecture sophistication

  • Validators must compute gradient-based adversarial examples.

The system proves:

  • Model generalization

  • Defense capability

  • Computational effort

Unlike raw inference subnets, this one measures resilience against strategic adversaries.


1.2.5 🧪 High-Level Algorithm

Epoch Loop

For each epoch:
    1. Validators sample hidden dataset batch
    2. Validators query miner model
    3. Generate adversarial samples (PGD/FGSM/etc)
    4. Evaluate clean and adversarial accuracy
    5. Compute score
    6. Normalize emissions
    7. Distribute rewards

2. ⛏️ Miner Design


2.1 🗂️ Miner Tasks

Miners must:

  • Host a classifier API

  • Accept batch inputs

  • Return:

    • Predicted class

    • Confidence score (optional)

  • Respond within latency constraints

Supported domains:

  • Image classification (e.g. CIFAR-style)

  • Tabular fraud detection

  • Signal classification


2.2 🔁 Expected Input → Output Format

Input (JSON)

{"task_id": "image_cifar","batch": [
    { "input": <base64_encoded_tensor> }]}

Output

{"predictions": [
    { "label": 3, "confidence": 0.92 }],"latency_ms": 38}

2.3 📊 Performance Dimensions

Dimension

Weight

Robust Accuracy

High

Clean Accuracy

Medium

Latency

Medium

Consistency

Medium


3. 🛡️ Validator Design


3.1 🧪 Scoring Methodology

Validators:

  1. Perform gradient estimation.

  2. Run:

    • FGSM

    • PGD (multi-step)

    • AutoAttack (optional advanced phase)

  3. Measure:

[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]

Validators submit:

  • Perturbed samples

  • Attack parameters

  • Result logs


3.2 ⏱️ Evaluation Cadence

  • Epoch-based scoring (e.g., every 100 blocks)

  • Rolling average to reduce variance

  • Randomized attack selection


3.3 🤝 Validator Incentive Alignment

Validators earn more if:

  • They discover new vulnerabilities.

  • They reduce miner robustness significantly.

  • Their attack validity is confirmed by peers.

Validator staking required to discourage spam attacks.


4. 💼 Business Logic & Market Rationale


4.1 ❗ The Problem

Adversarial attacks threaten:

  • Autonomous vehicles

  • Fraud detection systems

  • AI medical diagnostics

  • Financial AI systems

Most deployed AI models are:

  • Not adversarially tested

  • Easily manipulated

  • Vulnerable to gradient-based attacks

Robustness testing today is:

  • Centralized

  • Expensive

  • Static

We create:

A decentralized, continuous robustness benchmark.


4.2 🆚 Competing Solutions

Outside Bittensor:

  • RobustBench

  • Academic benchmarks

  • Internal red-teaming

Limitations:

  • Static datasets

  • No economic incentives

  • No adversarial evolution

Within Bittensor:

  • General inference subnets

  • LLM scoring subnets

None focus on adversarial ML robustness.


4.3 🧬 Why Bittensor?

Bittensor is uniquely suited because:

  • It supports adversarial competition.

  • Emissions reward measurable performance.

  • Validators can evolve attacks.

  • Miners continuously improve.

It creates:

A live adversarial ecosystem.


4.4 🛣️ Path to Sustainable Business

Possible monetization:

  1. Enterprise robustness certification

  2. API access to robustness leaderboard

  3. Insurance underwriting input

  4. White-label adversarial testing

Long term:

  • Robustness Score as on-chain primitive

  • Security oracle for AI systems


5. 🚀 Go-To-Market Strategy


5.1 🎯 Initial Target Users

  • AI startups deploying classifiers

  • Web3 AI protocols

  • Security-focused AI labs

  • Research institutions

Early dataset domains:

  • Fraud detection

  • Crypto transaction anomaly detection

  • Image moderation systems


5.2 📣 Distribution Channels

  • Crypto AI Twitter

  • Bittensor ecosystem partners

  • Research publications

  • Hackathon demos

  • Open leaderboard website


5.3 🎁 Incentives for Early Participation

Bootstrapping Miners

  • Bonus emission multiplier for first N epochs

  • Early adopter NFT badge

Bootstrapping Validators

  • Higher reward multiplier for novel attacks

  • Bounty pool for breaking top miner

Bootstrapping Users

  • Free robustness evaluation for first 100 external models


6. 🏗️ Extended Architecture Diagram

                    +------------------+
                    |   Hidden Dataset |
                    +------------------+
                             |
                             v
+----------+        +------------------+        +------------+
|  Miner A | <----> |   Validators     | <----> |  Miner B   |
+----------+        |  (Attack Engine)  |        +------------+
       |            +------------------+
       v                      |
  Robust Model                v
                       Score Aggregation
                              |
                              v
                        Emission Split

7. 🔭 Long-Term Vision

Phase 1:

  • Image & tabular classification robustness

Phase 2:

  • LLM jailbreak resistance

  • Multimodal robustness

Phase 3:

  • On-chain AI security oracle

  • AI robustness insurance market


8. 🌙 Why This Is a Moonshot

Although initially academic-feeling, this subnet can become:

  • The security layer of AI

  • The robustness oracle for autonomous systems

  • The on-chain benchmark for trustworthy intelligence

As AI integrates into finance, robotics, and defense:

Robustness becomes more valuable than raw intelligence.

We are building:

A decentralized adversarial intelligence arms race.


9. ✅ Closing Statement

The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.

It aligns:

  • Cryptoeconomics

  • Security engineering

  • Machine learning research

Into a single measurable signal:

Accuracy under attack.

This is not just proof of inference.

This is proof of resilience.

本次黑客松进展

Design proposal submtitted

融资状态

In progress

队长
JJacob Kohav
项目链接
赛道
AIInfra