hackquest logo

Test Forge

Forging bulletproof test cases through decentralized AI and mutation-verified unit tests. Proven code quality, rewarded by the network.

Videos

Project image 1

Tech Stack

Web3

Description

# TestForge β€” Project Description

---

## 🎯 One-Liner

TestForge is a Bittensor subnet where AI miners compete to generate battle-tested unit tests, verified through mutation testing to ensure tests actually catch bugs.

---

## πŸ“‹ Executive Summary

| | |

|---|---|

| Problem | 70% of open-source code has zero tests. Untested code kills people (Therac-25), crashes markets (Knight Capital $440M), and breaks the internet (Cloudflare, Log4j). |

| Solution | Decentralized AI competition to generate high-quality unit tests with cryptographic proof of usefulness. |

| Innovation | Three-gate verification with mutation testing β€” the only ungameable way to prove a test actually works. |

| Why Bittensor | Tests are binary verifiable. Perfect fit for incentive-driven competition. Best AI wins, bad AI earns nothing. |

---

## πŸ”¬ How It Works

```

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ CODE IN │────▢│ 64 MINERS │────▢│ 3-GATE │────▢│ BEST TEST β”‚

β”‚ β”‚ β”‚ COMPETE β”‚ β”‚ VALIDATOR β”‚ β”‚ WINS TAO β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

```

| Gate | Question | Method | Threshold |

|------|----------|--------|-----------|

| Gate 1 | Do tests run? | pytest execution | 0 failures |

| Gate 2 | Do tests cover code? | coverage.py | β‰₯80% lines |

| Gate 3 | Do tests catch bugs? | Mutation injection | β‰₯60% kills |

Score = (G1 + G2 + G3) / 3 β†’ Rewards distributed proportionally

---

## πŸ›‘οΈ Why Mutation Testing is Ungameable

The Cheat (without mutation testing):

```python

def test_fake():

my_function("x") # No assertion β€” always passes, 100% coverage

```

Why It Fails Gate 3:

```python

# Original: return a + b

# Mutant: return a - b ← AI injects this bug

test_fake() # Still passes! Mutation SURVIVES.

# Kill rate: 0% β†’ Gate 3 FAILS β†’ Score: 0

```

You cannot fake catching a bug. The mutation either dies or it doesn't.

---

## πŸ“Š Key Metrics

| Metric | Value |

|--------|-------|

| Prototype Status | βœ… Complete |

| Test Coverage | 47/47 passing |

| Simulation | 5 miners Γ— 10 epochs working |

| API | FastAPI server ready |

| Benchmarks | SWE-bench, GitBug-Java, Defects4J |

---

## πŸ—οΈ Architecture

```

TESTFORGE SUBNET

β”œβ”€β”€ Task Generator ──────▢ Pulls from real-world benchmarks

β”œβ”€β”€ Miner Pool ──────────▢ LLM agents generate pytest files

β”œβ”€β”€ Validator Pipeline ──▢ 3-gate verification + mutation engine

β”œβ”€β”€ Score Engine ────────▢ Proportional reward distribution

└── Bittensor Chain ─────▢ On-chain weight updates

```

---

## πŸ’° Economics

| Role | Emission Share | Incentive |

|------|----------------|-----------|

| Miners | 41% | Generate better tests β†’ earn more |

| Validators | 41% | Run honest verification β†’ earn stake |

| Subnet Owner | 18% | Grow the network β†’ grow value |

---

## πŸš€ Go-To-Market

| Phase | Timeline | Target | Goal |

|-------|----------|--------|------|

| 1 | Months 1-6 | Bittensor devs | 10 miners, 3 validators |

| 2 | Months 6-12 | Open source repos | GitHub Action, 100 repos |

| 3 | Year 2 | Enterprise | Self-hosted, $100k contracts |

---

## πŸ† Competitive Advantage

| vs Copilot/ChatGPT | vs Other Subnets |

|--------------------|------------------|

| βœ… Verified quality | βœ… Binary scoring (not subjective) |

| βœ… Mutation-tested | βœ… Focused on ONE task |

| βœ… Decentralized | βœ… Real benchmarks |

| βœ… Continuously improving | βœ… Ungameable mechanism |

---

## πŸ“¦ Deliverables

| Artifact | Status | Link |

|----------|--------|------|

| GitHub Repo | βœ… Live | [Test-Forge](https://github.com/manjeetsharma0796/Test-Forge) |

| Core Engine | βœ… Complete | testforge β€” 8 modules |

| Validation Pipeline | βœ… Complete | 3-gate system |

| Mutation Engine | βœ… Complete | 6 mutation operators |

| Tests | βœ… 47 passing | /tests/ |

| Simulation | βœ… Working | python main.py simulate |

| Documentation | βœ… Complete | /docs/ β€” 5 guides |

| API Server | βœ… Ready | python main.py serve |

---

## ⚑ Quick Start

```bash

git clone https://github.com/manjeetsharma0796/Test-Forge.git

cd Test-Forge

pip install -r requirements.txt

python -m pytest tests/ -v # 47 passed βœ…

python main.py simulate # Watch miners compete

```

---

## 🎯 The Ask

Select TestForge for Round 2.

We will deploy a fully functional subnet on Bittensor testnet demonstrating:

- Working miner competition

- Live 3-gate validation

- Real TAO emissions to the best test generator

---

> "The best test is one that catches bugs before users do. TestForge makes AI prove it."

---

GitHub: [github.com/manjeetsharma0796/Test-Forge](https://github.com/manjeetsharma0796/Test-Forge)

Progress During Hackathon

simulation build
ideation and top layer architecture ready

integration -- upcoming months

Fundraising Status

0

Team Leader
MManjeet Sharma
Project Link
Sector
AIInfraOtherDeFi