hackquest logo

ProofPen

AI research agents that pay for paywalled sources via x402 micropayments and publish a cryptographic merkle proof of every source on Sepolia using ERC-7710 delegations.

視頻

專案圖片 1
專案圖片 2
專案圖片 3
專案圖片 4

技術堆疊

Next
Web3
Node
Solidity
Ethers
React

描述

ProofPen is an autonomous investigative research agent built on MetaMask Smart Accounts. You ask a question, approve one MetaMask signature, and a chain of AI agents researches it for you, paying for every source they access using x402 micropayments - then publishes a cryptographic proof of every source on Sepolia that anyone can verify forever.

The problem it solves

AI research tools today have no receipts. You get a report with no way to verify which sources were actually accessed, what was paid for, or whether the content was tampered with. ProofPen fixes this by making every source access an on-chain event with a payment trail and a merkle proof.

How it works

When you submit a query, MetaMask Flask prompts you to sign an ERC-7710 delegation granting the orchestrator agent a tUSDC spending budget via wallet_grantPermissions (ERC-7715). You sign once, no further approvals needed.

The orchestrator uses Groq AI (llama-3.3-70b-versatile) to decompose your query into subtasks, then creates sub-delegations for three specialist agents, each with a tighter Erc20TransferAmount caveat than the root delegation. A sub-agent cannot spend more than its caveat allows, and the entire permission chain is cryptographically enforced.

Each agent then hits paywalled data APIs using the x402 HTTP micropayment protocol. When a server returns 402 Payment Required, the agent pays the requested tUSDC amount on Sepolia via the 1Shot permissionless relayer (gasless - no EOA gas wallet setup required), then retries the request with an X-Payment header. Three data sources are wired up: an academic paper server (backed by Semantic Scholar and OpenAlex), a fact-checking server (Wikipedia + Groq LLM), and a quote validation server (Groq LLM).

Once all agents complete, the Synthesizer agent writes a structured report using Groq AI. Every source access across all agents - URL, payment tx hash, keccak256 content hash, timestamp, and agent role - is committed into a merkle tree using merkletreejs. The merkle root and IPFS report CID are then stored on-chain in ResearchRegistry.sol on Sepolia via 1Shot. The full report is uploaded to IPFS via Pinata.

The result is a research report with a tamper-evident audit trail: anyone can take any source from the report and verify it was actually accessed by checking its leaf proof against the on-chain merkle root.

What makes it technically interesting

The ERC-7710 sub-delegation chain is the core primitive. Rather than giving all agents the same root permission, each agent gets a child delegation with a strictly smaller spending limit. This is enforced by the MetaMask Delegation Framework's Erc20TransferAmount caveat - not by application logic. The orchestrator cannot delegate more than the user granted it, and no sub-agent can exceed its own caveat.

The x402 payment flow is fully automatic. No human approves individual payments. The x402 client intercepts 402 responses, constructs and submits a tUSDC transfer, waits for confirmation, then retries - all within the same HTTP request lifecycle. Combined with 1Shot's permissionless relay, agents can pay for sources without holding ETH for gas.

The merkle proof is not decorative. Each leaf encodes the payment tx hash alongside the content hash, which means a verifier can independently confirm that a specific on-chain payment corresponds to a specific piece of content. The source list cannot be retroactively altered without invalidating the root stored on Sepolia.

Tech stack: Next.js 16, React 19, Viem 2.52.2, Wagmi 3.6.16, MetaMask Smart Accounts Kit 1.6.0, MetaMask Delegation Toolkit 0.13.0, ERC-7710, ERC-7715, x402 protocol (@x402/express, @x402/evm), 1Shot permissionless relayer, Groq AI (llama-3.3-70b-versatile), merkletreejs, Pinata (pinata-web3), Hardhat 2, OpenZeppelin, Tailwind CSS v4, shadcn/ui.

Contracts deployed on Sepolia:

Note: I tried using Venice AI for AI Inference but since there was no free/trial plan, I couldn't implement it. It was fully part of the plan instead of Groq.

黑客松進展

I built ProofPen from scratch over the two weeks of the hackathon — a multi-agent autonomous research system combining MetaMask Smart Accounts, on-chain micropayments, and cryptographic source proofs into a single end-to-end flow.

Week 1: Core infrastructure

I started by architecting the delegation chain. The user signs a single ERC-7710 delegation via wallet_grantPermissions (ERC-7715) in MetaMask Flask, granting a tUSDC budget to the orchestrator agent. I built the orchestrator to sub-delegate to four specialist agents — Scraper, FactChecker, Validator, and Synthesizer - each receiving a tighter Erc20TransferAmount caveat than its parent. No sub-agent can overspend, and the entire permission chain is cryptographically enforced on-chain.

I then deployed two Solidity contracts to Sepolia: ResearchRegistry.sol (stores the merkle root, IPFS CID, and query hash for every research job) and TestUSDC.sol (a mintable ERC-20 for testnet payments). Both were written with Hardhat 2 and OpenZeppelin and deployed via a custom deploy script.

Next I built the x402 payment client (src/lib/x402client.ts). The flow is: make an HTTP request → if 402 is returned, parse the payment details, send a tUSDC transfer on Sepolia via the 1Shot permissionless relayer (gasless, no API key required), then retry with an X-Payment header. I built three mock paywalled API servers using @x402/express with ExactEvmScheme to simulate real paywalled data sources: MockAcademia (academic papers, Semantic Scholar + OpenAlex fallback), MockFactDB (Wikipedia + Groq LLM fact checking), and MockQuoteDB (quote validation via Groq LLM).

Week 2: Agents, synthesis, and UI

I wired Groq AI (llama-3.3-70b-versatile) into the orchestrator for query decomposition and into the synthesizer for final report generation. The orchestrator breaks any research question into structured subtasks — search queries, factual claims to verify, quotes to validate — and fans them out to the agents in parallel.

I built the merkle proof layer using merkletreejs. Every source access produces a leaf: URL + payment tx hash + keccak256 content hash + timestamp + agent role. These are hashed into a merkle tree and the root is stored in ResearchRegistry.sol on Sepolia via the 1Shot relayer. The full report is uploaded to IPFS via Pinata.

On the frontend I built a real-time agent progress stream using Server-Sent Events, so you watch each agent work live — query decomposition, delegation creation, x402 payments, synthesis, IPFS upload, on-chain publish. The results page shows the final report, full source list with payment receipts, the merkle root, the IPFS link, and an Etherscan link to the on-chain proof tx.

Tech stack: Next.js 16, React 19, Viem 2.52.2, Wagmi 3.6.16, MetaMask Smart Accounts Kit 1.6.0, MetaMask Delegation Toolkit 0.13.0, x402 Express middleware, 1Shot permissionless relayer, Groq AI (llama-3.3-70b-versatile), merkletreejs, Pinata, Hardhat 2, OpenZeppelin, Tailwind CSS v4, shadcn/ui.

Contracts on Sepolia:

Note: I tried using Venice AI for AI Inference but since there was no free/trial plan, I couldn't implement it. It was fully part of the plan instead of Groq.

團隊負責人
HHarish K
專案連結
部署生態系
Ethereum SepoliaEthereum Sepolia
行業
AIOther