Ethereum aims to stop rogue AI agents from stealing trust with new ERC-8004

Ethereum (ETH) announced ERC-8004 is heading to mainnet, positioning the network as a neutral infrastructure for a problem the AI industry can’t yet solve: how agents prove they’re trustworthy when no single platform controls the reputation layer.

The timing reveals the underlying tension, as AI agents are moving from demos into production systems that trigger real transactions.

Mastercard is drafting commerce standards for agentic checkout, UK banks are piloting customer-facing agent trials slated for early 2026, and Gartner projects 40% of enterprise applications will integrate task-specific agents by year-end.

However, a Camunda report found that while 71% of organizations now deploy AI agents, only 11% of use cases reached production over the past year. The blockers are trust, transparency, and regulatory risk.

Dynatrace surveys show roughly half of agentic projects stalled in pilot, with 52% citing security and compliance issues, and about 70% of AI decisions still requiring human verification.

ERC-8004 tries to productize that trust gap by defining three lightweight registries: identity, reputation, and validation. Those can be deployed on mainnet or layer-2 blockchains as application-layer contracts, not a protocol fork.

Ethereum’s official account framed the standard as enabling “discovery and portable reputation,” so AI services can “interoperate without gatekeepers.” The canonical spec remains in draft status on eips.ethereum.org.

Trust on AI agents breakdown — Surveys from Camunda and Dynatrace show 71% of organizations deploy AI agents, but only 11% reach production due to security and human verification requirements.

Three registries, three coordination problems

The Identity Registry turns each agent into an ERC-721 NFT with a global identifier and a pointer to a structured registration file.

That file lists capabilities, endpoints (MCP, A2A, ENS, DID, web URLs), and contact methods, essentially serving as a service directory for machine actors.

Agents become discoverable and transferable using standard NFT tooling.

The spec includes optional endpoint domain verification to prove domain control, and reserves an “agentWallet” field that requires EIP-712 signature or ERC-1271 verification to change.

The design choice prevents “I’m reputable, pay here” hijacks, where an attacker swaps the payment address while preserving the reputation.

Identity solves composability, as reputations and validations can be indexed to a stable agent ID rather than a platform account. Ethereum is trying to turn agent identity into a public utility, the same way ENS did for names, but for machine actors.

The failure mode is baked in, with ERC-8004 proving that the metadata belongs to the agent NFT, not that the endpoints are safe or honest.

The spec warns that advertised capabilities “might be non-functional or malicious,” which is why the other two registries exist.

The Reputation Registry stores minimal, composable feedback data on-chain and pushes rich details off-chain via URIs and hashes. Feedback includes a signed fixed-point value with configurable decimals and optional tags.

The off-chain JSON can include context like MCP tool references, A2A task IDs, and even proof-of-payment references. The spec explicitly names x402-style HTTP payment proofs.

There’s a revokeFeedback path and an appendResponse function for refunds, spam flags, or rebuttals.

ERC-8004 does not promise an on-chain Yelp score. It’s closer to a shared event rail where different marketplaces, insurers, and auditors can compute their own trust models.

The spec explicitly warns that summaries without filtering reviewers are vulnerable to Sybil attacks and spam, requiring clientAddresses filtering for getSummary calls.

Aggregation happens both on-chain through basic composability and off-chain through sophisticated scoring. The design assumes reputation gaming, such as bought reviews, collusion, and feedback laundering, as inevitable, not exceptional.

Economic bias creeps in if proof of payment becomes de facto proof of credibility: big spenders look trustworthy. And because rich feedback is event-based and off-chain, whoever runs the best indexers and filters could become a new gatekeeper.

The Validation Registry implements an on-chain request/response log in which agents submit requests to validator contracts to verify work, and validators post outcomes along with optional evidence URIs and hashes.

Agent owners call validationRequest with a validator address, agent ID, request URI, and a keccak commitment to the payload. Validators respond via validationResponse with a score, a response URI, a hash, and a tag.

The spec allows progressive responses, including soft and hard finality via tags, permits multiple responses, and keeps the design intentionally generic to accommodate crypto-economic re-execution, zkML verifiers, TEE oracles, or trusted judges.

Validation is the trust escalator: reputation works for low-stakes tasks, but validation is what you reach for when money, compliance, or liability are on the line.

The EIP describes tiered trust proportional to value-at-risk: pizza orders versus medical diagnoses.

The failure mode: who validates the validators? ERC-8004 records validator outputs but doesn’t solve validator integrity, creating a meta-market for validator reputations, staking, insurance, and audit brands.

Registry	What it does	What’s on-chain vs off-chain	Key mechanisms	Primary failure mode
Identity Registry	Discovery + durable agent ID (composable handle others can reference)	On-chain: ERC-721 agent ID + pointers / key-value metadata Off-chain: structured registration file (capabilities, endpoints, contact)	Optional endpoint domain verification; `agentWallet` change requires EIP-712 signature or ERC-1271 verification	Metadata can be truthful-but-malicious (ownership ≠ honesty/safety)
Reputation Registry	Portable feedback signals across orgs/markets (shared trust events)	On-chain: minimal feedback primitives; event rail Off-chain: context URIs/hashes (task IDs, payment proofs, etc.)	revokeFeedback + appendResponse (refunds/rebuttals); `getSummary` requires reviewer filtering to reduce Sybil	Sybil/collusion + “best indexer wins” gatekeeping
Validation Registry	Third-party verification for high-stakes actions (trust escalator)	On-chain: request/response log + scores/tags Off-chain: evidence URIs/hashes	Commitments via requestHash; progressive responses (soft/hard finality tags), multiple responses allowed	“Who validates validators?” → validator corruption / cartelization

Why Ethereum thinks this is infrastructure

The emerging agent stack looks like this: MCP and A2A handle communication and orchestration, x402 (HTTP 402 plus stablecoin settlement) handles payments, and ERC-8004 handles trust and discovery.

The clean line is that ERC-8004 doesn’t compete with MCP, A2A, or x402. Instead, it composes with them.

The EIP includes fields for MCP and A2A endpoints, as well as payment-proof references, within off-chain feedback payloads.

Three registries, three coordination problems

Why Ethereum thinks this is infrastructure

Daily signals, zero noise.

Risks as part of the design