The Growing Demand for Decentralized Computing Networks

In the past two years, artificial intelligence has shifted its focus. While in 2022–2023 everything revolved around large models and their training, today the real battle is no longer fought in building models, but in their continuous execution, in the ability to respond to billions of queries every day. It is the war of inference, and it is much more important than it seems.

Messari states clearly in its report “State of AI 2025”: by 2030, inference will account for between 50% and 75% of the global computing demand. Such a high threshold will completely reshape the geography of AI infrastructure.

Today, every time a user opens ChatGPT, generates an image, asks for advice, analyzes a text, requests an agent to browse the web or make a decision: they are consuming inference. The same happens for thousands of AI agents performing continuous operations in the background, without anyone watching them.

The result is a computational consumption that grows vertically, far beyond what is necessary to train the models themselves.

The new pressure: real users, real interactions

The reasons for this growth are manifold, but they all converge in one direction: AI has become a mass service. Users no longer limit themselves to experimenting; they use AI in a daily, prolonged, and increasingly complex manner.

ChatGPT sessions have become longer and more complex: according to data reported by Messari, the time spent on the models has doubled in a year, and the average duration of a single session has increased by 75%. This is a very clear signal: AI is no longer an “occasional” assistant, but a true operational environment where the user stays longer, making more extensive and token-dense requests.

Adding to this is the explosion of consumer AI. In 2024, artificial intelligence apps surpassed one billion downloads, with growth exceeding 100% year-over-year. ChatGPT now has over 700 million weekly users, while Gemini, Copilot, and Claude are following similar trends. It’s a continuous wave of demand that translates into compute, and therefore into costs.

The “reasoning” factor: intelligent models cost significantly more

Adding to the complexity is the rise of “reasoning” oriented models, which do not merely complete sentences but attempt to reason, explain, and plan. These models consume up to ten times more tokens to answer the same question compared to a traditional model.

If we add to this the fact that the industry has now adopted post-training techniques based on reinforcement learning — as demonstrated by DeepSeek — the pressure on compute further explodes. Reinforcement learning generates thousands of iterations for each individual problem, multiplying the computational consumption on an industrial scale.

In other words: the smarter AI becomes, the more expensive it is to operate.

The Cost Issue: Inference Has Become the New “Core Challenge” of AI

There is a point that is often underestimated: training is a huge cost, but it is a one-time cost. Once trained, the model exists. Inference, on the other hand, is a continuous cost, infinite, proportional to the number of users, agents, and applications that utilize it.

For major labs — OpenAI, Anthropic, Google — inference has become the dominant expense. And this dynamic is paving the way for a quiet revolution that closely concerns the crypto ecosystem.

Open-source shifts the balance: smaller, faster, and much more cost-effective models

While proprietary models continue to grow in size and complexity, open-source is rapidly closing the gap. According to the Artificial Analysis benchmark cited by Messari, the difference between the best closed and large-scale open models is surprisingly small today, especially when compared to costs.

An open model with 120 billion parameters costs up to 90% less in inference compared to ChatGPT-5, with a relatively marginal loss in capability.

But the real revolution concerns small and mid-size models, ranging from 4 to 40 billion parameters. Today, many of these models are capable of solving complex tasks while running on a single consumer GPU like an RTX 4090 or 5090. This means that inference no longer needs to be centralized in gigantic data centers: it can be distributed.

And it is here that the world of decentralized AI finds its natural ground.

The Rise of Decentralized Computing Networks (DCN): A New Computing Economy

The decentralized computing networks (DCN) — such as Render, Akash, io.net, Aethir, Hyperbolic, EigenCloud, and Exabits — aggregate millions of GPUs distributed worldwide. For years, these networks struggled to find a real market: training large models was simply too complex due to latency and the continuous exchange of information between GPUs.

But inference is another story.

Inference requires much less horizontal communication, can be executed in a highly parallelized manner, and can leverage heterogeneous hardware. It does not need perfect and super-synchronized clusters. It is an ideal task for thousands of scattered nodes, especially now that smaller models are becoming surprisingly powerful.

This time, the market is truly here. And Messari defines it as the first, true “product-market fit” of the entire deAI sector.

The Fortytwo Case: Swarm Intelligence as a Practical Demonstration

Among the most interesting innovations, the report mentions Fortytwo Network, a network that coordinates small models installed on users’ laptops. These models work together like a swarm: each one answers the same question, then evaluates the responses of others, and finally, the network produces an optimized response based on consensus.

The mechanism generates onchain credit, reputation, and rewards. It is so efficient that Fortytwo has even managed to produce datasets entirely generated by the swarm and to fine-tune a specialized model in Rust, achieving results superior to much larger models.

It is a concrete example of how decentralization is not only desirable but already competitive.

The Verification Issue: The Essential Piece for Decentralized Inference

Every time a request is executed on a distributed node, a crucial question arises: how can one be certain that the result is correct? This is where crypto plays a decisive role.

Messari analyzes three currently dominant approaches:

  • zero-knowledge proofs (zkML), slow but extremely secure;
  • optimistic systems, where the result is considered valid unless challenged;
  • hardware enclaves (TEE), faster but based on hardware trust.

Among the pioneers in the sector is EigenCloud, which is bringing to market a deterministic and verifiable inference, compatible with OpenAI’s APIs and already used for agent frameworks by Coinbase and Google.

Verification is not a technical detail: it is what makes AI suitable for finance, health, governance, and autonomous transactions. It is the bridge between AI and Web3.

The Future: An Economy of Agents Continuously Consuming Compute

The conclusion of the report is clear: the future of AI will not be dominated by the largest models, but by those who can deliver inference in the most scalable, cost-effective, and verifiable manner possible. If today human users generate millions of requests, tomorrow autonomous agents will generate billions. And each of these requests will have a computational cost.

At that point, decentralized compute networks will no longer be an experimental alternative: they will become an economic necessity.

Conclusion

We are entering the era of inference, not training.
An era where demand grows without limits, where computation is no longer an isolated investment but a continuous flow, and where millions of models — large or small — will need to be served every second.

And it is precisely here, in this vast economic space, that the crypto world is finding its most natural role: coordinating, verifying, distributing, and economizing the computational power necessary to support an increasingly intelligent society.

Source: https://en.cryptonomist.ch/2025/12/03/why-the-demand-for-inference-is-booming-and-what-it-means-for-decentralized-computing-networks/