The post The New Gold Rush of AI appeared on BitcoinEthereumNews.com. In the past decade, artificial intelligence has grown by primarily feeding on the same resource: public web data. Texts, images, documents, forums, news, blogs, repositories… an enormous amount of material that models have absorbed to build their language and cognitive abilities. But this phase is about to end. According to projections cited by Messari, the total amount of public text available for model training—approximately 300 trillion tokens—could be completely exhausted between 2026 and 2032. This means that large models have “eaten the internet,” and now they need something else. The next frontier for AI will no longer be the web: it will be the real world. And this is where the concept of frontier data comes into play, the resource that will define the competitiveness of future models. Video, audio, sensory, motor, robotic data, action data, data generated from interaction with the physical world or complex digital interfaces. Data that cannot simply be downloaded: they must be collected, coordinated, verified, and, above all, incentivized. For this reason, the blockchain is not a detail or a marginal addition: it is the infrastructure that enables the orchestration of this new data economy. The End of “Web Scraping” and the Beginning of High-Value Data The most advanced models of 2025—not only linguistic but also multimodal, agentic, and reasoning-oriented—no longer improve with the mere addition of generic textual datasets. They require something much more specific and much more expensive to collect: data that reflects actions, intentions, movement, interaction, manipulation, context. This is the case, for example, with computer-use agents, AI capable of interacting directly with the computer as a human would. To train these systems, textual descriptions are not enough: “trajectories” are needed, which are actual recordings of people performing tasks on the screen. A protocol like Chakra, mentioned in the report, has developed… The post The New Gold Rush of AI appeared on BitcoinEthereumNews.com. In the past decade, artificial intelligence has grown by primarily feeding on the same resource: public web data. Texts, images, documents, forums, news, blogs, repositories… an enormous amount of material that models have absorbed to build their language and cognitive abilities. But this phase is about to end. According to projections cited by Messari, the total amount of public text available for model training—approximately 300 trillion tokens—could be completely exhausted between 2026 and 2032. This means that large models have “eaten the internet,” and now they need something else. The next frontier for AI will no longer be the web: it will be the real world. And this is where the concept of frontier data comes into play, the resource that will define the competitiveness of future models. Video, audio, sensory, motor, robotic data, action data, data generated from interaction with the physical world or complex digital interfaces. Data that cannot simply be downloaded: they must be collected, coordinated, verified, and, above all, incentivized. For this reason, the blockchain is not a detail or a marginal addition: it is the infrastructure that enables the orchestration of this new data economy. The End of “Web Scraping” and the Beginning of High-Value Data The most advanced models of 2025—not only linguistic but also multimodal, agentic, and reasoning-oriented—no longer improve with the mere addition of generic textual datasets. They require something much more specific and much more expensive to collect: data that reflects actions, intentions, movement, interaction, manipulation, context. This is the case, for example, with computer-use agents, AI capable of interacting directly with the computer as a human would. To train these systems, textual descriptions are not enough: “trajectories” are needed, which are actual recordings of people performing tasks on the screen. A protocol like Chakra, mentioned in the report, has developed…

The New Gold Rush of AI

7 min read

In the past decade, artificial intelligence has grown by primarily feeding on the same resource: public web data. Texts, images, documents, forums, news, blogs, repositories… an enormous amount of material that models have absorbed to build their language and cognitive abilities. But this phase is about to end.

According to projections cited by Messari, the total amount of public text available for model training—approximately 300 trillion tokens—could be completely exhausted between 2026 and 2032. This means that large models have “eaten the internet,” and now they need something else. The next frontier for AI will no longer be the web: it will be the real world.

And this is where the concept of frontier data comes into play, the resource that will define the competitiveness of future models. Video, audio, sensory, motor, robotic data, action data, data generated from interaction with the physical world or complex digital interfaces. Data that cannot simply be downloaded: they must be collected, coordinated, verified, and, above all, incentivized.

For this reason, the blockchain is not a detail or a marginal addition: it is the infrastructure that enables the orchestration of this new data economy.


The End of “Web Scraping” and the Beginning of High-Value Data

The most advanced models of 2025—not only linguistic but also multimodal, agentic, and reasoning-oriented—no longer improve with the mere addition of generic textual datasets. They require something much more specific and much more expensive to collect: data that reflects actions, intentions, movement, interaction, manipulation, context.

This is the case, for example, with computer-use agents, AI capable of interacting directly with the computer as a human would. To train these systems, textual descriptions are not enough: “trajectories” are needed, which are actual recordings of people performing tasks on the screen.

A protocol like Chakra, mentioned in the report, has developed an extension that allows users to record their screen while performing daily tasks: navigating a management system, preparing an Excel document, editing images, using professional software. These recordings become invaluable material for training models like GLADOS-1, the first computer-use model built almost entirely on crowdsourced data.

And this is precisely the point: these data do not exist until someone produces them. And they must be paid for. Just like energy or inference is paid for.


The Increasing Value of Gameplay-Action Pairs

Another striking example comes from the gaming world. A platform like Shaga, born as a decentralized cloud gaming network, produces an extremely valuable byproduct: the so-called Gameplay-Action Pairs (GAP), which are synchronized pairs of what happens on screen and the commands the player issues.

These are data that cannot be retrieved simply by watching videos on YouTube: they need to be captured at the source, on the player’s device. And this type of dataset, according to estimates reported by Messari, can be worth up to $50–$100 per hour of gameplay.

To put it into context: Shaga has already accumulated over 259,000 hours of gameplay, with an estimated value of more than 26 million dollars. And it’s no coincidence that OpenAI, a year earlier, offered half a billion to acquire Medal, a similar platform specializing precisely in gameplay recording.

These data are used to train world models, models that do not merely interpret language but simulate physics, causality, and agent-environment interaction. These are the models that will enable more intelligent robots, autonomous agents, advanced forecasting systems, and AI capable of “moving” in complex environments.


Physical AI: intelligence entering the physical world

And this is precisely where we arrive at the second major wave of frontier data: robotic data.

The AI of the future will not only reside in data centers. It will live in robots, drones, autonomous cars, distributed sensors, and smart home devices. Each robot will need data to learn how to move, identify objects, make decisions, and manipulate environments. And this data collection is incredibly costly: it requires physical hardware, human operators for teleoperation, continuous maintenance, and coordination.

Projects like PrismaX, BitRobot, GEODNET, and NATIX are beginning to use incentivized mechanisms typical of Web3 to distribute this cost across a global network of contributors. Instead of having a single company collecting robotic data, thousands of users can do so in a coordinated manner, receiving direct compensation.

It’s the same logic as mining: but instead of computational power, here the contribution is the real data.


Machine-to-machine coordination: when AI acts in the real world

If robots and AI agents truly begin to interact with the physical world, a completely new level of coordination is required. Robots will need to:

  • identify each other,
  • transact payments,
  • purchase services,
  • consume data,
  • execute tasks in a verifiable manner,
  • demonstrate having performed an action,
  • rely on shared ledgers of identity and reputation.

This is where initiatives like OpenMind and Peaq emerge, attempting to build an onchain infrastructure dedicated to the communication and identity of robots. An equivalent of DNS, but for machines. A system where drones, autonomous cars, robotic arms, or industrial systems can signal their presence, certify their actions, pay other systems, and exchange services.

It is the beginning of the machine economy, an economy populated by non-human entities that interact autonomously on decentralized networks.


Certified Real Data: The Role of IoTeX and DePIN Networks

The report also places significant focus on IoTeX, a protocol that in recent years has transformed its infrastructure into a comprehensive platform for the collection, certification, and orchestration of real-world data.

IoTeX enables the connection of sensors, IoT devices, home systems, and industrial equipment, providing:

  • a verified onchain identity for each device,
  • a data aggregation system,
  • a level of cryptographic attestation via ZK,
  • APIs that allow AI agents to utilize that data in real-time.

Today, IoTeX coordinates over 16,000 devices and dozens of vertical projects, providing AI agents with the ability to access verified data from the real world. A significant difference compared to simple scraping.


The Endpoint: Data as a Financial Asset

According to Messari, the trajectory is clear: data is becoming a financial asset in every respect. Just as today one can invest in compute, GPU, and colocation, in the future it will be possible to invest in “data streams,” purchase usage rights, support networks that collect frontier data, and in return, receive economic returns.

It’s an almost inevitable evolution: if data becomes scarce, valuable, and difficult to produce, it will then have a market, a price, demand, and supply.

Blockchain, once again, is the ideal layer for:

  • coordinate this economy,
  • verify its integrity,
  • trace the provenance,
  • distribute the compensations,
  • protect users,
  • support global scalability.

Conclusion

AI will not advance through increasingly larger models, but through richer data, sourced from the real world and collected via global networks of contributors. It is the greatest gold rush of the next decade: not that of chips, but that of data.

Web3 protocols are not a mere detail: they are the natural platform for collecting, verifying, distributing, and compensating those who provide this data. If the web was the raw material of the first AI wave, the real world will be the raw material of the second.

And this time, for the first time, the collection will not be controlled by a few giants, but by the networks.

Open, incentivized, decentralized networks: the new infrastructure of frontier data.

Source: https://en.cryptonomist.ch/2025/12/03/frontier-data-and-physical-ai-the-new-gold-rush-of-artificial-intelligence-and-why-blockchain-becomes-indispensable/

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Woman shot 5 times by DHS to stare down Trump at State of the Union address

Woman shot 5 times by DHS to stare down Trump at State of the Union address

A House Democrat has invited Marimar Martinez to attend President Donald Trump's State of the Union address in Washington, D.C., after she was shot by Customs and
Share
Rawstory2026/02/06 03:36
China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

The post China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise appeared on BitcoinEthereumNews.com. China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise China’s internet regulator has ordered the country’s biggest technology firms, including Alibaba and ByteDance, to stop purchasing Nvidia’s RTX Pro 6000D GPUs. According to the Financial Times, the move shuts down the last major channel for mass supplies of American chips to the Chinese market. Why Beijing Halted Nvidia Purchases Chinese companies had planned to buy tens of thousands of RTX Pro 6000D accelerators and had already begun testing them in servers. But regulators intervened, halting the purchases and signaling stricter controls than earlier measures placed on Nvidia’s H20 chip. Image: Nvidia An audit compared Huawei and Cambricon processors, along with chips developed by Alibaba and Baidu, against Nvidia’s export-approved products. Regulators concluded that Chinese chips had reached performance levels comparable to the restricted U.S. models. This assessment pushed authorities to advise firms to rely more heavily on domestic processors, further tightening Nvidia’s already limited position in China. China’s Drive Toward Tech Independence The decision highlights Beijing’s focus on import substitution — developing self-sufficient chip production to reduce reliance on U.S. supplies. “The signal is now clear: all attention is focused on building a domestic ecosystem,” said a representative of a leading Chinese tech company. Nvidia had unveiled the RTX Pro 6000D in July 2025 during CEO Jensen Huang’s visit to Beijing, in an attempt to keep a foothold in China after Washington restricted exports of its most advanced chips. But momentum is shifting. Industry sources told the Financial Times that Chinese manufacturers plan to triple AI chip production next year to meet growing demand. They believe “domestic supply will now be sufficient without Nvidia.” What It Means for the Future With Huawei, Cambricon, Alibaba, and Baidu stepping up, China is positioning itself for long-term technological independence. Nvidia, meanwhile, faces…
Share
BitcoinEthereumNews2025/09/18 01:37
WLFI Drops 20% Weekly as Price Tests the Crucial $0.113 Support

WLFI Drops 20% Weekly as Price Tests the Crucial $0.113 Support

On Thursday, February 5, World Liberty Financial (WLFI) is continuing its decline and is trading at $0.1281, decreased by 5.89% in the past day. The token has lost
Share
Tronweekly2026/02/06 03:00