Tether releases 41 billion-token dataset to democratize AI training

Tether is challenging the foundational advantage of Big Tech’s AI models with QVAC Genesis I, a colossal synthetic dataset built to train models in scientific reasoning and complex problem-solving.

Summary

  • Tether’s QVAC division launches Genesis I, a 41 billion-token synthetic dataset for STEM-focused AI training.
  • The initiative aims to challenge centralized control of AI intelligence held by Big Tech.
  • Release includes QVAC Workbench, enabling users to run AI models locally across devices.

According to a press release dated Oct. 24, Tether Data’s AI division, QVAC, has publicly launched the “Genesis I” dataset, a massive collection of 41 billion synthetically generated tokens.

The resource is specifically engineered and validated to bolster AI performance in rigorous STEM fields where open-source models typically lag, including mathematics, physics, and medicine. Tether CEO Paolo Ardoino framed the release as a move to break the corporate stranglehold on high-quality AI training data.

“Intelligence shouldn’t be centralized,” Ardoino said. With QVAC Workbench and Genesis I, we’re opening the door to infinite intelligence, AI that lives, learns, and evolves locally on your own device. We believe that intelligence, like information, should be free, accessible, and owned by everyone, not locked behind corporate firewalls or sold as a service.”

Tether’s new model for AI access and control

The QVAC Genesis I dataset was built to address a specific weakness in today’s open-source AI landscape: a lack of deep, logical reasoning. Tether’s researchers employed a multi-stage generation and validation process, transforming high-quality scientific and educational content into structured learning data.

The result is a corpus of 41 billion tokens that teaches models to grasp the relationships and logic connecting concepts, moving beyond simple pattern recognition to foster genuine critical thinking. Ardoino emphasized this distinction, noting, “Most AI today sounds smart, but doesn’t truly think. We designed this dataset to help models understand cause and effect.”

Simultaneously, Tether is addressing where this intelligence operates with the release of QVAC Workbench, its first consumer application for local AI. The Workbench allows users to run a wide array of AI models, including popular open-source options like Llama, Medgemma, and Qwen, directly on their own devices.

Targeted at researchers and AI enthusiasts, the app is available now for Android, Windows, macOS, and Linux, with an iOS version expected shortly. Workbench includes a “Delegated Inference” feature, which enables a user’s mobile device to connect peer-to-peer with a more powerful desktop version, harnessing the full computational power of a home or office workstation from a smartphone.

The two releases form the core of Tether’s ambitious plan to reshape the AI industry. The strategy mirrors its foundational work in digital assets: creating open, peer-to-peer systems that reduce reliance on centralized intermediaries.

Source: https://crypto.news/tether-releases-41-billion-token-dataset-to-democratize-ai-training/