Synthetic Dataset for AI STEM

Tether Data marks a new chapter in the evolution of artificial intelligence with the launch of QVAC Genesis I, the largest and most advanced synthetic dataset ever created for training language models focused on STEM disciplines.
The initiative, presented by Tether’s AI research division, QVAC, aims to democratize access to quality data for AI model training, challenging the centralization of large tech companies.
QVAC Genesis I: 41 billion tokens for a new generation of AI
The heart of the announcement is QVAC Genesis I, a monumental collection of 41 billion textual tokens. Each token represents a fragment of language, the raw material with which AI models learn to understand and generate text.
This dataset, rigorously validated on educational and scientific benchmarks, stands out for its superior performance in reasoning and problem-solving in fields such as mathematics, physics, biology, and medicine.
Unlike existing public datasets, often lacking in STEM content, QVAC Genesis I offers comprehensive and validated coverage for scientific education.
It is the first synthetic dataset of this kind made publicly available, designed to support the construction of more intelligent, precise, and critically thinking language models.
A dataset designed for the community, not for corporations
Beyond the technical value, the release of QVAC Genesis I represents a stance on who should control the future of intelligence.
In a landscape dominated by a few giants that centralize the training and management of AI models, Tether Data aims to return power and autonomy to users and researchers.
The goal is to promote an open and community-driven intelligence, providing high-quality data for scientific research and innovation beyond the confines of the large platforms.
According to Paolo Ardoino, CEO of Tether,
“Intelligence should not be centralized. With QVAC Workbench and Genesis I, we open the door to infinite intelligence, which lives, learns, and evolves locally on its own device. Intelligence, like information, must be free, accessible, and owned by everyone, not locked behind corporate firewalls or sold as a service.”
QVAC Workbench: AI intelligence directly on the device
Simultaneously with the dataset, Tether Data also launches QVAC Workbench, the first consumer app that brings artificial intelligence directly to users’ devices.
It is a complete workspace for local AI, designed for enthusiasts, advanced users, and researchers. The app supports a wide range of language and AI models, including Llama, Medgemma, Qwen, SmolVLM, Whisper, and many others.
QVAC Workbench is already available for smartphones (initially on Android, soon also on iOS) and for desktop (Windows, macOS, Linux), offering the widest on-device compatibility compared to current solutions.
All interactions and chats with AI models remain local and private, with data remaining the exclusive property of the user. An innovative feature, called “Delegated Inference,” also allows for peer-to-peer connection of the mobile app with the desktop one, leveraging the computing power of home or business workstations.
A new paradigm: local and decentralized intelligence
The approach of Tether Data and QVAC is based on a vision of decentralized and adaptive AI, which lives and learns on any device, returning control and autonomy to individuals and communities.
The mission of QVAC is clear: “Local AI. Infinite Intelligence. No Compromise.” Intelligence should not be the prerogative of institutions, but return to the hands of the people, ensuring the freedom to build, learn, and share.
The QVAC Genesis I dataset was created through a multi-stage generation and validation process, transforming high-quality scientific and educational materials into structured learning data.
The result is a training resource that helps models to reason, solve problems, and think critically, surpassing mere linguistic imitation.
“Many AI models today seem intelligent, but they don’t really think,” emphasizes Ardoino. “We designed this dataset to help models understand cause and effect, make connections, draw conclusions, and reason within complexity. And we make it open to everyone.”
An invitation to the scientific community and developers
By making QVAC Genesis I public, Tether Data invites the community of researchers and developers to build and utilize models capable of competing with, and even surpassing, proprietary systems.
The complete technical documentation of the dataset is available on the dedicated research blog, offering transparency and tools for anyone who wants to contribute to the evolution of artificial intelligence.
The QVAC Workbench apps are downloadable from the official site, ready to be tested and adopted by anyone wishing to experience the power of local AI.
Tether Data: innovation, privacy, and decentralization
Tether Data fits into the broader vision of Tether, aimed at promoting freedom, transparency, and innovation through technology.
The company’s mission is to enable individuals and organizations to connect and share information directly, without unnecessary intermediaries.
Thanks to secure and peer-to-peer systems, Tether Data offers users greater control over data, communications, and digital interactions, redefining information flows with decentralized infrastructures designed for privacy, efficiency, and resilience.
QVAC: towards a future of distributed intelligence
QVAC represents the forefront of Tether Data’s AI research, engaged in building open, decentralized, and adaptive intelligence systems.
The goal is a world where AI lives and learns on every device, empowering individuals and communities instead of concentrating power in corporate data centers.
With the release of QVAC Genesis I and QVAC Workbench, Tether Data paves the way for a new era of free artificial intelligence, accessible and truly in the hands of everyone. A revolution that promises to redefine the relationship between technology, knowledge, and society.
Tether Data marks a new chapter in the evolution of artificial intelligence with the launch of QVAC Genesis I, the largest and most advanced synthetic dataset ever created for training language models focused on STEM disciplines.
The initiative, presented by Tether’s AI research division, QVAC, aims to democratize access to quality data for AI model training, challenging the centralization of large tech companies.
QVAC Genesis I: 41 billion tokens for a new generation of AI
The heart of the announcement is QVAC Genesis I, a monumental collection of 41 billion textual tokens. Each token represents a fragment of language, the raw material with which AI models learn to understand and generate text.
This dataset, rigorously validated on educational and scientific benchmarks, stands out for its superior performance in reasoning and problem-solving in fields such as mathematics, physics, biology, and medicine.
Unlike existing public datasets, often lacking in STEM content, QVAC Genesis I offers comprehensive and validated coverage for scientific education.
It is the first synthetic dataset of this kind made publicly available, designed to support the construction of more intelligent, precise, and critically thinking language models.
A dataset designed for the community, not for corporations
Beyond the technical value, the release of QVAC Genesis I represents a stance on who should control the future of intelligence.
In a landscape dominated by a few giants that centralize the training and management of AI models, Tether Data aims to return power and autonomy to users and researchers.
The goal is to promote an open and community-driven intelligence, providing high-quality data for scientific research and innovation beyond the confines of the large platforms.
According to Paolo Ardoino, CEO of Tether,
“Intelligence should not be centralized. With QVAC Workbench and Genesis I, we open the door to infinite intelligence, which lives, learns, and evolves locally on its own device. Intelligence, like information, must be free, accessible, and owned by everyone, not locked behind corporate firewalls or sold as a service.”
QVAC Workbench: AI intelligence directly on the device
Simultaneously with the dataset, Tether Data also launches QVAC Workbench, the first consumer app that brings artificial intelligence directly to users’ devices.
It is a complete workspace for local AI, designed for enthusiasts, advanced users, and researchers. The app supports a wide range of language and AI models, including Llama, Medgemma, Qwen, SmolVLM, Whisper, and many others.
QVAC Workbench is already available for smartphones (initially on Android, soon also on iOS) and for desktop (Windows, macOS, Linux), offering the widest on-device compatibility compared to current solutions.
All interactions and chats with AI models remain local and private, with the data remaining the exclusive property of the user. An innovative feature, called “Delegated Inference,” also allows for peer-to-peer connection between the mobile app and the desktop app, leveraging the computing power of home or business workstations.
A New Paradigm: Local and Decentralized Intelligence
The approach of Tether Data and QVAC is based on a vision of decentralized and adaptive AI, which lives and learns on any device, returning control and autonomy to individuals and communities.
The mission of QVAC is clear: “Local AI. Infinite Intelligence. No Compromise.” Intelligence should not be the prerogative of institutions, but return to the hands of the people, ensuring the freedom to build, learn, and share.
The QVAC Genesis I dataset was created through a multi-stage generation and validation process. Specifically, by transforming high-quality scientific and educational materials into structured data for learning.
The result is a training resource that helps models to reason, solve problems, and think critically, surpassing mere linguistic imitation.
“Many AI models today seem intelligent, but they don’t really think,” emphasizes Ardoino. “We designed this dataset to help models understand cause and effect, make connections, draw conclusions, and reason within complexity. And we make it open to everyone.”
An invitation to the scientific community and developers
By making QVAC Genesis I public, Tether Data invites the community of researchers and developers to build and utilize models capable of competing with, and even surpassing, proprietary systems.
The complete technical documentation of the dataset is available on the dedicated research blog, offering transparency and tools for anyone who wants to contribute to the evolution of artificial intelligence.
The QVAC Workbench apps are downloadable from the official site, ready to be tested and adopted by anyone wishing to experience the power of local AI.
Tether Data: innovation, privacy, and decentralization
Tether Data fits into the broader vision of Tether, aimed at promoting freedom, transparency, and innovation through technology.
The company’s mission is to enable individuals and organizations to connect and share information directly, without unnecessary intermediaries.
Thanks to secure and peer-to-peer systems, Tether Data offers users greater control over data, communications, and digital interactions. Thus redefining information flows with decentralized infrastructures designed for privacy, efficiency, and resilience.
QVAC: towards a future of distributed intelligence
QVAC represents the forefront of Tether Data’s AI research, engaged in building open, decentralized, and adaptive intelligence systems.
The goal is a world where AI lives and learns on every device, empowering individuals and communities instead of concentrating power in corporate data centers.
With the release of QVAC Genesis I and QVAC Workbench, Tether Data paves the way for a new era of free, accessible artificial intelligence truly in the hands of everyone. A revolution that promises to redefine the relationship between technology, knowledge, and society.
 
				



