Back to stories
Research

Basecamp Research Launches Trillion Gene Atlas to Revolutionize AI Drug Discovery

Michael Ouroumis2 min read
Basecamp Research Launches Trillion Gene Atlas to Revolutionize AI Drug Discovery

Basecamp Research announced the launch of the Trillion Gene Atlas this week at both the SXSW Health Track and NVIDIA's GTC 2026 conference in San Jose — a landmark initiative that aims to expand humanity's map of evolutionary genetic diversity by a factor of 100 and use it to train the next generation of AI models for drug discovery.

The Scale of the Ambition

The Atlas will collect novel genomic data from more than 100 million new species across thousands of sampling sites worldwide. The goal is to provide the vast, diverse training data that AI systems need to learn from billions of years of evolution and, ultimately, design new medicines on demand.

Basecamp Research estimates that processing the quadrillions of DNA base pairs involved would have taken more than 20 years using conventional methods. With the Atlas infrastructure, the company expects to compress that timeline to under two years.

A Powerhouse Partnership

The initiative brings together an unusual coalition. Anthropic, the AI safety company behind the Claude model family, is contributing AI capabilities. Ultima Genomics and PacBio are providing next-generation sequencing technology — with PacBio's HiFi sequencing selected specifically for its accuracy on long-read genomic data. NVIDIA's AI infrastructure, including the Parabricks genomic analysis toolkit, delivers a reported 10x speedup in data processing.

Building on EDEN

The Trillion Gene Atlas builds on Basecamp Research's EDEN foundation models, which launched earlier this year after training on more than 10 billion novel genes collected from over one million previously uncharacterized species. The Atlas represents a 100x expansion of the underlying dataset — a scale that the company argues is necessary for AI models to capture the full breadth of nature's molecular toolkit.

Global Scientific Network

Alongside the Atlas launch, Basecamp Research announced new biodiversity partnerships in Chile and Argentina, as well as an expanded collaboration in Antarctica. These additions extend the company's global network of scientific collaborators to 31 countries, ensuring that the genomic data captured reflects the planet's true biological diversity rather than sampling only well-studied ecosystems.

Why It Matters for Drug Discovery

Traditional drug discovery relies on screening known compounds against known targets — a process that is expensive, slow, and limited by the chemical space researchers have explored. Foundation models trained on evolutionary data offer a fundamentally different approach: they can predict protein structures, suggest novel molecular candidates, and identify therapeutic targets that conventional methods might miss entirely.

By dramatically expanding the training data available to these models, the Trillion Gene Atlas could unlock new classes of treatments for diseases where current drug pipelines have stalled. The initiative also raises the bar for what constitutes a competitive biological AI dataset, putting pressure on rivals to match Basecamp's data breadth or risk falling behind in the race to build effective biomedical foundation models.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Research

Harvard Study: OpenAI's o1 Outperforms ER Doctors on Diagnosis Accuracy
Research

Harvard Study: OpenAI's o1 Outperforms ER Doctors on Diagnosis Accuracy

A Harvard Medical School study published in Science finds OpenAI's o1 model matched or beat attending physicians at diagnostic and management reasoning across 76 emergency department cases — but the authors warn against removing humans from care.

18 hours ago3 min read
ARC Prize Analysis: GPT-5.5 and Opus 4.7 Share Three Systematic Reasoning Errors on ARC-AGI-3
Research

ARC Prize Analysis: GPT-5.5 and Opus 4.7 Share Three Systematic Reasoning Errors on ARC-AGI-3

A new ARC Prize Foundation analysis of 160 replays shows OpenAI's GPT-5.5 and Anthropic's Claude Opus 4.7 stay below 1% on ARC-AGI-3 because of three recurring failure modes — and they fail differently.

2 days ago3 min read
MIT's FTTE Cuts Federated Learning Time 81%, Brings AI Training to Smartwatches and Sensors
Research

MIT's FTTE Cuts Federated Learning Time 81%, Brings AI Training to Smartwatches and Sensors

MIT CSAIL's Federated Tiny Training Engine reports 81% faster training, 80% less on-device memory, and 69% smaller communication payloads — putting privacy-preserving AI training within reach of small edge hardware.

2 days ago3 min read