Back to stories
Research

Basecamp Research Launches Trillion Gene Atlas to Revolutionize AI Drug Discovery

Michael Ouroumis2 min read
Basecamp Research Launches Trillion Gene Atlas to Revolutionize AI Drug Discovery

Basecamp Research announced the launch of the Trillion Gene Atlas this week at both the SXSW Health Track and NVIDIA's GTC 2026 conference in San Jose — a landmark initiative that aims to expand humanity's map of evolutionary genetic diversity by a factor of 100 and use it to train the next generation of AI models for drug discovery.

The Scale of the Ambition

The Atlas will collect novel genomic data from more than 100 million new species across thousands of sampling sites worldwide. The goal is to provide the vast, diverse training data that AI systems need to learn from billions of years of evolution and, ultimately, design new medicines on demand.

Basecamp Research estimates that processing the quadrillions of DNA base pairs involved would have taken more than 20 years using conventional methods. With the Atlas infrastructure, the company expects to compress that timeline to under two years.

A Powerhouse Partnership

The initiative brings together an unusual coalition. Anthropic, the AI safety company behind the Claude model family, is contributing AI capabilities. Ultima Genomics and PacBio are providing next-generation sequencing technology — with PacBio's HiFi sequencing selected specifically for its accuracy on long-read genomic data. NVIDIA's AI infrastructure, including the Parabricks genomic analysis toolkit, delivers a reported 10x speedup in data processing.

Building on EDEN

The Trillion Gene Atlas builds on Basecamp Research's EDEN foundation models, which launched earlier this year after training on more than 10 billion novel genes collected from over one million previously uncharacterized species. The Atlas represents a 100x expansion of the underlying dataset — a scale that the company argues is necessary for AI models to capture the full breadth of nature's molecular toolkit.

Global Scientific Network

Alongside the Atlas launch, Basecamp Research announced new biodiversity partnerships in Chile and Argentina, as well as an expanded collaboration in Antarctica. These additions extend the company's global network of scientific collaborators to 31 countries, ensuring that the genomic data captured reflects the planet's true biological diversity rather than sampling only well-studied ecosystems.

Why It Matters for Drug Discovery

Traditional drug discovery relies on screening known compounds against known targets — a process that is expensive, slow, and limited by the chemical space researchers have explored. Foundation models trained on evolutionary data offer a fundamentally different approach: they can predict protein structures, suggest novel molecular candidates, and identify therapeutic targets that conventional methods might miss entirely.

By dramatically expanding the training data available to these models, the Trillion Gene Atlas could unlock new classes of treatments for diseases where current drug pipelines have stalled. The initiative also raises the bar for what constitutes a competitive biological AI dataset, putting pressure on rivals to match Basecamp's data breadth or risk falling behind in the race to build effective biomedical foundation models.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Research

Researchers Expose 26 Malicious LLM Routers Hijacking AI Agents and Stealing Credentials
Research

Researchers Expose 26 Malicious LLM Routers Hijacking AI Agents and Stealing Credentials

A UC Santa Barbara study of 428 LLM API routers found 26 secretly injecting malicious tool calls, exfiltrating credentials, and draining crypto wallets — exposing a critical blind spot in the AI supply chain.

1 day ago2 min read
AI Chatbots Fail Over 80% of Early Medical Diagnoses, JAMA Study Finds
Research

AI Chatbots Fail Over 80% of Early Medical Diagnoses, JAMA Study Finds

A JAMA Network Open study of 21 leading AI models found they fail to produce appropriate differential diagnoses more than 80% of the time when patient data is incomplete, despite achieving over 90% accuracy on final diagnoses with full information.

1 day ago2 min read
Stanford AI Index 2026: Capability Is Accelerating, But Benefits Are Concentrating
Research

Stanford AI Index 2026: Capability Is Accelerating, But Benefits Are Concentrating

The Stanford HAI AI Index 2026, released today, reports $581.7B in global corporate AI investment, a 29.6 GW data-center power footprint, and a shrinking US–China capability gap.

3 days ago2 min read