Databricks has acquired Tabular, the commercial company behind the Apache Iceberg open table format, for approximately $2 billion. The deal brings the most widely adopted open lakehouse format under Databricks' roof and signals the company's intent to own the full stack from data storage to AI model training.
Why Iceberg Matters
Apache Iceberg has won the data lakehouse format war. Over the past two years, it has been adopted by AWS, Google Cloud, Snowflake, and dozens of other platforms as the standard way to store and manage large-scale analytical data. Iceberg provides warehouse-like features — ACID transactions, schema evolution, time travel queries, and efficient partition pruning — on top of object storage like S3.
Tabular, founded by the original Iceberg creators Ryan Blue, Dan Weeks, and Jason Reid, built a managed Iceberg service that simplifies catalog management, access control, and cross-engine compatibility. The company had raised $62 million and had approximately 200 enterprise customers.
Strategic Logic
For Databricks, the acquisition fills a critical gap. The company's own Delta Lake format has competed with Iceberg for years, and while both formats have strong adoption, Iceberg has become the industry-preferred open standard. Rather than continue fighting a format war, Databricks is embracing Iceberg.
"The format debate is over," said Ali Ghodsi, Databricks CEO. "Iceberg won the open format standard. We're going all-in on making Databricks the best platform for Iceberg data."
Databricks announced that its platform will support Iceberg as a first-class citizen alongside Delta Lake. Existing Delta Lake users will have a migration path to Iceberg, and new customers will be able to choose either format. The company expects most new deployments to use Iceberg.
AI Integration
The deeper strategic play is about AI. Training large language models and building AI applications requires access to large, well-organized datasets. Iceberg's ability to manage petabyte-scale data with efficient versioning and access patterns makes it an ideal foundation for AI data pipelines.
Databricks plans to integrate Tabular's catalog technology with its Mosaic AI platform, allowing data scientists to point AI training jobs directly at Iceberg tables without complex ETL pipelines. The goal is a workflow where data engineers prepare data in Iceberg, and AI engineers consume it for model training — all within the same platform.
Snowflake's Response
The acquisition puts pressure on Snowflake, which had partnered with Tabular and adopted Iceberg as a supported format. Snowflake now faces the prospect of relying on Iceberg infrastructure controlled by its primary competitor. Snowflake said in a statement that it remains committed to Iceberg and that the open-source nature of the project protects its independence.
Open Source Concerns
The data engineering community has raised concerns about whether Databricks will continue to invest in Iceberg as a truly open project. Databricks preemptively addressed this, committing to maintaining Iceberg as an Apache Software Foundation project with open governance. The company pledged to increase its contributions to the project and not to create proprietary extensions that fragment the standard.
Whether this commitment holds will be tested over time. The history of corporate stewardship of open-source projects is mixed, and the community will be watching closely.



