Back to stories
Industry

Fireworks AI in Talks to Raise at $15B — 4x Its October Price in Seven Months

Michael Ouroumis2 min read
Fireworks AI in Talks to Raise at $15B — 4x Its October Price in Seven Months

Fireworks AI is in talks to raise a new funding round that would value the inference startup at roughly $15 billion, according to a Bloomberg report dated May 27 — nearly four times the $4 billion valuation it carried just seven months ago. Existing backer Index Ventures is set to co-lead the round, which has not closed and whose terms could still change.

4x in seven months

The jump is steep even by 2026 standards. Fireworks set that $4 billion mark in its October 2025 Series C, a $250 million raise co-led by Lightspeed Venture Partners, Index Ventures and Evantic Capital, with Sequoia Capital participating. Its July 2024 Series B valued the company at just $552 million. A $15 billion price would mark roughly a 27x step-up in under two years.

The revenue trajectory is doing the talking. Research firm Sacra estimates Fireworks hit about $315 million in annualized revenue in February 2026, up 416% year over year — the kind of curve that explains why an investor already on the cap table would lead the next round rather than sit it out.

What the company sells

Founded in 2022 and based in Redwood City, Fireworks was started by CEO Lin Qiao, a former Meta engineer. Its pitch is narrow and increasingly valuable: run inference for open-weight LLMs and generative models faster and cheaper than teams can self-host. The platform exposes OpenAI-compatible endpoints, offers both serverless and dedicated-GPU deployments, and layers optimizations on serving stacks like vLLM, SGLang and TensorRT. Fireworks reportedly processes around 15 trillion tokens per day.

That positions it squarely in the serverless-inference tier alongside Together AI, DeepInfra and Replicate, with Baseten — fresh off a $300 million Series E at a $5 billion valuation in January 2026 — pushing hardest on the enterprise engineering angle. Custom-silicon vendors Groq, Cerebras and SambaNova attack the same workloads from the hardware side, competing on raw throughput.

Why inference is the trade

The round is a bet that serving models, not training them, is where durable margin sits. As frontier-lab capex balloons into the hundreds of billions, the practical question for enterprises is who runs their open-weight checkpoints at production scale without per-token costs spiraling. NVIDIA notes leading providers — Baseten, DeepInfra, Fireworks and Together among them — are cutting cost per token by up to 10x on Blackwell-class hardware, which is exactly what makes open-weight deployment viable against closed-API incumbents.

What it means for builders

For teams choosing an inference layer, a $15 billion Fireworks is a vendor with the balance sheet to commit GPU capacity and hold pricing — but also one under pressure to monetize. The recurring reasons teams migrate off any single provider remain the same: per-token cost at scale, dedicated-GPU control, model-catalog breadth for newer checkpoints, and fine-tune portability. The smart play is to keep deployments on OpenAI-compatible endpoints so the inference layer stays swappable, no matter whose valuation is climbing this quarter.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Robinhood Opens Its Brokerage to AI Agents Over MCP, Adds 3% Cash-Back Agent Card
Industry

Robinhood Opens Its Brokerage to AI Agents Over MCP, Adds 3% Cash-Back Agent Card

Robinhood launched Agentic Trading, an MCP-based product that connects third-party AI agents like Claude and Cursor to a ring-fenced brokerage account, plus a virtual agentic credit card paying 3% cash back on agent purchases.

1 hours ago2 min read
Ex-Palantir AIP Engineers Raise $12M for Perceptic, an AI Operating System for Drug Development
Industry

Ex-Palantir AIP Engineers Raise $12M for Perceptic, an AI Operating System for Drug Development

Perceptic exited stealth with a $12M seed led by Accel, building a model-agnostic intelligence layer that wires AI tools into pharma's siloed asset, evidence, and clinical-trial data. CSL is its first named customer.

3 hours ago2 min read
Nvidia to Pour $150 Billion a Year Into Taiwan as Vera Rubin Ramp Strains the Supply Chain
Industry

Nvidia to Pour $150 Billion a Year Into Taiwan as Vera Rubin Ramp Strains the Supply Chain

Jensen Huang says Nvidia is spending $100 billion a year in Taiwan and will lift that to $150 billion annually, anchored by a new 4,000-person Taipei campus and a Vera Rubin production push with TSMC.

4 hours ago2 min read