Back to stories
Models

Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings

Michael Ouroumis2 min read
Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings

Alibaba Group has officially claimed ownership of HappyHorse-1.0, a video AI model that first appeared anonymously on the Artificial Analysis Video Arena and quickly rose to the top of global leaderboards — surprising the AI community and sending Alibaba shares up over 2% on the news.

The reveal ends weeks of speculation about who was behind the "mystery model" that dethroned ByteDance's Seedance 2 and other well-funded competitors in blind user voting.

From Anonymous Submission to Global #1

HappyHorse-1.0 debuted without any company branding on the Artificial Analysis Video Arena, where users evaluate AI-generated videos side by side without knowing which model produced them. It reached #1 globally in both Text-to-Video (Elo 1333) and Image-to-Video (Elo 1392) categories, an unprecedented achievement for an anonymous submission.

The model was developed by a team formerly operating under Alibaba's Taotian Group Future Life Laboratory, led by Zhang Di — the former Vice President of Kuaishou and technical architect behind Kling AI, one of China's most prominent video generation platforms.

What Makes HappyHorse Different

At 15 billion parameters, HappyHorse-1.0 is notably compact compared to many frontier models, yet it introduces several technical firsts. It is reportedly one of the first open-weight models that natively generates synchronized dialogue, ambient sounds, and effects alongside video — eliminating the need for separate audio pipelines.

Key capabilities include:

Architecture

The model is built on a 40-layer self-attention Transformer that breaks from the popular DiT (Diffusion Transformer) approach. Rather than using cross-attention for text conditioning, HappyHorse places text, image, video, and audio tokens into a single unified sequence, with attention handling all modality fusion natively. The first and last four layers manage modality-specific embedding and decoding, while the middle 32 layers share parameters across all modalities.

Open Source and Commercially Licensed

In a move that distinguishes it from many competitors, Alibaba has announced that HappyHorse-1.0 will be fully open source with complete commercial licensing. Model weights, distilled variants, super-resolution modules, and inference code are expected to be released on GitHub, though as of early April 2026 the weights have not yet been made publicly available. The API is reportedly scheduled to open for access on April 30.

Implications for the AI Video Market

The release intensifies an already crowded AI video generation landscape where OpenAI's Sora has struggled commercially, Google's Veo continues to iterate, and Chinese labs like ByteDance and Kuaishou have been rapidly gaining ground. HappyHorse's combination of top-tier quality, open weights, and efficient inference could reshape pricing expectations across the sector — particularly for startups building on proprietary video APIs that now face a powerful open-source alternative.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

Google's Gemini 3.5 Flash Beats the Pro Tier on Agent Benchmarks — and Ships a Managed Agents API
Models

Google's Gemini 3.5 Flash Beats the Pro Tier on Agent Benchmarks — and Ships a Managed Agents API

At I/O 2026 Google shipped Gemini 3.5 Flash, a Flash-tier model that outscores Gemini 3.1 Pro on coding and agentic benchmarks at less than half the cost of comparable frontier models, alongside a Managed Agents API that spins up tool-using, code-executing agents in a single call.

5 days ago2 min read
Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost
Models

Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost

Gemini 3.2 Flash debuts at $0.25/M input and $2.00/M output tokens, hitting ~92% of GPT-5.5 on coding and reasoning while rolling out across Search, Maps, Gmail, and Chrome simultaneously.

6 days ago2 min read
Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI
Models

Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI

Mira Murati's Thinking Machines Lab released a research preview of 'interaction models,' a new class of full-duplex multimodal AI that listens, sees and speaks at the same time, with turn-taking latency reported at about 0.4 seconds.

1 week ago2 min read