Alibaba Group has officially claimed ownership of HappyHorse-1.0, a video AI model that first appeared anonymously on the Artificial Analysis Video Arena and quickly rose to the top of global leaderboards — surprising the AI community and sending Alibaba shares up over 2% on the news.
The reveal ends weeks of speculation about who was behind the "mystery model" that dethroned ByteDance's Seedance 2 and other well-funded competitors in blind user voting.
From Anonymous Submission to Global #1
HappyHorse-1.0 debuted without any company branding on the Artificial Analysis Video Arena, where users evaluate AI-generated videos side by side without knowing which model produced them. It reached #1 globally in both Text-to-Video (Elo 1333) and Image-to-Video (Elo 1392) categories, an unprecedented achievement for an anonymous submission.
The model was developed by a team formerly operating under Alibaba's Taotian Group Future Life Laboratory, led by Zhang Di — the former Vice President of Kuaishou and technical architect behind Kling AI, one of China's most prominent video generation platforms.
What Makes HappyHorse Different
At 15 billion parameters, HappyHorse-1.0 is notably compact compared to many frontier models, yet it introduces several technical firsts. It is reportedly one of the first open-weight models that natively generates synchronized dialogue, ambient sounds, and effects alongside video — eliminating the need for separate audio pipelines.
Key capabilities include:
- 1080p cinematic output generated in approximately 38 seconds on a single H100 GPU
- Native lip-sync across seven languages: Mandarin, Cantonese, English, Japanese, Korean, German, and French
- 8-step denoising inference requiring no classifier-free guidance (CFG), significantly reducing compute costs
Architecture
The model is built on a 40-layer self-attention Transformer that breaks from the popular DiT (Diffusion Transformer) approach. Rather than using cross-attention for text conditioning, HappyHorse places text, image, video, and audio tokens into a single unified sequence, with attention handling all modality fusion natively. The first and last four layers manage modality-specific embedding and decoding, while the middle 32 layers share parameters across all modalities.
Open Source and Commercially Licensed
In a move that distinguishes it from many competitors, Alibaba has announced that HappyHorse-1.0 will be fully open source with complete commercial licensing. Model weights, distilled variants, super-resolution modules, and inference code are expected to be released on GitHub, though as of early April 2026 the weights have not yet been made publicly available. The API is reportedly scheduled to open for access on April 30.
Implications for the AI Video Market
The release intensifies an already crowded AI video generation landscape where OpenAI's Sora has struggled commercially, Google's Veo continues to iterate, and Chinese labs like ByteDance and Kuaishou have been rapidly gaining ground. HappyHorse's combination of top-tier quality, open weights, and efficient inference could reshape pricing expectations across the sector — particularly for startups building on proprietary video APIs that now face a powerful open-source alternative.



