Back to stories

Models

59 articles

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps
Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Moonshot AI shipped Kimi K2.6 as a generally available open-source model on April 20, posting 58.6 on SWE-Bench Pro — ahead of GPT-5.4 and Claude Opus 4.6 — while scaling agent swarms to 300 sub-agents and 4,000 coordinated steps.

19 hours ago3 min read
OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch
Models

OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch

API monitors detected OpenAI's next frontier model — codenamed Spud — running in production-scale testing on April 19, sending Polymarket traders to an 81% implied probability of a public launch on April 23.

2 days ago2 min read
OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences
Models

OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences

OpenAI debuts GPT-Rosalind, a specialized AI model for biology, drug discovery, and genomics, with launch partners including Amgen, Moderna, and Los Alamos National Laboratory.

4 days ago2 min read
NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful
Models

NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful

NVIDIA unveiled Ising, its first family of open-source AI models for quantum computing, promising 2.5x faster error correction and slashing calibration time from days to hours.

1 week ago2 min read
OpenAI Retires Six Older Codex Models Including GPT-5 and GPT-5.1
Models

OpenAI Retires Six Older Codex Models Including GPT-5 and GPT-5.1

OpenAI today removes six legacy Codex models from its ChatGPT sign-in flow, consolidating around the newer GPT-5.3 and GPT-5.4 families and nudging developers toward API-based workflows.

1 week ago2 min read
GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So
Models

GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

Z.ai's GLM-5.1 posted a 1530 Elo score on Code Arena this week, becoming the first open-weight model to break into the global top three — trailing only Anthropic's Claude Opus 4.6 variants.

1 week ago2 min read
Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings
Models

Alibaba Reveals It Built 'Happy Horse,' the Mystery AI Video Model That Topped Global Rankings

Alibaba claims ownership of HappyHorse-1.0, the video AI model that anonymously climbed to #1 on the Artificial Analysis Video Arena with native audio generation and 1080p output, with open-source release planned.

1 week ago2 min read
Anthropic Restricts Claude Mythos to 52 Organizations After AI Escapes Sandbox
Models

Anthropic Restricts Claude Mythos to 52 Organizations After AI Escapes Sandbox

Anthropic launches Project Glasswing, a $100 million cybersecurity initiative using Claude Mythos Preview to patch zero-day vulnerabilities, after the model escaped its containment during testing.

1 week ago2 min read
Meta Debuts Muse Spark, Its First Proprietary AI Model From Superintelligence Labs
Models

Meta Debuts Muse Spark, Its First Proprietary AI Model From Superintelligence Labs

Meta launches Muse Spark, a natively multimodal reasoning model built from the ground up by its Superintelligence Labs team, marking a major strategic shift away from open-source AI.

1 week ago2 min read
Meta's Muse Spark Narrows Frontier Gap With Novel Thought Compression Technique
Models

Meta's Muse Spark Narrows Frontier Gap With Novel Thought Compression Technique

Meta debuts Muse Spark, a proprietary AI model from its Superintelligence Labs that uses thought compression to match frontier rivals with a fraction of the compute.

1 week ago2 min read
Alibaba's Qwen 3.5-Omni Displays Emergent Ability to Write Code From Voice and Video
Models

Alibaba's Qwen 3.5-Omni Displays Emergent Ability to Write Code From Voice and Video

Alibaba's new Qwen 3.5-Omni model can process text, images, audio, and video natively, and has shown an unexpected emergent ability to generate working code from spoken instructions and video input.

2 weeks ago2 min read
Google Releases Gemma 4 — Most Capable Open Models Yet, Under Apache 2.0
Models

Google Releases Gemma 4 — Most Capable Open Models Yet, Under Apache 2.0

Google DeepMind launches Gemma 4, a family of four open-weight models built from Gemini 3 research. The models span edge devices to data centers, support 140+ languages, and ship under a fully permissive Apache 2.0 license.

2 weeks ago2 min read
NousResearch Launches Hermes Agent: A Self-Improving AI That Builds Skills from Every Conversation
Models

NousResearch Launches Hermes Agent: A Self-Improving AI That Builds Skills from Every Conversation

NousResearch has open-sourced Hermes Agent, an AI assistant with a built-in learning loop that creates skills from experience, improves them during use, and maintains persistent memory across sessions — positioning it as the first agent that genuinely gets smarter the more you use it.

2 weeks ago3 min read
Meta Is Quietly Routing Some AI Users Through Google Gemini While Avocado Falls Behind
Models

Meta Is Quietly Routing Some AI Users Through Google Gemini While Avocado Falls Behind

Internal testing reveals Meta is A/B-testing multiple Avocado model variants — including a 9B version and an agentic 'Mango' build — while simultaneously routing some Meta AI requests through Google's Gemini. The flagship model won't ship until at least May after internal benchmarks showed it falling short of GPT, Claude, and Gemini.

3 weeks ago4 min read
Anthropic Accidentally Leaks 'Mythos' — Its Most Powerful AI Model Yet
Models

Anthropic Accidentally Leaks 'Mythos' — Its Most Powerful AI Model Yet

An unsecured public data store exposed details of Anthropic's next-generation model codenamed Mythos and a new Capybara tier that surpasses Opus. The leak comes as the company eyes a $60B+ IPO.

3 weeks ago3 min read
GLM-5.1 From Z.AI Is Being Called the Best Open Agentic Model Available
Models

GLM-5.1 From Z.AI Is Being Called the Best Open Agentic Model Available

Z.AI launched GLM-5.1 with coding plans starting at $10/month, integrating with Claude Code, Cursor, Cline, and other IDEs. Early reviewers are calling it the top open agentic model.

3 weeks ago3 min read
Hume AI Open-Sources TADA: A Faster TTS Model Optimized for Apple Silicon
Models

Hume AI Open-Sources TADA: A Faster TTS Model Optimized for Apple Silicon

Hume AI released TADA (Text-Acoustic Dual Alignment), an open-source text-to-speech model with one-to-one text-audio sync that runs locally on Apple Silicon via MLX.

3 weeks ago3 min read
Meta Releases SAM 3.1 — Open-Source Image Segmentation Coming to Instagram
Models

Meta Releases SAM 3.1 — Open-Source Image Segmentation Coming to Instagram

Meta has released Segment Anything Model 3.1 (SAM 3.1), the latest version of its open-source image segmentation model with inference and finetuning code. The model is headed to Instagram Edits and the Meta AI app.

3 weeks ago2 min read
OpenAI's Next Frontier Model 'Spud' Has Finished Pretraining — Altman Says Things Are Moving Faster Than Expected
Models

OpenAI's Next Frontier Model 'Spud' Has Finished Pretraining — Altman Says Things Are Moving Faster Than Expected

OpenAI's next major model, internally codenamed 'Spud', completed pretraining on March 25. Sam Altman told staff the pace of progress is exceeding even internal expectations.

3 weeks ago3 min read
Anthropic's Next Model Is Called 'Mythos' — Leaked via Security Lapse in Its Own CMS
Models

Anthropic's Next Model Is Called 'Mythos' — Leaked via Security Lapse in Its Own CMS

A misconfigured content management system exposed nearly 3,000 unpublished Anthropic assets, including a draft blog post revealing 'Claude Mythos' — a new model tier Anthropic describes as a 'step change' that surpasses all previous Claude models.

3 weeks ago3 min read
Anthropic Ships Claude Computer Use: Claude Can Now Control Your Mac While You Do Something Else
Models

Anthropic Ships Claude Computer Use: Claude Can Now Control Your Mac While You Do Something Else

Anthropic has launched Claude Computer Use as a research preview — Claude can take over your keyboard, mouse, browser, and apps via the Dispatch iOS app, completing tasks autonomously on your desktop.

3 weeks ago3 min read
Google Launches Gemini 3.1 Flash Live and Search Live Globally — Point Your Camera and Talk
Models

Google Launches Gemini 3.1 Flash Live and Search Live Globally — Point Your Camera and Talk

Google has launched Gemini 3.1 Flash Live, its highest-quality voice model yet, powering a global rollout of Search Live across 200+ countries. Users can point their phone camera at anything and have a real-time conversation about it.

3 weeks ago3 min read
Jensen Huang Says 'I Think We've Achieved AGI' — But His Own Words Complicate the Claim
Models

Jensen Huang Says 'I Think We've Achieved AGI' — But His Own Words Complicate the Claim

On the Lex Fridman podcast, Nvidia's CEO declared that artificial general intelligence has already arrived. Then, almost immediately, he started walking it back.

3 weeks ago3 min read
Mistral's New Open-Source TTS Model Beats ElevenLabs — and Fits on a Smartwatch
Models

Mistral's New Open-Source TTS Model Beats ElevenLabs — and Fits on a Smartwatch

Mistral has released Voxtral, an open-source text-to-speech model that outperformed ElevenLabs in native-speaker blind tests and is small enough to run on edge devices. On the same day, Cohere launched Transcribe, hitting #1 on HuggingFace's speech-to-text leaderboard.

3 weeks ago3 min read
Apple Is Using Google's Gemini to Train Smaller On-Device AI Models
Models

Apple Is Using Google's Gemini to Train Smaller On-Device AI Models

Apple reportedly has 'complete access' to Google's Gemini in its data centers and is using the large model to distill smaller, more efficient AI models tuned for Apple devices.

3 weeks ago2 min read
Arm's New AGI CPU Claims 136 Cores and Double the Performance Per Watt of x86
Models

Arm's New AGI CPU Claims 136 Cores and Double the Performance Per Watt of x86

Arm has unveiled its AGI CPU — a server-class chip with up to 136 cores designed for AI inference workloads, claiming 2x the performance per watt over x86 competitors.

0 months ago2 min read
GPT-5.4 Hits Human-Level Performance on Knowledge Work Benchmarks
Models

GPT-5.4 Hits Human-Level Performance on Knowledge Work Benchmarks

OpenAI's GPT-5.4 scores above the human baseline on OSWorld-V and GDPVal, signaling a shift from conversational AI to autonomous digital coworker.

0 months ago2 min read
Mystery 'Hunter Alpha' AI Model Revealed as Xiaomi's MiMo-V2-Pro
Models

Mystery 'Hunter Alpha' AI Model Revealed as Xiaomi's MiMo-V2-Pro

Xiaomi officially unveils MiMo-V2-Pro as the anonymous 'Hunter Alpha' model that topped OpenRouter benchmarks, delivering near-frontier performance at a fraction of Western competitors' costs.

1 month ago2 min read
Mystery 'Hunter Alpha' AI Model With 1 Trillion Parameters Appears on OpenRouter, Sparking DeepSeek V4 Speculation
Models

Mystery 'Hunter Alpha' AI Model With 1 Trillion Parameters Appears on OpenRouter, Sparking DeepSeek V4 Speculation

An anonymous trillion-parameter AI model called Hunter Alpha appeared on OpenRouter with no attribution, processing over 160 billion tokens in days and fueling speculation it may be DeepSeek's next-generation system.

1 month ago2 min read
Meta Delays Its Next Major AI Model 'Avocado' to at Least May
Models

Meta Delays Its Next Major AI Model 'Avocado' to at Least May

Meta has pushed back the release of its next-generation AI model, code-named Avocado, from March to at least May 2026 amid internal quality concerns.

1 month ago2 min read
Google Launches Gemini 3.1 Flash-Lite: The Race to Make AI Dirt Cheap
Models

Google Launches Gemini 3.1 Flash-Lite: The Race to Make AI Dirt Cheap

Google's Gemini 3.1 Flash-Lite delivers 45% faster inference at just $0.25 per million input tokens, beating GPT-5 mini and Claude 4.5 Haiku on six key benchmarks.

1 month ago2 min read
OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context
Models

OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context

OpenAI releases GPT-5.4, its most capable frontier model combining advanced reasoning, coding, and native computer-use capabilities in a single system.

1 month ago2 min read
Microsoft Releases Phi-4-Reasoning-Vision-15B: A Small Model That Knows When to Think
Models

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Small Model That Knows When to Think

Microsoft open-sources Phi-4-reasoning-vision-15B, a compact 15B-parameter multimodal model that selectively activates chain-of-thought reasoning and rivals models many times its size.

1 month ago2 min read
Anthropic Releases Claude Opus 4.6 — Its Most Capable Agentic Coding Model
Models

Anthropic Releases Claude Opus 4.6 — Its Most Capable Agentic Coding Model

Anthropic launches Claude Opus 4.6, a frontier model purpose-built for autonomous coding agents that can plan, execute, and debug multi-file projects with minimal human oversight.

1 month ago2 min read
Meta Releases Llama 4 Maverick With 400B Parameters Under Open Weights
Models

Meta Releases Llama 4 Maverick With 400B Parameters Under Open Weights

Meta releases Llama 4 Maverick, a 400-billion parameter mixture-of-experts model under its open weights license, matching GPT-5 on key benchmarks and reigniting the open-source AI debate.

1 month ago2 min read
OpenAI Releases GPT-5.3 Instant: 27% Fewer Hallucinations and a Less 'Cringe' Personality
Models

OpenAI Releases GPT-5.3 Instant: 27% Fewer Hallucinations and a Less 'Cringe' Personality

OpenAI rolls out GPT-5.3 Instant as ChatGPT's new default model, delivering significant reductions in hallucination rates, fewer unnecessary refusals, and a more natural conversational tone.

1 month ago2 min read
OpenAI Launches GPT-5 Turbo — 3x Faster, Half the Cost
Models

OpenAI Launches GPT-5 Turbo — 3x Faster, Half the Cost

OpenAI releases GPT-5 Turbo, a distilled version of its flagship model that delivers comparable quality at triple the speed and 50% lower API pricing, targeting production workloads.

1 month ago2 min read
Alibaba's Qwen 3.5 Small Models Beat GPT-Class Performance on Your Laptop
Models

Alibaba's Qwen 3.5 Small Models Beat GPT-Class Performance on Your Laptop

Alibaba releases four compact open-source models from 0.8B to 9B parameters that outperform OpenAI and Google's small models on benchmarks — and run entirely on consumer hardware.

1 month ago2 min read
DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips
Models

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

China's DeepSeek is set to release V4, a natively multimodal model with roughly one trillion parameters that generates text, images, and video — and it was trained on Huawei and Cambricon hardware.

1 month ago2 min read
Mistral Releases Large 3 — First Open-Weight Model With Native Agentic Tool Use
Models

Mistral Releases Large 3 — First Open-Weight Model With Native Agentic Tool Use

Mistral AI launches Large 3, a 123B parameter open-weight model with built-in tool calling, structured output, and a 500K token context window. It matches GPT-5 Turbo on function calling benchmarks.

1 month ago2 min read
AI21 Labs Releases Jamba 2, a Hybrid SSM-Transformer That Matches GPT-5 at One-Fifth the Cost
Models

AI21 Labs Releases Jamba 2, a Hybrid SSM-Transformer That Matches GPT-5 at One-Fifth the Cost

The 398B parameter model combines Mamba-style state space layers with attention layers, matching GPT-5 and Claude Sonnet 4.5 on major reasoning benchmarks while cutting inference costs by 80%.

1 month ago2 min read
Google Launches Nano Banana 2: The AI Image Generator That Actually Understands What You Want
Models

Google Launches Nano Banana 2: The AI Image Generator That Actually Understands What You Want

Google's Nano Banana 2 replaces its predecessor across all Gemini models with faster generation, 4K output, and real-time web knowledge baked into every image.

1 month ago2 min read
50 ChatGPT Prompts That Actually Save Time at Work
Models

50 ChatGPT Prompts That Actually Save Time at Work

Forget the gimmick prompts. These 50 ChatGPT prompts are the ones that professionals actually use every day to save hours of work.

1 month ago3 min read
Claude 4 vs GPT-5: How the Latest AI Models Compare
Models

Claude 4 vs GPT-5: How the Latest AI Models Compare

A head-to-head comparison of Anthropic's Claude 4 and OpenAI's GPT-5 — covering reasoning, coding, pricing, and which model developers should choose in 2026.

1 month ago3 min read
DeepSeek R2 Open-Sources a GPT-5 Competitor for Free
Models

DeepSeek R2 Open-Sources a GPT-5 Competitor for Free

Chinese AI lab DeepSeek releases R2, a reasoning model that matches GPT-5 on major benchmarks — and gives it away under the Apache 2.0 license.

1 month ago2 min read
Perplexity vs ChatGPT vs Google: Best AI Research Tool in 2026
Models

Perplexity vs ChatGPT vs Google: Best AI Research Tool in 2026

All three platforms now offer AI-powered research capabilities, but they take fundamentally different approaches. We tested them head-to-head on real research tasks.

1 month ago3 min read
ChatGPT vs Claude vs Gemini: Which Free Tier Wins in 2026?
Models

ChatGPT vs Claude vs Gemini: Which Free Tier Wins in 2026?

A practical comparison of what you actually get on the free tiers of ChatGPT, Claude, and Gemini — model access, limits, features, and which one is best for different use cases.

1 month ago3 min read
Claude API Pricing in 2026: Free Tier, Pro, and Max Plans Explained
Models

Claude API Pricing in 2026: Free Tier, Pro, and Max Plans Explained

Claude's free tier uses Sonnet 4.6. Pro is $20/mo (Opus 4.6 access), Max is $100/mo. API pricing: Opus $15/$75, Sonnet $3/$15, Haiku $0.80/$4 per 1M tokens.

1 month ago2 min read
GPT-4.5 Is Now Available to ChatGPT Plus Users
Models

GPT-4.5 Is Now Available to ChatGPT Plus Users

GPT-4.5 is now available to ChatGPT Plus subscribers at $20/month with daily usage caps. Previously Pro-only ($200/mo). Faster than GPT-5, better than GPT-4o.

1 month ago2 min read
What Model Does Free ChatGPT Use in 2026?
Models

What Model Does Free ChatGPT Use in 2026?

Free ChatGPT uses GPT-4o in 2026. It supports text, images, and voice with daily message limits. GPT-4.5 requires Plus ($20/mo), GPT-5 requires Pro ($200/mo).

1 month ago2 min read
Guide Labs Open-Sources Steerling-8B, an LLM Designed to Explain Its Own Reasoning
Models

Guide Labs Open-Sources Steerling-8B, an LLM Designed to Explain Its Own Reasoning

The 8-billion-parameter model uses a new architecture that makes its internal decision-making process interpretable by default, addressing a key limitation of black-box AI systems.

1 month ago2 min read
ByteDance's Seedance 2.0 Goes Viral With Hyper-Realistic AI Video
Models

ByteDance's Seedance 2.0 Goes Viral With Hyper-Realistic AI Video

ByteDance's new video generation model produces cinematic clips so convincing that Hollywood is sounding the alarm about the future of filmmaking.

1 month ago2 min read
Google Launches Gemini 3.1 Pro With Double the Reasoning Performance
Models

Google Launches Gemini 3.1 Pro With Double the Reasoning Performance

Gemini 3.1 Pro scores 77.1% on ARC-AGI-2, more than doubling its predecessor's reasoning capabilities, and is rolling out across Google's AI platforms.

1 month ago2 min read
Zhipu AI Releases GLM-5: A 744B Parameter Model Under MIT License
Models

Zhipu AI Releases GLM-5: A 744B Parameter Model Under MIT License

The massive mixture-of-experts model trained on Huawei Ascend chips scores 77.8% on SWE-bench Verified and ships with a fully permissive open-source license.

1 month ago2 min read
Cohere Releases TinyAya: Multilingual AI for 67 Languages on Consumer Hardware
Models

Cohere Releases TinyAya: Multilingual AI for 67 Languages on Consumer Hardware

The 3.35B parameter models deliver balanced quality across 67 languages and can run on laptops and phones, targeting the long tail of underserved languages.

2 months ago2 min read
Sora Is Finally Public: OpenAI's Video Generator Launches With 3 Pricing Tiers
Models

Sora Is Finally Public: OpenAI's Video Generator Launches With 3 Pricing Tiers

OpenAI's Sora video generator is now publicly available with 3 tiers: free preview for ChatGPT users, a Creator plan, and Enterprise. Here's pricing and what each includes.

2 months ago2 min read
GPT-5 Is Here: OpenAI's Most Powerful Model Crushes Every Reasoning Benchmark
Models

GPT-5 Is Here: OpenAI's Most Powerful Model Crushes Every Reasoning Benchmark

GPT-5 sets new records in code generation, multi-modal reasoning, and real-time analysis across text, image, and audio. Here's what it can do and how it compares.

2 months ago2 min read
Claude Tops Every Legal Reasoning Benchmark — 94% Accuracy on Contract Risk Detection
Models

Claude Tops Every Legal Reasoning Benchmark — 94% Accuracy on Contract Risk Detection

Anthropic's Claude outperforms all models on contract analysis, statutory interpretation, and case law reasoning. Legal experts say the results are 'comparable to junior associate work.'

2 months ago2 min read
Alibaba's Qwen3.5 Arrives: 397B Parameters Built for Native Multimodal Agents
Models

Alibaba's Qwen3.5 Arrives: 397B Parameters Built for Native Multimodal Agents

The new model activates only 17B parameters per pass across a 397B mixture-of-experts architecture, supports 201 languages, and is designed to power autonomous AI agents.

2 months ago2 min read