Back to stories
Models

OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context

Michael Ouroumis2 min read
OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context

OpenAI dropped its most ambitious model yet on March 5, and the AI community is still digesting what GPT-5.4 means for the industry. The new release merges the coding prowess of GPT-5.3-Codex with breakthrough computer-use capabilities, creating a single system that can reason, write code, and directly operate software interfaces.

What Makes GPT-5.4 Different

The headline feature is native computer-use. GPT-5.4 can view screenshots, move a cursor, click buttons, and type keystrokes — effectively operating any desktop or web application the way a human would. Previous models required third-party tooling or wrappers to achieve similar functionality. Now it is built into the model itself.

OpenAI also expanded the API context window to one million tokens, the largest the company has ever offered. For enterprise customers working with sprawling codebases or lengthy legal documents, this removes a persistent bottleneck.

Benchmark Performance

The numbers back up the marketing. GPT-5.4 set new records on OSWorld-Verified and WebArena-Verified, two benchmarks that measure a model's ability to complete real software tasks. It also scored 83 percent on OpenAI's internal GDPval test for knowledge work — tasks like drafting reports, analyzing spreadsheets, and managing project workflows.

Factual accuracy improved as well. Compared to GPT-5.2, individual claims are 33 percent less likely to be false, and full responses are 18 percent less likely to contain any factual errors.

Two Flavors at Launch

OpenAI released two variants simultaneously. GPT-5.4 Thinking is the default in ChatGPT, optimized for interactive conversations with visible chain-of-thought reasoning. GPT-5.4 Pro targets power users and enterprise customers who need maximum performance on complex, multi-step tasks.

Both versions are available through the API, where developers can access the full million-token context and integrate computer-use into agentic workflows.

Industry Implications

The release intensifies an already crowded March. Google shipped Gemini 3.1 Pro in late February, Anthropic updated Claude Sonnet 4.6, and DeepSeek V4 arrived days earlier with its own trillion-parameter multimodal architecture. The pace of releases has shifted from quarterly cadences to what industry trackers now describe as weekly waves.

More significantly, GPT-5.4 signals that the frontier is moving beyond chat and code generation toward autonomous task execution. Models that can see and operate software blur the line between assistant and agent — a shift that will reshape how businesses think about automation, workforce planning, and software design in the months ahead.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

OpenAI Releases GPT-5.5 'Spud', Tops Artificial Analysis Intelligence Index
Models

OpenAI Releases GPT-5.5 'Spud', Tops Artificial Analysis Intelligence Index

OpenAI launched GPT-5.5 on April 23, 2026. Codenamed 'Spud', the model scores 60 on the Artificial Analysis Intelligence Index, three points ahead of Claude Opus 4.7 and Gemini 3.1 Pro Preview.

5 min ago3 min read
OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device
Models

OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device

OpenAI released Privacy Filter, a 1.5B-parameter Apache 2.0 open-weight model that detects and redacts personal data locally, hitting 96% F1 on a standard PII benchmark.

3 hours ago2 min read
Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps
Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Moonshot AI shipped Kimi K2.6 as a generally available open-source model on April 20, posting 58.6 on SWE-Bench Pro — ahead of GPT-5.4 and Claude Opus 4.6 — while scaling agent swarms to 300 sub-agents and 4,000 coordinated steps.

2 days ago3 min read