OpenAI Unveils GPT-5.4 With Advanced Pro & Thinking Versions

OpenAI on Thursday introduced GPT-5.4, a new frontier AI model the company describes as its most capable and efficient system for professional work to date.

Alongside the standard version, the company released two specialized variants: GPT-5.4 Thinking, designed for advanced reasoning tasks, and GPT-5.4 Pro, optimized for high-performance workloads and demanding applications.

Larger Context Window for Developers

The API version of GPT-5. 4 also has a context window of 1 million tokens, the highest that anything from OpenAI currently offers. These expanded context windows enable the model to process much more information in a single request; this makes GPT-4 useful for applications such as analyzing large amounts of unstructured text, generating reports from raw time-series data, or managing extremely long coding projects.

OpenAI also touted improvements in token efficiency, saying that GPT-5. It solves many tasks in fewer tokens than its predecessors. This refinement could lower the cost of operations for developers and companies that are building applications using the model.

Strong Benchmark Performance

GPT-5.4 delivered notable improvements across several AI performance benchmarks. According to OpenAI, the model achieved record scores in computer-use evaluations, including OSWorld-Verified and WebArena Verified, two benchmarks that measure how effectively AI systems interact with software environments and web tools.

The model also achieved an 83% score on OpenAI’s GDPval benchmark, which evaluates AI performance on knowledge-based professional tasks such as research, analysis, and business problem-solving.

Top Results in Professional Skills Tests

In addition to internal tests, GPT-5.4 performed strongly in Mercor’s APEX-Agents benchmark, which evaluates AI systems on professional tasks in fields like law and finance.

According to Brendan Foody, CEO of Mercor, the model demonstrated strong capabilities in producing complex professional deliverables.

“GPT-5.4 excels at producing long-horizon outputs such as slide presentations, financial models, and legal analysis,” Foody said. “It delivers top performance while running faster and at a lower cost than other frontier models.”

Reducing Hallucinations and Errors

OpenAI says the latest model also makes progress in addressing one of the biggest challenges in AI systems: hallucinations, or incorrect factual statements generated by models.

According to the company, GPT-5.4 is 33% less likely to produce incorrect individual claims compared with GPT-5.2. Overall responses are also 18% less likely to contain factual errors, representing a notable improvement in reliability.

New Tool Search System for Developers

As part of the release, OpenAI introduced an updated tool-calling system for the API version of GPT-5.4. The new system, called Tool Search, allows the model to locate tool definitions only when needed.

Previously, developers had to include descriptions for all available tools in the system prompt—a process that could use a large number of tokens when many tools were available. With Tool Search, the model retrieves tool information dynamically, making requests faster, more efficient, and cheaper.

New Safety Testing for Reasoning Models

OpenAI also introduced a new safety evaluation framework designed to monitor how AI models present their chain-of-thought reasoning—the step-by-step explanation models provide when solving complex problems.

AI safety researchers have previously raised concerns that reasoning models could misrepresent their thought processes under certain conditions. OpenAI’s testing suggests that the GPT-5.4 Thinking variant is less likely to behave deceptively, indicating that monitoring chain-of-thought outputs remains an effective safety measure.

OpenAI Unveils GPT-5.4 With Advanced Pro and Thinking Versions

Larger Context Window for Developers

Strong Benchmark Performance

Top Results in Professional Skills Tests

Reducing Hallucinations and Errors

New Tool Search System for Developers

New Safety Testing for Reasoning Models

Written by Hajra Naz

AWS Introduces Amazon Connect Health, an AI Agent Platform for Healthcare

Meta faces lawsuit over AI smart glasses after staff reviewed nudity, sex, and other footage