OpenAI Releases GPT-5.4 — The First AI Model That Actually Does Your Job: 83% GDPval, Native Computer Use, 1M Context, Tool Search, and the Professional Workflow Era Officially Begins

OpenAI has released GPT-5.4 — simultaneously its most capable frontier model and its most decisive statement yet that AI has crossed from "answering questions" into "doing the work." Scoring 83% on GDPval's 44-occupation knowledge work benchmark, achieving 75% on OSWorld computer-control tasks (surpassing the human benchmark of 72.4%), slashing hallucinations by 33%, introducing native computer use for the first time in a general-purpose model, and launching Tool Search to cut token costs 47% in large agent ecosystems — GPT-5.4 is not an incremental update. It is OpenAI's declaration that professional workflows now have a new default executor.

Anthropic Releases Claude Opus 4.6 — Frontier Model Targets Financial Research, Deep Document Analysis, and “Days-to-Minutes” Workflow Automation

Anthropic has launched Claude Opus 4.6 on February 5, 2026, positioning it as its most capable model to date for enterprise agents and “professional work” — including financial analysis that can surface insights that would otherwise take days of manual compilation. Opus 4.6 adds major agentic upgrades (notably Agent Teams in research preview) and introduces up to a 1M-token context window (preview) for processing large collections of company filings, regulatory documents, spreadsheets, and research notes in a single workflow. The model is available to Claude paid tiers (Pro/Max/Team/Enterprise) and via API, as well as through Amazon Bedrock and other cloud platforms.

Telegram
Telegram
WhatsApp
WhatsApp