• 隐私政策
  • 联系我们
  • 关于我们
2026 年 5 月 8 日 星期五
聚赢方舟
广告
  • 首页
  • 快讯 7x24
  • 行业新闻
  • 商业动态
  • 股市风云
  • 期货研报
  • 基金财讯
  • 贵金属
No Result
View All Result
  • 首页
  • 快讯 7x24
  • 行业新闻
  • 商业动态
  • 股市风云
  • 期货研报
  • 基金财讯
  • 贵金属
No Result
View All Result
聚赢方舟
No Result
View All Result
Home 商业动态

OpenAI Unveils GPT-5.2 to Counter Google's Gemini 3 Dominance, Claims Strongest Agent Coding Capabilities

by 聚赢方舟
5 月 ago
in 商业动态
Reading Time: 4 mins read
A A
分享至微博分享给朋友


ADVERTISEMENT

OpenAI on Thursday launched GPT-5.2, its most advanced artificial intelligence (AI) model, firing the first shot in its battle against Google's Gemini 3. The new model positions OpenAI to reclaim leadership in AI development after weeks of losing ground to rivals.

AI Generated Image

AI Generated Image

The company released GPT-5.2 on Thursday in three tiers across ChatGPT and its API platform. Paid ChatGPT users on Plus, Pro, Go, Business and Enterprise plans gain immediate access, while subscribers can continue using GPT-5.1 for three additional months before the legacy model sunsets. API developers received instant availability of GPT-5.2 Thinking at $1.75 per million input tokens and $14 per million output tokens, with a 90% discount on cached inputs.

The launch comes weeks after CEO Sam Altman declared a "code red" in an internal memo, redirecting resources to ChatGPT improvements as the company faced declining traffic and market share losses to Google. OpenAI says GPT-5.2 surpasses human experts at professional tasks, setting new benchmarks in coding, mathematical reasoning and scientific research.

The competitive stakes have intensified as Google's Gemini 3 topped LMArena leaderboards and earned widespread praise for reasoning capabilities, threatening OpenAI's first-mover advantage at a time when the startup has committed over $1 trillion to AI infrastructure development.

Professional Performance Reaches Expert Level

GPT-5.2 represents OpenAI's first model to match or exceed human expert performance on GDPval, a benchmark measuring well-specified knowledge work across 44 occupations. The model beats or ties top industry professionals on 70.9% of comparisons, according to expert judges evaluating tasks spanning presentations, spreadsheets and other professional deliverables.

The model delivers these results at 11 times the speed and less than 1% the cost of expert professionals, OpenAI said. On internal benchmarks testing junior investment banking analyst tasks, GPT-5.2 Thinking scored 68.4%, a 9.3 percentage point improvement over GPT-5.1's 59.1%.

One GDPval judge reviewing outputs commented the work "appears to have been done by a professional company with staff, and has a surprisingly well designed layout and advice."

Coding Capabilities Target Developer Market

GPT-5.2 Thinking achieved 55.6% on SWE-Bench Pro, a rigorous evaluation testing real-world software engineering across four programming languages. The model reached 80% on SWE-bench Verified, OpenAI's new high.

Coding platforms reported measurable improvements. Jeff Wang, CEO of Windsurf, said GPT-5.2 "represents the biggest leap for GPT models in agentic coding since GPT-5" and enabled his company to collapse fragile multi-agent systems into single mega-agents with 20-plus tools. Cognition, Warp, Charlie Labs, JetBrains and Augment Code reported state-of-the-art agentic coding performance.

Research lead Adain Clark told reporters that stronger mathematical reasoning translates across workloads. "These are all properties that really matter across a wide range of different workloads," Clark said, citing financial modeling, forecasting and data analysis as key applications.

Product lead Max Schwarzer said GPT-5.2 Thinking responses contain 38% fewer errors than its predecessor, making the model more dependable for daily decision-making and research.

Scientific Research and Mathematical Breakthroughs

OpenAI positions GPT-5.2 Pro and Thinking as the world's best models for accelerating scientific research. GPT-5.2 Pro scored 93.2% on GPQA Diamond, a graduate-level benchmark testing science knowledge, while GPT-5.2 Thinking achieved 92.4%.

On FrontierMath expert-level mathematics problems, GPT-5.2 Thinking solved 40.3% of Tier 1-3 challenges, setting a new state of the art. The model became the first to cross 90% on ARC-AGI-1, improving from o3-preview's 87% while reducing costs by approximately 390 times.

In recent research, GPT-5.2 Pro helped researchers explore an open question in statistical learning theory, proposing a proof subsequently verified by authors and external experts. The company said this demonstrates how frontier models can assist mathematical research under human oversight.

Strategic Response to Competitive Pressure

Fidji Simo, CEO of applications at OpenAI, told CNBC that GPT-5.2 development spanned many months, predating the recent code red directive. "While we are proud that we are able to have a cadence of releasing models fast, this particular integration has been in the works for a while," Simo said.

Altman told CNBC on Thursday that "Gemini 3 has had less of an impact on our metrics than maybe we feared." He said he expects OpenAI to exit code red mode by January "in a very strong position."

The company has committed more than $1 trillion to AI infrastructure alongside partners NVIDIA and Microsoft. Azure data centers and NVIDIA GPUs, including H100, H200 and GB200-NVL72, underpin OpenAI's training infrastructure.

However, the focus on compute-intensive reasoning models presents financial challenges. GPT-5.2's Thinking and Pro modes consume significantly more computing resources than standard chatbots, potentially creating pressure as OpenAI already spends more on inference compute than previously disclosed, according to recent reports.

New Safety Features and Product Roadmap

OpenAI announced it has begun rolling out age prediction software to apply content protections for users under 18. Simo said the company plans to launch "adult mode" in the first quarter of 2025, allowing uses such as "erotica for verified adults."

The company strengthened responses to sensitive conversations, with improvements in handling prompts indicating suicide risk, self-harm, mental health distress or emotional reliance on the model. Details appear in the updated GPT-5.2 System Card.

OpenAI has no current plans to deprecate GPT-5.1, GPT-5 or GPT-4.1 in the API and will provide advance notice of any future deprecations. The company expects to release a Codex-optimized version of GPT-5.2 in coming weeks.

Enterprise partners including Notion, Box, Shopify, Harvey, Zoom, Databricks, Hex and Triple Whale reported state-of-the-art performance for long-horizon reasoning, tool-calling, data science and document analysis tasks.

聚赢方舟

专业财经网站

聚赢方舟 (arkxx.com) 网站是长沙聚赢方舟文化传媒有限公司旗下运营的财经资讯门户网站。聚赢方舟致力于为用户提供全面而深入的财经资讯与金融数据分析。网站汇集了最新的市场行情、股票动态、投资策略以及经济趋势,为投资者和财经行业人士提供及时的新闻参考。网站通过高效的数据处理与分析工具,聚赢方舟帮助用户把握市场机会,优化投资决策。

此外,网站还定期发布专业的市场评估报告和财经评论,确保用户能够获得最准确的市场洞察。

方舟日历

2026 年 5 月
一 二 三 四 五 六 日
 123
45678910
11121314151617
18192021222324
25262728293031
« 4 月    

标签

中国 中国企业 也不 买了 互联网 假日 养老金 北大 千元 印度 反超 奶茶 家族 工龄 怎么回事 或将 房价 房贷 新能源 新闻 日本 更大 有什么 村官 来了 楼市 江苏 沙特 浙江 特斯拉 电动车 石油 美元 美国 美籍 节日 芯片 让人 越南 长假 防晒 阿里 阿里巴巴 院士 首富

© 2025 长沙聚赢方舟文化传媒有限公司 by 聚赢方舟 - 湘 ICP 备 2025135270 号-1

No Result
View All Result
  • Home

© 2025 长沙聚赢方舟文化传媒有限公司 by 聚赢方舟 - 湘 ICP 备 2025135270 号-1

此网站使用 cookie。继续使用本网站即表示您同意使用 cookie。访问隐私和 cookie 策略.。