OpenAI Declares SWE-bench Verified Obsolete for Measuring Frontier Coding Capabilities

OpenAI publishes a blog post officially declaring that SWE-bench Verified has saturated and can no longer effectively differentiate frontier AI models’ coding abilities.

2026-04-27 08:00 · 🤖 AI与科技 · goodinfo.net