Jumpstart your fitness journey with 33% off the Renpho Smart Scale

2026年2月19日 · 赵敏 · 来源：tutorial新闻网

在 FrontierScience-Olympiad 上，启用工具的 UniScientist 得分 71.0，匹配 Claude Opus 4.5，超越多个其他前沿模型。在多项分布外的基准——DeepResearch Bench、DeepResearch Bench II 和 ResearchRubrics 上——模型的表现与一系列顶级闭源系统相当。

2 days agoShareSave

В Европе и ，推荐阅读safew 官网入口获取更多信息

AI女友Clawra，图源：David Im

METR’s randomized controlled trial (July 2025; updated February 24, 2026) with 16 experienced open-source developers found that participants using AI were 19% slower, not faster. Developers expected AI to speed them up, and after the measured slowdown had already occurred, they still believed AI had sped them up by 20%. These were not junior developers but experienced open-source maintainers. If even THEY could not tell in this setup, subjective impressions alone are probably not a reliable performance measure.

Стало изве

关于作者