当AI编程智能体处理复杂多步骤问题时,其产生的上下文信息量远超任何模型单次可容纳的内存。通常的解决方案——截断旧上下文或使用独立模型进行总结——会导致智能体丢失关键信息并产生连锁错误。Cursor的方法是在强化学习过程中,训练模型自身在任务中途压缩其工作记忆。当Composer 2接近上下文限制时,它会暂停,将所有内容压缩至大约1000个令牌,然后继续。这些总结的有效性会根据其是否有助于完成整体任务而受到奖励或惩罚,从而使模型通过数千次训练运行学会保留什么、舍弃什么。
Perhaps the most revealing case involves financial institutions. Salamanca described developing models for proprietary quantitative languages—closely protected intellectual property never exposed to cloud AI services. Using Forge’s reinforcement learning features, Mistral helped a hedge fund establish custom benchmarks and train models to exceed them, creating “a unique model delivering essential competitive advantage.”
,更多细节参见欧易下载
Lot, and are called Gen. 19.13. Men; and to whom, though they were but
«Крылья Советов» по пенальти переиграли «Локомотив» в Кубке России20:46