🤖 AI Agent 研究Research
【arXiv】奖励模型( RM )为LLM培训后提供关键反馈信号,特别是在加强微调( RFT )和强化学习( RL )管道中。但是,当前的奖励评估依赖于基于规则的版本等异构标准
【arXiv】Reward models (RMs) provide critical feedback signals for LLM post-training, notably in reinforced fine-tuning (RFT) and reinforcement learning (RL) pipelines. However, current reward evaluation relies on heterogeneous criteria such as rule-based ver
【arXiv】我们介绍AlignAtt4LLM ,这是一款适用于英语到德语、意大利语和中文的IWSLT 2026同步语音翻译系统。系统是一个同步级联:强制对齐的Qwen3-ASR生成一个增量更新的源转录本,并且
【arXiv】We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and
【arXiv】大型语言模型通过扩展思维链推理提高了最终答案的准确性,但通常花费代币效率低下,几乎没有推理时间控制。现有的有效推理方法通过缩短时间来控制思维长度,
【arXiv】Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and offer little inference-time control. Existing efficient reasoning methods control thinking length by shortening,
【arXiv】深度强化学习已显示出强大的潜力,使自主机器人能够学习复杂的导航任务。然而,它的实际使用仍然在很大程度上依赖于人类设计的奖励功能和重复的手动微调,
【arXiv】Deep reinforcement learning has shown strong potential for enabling autonomous robots to learn complex navigational tasks. However, its practical use still depends heavily on human designed reward functions and repeated manual fine tuning, which is t
⭐ GitHub 热门项目GitHub Trending
【GitHub】用于指导绩效考核、工作报告、git提交摘要和Excel模板填写的开源Claude Code/Codex技能。(⭐ 0 )
【GitHub】An open-source Claude Code / Codex skill for guided performance reviews, work reports, git commit summaries, and Excel template filling. (⭐ 0)
【GitHub】聚合—来自10个开源项目的1005个Claude Code子代理。原作者的全部功劳(见ATTRIBUTION.md )。(⭐ 0 )
【GitHub】AGGREGATION — 1005 Claude Code subagents from 10 open-source projects. Full credit to original authors (see ATTRIBUTION.md). (⭐ 0)
【GitHub】开源代理技能+托尔斯泰的Claude Code插件—通过托尔斯泰MCP创建营销视频/图像并重新组合您的库。(⭐ 0 )
【GitHub】Open-source agent skills + Claude Code plugin for Tolstoy — create marketing videos/images and remix your library, via the Tolstoy MCP. (⭐ 0)
🚀 模型与行业动态Models & Industry
这家网络安全公司即将获得由Evolution Equity Partners领投的3亿美元$一轮融资。
The cybersecurity company is nearing a $300 million round led by Evolution Equity Partners.
据报道,在优步鼓励员工尽可能多地使用人工智能之后,优步进行了裁员。
Uber's cutback has occurred after the company had reportedly encouraged staff to use AI as much as possible.
微软周二结束了自适应规范驱动的评估和回归测试评分,这是一个用于启动人工智能评估的开源框架。
Microsoft on Tuesday took the wraps off Adaptive Spec-driven Scoring for Evaluation and Regression Testing, an open source framework for spinning up AI evaluations.
需要注意的是,世界上最著名的在世导演之一将该技术仅用于情节提要。
The caveat is that one of the world's most famous living directors is using the tech solely for storyboarding.
在Build上推出的Microsoft Scout是一款新的人工智能助手,旨在将OpenClaw的强大功能和灵活性带入Microsoft 365系统。
Launched at Build, Microsoft Scout is a new AI assistant meant to bring the power and flexibility of OpenClaw into the Microsoft 365 system.
🔥 社区热议Community
【Lobsters】热度: 4↑ | 4 评论 | 标签: ai, windows
【Lobsters】热度: 4↑ | 4 评论 | 标签: ai, windows
【Lobsters】热度: 28↑ | 1 评论 | 标签: linux, ml
【Lobsters】热度: 28↑ | 1 评论 | 标签: linux, ml
【HN】热度: 52 分 | 42 评论
【HN】热度: 52 分 | 42 评论