🤖 AI Agent 研究Research
【arXiv】在本文中,我们提出了EEVEE ,这是第一个针对LLM代理的多数据集测试时提示学习框架,可在真实任务流下实现测试时提示学习。现有的方法主要是为单个数据集设置而设计的,而
【arXiv】In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt learning under real-world task streams. Existing methods are largely designed for single-dataset settings, while re
【arXiv】数据讲述塑造社会的故事;数据记者的工作是将原始信息转化为非专家可以信任的故事。高质量的新闻功能需要新闻编辑室团队花费数周的时间:寻找背景,运行统计数据,选择角度,以及
【arXiv】Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and d
【arXiv】大型语言模型( LLM )越来越多地被描述为人类专家在知识经济任务方面的表现。这些声明主要基于LLM在衡量跨标准平均绩效的基准任务上的表现
【arXiv】Large Language Models (LLMs) are increasingly described as performing at the level of human experts on knowledge economy tasks. These claims are primarily based on how LLMs perform on benchmarking tasks that measure average performance across standar
【arXiv】大型语言模型( LLM )正在迅速获得与生物研究相关的能力,从文献合成到实验数据的解释。LLM代理商也可以越来越多地执行以前需要的硅生物学任务
【arXiv】Large language models (LLMs) are rapidly acquiring capabilities relevant to biological research, from literature synthesis to interpretation of experimental data. Increasingly, LLM agents can also perform in silico biology tasks that previously requi
⭐ GitHub 热门项目GitHub Trending
【GitHub】CLI计算器接受自然语言和单位转换。可用作AI agent的CLI、Python库或MCP服务器。(⭐ 0 )
【GitHub】CLI calculator accepting natural language and unit conversion. Can be used as a CLI, python library, or MCP server for AI agents. (⭐ 0)
【GitHub】为并行而生的 Agent 运行时——Go 写的开源终端 Coding Agent(类 Claude Code),可接入任意模型,自带心跳与自动做梦 | A parallel-native agent runtime in Go: an open-source, Claude Code-style coding agent. Any model, heartbeat & dreaming. (⭐ 1)
【GitHub】为并行而生的 Agent 运行时——Go 写的开源终端 Coding Agent(类 Claude Code),可接入任意模型,自带心跳与自动做梦 | A parallel-native agent runtime in Go: an open-source, Claude Code-style coding agent. Any model, heartbeat & dreaming. (⭐ 1)
【GitHub】我们的开源项目是SurfaceProxy (或SentrySurface Agent网关)。它是专为AI原生开发而设计的轻量级、高性能语义代理和安全防火墙(⭐ 2 )
【GitHub】Our open-source project is SurfaceProxy (or SentrySurface Agent Gateway). It is a lightweight, high-performance Semantic Proxy and Security Firewall designed specifically for AI-native development to (⭐ 2)
【GitHub】DeepSeek代码代理框架-开源Claude代码竞争对手(⭐ 2 )
【GitHub】DeepSeek Code Agent Framework - Open-source Claude Code competitor (⭐ 2)
🚀 模型与行业动态Models & Industry
谷歌刚刚大大降低了享受其预算AI订阅层的成本。
Google just made it significantly cheaper to enjoy its budget AI subscription tier.
这位Sabertooth风险投资创始人没有花一年时间筹集正式的风险基金,而是利用有限合伙人的专属网络投资于Anthropic、Anduril和SpaceX等初创公司。
Instead of spending a year raising a formal venture fund, the Sabertooth VC founder used a captive network of LPs to invest in startups like Anthropic, Anduril, and SpaceX.
我迫切需要一个人工智能助理,但我真的想成为那种没有手机中友好的机器人声音就无法工作的人吗?
I'm desperate for a personal AI assistant, but do I really want to become the kind of person who can't function without the friendly robot voice in my phone?
Anthropic的Claude Fable 5将成为网络氛围编码器的大热门。
Anthropic's Claude Fable 5 is going to be a big hit with the web's vibe coders.
如果这些相同的人工智能工作负载可以通过更便宜的模型来处理,而不会影响质量,那将意味着人工智能经济的巨大转变。
If those same AI workloads can be handled by cheaper models without affecting quality, it would mean a massive shift in the economics of AI.
🔥 社区热议Community
【Lobsters】热度: 61↑ | 24 评论 | 标签: compilers, debugging
【Lobsters】热度: 61↑ | 24 评论 | 标签: compilers, debugging
【Lobsters】热度: 81↑ | 49 评论 | 标签: vibecoding
【Lobsters】热度: 81↑ | 49 评论 | 标签: vibecoding
【Lobsters】热度: 5↑ | 0 评论 | 标签: vcs, vibecoding
【Lobsters】热度: 5↑ | 0 评论 | 标签: vcs, vibecoding
【HN】热度: 436 分 | 177 评论
【HN】热度: 436 分 | 177 评论
【HN】热度: 16 分 | 2 评论
【HN】热度: 16 分 | 2 评论