2025.08.12 | ReasonRank提升段落排序推理;WideSearch评估智能体广域搜寻

2025.08.12 | ReasonRank提升段落排序推理;WideSearch评估智能体广域搜寻

7分钟 ·
播放数122
·
评论数0

本期的 15 篇论文如下:

00:18 🧠 ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability(ReasonRank:赋予段落排序强大推理能力)

00:41 🔍 WideSearch: Benchmarking Agentic Broad Info-Seeking(WideSearch:智能体广域信息搜寻基准测试)

01:01 ✨ Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation(Omni-Effects:统一且空间可控的视觉效果生成)

01:26 🧠 Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization(Klear-Reasoner:通过梯度保留剪裁策略优化提升推理能力)

01:59 💬 UserBench: An Interactive Gym Environment for User-Centric Agents(UserBench:面向用户中心智能体的交互式Gym基准环境)

02:22 💡 SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens(SONAR-LLM:以句子嵌入思考并以Token表达的自回归Transformer)

02:50 🌱 A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems(自进化AI智能体综合综述:连接基础模型与终身智能体系统的新范式)

03:15 🔬 BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent(BrowseComp-Plus:一种更公平透明的深度研究智能体评估基准)

03:45 🤖 MolmoAct: Action Reasoning Models that can Reason in Space(MolmoAct:可进行空间推理的动作推理模型)

04:11 🤖 OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks(OmniEAR:具身任务中智能体推理的基准测试)

04:38 💡 Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts(Grove MoE:面向高效卓越的伴随专家MoE大语言模型)

05:05 ⏳ Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future(时序自奖励语言模型:通过过去-未来解耦选择与拒绝)

05:29 🗺 Reinforcement Learning in Vision: A Survey(视觉强化学习:综述)

05:59 🔍 Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning(第一部分:技巧还是陷阱?深入探究强化学习在大型语言模型推理中的应用)

06:23 🖌 Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control(随形而动:轨迹引导区域控制的形状感知图像编辑)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递