本期的 15 篇论文如下:
00:20 🔍 Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning(观察、推理与搜索:面向智能体视频推理的开放网络视频深度研究基准)
01:01 👶 BabyVision: Visual Reasoning Beyond Language(BabyVision:超越语言的视觉推理)
01:45 🚀 PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning(PaCoRe:通过并行协调推理学习扩展测试时计算)
02:24 🧠 X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests(X-Coder:基于全合成任务、解决方案与测试推进竞争性编程)
03:03 ⚡ MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head(MHLA:通过令牌级多头机制恢复线性注意力的表达能力)
03:41 ⚡ GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts(GlimpRouter:通过瞥见思维令牌实现高效协同推理)
04:17 🤖 OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent(OS-Symphony:一个用于鲁棒且通用的计算机使用智能体的整体框架)
05:20 📉 Lost in the Noise: How Reasoning Models Fail with Contextual Distractors(迷失于噪声之中:推理模型如何因上下文干扰物而失效)
06:00 🚀 Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models(超越硬掩码:扩散语言模型的渐进式令牌演化)
06:30 🧠 Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction(可控内存使用:在长期人机交互中平衡锚定与创新)
07:10 🚗 DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving(DrivingGen:自动驾驶生成式视频世界模型的综合基准)
07:58 🤖 MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era(MegaFlow:面向智能体时代的大规模分布式编排系统)
08:26 🎨 Boosting Latent Diffusion Models via Disentangled Representation Alignment(通过解耦表征对齐提升潜在扩散模型)
09:08 🤔 What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models(用户未言明之处:欠明确的查询限制视觉语言模型)
09:45 🔧 ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration(ET-Agent:通过行为校准激励有效的工具集成推理智能体)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
