2026.06.09 | 代码探索短板凸显；策略内蒸馏几何特性揭示 - HuggingFace 每日AI论文速递

【目录】
本期的 15 篇论文如下：

[00:32] 🔍 SWE-Explore: Benchmarking How Coding Agents Explore Repositories（SWE-Explore：基准测试编码代理如何探索代码仓库）
[01:34] 🔍 On the Geometry of On-Policy Distillation（论策略内蒸馏的几何特性）
[02:26] 🧠 Latent Spatial Memory for Video World Models（面向视频世界模型的潜在空间记忆）
[03:20] 🎬 CoVEBench: Can Video Editing Models Handle Complex Instructions?（CoVEBench：视频编辑模型能否处理复杂指令？）
[04:20] 🧠 LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents（潜在技能：从上下文文本技能到LLM智能体的权重内潜在技能）
[05:10] ⚡ FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention（闪存-深度求索V4：通过前向稀疏注意力实现闪电般超长上下文处理）
[06:06] 🌍 SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks（空间世界：真实世界任务中多模态智能体交互式空间推理的基准测试）
[07:10] 🧠 Human Psychometric Questionnaires Mischaracterize LLM Behavior（人类心理测量问卷误判LLM行为）
[08:19] 🧠 Echo-Memory: A Controlled Study of Memory in Action World Models（回响记忆：动作世界模型中记忆机制的受控研究）
[09:08] 🎮 OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics（OmniGameArena：一个统一的UE5基准测试，用于具备改进动态的VLM游戏智能体）
[10:03] 🤖 AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing（AHA-WAM：异步自适应时域世界-动作建模与观测引导上下文路由）
[11:08] 🎥 SwiftVR: Real-Time One-Step Generative Video Restoration（SwiftVR：实时一步生成式视频修复）
[12:12] 🧠 Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses（贝叶斯智能体：基于后验引导的技能演化用于LLM智能体框架）
[13:02] 🎬 OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning（OmniCap-IF：全方位视频字幕生成的指令遵循能力基准测试与改进）
[14:14] 🎯 Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill（技能奖励模型：通过智能体技能统一异构评估标准）

【关注我们】
您还可以在以下平台找到我们，获得播客内容以外更多信息
小红书: AI速递

【赞助商】
OpenClaw快报
每天五分钟，听听 OpenClaw 快报，带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com