2026.06.23 | 工具生态暴露规划短板；会话中心构建可审计系统 - HuggingFace 每日AI论文速递

【赞助商】
OpenClaw快报
每天五分钟，听听 OpenClaw 快报，带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com

【目录】
本期的 15 篇论文如下：

[00:32] 🧩 PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems（PlanBench-XL：评估大语言模型工具使用智能体在大型工具生态系统中的长时域规划能力）
[01:31] 🧠 OpenRath: Session-Centered Runtime State for Agent Systems（OpenRath：面向智能体系统的会话中心运行时状态）
[02:28] 🧩 DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams（DataClaw0：从原始流中智能裁剪多模态数据）
[03:19] 💼 EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions（企业爪痕基准：从真实工作会话中构建的智能体评估）
[04:10] 🧠 Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention（分组查询专家：基于分组查询自注意力的混合专家模型）
[05:09] ⚡ KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking（KaLM-Reranker-V1：用于压缩文档重排序的快速但非延迟交互方法）
[06:05] 🌍 World Action Models: A Survey（世界行动模型：一项综述）
[07:10] 🧪 CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents（CLI-Universe：面向终端智能体的可验证任务合成引擎）
[08:07] 🧬 EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory（EvoEmbedding：面向长上下文检索与智能体记忆的可进化表示）
[09:06] 🧬 BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language（BioMatrix：迈向涵盖序列、结构和语言模态矩阵的综合性生物基础模型）
[10:07] 🧠 HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization（HydraHead：从头级功能异质性到专业化注意力混合）
[10:57] 🎯 Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation（从自身错误中学习：为自蒸馏构建可学习的微反思轨迹）
[11:53] 🛡 SkillHarness: Harnessing Safe Skills for Computer-Use Agents（SkillHarness：为计算机使用代理安全地驾驭技能）
[12:42] 🧠 Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding（更深并非总是更好：通过置信层解码减轻对齐代价）
[13:41] 🔬 Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark（物理科学中的深度研究：一个多智能体框架与综合基准）

【关注我们】
您还可以在以下平台找到我们，获得播客内容以外更多信息
小红书: AI速递