2026.04.02 | ClawKeeper三层守护智能体安全；终端智能体轻量API夺冠 - HuggingFace 每日AI论文速递

【赞助商】

通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事

【目录】

本期的 15 篇论文如下：

00:27 🛡 ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers（ClawKeeper：通过技能、插件和监视器为OpenClaw代理提供全面的安全保护）

01:20 💻 Terminal Agents Suffice for Enterprise Automation（终端智能体足以实现企业自动化）

02:03 📊 MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome（MiroEval：面向过程和结果的多模态深度研究智能体基准测试）

02:54 🧠 ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?（ViGoR-Bench：视觉生成模型距离零样本视觉推理器还有多远？）

03:40 🔬 Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification（Vision2Web：基于智能体验证的视觉网站开发分层基准）

04:26 📊 QuitoBench: A High-Quality Open Time Series Forecasting Benchmark（QuitoBench：一个高质量开放时间序列预测基准）

05:12 🧠 Reasoning Shift: How Context Silently Shortens LLM Reasoning（推理偏移：上下文如何悄然缩短大语言模型的推理过程）

05:59 📊 HippoCamp: Benchmarking Contextual Agents on Personal Computers（HippoCamp：在个人计算机上评估情境智能体的基准）

06:52 🧠 PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning（PerceptionComp：面向复杂感知推理的视频基准测试）

07:34 ⚡ Universal YOCO for Efficient Depth Scaling（通用YOCO：实现高效深度扩展）

08:12 🔄 Brevity Constraints Reverse Performance Hierarchies in Language Models（简洁性约束逆转语言模型的性能层级）

08:48 🧠 GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation（GaussianGPT：迈向自回归3D高斯场景生成）

09:25 📝 Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers（论文重构评估：评估AI撰写论文的呈现质量与幻觉问题）

10:11 🚀 Embarrassingly Simple Self-Distillation Improves Code Generation（极其简单的自蒸馏提升代码生成能力）

10:54 🤖 Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants（主动式智能体研究环境：通过模拟主动用户来评估主动式助手）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递