2026.06.12 | EvoArena追踪记忆演化,测试AI洞悉偏好变化

2026.06.12 | EvoArena追踪记忆演化,测试AI洞悉偏好变化

15分钟 ·
播放数101
·
评论数0

【赞助商】
OpenClaw快报
每天五分钟,听听 OpenClaw 快报,带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com

【目录】
本期的 15 篇论文如下:

[00:31] 🧠 EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments(EvoArena:在动态环境中追踪记忆演化以实现鲁棒的LLM智能体)
[01:32] 🧠 SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning(SpatialClaw:重新思考面向智能体空间推理的动作接口)
[02:33] 🔍 FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents(FORT-Searcher:为训练深度搜索代理合成抗捷径搜索任务)
[03:31] 🛠 Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?(鲁棒-U1:多模态大语言模型能否自我恢复受损视觉内容以实现鲁棒理解?)
[04:27] 🔄 InterleaveThinker: Reinforcing Agentic Interleaved Generation(交错思考者:强化智能交错生成)
[05:14] 🧮 MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling(MaxProof:利用生成-验证强化学习与群体级测试时扩展实现数学证明的规模化)
[06:11] 🧠 MiniMax Sparse Attention(MiniMax稀疏注意力机制)
[06:50] 🖥 WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces(WeaveBench:面向混合界面计算机使用代理的长期、真实世界基准)
[07:50] 🔬 LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories(LabVLA:在科学实验室中落地视觉-语言-动作模型)
[08:55] 🦾 HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers(HYDRA-X:具备整体视觉分词器的原生统一多模态模型)
[09:45] 🧩 N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization(N-GRPO:基于嵌入级邻居混合的增强策略优化)
[10:44] 🔬 EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery(EurekAgent:智能体环境工程是自主科学发现所需的一切)
[11:38] 🏃 VideoMDM: Towards 3D Human Motion Generation From 2D Supervision(VideoMDM:从二维监督迈向三维人体运动生成)
[12:35] 🔍 Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback(位置、类型、原因与重要性:面向文本到图像反馈的结构化缺陷定位)
[13:26] 🔀 Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning(揭秘隐藏状态循环:基于在线强化学习的可切换潜在推理)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递