2026.03.26 | CUA-Suite攒600万帧操作视频;EVA三阶段训练砍七成令牌

2026.03.26 | CUA-Suite攒600万帧操作视频;EVA三阶段训练砍七成令牌

13分钟 ·
播放数82
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:27 🎬 CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents(CUA-Suite:用于计算机使用代理的大规模人工标注视频演示集)

01:24 🎬 EVA: Efficient Reinforcement Learning for End-to-End Video Agent(EVA:面向端到端视频智能体的高效强化学习框架)

02:05 🛡 T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search(T-MAP:基于轨迹感知进化搜索的LLM智能体红队测试)

02:50 🤖 UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience(UI-Voyager:一种通过失败经验学习的自进化图形用户界面代理)

03:33 🤔 Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?(自蒸馏为何(有时)会削弱大语言模型的推理能力?)

04:20 🎮 GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents(GameplayQA:面向决策密集型第一人称同步多视频理解的3D虚拟智能体基准测试框架)

05:13 🧠 When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning(当模型自我评判时:多模态推理的无监督自我进化)

06:11 🤖 CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare(CarePilot:面向医疗领域长周期计算机任务自动化的多智能体框架)

07:13 🌀 4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video(4DGS360:基于单视频的动态物体360度高斯重建)

07:54 🎬 OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning(OmniWeaving:面向自由组合与推理的统一视频生成)

08:38 🚗 Toward Physically Consistent Driving Video World Models under Challenging Trajectories(面向挑战性轨迹下物理一致性驾驶视频世界模型的研究)

09:18 📊 Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments(LLM智能体能否胜任CFO?动态企业环境中资源分配的基准测试)

10:10 🧠 Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning(通过文本表征引导推理释放多模态大语言模型的空间推理能力)

10:53 🤖 StreamingClaw Technical Report(StreamingClaw技术报告)

11:30 🔍 LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis(LagerNVS:基于潜在几何的全神经实时新视角合成)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递