2026.02.19 | 可学习路由+量化加速视频扩散;残差追踪让人形90%抓取

2026.02.19 | 可学习路由+量化加速视频扩散;残差追踪让人形90%抓取

11分钟 ·
播放数24
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 14 篇论文如下:

00:30 ⚡ SLA2: Sparse-Linear Attention with Learnable Routing and QAT(SLA2:具有可学习路由和量化感知训练的稀疏线性注意力)

01:16 🤖 Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation(面向开放词汇视觉移动操作的人形机器人末端执行器控制学习)

02:02 🧠 RynnBrain: Open Embodied Foundation Models(RynnBrain:开放式具身基础模型)

02:46 🔑 Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality(空书架还是丢钥匙?回忆是参数化事实性的瓶颈)

03:33 🕺 SAM 3D Body: Robust Full-Body Human Mesh Recovery(SAM 3D 人体:鲁棒的全身体三维人体网格重建)

04:41 🤝 Multi-agent cooperation through in-context co-player inference(通过上下文共玩家推断实现多智能体合作)

05:28 📊 MAEB: Massive Audio Embedding Benchmark(MAEB:大规模音频嵌入基准测试)

06:04 🤖 World Action Models are Zero-shot Policies(世界行动模型是零样本策略)

06:44 🔬 Towards a Science of AI Agent Reliability(迈向AI智能体可靠性的科学)

07:20 🧠 MMA: Multimodal Memory Agent(MMA:多模态记忆智能体)

08:09 🚀 Optimizing Few-Step Generation with Adaptive Matching Distillation(通过自适应匹配蒸馏优化少步生成)

08:56 🧭 Learning Situated Awareness in the Real World(在现实世界中学习情境感知)

09:28 ⚠ Visual Memory Injection Attacks for Multi-Turn Conversations(面向多轮对话的视觉记忆注入攻击)

10:10 🤖 BiManiBench: A Hierarchical Benchmark for Evaluating Bimanual Coordination of Multimodal Large Language Models(BiManiBench:用于评估多模态大语言模型双手协调能力的层次化基准)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递