2026.03.23 | 多跳合成提推理;前向强化快视频

2026.03.23 | 多跳合成提推理;前向强化快视频

13分钟 ·
播放数93
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:31 🔗 HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning(HopChain:用于可泛化视觉语言推理的多跳数据合成)

01:28 🎬 Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models(Astrolabe:面向蒸馏自回归视频模型的前向过程强化学习引导框架)

02:06 🛰 TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation(TerraScope:面向地球观测的像素级视觉推理)

02:56 🔍 ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models(ProactiveBench:多模态大语言模型主动性能力评测基准)

03:45 🎬 LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation(LumosX:通过属性关联任意身份实现个性化视频生成)

04:50 🏠 FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow(FlowScene:基于多模态图整流流的风格一致室内场景生成)

05:35 🧠 The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus(面向大语言模型的Y组合子:用λ演算解决长上下文困境)

06:20 🎯 A Subgoal-driven Framework for Improving Long-Horizon LLM Agents(一种用于改进长视野LLM智能体的子目标驱动框架)

07:02 🔍 How Well Does Generative Recommendation Generalize?(生成式推荐模型的泛化能力究竟如何?)

07:48 🌍 WorldAgents: Can Foundation Image Models be Agents for 3D World Models?(WorldAgents:基础图像模型能否成为3D世界模型的智能体?)

08:24 ⚡ BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection(BEAVER:一种基于结构感知页面选择的免训练分层提示压缩方法)

09:05 🚀 Hyperagents(超智能体:可自我编辑的元认知自改进智能体)

09:54 🎬 HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering(HiMu:面向长视频问答的分层多模态帧选择方法)

10:37 🎬 EgoForge: Goal-Directed Egocentric World Simulator(EgoForge:目标导向的自我中心世界模拟器)

11:50 🎬 Versatile Editing of Video Content, Actions, and Dynamics without Training(无需训练的通用视频内容、动作与动态编辑)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递