【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:31 🔗 HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning(HopChain:用于可泛化视觉语言推理的多跳数据合成)
01:28 🎬 Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models(Astrolabe:面向蒸馏自回归视频模型的前向过程强化学习引导框架)
02:06 🛰 TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation(TerraScope:面向地球观测的像素级视觉推理)
02:56 🔍 ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models(ProactiveBench:多模态大语言模型主动性能力评测基准)
03:45 🎬 LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation(LumosX:通过属性关联任意身份实现个性化视频生成)
04:50 🏠 FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow(FlowScene:基于多模态图整流流的风格一致室内场景生成)
05:35 🧠 The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus(面向大语言模型的Y组合子:用λ演算解决长上下文困境)
06:20 🎯 A Subgoal-driven Framework for Improving Long-Horizon LLM Agents(一种用于改进长视野LLM智能体的子目标驱动框架)
07:02 🔍 How Well Does Generative Recommendation Generalize?(生成式推荐模型的泛化能力究竟如何?)
07:48 🌍 WorldAgents: Can Foundation Image Models be Agents for 3D World Models?(WorldAgents:基础图像模型能否成为3D世界模型的智能体?)
08:24 ⚡ BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection(BEAVER:一种基于结构感知页面选择的免训练分层提示压缩方法)
09:05 🚀 Hyperagents(超智能体:可自我编辑的元认知自改进智能体)
09:54 🎬 HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering(HiMu:面向长视频问答的分层多模态帧选择方法)
10:37 🎬 EgoForge: Goal-Directed Egocentric World Simulator(EgoForge:目标导向的自我中心世界模拟器)
11:50 🎬 Versatile Editing of Video Content, Actions, and Dynamics without Training(无需训练的通用视频内容、动作与动态编辑)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
