2026.02.10 | ReAlign零训弥合图文隙;MOVA同步生成视音频

2026.02.10 | ReAlign零训弥合图文隙;MOVA同步生成视音频

13分钟 ·
播放数68
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:34 🔀 Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models(面向多模态大语言模型的模态间隙驱动的子空间对齐训练范式)

01:23 🎬 MOVA: Towards Scalable and Synchronized Video-Audio Generation(MOVA:迈向可扩展且同步的视频-音频生成)

02:03 📈 QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining(QuantaAlpha:一种基于大语言模型驱动的阿尔法挖掘进化框架)

02:51 🤖 Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning(循环深度视觉语言动作模型:通过潜在迭代推理实现隐式测试时计算扩展)

03:24 🎯 Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO(通过建模逐步与长期采样效应缓解流式GRPO中的稀疏奖励问题)

04:22 ⚡ LLaDA2.1: Speeding Up Text Diffusion via Token Editing(LLaDA2.1:通过令牌编辑加速文本扩散)

05:02 📱 GEBench: Benchmarking Image Generation Models as GUI Environments(GEBench:将图像生成模型作为GUI环境的基准测试)

05:52 🎬 Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition(Demo-ICL:面向过程性视频知识获取的上下文学习)

06:42 🧠 Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory(学习查询感知的预算层级路由以实现运行时智能体记忆)

07:20 📈 Weak-Driven Learning: How Weak Agents make Strong Agents Stronger(弱驱动学习:弱智能体如何使强智能体更强)

08:12 📊 LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth(LOCA-bench:在可控与极端上下文增长下对语言智能体进行基准测试)

08:59 🔍 GISA: A Benchmark for General Information-Seeking Assistant(GISA:通用信息寻求助手基准)

09:56 🧭 WorldCompass: Reinforcement Learning for Long-Horizon World Models(WorldCompass:面向长视野世界模型的强化学习)

10:35 🧪 LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning(LatentChem:从文本思维链到化学推理中的潜在思维)

11:20 🧭 Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?(空间理论:基础模型能否通过主动探索构建空间信念?)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递