2026.03.24 | 世界模型交互评估短板;单流架构极速生成

2026.03.24 | 世界模型交互评估短板;单流架构极速生成

13分钟 ·
播放数170
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:32 🧪 Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models(Omni-WorldBench:迈向面向世界模型的全面交互中心化评估)

01:13 🚀 Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model(速度源于简洁:用于快速音视频生成基础模型的单流架构)

01:55 🧠 LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning(LongCat-Flash-Prover:通过智能体工具集成强化学习推进原生形式推理)

02:42 🔍 VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding(VideoDetective:基于外部查询与内部相关性的线索搜寻用于长视频理解)

03:30 🧠 SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning(SpatialBoost:通过语言引导推理增强视觉表征)

04:10 🎯 F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting(F4Splat:用于前馈3D高斯泼溅的前馈预测性致密化)

05:03 🎬 Manifold-Aware Exploration for Reinforcement Learning in Video Generation(面向视频生成的强化学习中的流形感知探索)

05:56 ⚖ mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT(mSFT:解决多任务监督微调中数据集混合的异质过拟合问题)

06:46 🧠 Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection(Group3D:基于多模态大语言模型的语义分组开放词汇3D物体检测)

07:35 🔄 Repurposing Geometric Foundation Models for Multi-view Diffusion(几何基础模型在多视角扩散中的再利用)

08:21 🤖 RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models(RoboAlign:学习视觉-语言-动作模型中语言-动作对齐的测试时推理)

09:15 🔍 OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis(OpenResearcher:一个完全开源的深度研究长程轨迹合成流程)

10:02 💭 BubbleRAG: Evidence-Driven Retrieval-Augmented Generation for Black-Box Knowledge Graphs(BubbleRAG:面向黑盒知识图谱的证据驱动检索增强生成)

10:54 ⚖ SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models(SEM:用于视觉语言模型事后去偏的稀疏嵌入调制)

11:43 🧭 On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation(论RLVR更新方向对LLM推理的影响:识别与利用)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递