2026.04.09 | RL智能体模板病；分步生图更可控 - HuggingFace 每日AI论文速递

【赞助商】

通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事

【目录】

本期的 15 篇论文如下：

00:31 🧠 RAGEN-2: Reasoning Collapse in Agentic RL（RAGEN-2：智能体强化学习中的推理崩溃）

01:21 🎨 Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning（以笔画思考，而非像素：通过交错推理实现过程驱动的图像生成）

02:00 ⚡ MARS: Enabling Autoregressive Models Multi-Token Generation（MARS：实现自回归模型的多令牌生成）

02:51 🌍 INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling（INSPATIO-WORLD：基于时空自回归建模的实时4D世界模拟器）

03:48 🔬 SEVerA: Verified Synthesis of Self-Evolving Agents（SEVerA：可验证自进化智能体的合成）

04:41 🔍 TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders（TC-AE：解锁深度压缩自编码器的令牌容量）

05:26 ⚡ FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling（FP4探索，BF16训练：通过高效扩展rollout的扩散模型强化学习）

06:17 🔄 FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching（FlowInOne：将多模态生成统一为图像输入-图像输出的流匹配）

07:00 🧠 Neural Computers（神经计算机）

07:37 🎯 Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization（个性化奖励模型基准：基于人类对齐个性化的奖励模型评估）

08:22 💡 Learning to Hint for Reinforcement Learning（强化学习的提示学习）

09:11 🧠 Fast Spatial Memory with Elastic Test-Time Training（基于弹性测试时训练的高速空间记忆）

09:44 🎬 MoRight: Motion Control Done Right（MoRight：正确的运动控制）

10:21 🌐 Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment（通过跨语言对齐提升信息检索中的语义邻近性）

11:02 📊 Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval（超越困难负样本：知识蒸馏中分数分布对稠密检索的重要性）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递