2026.04.23 | LLaDA2.0统一多模态;未来经验外挂RL

2026.04.23 | LLaDA2.0统一多模态;未来经验外挂RL

12分钟 ·
播放数119
·
评论数0

【目录】
本期的 15 篇论文如下:
00:28 🔮 LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model(LLaDA2.0-Uni:基于扩散大语言模型统一多模态理解与生成)
01:17 🔮 Near-Future Policy Optimization(近未来策略优化)
02:07 🤖 DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data(DR-Venus:仅用1万条开放数据迈向前沿边缘规模深度研究代理)
02:53 🤖 DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation(DeVI:基于物理的灵巧人机交互通过合成视频模仿)
03:42 🎭 Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges(大模型时代的奖励黑客:机制、涌现性失调与挑战)
04:36 🧠 Exploring Spatial Intelligence from a Generative Perspective(从生成视角探索空间智能)
05:21 🤖 A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression(一种通过观测上下文压缩实现高效终端代理的自演化框架)
06:18 🎤 WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training(WavAlign:通过自适应混合后训练增强口语对话模型的智能与表现力)
07:06 🤖 SWE-chat: Coding Agent Interactions From Real Users in the Wild(SWE-chat:来自真实用户的编码智能体交互记录)
07:53 🤖 Cortex 2.0: Grounding World Models in Real-World Industrial Deployment(Cortex 2.0:在现实工业部署中基于世界模型进行规划)
08:36 🧠 Convergent Evolution: How Different Language Models Learn Similar Number Representations(趋同演化:不同语言模型如何学习相似的数值表示)
09:21 🤝 SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution(SAVOIR:通过沙普利值奖励归因学习社交智慧)
09:57 🎬 ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis(ReImagine:通过图像优先合成重新思考可控的高质量人类视频生成)
10:34 🔧 Visual Reasoning through Tool-supervised Reinforcement Learning(通过工具监督强化学习实现视觉推理)
11:09 🤖 AI scientists produce results without reasoning scientifically(AI科学家产生结果但未进行科学推理)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递