2026.05.26 | DVAO动态平衡多目标;WBench填补交互评估空白

2026.05.26 | DVAO动态平衡多目标;WBench填补交互评估空白

14分钟 ·
播放数64
·
评论数0

【目录】
本期的 15 篇论文如下:
[00:25] 🎯 DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning(DVAO:面向多奖励强化学习的动态方差自适应优势优化)
[01:15] 🎬 WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation(WBench:用于交互式视频世界模型评估的全面多轮基准)
[02:13] 🖥 Macaron-A2UI: A Model for Generative UI in Personal Agents(Macaron-A2UI:一种面向个人代理的生成式用户界面模型)
[02:56] 🤝 Foundation Protocol: A Coordination Layer for Agentic Society(基础协议:面向智能体社会的协调层)
[04:02] 🔺 TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction(TriSplat:面向模拟的馈通式三维场景重建)
[05:05] 🎬 ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning(ParaVT:驯服工具先验悖论,实现智能视频强化学习中的并行工具调用)
[06:00] 🧠 Toward Native Multimodal Modeling: A Roadmap(迈向原生多模态建模:路线图)
[06:50] 🔍 QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks(QUEST:通过完全合成任务训练前沿深度研究智能体)
[07:43] 🎯 ThriftAttention: Selective Mixed Precision for Long-Context FP4 Attention(ThriftAttention:面向长上下文的FP4注意力机制的选择性混合精度方法)
[08:50] 🔬 AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery(自动研究AI:迈向人工智能驱动的科学发现自动化研究)
[09:46] 🧠 Your Embedding Model is SMARTer Than You Think(你的嵌入模型比想象中更聪明)
[10:25] 💡 ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement(ControlLight:迈向可控、一致且泛化的低光增强)
[11:22] 🌐 Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion(Pantheon360:通过三维感知的360°视频扩散驯服数字孪生生成)
[12:09] 🤖 CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents(CUA-Gym:为计算机使用智能体扩展可验证的训练环境与任务)
[13:01] 🤖 Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents(预见与学习:释放主动智能体中的空闲计算资源)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递