2026.05.13 | 原生统一看画；边缘隐私记管 - HuggingFace 每日AI论文速递

【目录】
本期的 15 篇论文如下：
[00:23] 🧠 SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture（SenseNova-U1: 基于NEO-unify架构统一多模态理解与生成）
[01:10] 🔒 MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents（MemPrivacy：面向边缘-云智能体的隐私保护个性化记忆管理）
[01:59] 🧠 $δ$-mem: Efficient Online Memory for Large Language Models（δ-mem：面向大型语言模型的高效在线记忆机制）
[02:43] 🤖 RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards（RubricEM：超越可验证奖励的元强化学习与基于量规引导的策略分解）
[03:33] 🤖 World Action Models: The Next Frontier in Embodied AI（世界动作模型：具身智能的下一个前沿）
[04:22] 🤖 AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward（AlphaGRPO：通过分解可验证奖励解锁统一多模态模型中的自反思多模态生成）
[05:09] 🧩 Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization（超越最后一层：多层表示融合用于视觉标记化）
[06:12] 🛠 ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents（ToolCUA：面向计算机使用代理的最优GUI-工具路径编排）
[06:51] 🏭 Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics（企业系统需要学习世界模型吗？上下文在推断动态中的重要性）
[07:52] 🎨 L2P: Unlocking Latent Potential for Pixel Generation（L2P：解锁像素生成的潜在潜能）
[08:33] 🎬 CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives（CausalCine：面向多镜头视频叙事的实时自回归生成）
[09:18] 🔍 Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents（面向视觉原生多模态深度搜索代理的在策略数据进化方法）
[10:18] 💻 Teaching Language Models to Think in Code（教语言模型用代码思考）
[10:58] 🛡 On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment（基于失败轨迹的在线策略自我进化方法用于智能体安全对齐）
[11:52] 🌌 MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments（MCP-Cosmos：MCP环境中用于复杂任务执行的世界模型增强型智能体）

【关注我们】
您还可以在以下平台找到我们，获得播客内容以外更多信息
小红书: AI速递