2026.05.13 | 原生统一看画;边缘隐私记管

2026.05.13 | 原生统一看画;边缘隐私记管

13分钟 ·
播放数85
·
评论数0

【目录】
本期的 15 篇论文如下:
[00:23] 🧠 SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture(SenseNova-U1: 基于NEO-unify架构统一多模态理解与生成)
[01:10] 🔒 MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents(MemPrivacy:面向边缘-云智能体的隐私保护个性化记忆管理)
[01:59] 🧠 $δ$-mem: Efficient Online Memory for Large Language Models(δ-mem:面向大型语言模型的高效在线记忆机制)
[02:43] 🤖 RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards(RubricEM:超越可验证奖励的元强化学习与基于量规引导的策略分解)
[03:33] 🤖 World Action Models: The Next Frontier in Embodied AI(世界动作模型:具身智能的下一个前沿)
[04:22] 🤖 AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward(AlphaGRPO:通过分解可验证奖励解锁统一多模态模型中的自反思多模态生成)
[05:09] 🧩 Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization(超越最后一层:多层表示融合用于视觉标记化)
[06:12] 🛠 ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents(ToolCUA:面向计算机使用代理的最优GUI-工具路径编排)
[06:51] 🏭 Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics(企业系统需要学习世界模型吗?上下文在推断动态中的重要性)
[07:52] 🎨 L2P: Unlocking Latent Potential for Pixel Generation(L2P:解锁像素生成的潜在潜能)
[08:33] 🎬 CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives(CausalCine:面向多镜头视频叙事的实时自回归生成)
[09:18] 🔍 Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents(面向视觉原生多模态深度搜索代理的在策略数据进化方法)
[10:18] 💻 Teaching Language Models to Think in Code(教语言模型用代码思考)
[10:58] 🛡 On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment(基于失败轨迹的在线策略自我进化方法用于智能体安全对齐)
[11:52] 🌌 MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments(MCP-Cosmos:MCP环境中用于复杂任务执行的世界模型增强型智能体)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递