2026.02.13 | 自演化AI难守安全；音频大模型统一token - HuggingFace 每日AI论文速递

【赞助商】

通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事

【目录】

本期的 15 篇论文如下：

00:31 ⚠ The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies（魔书背后的魔鬼：在自我进化的AI社会中，人类安全价值总是趋于消失）

01:24 🎵 MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models（MOSS-Audio-Tokenizer：为未来音频基础模型扩展音频分词器）

02:28 🧠 Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation（超越教师的学习：基于奖励外推的广义策略蒸馏）

03:05 🤖 GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning（GigaBrain-0.5M*：一种通过世界模型强化学习训练的视觉-语言-动作模型）

03:56 ⚖ LawThinker: A Deep Research Legal Agent in Dynamic Environments（LawThinker：动态环境中的深度研究法律智能体）

04:33 🔍 Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning（思之愈久，探之愈深：通过长度激励强化学习实现上下文内探索）

05:16 🎨 Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching（惊喜之笔：矢量草图绘制中的渐进式语义错觉）

06:01 🚀 DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing（DeepGen 1.0：一个用于推进图像生成与编辑的轻量级统一多模态模型）

06:55 🧩 Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models（Composition-RL：为大型语言模型强化学习组合可验证提示）

07:38 🧠 Thinking with Drafting: Optical Decompression via Logical Reconstruction（思维与草稿：通过逻辑重构实现光学解压缩）

08:17 🗳 dVoting: Fast Voting for dLLMs（dVoting：面向扩散大语言模型的快速投票推理方法）

09:09 🤖 RISE: Self-Improving Robot Policy with Compositional World Model（RISE：基于组合世界模型的机器人策略自改进框架）

09:54 🤖 $χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies（χ₀：通过驯服分布不一致实现资源感知的鲁棒机器人操作）

10:48 🤖 EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration（EgoHumanoid：利用无机器人自我中心演示解锁野外移动操作）

11:45 🔍 Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation（揭示隐式优势对称性：为何GRPO在探索与难度适应中举步维艰）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递