【目录】
本期的 15 篇论文如下:
[] 🎯 Trust Region On-Policy Distillation(信任区域同策略蒸馏)
[] 🤖 Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking(人形GPT:扩展数据与结构实现零样本运动追踪)
[] 🧠 A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL(多领域强化学习中跨域干扰与恢复的局部微扰理论)
[] 🧠 World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning(世界模型与语言模型:具体与抽象推理的互补性)
[] 🏥 AutoMedBench: Towards Medical AutoResearch with Agentic AI Models(AutoMedBench:面向医疗自主研究的智能体AI模型基准)
[] 🖼 Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation(解耦残差去噪扩散模型用于统一且数据高效的图像到图像翻译)
[] 😴 Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories(语言模型需要睡眠:学习自我修改与记忆巩固)
[] 🧩 TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL(TRON:面向视觉推理强化学习的目标驱动、规则可验证的在线环境)
[] 💬 $Ψ$-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues(Ψ-Bench:评估说服性对话中个性感知影响能力)
[] 🧩 Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging(去中心化指令微调:冲突感知分割与权重合并)
[] 🎯 Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling(小型强化学习控制器与大型语言模型:基于强化学习引导的自适应采样实现测试时扩展)
[] 📄 PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training(PaddleOCR-VL-1.6:通过欠优化区域精炼与渐进式后训练扩展文档解析前沿)
[] 🗺 PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps(柏拉图导航:利用柏拉图拓扑图揭示导航中的语义对应关系)
[] 🔍 Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces(诊断正确答案长链思维训练轨迹中的有害延续)
[] 🎵 MERIT: Learning Disentangled Music Representations for Audio Similarity(MERIT:学习用于音频相似性的解耦音乐表示)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
【赞助商】
OpenClaw快报
每天五分钟,听听 OpenClaw 快报,带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com
