2025.11.18 | RL奥赛夺金；Uni-MoE 2.0全能跃升 - HuggingFace 每日AI论文速递

本期的 14 篇论文如下：

00:17 🏅 P1: Mastering Physics Olympiads with Reinforcement Learning（用强化学习攻克物理奥赛）

00:56 🌐 Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data（Uni-MoE 2.0 Omni：以语言为中心的万模态大模型，通过先进MoE、训练与数据实现规模跃升）

01:42 🧩 Part-X-MLLM: Part-aware 3D Multimodal Large Language Model（Part-X-MLLM：面向部件感知的3D多模态大语言模型）

02:22 🧠 TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models（TiViBench：视频生成模型思维推理基准测试）

03:08 🚀 GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning（GroupRank：一种由强化学习驱动的分组重排范式）

03:49 🧩 PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image（PhysX-Anything：单张图像生成可仿真物理3D资产）

04:28 🌌 UFO$^3$: Weaving the Digital Agent Galaxy（UFO³：编织数字智能体银河）

04:59 🍲 Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance（“汤”级模型：简单加权平均即可让大语言模型性能跃升）

05:38 🌍 OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation（OlmoEarth：面向多模态地球观测的稳定潜变量图像建模）

06:19 🔄 Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?（Live-SWE-agent：软件工程智能体能否实时自我进化？）

06:51 🚀 MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling（MiroThinker：通过模型、上下文与交互扩展，将开源研究智能体性能推向新边界）

07:36 🎯 Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models（测试时谱感知潜变量引导实现视觉-语言模型零样本泛化）

08:19 🧠 WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance（WebCoach：具备跨会话记忆引导的自进化网页智能体）

09:10 🧬 Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs（进化方法而非提示：面向大模型的越狱攻击演化合成）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递