本期的 15 篇论文如下:
00:22 🧪 NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification(NovelSeek:当智能体成为科学家——构建从假设到验证的闭环系统)
01:05 🤔 Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models(规模化推理,失控的指令:评估大型推理模型中的指令遵循)
01:50 🤖 Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning(Tool-Star:通过强化学习赋能基于LLM的多工具推理器)
02:30 🖼 KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models(KRIS-Bench:下一代智能图像编辑模型评测基准)
03:16 🖼 Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning(像素推理器:通过好奇心驱动的强化学习激励像素空间推理)
04:03 ⏱ QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design(QuickVideo:基于系统算法协同设计的实时长视频理解)
04:55 🖼 GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning(GoT-R1:利用强化学习释放多模态大语言模型在视觉生成中的推理能力)
05:39 🖼 LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning(LLaDA-V:基于视觉指令调整的大型语言扩散模型)
06:15 📉 Risk-Averse Reinforcement Learning with Itakura-Saito Loss(基于Itakura-Saito损失的风险规避强化学习)
06:54 🚀 Scaling Diffusion Transformers Efficiently via $μ$P(通过 μP 高效扩展扩散Transformer)
07:33 🖼 Understanding Generative AI Capabilities in Everyday Image Editing Tasks(理解生成式人工智能在日常图像编辑任务中的能力)
08:19 🧠 Let LLMs Break Free from Overthinking via Self-Braking Tuning(让大型语言模型通过自刹车调整摆脱过度思考)
08:56 🧠 Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning(弥合差距:桥接思维跳跃以改进思维链微调)
09:37 🎮 VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance(VideoGameQA-Bench:评估视觉-语言模型在视频游戏质量保证中的应用)
10:23 💡 Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding(Dimple:具有并行解码的离散扩散多模态大型语言模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递