本期的 15 篇论文如下:
00:23 🔗 Chain-of-Model Learning for Language Model(模型链学习:一种用于语言模型的新型学习范式)
00:58 🤔 AdaptThink: Reasoning Models Can Learn When to Think(AdaptThink:推理模型何时思考的学习)
01:45 🧠 AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning(AdaCoT: 通过强化学习实现帕累托最优的自适应思维链触发)
02:21 🚀 Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction(Delta注意力机制:通过Delta校正实现快速而精确的稀疏注意力推断)
03:04 🖥 Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis(通过用户界面分解与合成扩展计算机使用中的Grounding)
03:43 🤔 Thinkless: LLM Learns When to Think(智思:大语言模型学习何时思考)
04:23 💡 Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space(暗中求索:在隐空间中通过测试时实例级策略梯度进行推理)
05:00 🧮 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision(MM-PRM:利用可扩展的步骤级监督增强多模态数学推理)
05:39 ✨ Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation(混合3D-4D高斯溅射:用于快速动态场景表示)
06:15 🛡 FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA(FedSVD:基于LoRA的自适应正交化差分隐私联邦学习)
07:00 🧩 Model Merging in Pre-training of Large Language Models(大型语言模型预训练中的模型合并)
07:53 🤖 CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models(CPGD:面向语言模型稳定规则强化学习)
08:36 🎬 Faster Video Diffusion with Trainable Sparse Attention(基于可训练稀疏注意力的快速视频扩散)
09:23 🧠 Fractured Chain-of-Thought Reasoning(碎裂的思维链推理)
10:03 🧠 Neuro-Symbolic Query Compiler(神经符号查询编译器)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递