2025.05.20 | 模型链学习提升效率;AdaptThink优化推理速度。

2025.05.20 | 模型链学习提升效率;AdaptThink优化推理速度。

11分钟 ·
播放数91
·
评论数0

本期的 15 篇论文如下:

00:23 🔗 Chain-of-Model Learning for Language Model(模型链学习:一种用于语言模型的新型学习范式)

00:58 🤔 AdaptThink: Reasoning Models Can Learn When to Think(AdaptThink:推理模型何时思考的学习)

01:45 🧠 AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning(AdaCoT: 通过强化学习实现帕累托最优的自适应思维链触发)

02:21 🚀 Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction(Delta注意力机制:通过Delta校正实现快速而精确的稀疏注意力推断)

03:04 🖥 Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis(通过用户界面分解与合成扩展计算机使用中的Grounding)

03:43 🤔 Thinkless: LLM Learns When to Think(智思:大语言模型学习何时思考)

04:23 💡 Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space(暗中求索:在隐空间中通过测试时实例级策略梯度进行推理)

05:00 🧮 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision(MM-PRM:利用可扩展的步骤级监督增强多模态数学推理)

05:39 ✨ Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation(混合3D-4D高斯溅射:用于快速动态场景表示)

06:15 🛡 FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA(FedSVD:基于LoRA的自适应正交化差分隐私联邦学习)

07:00 🧩 Model Merging in Pre-training of Large Language Models(大型语言模型预训练中的模型合并)

07:53 🤖 CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models(CPGD:面向语言模型稳定规则强化学习)

08:36 🎬 Faster Video Diffusion with Trainable Sparse Attention(基于可训练稀疏注意力的快速视频扩散)

09:23 🧠 Fractured Chain-of-Thought Reasoning(碎裂的思维链推理)

10:03 🧠 Neuro-Symbolic Query Compiler(神经符号查询编译器)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递