【月末特辑】1月最火AI论文 | DeepSeek-R1强化学习提升LLM推理能力；长文本处理突破 - HuggingFace 每日AI论文速递

本期的 10 篇论文如下：

00:40 TOP1(🔥281) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning（DeepSeek-R1：通过强化学习激励大语言模型的推理能力）

03:13 TOP2(🔥271) | ⚡ MiniMax-01: Scaling Foundation Models with Lightning Attention（MiniMax-01：基于闪电注意力机制扩展基础模型）

05:36 TOP3(🔥249) | 🧠 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking（rStar-Math：小型语言模型通过自我进化的深度思考掌握数学推理）

08:13 TOP4(🔥103) | 🧠 Evolving Deeper LLM Thinking（演化更深层次的LLM思维）

10:28 TOP5(🔥99) | 📚 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining（2.5年课堂：用于视觉-语言预训练的多模态教科书）

12:51 TOP6(🔥90) | 🚀 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models（REINFORCE++：一种简单高效的大语言模型对齐方法）

15:15 TOP7(🔥90) | 🧠 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though（迈向LLMs中的系统2推理：学习如何通过元思维链进行思考）

17:14 TOP8(🔥89) | 📊 The Lessons of Developing Process Reward Models in Mathematical Reasoning（数学推理中过程奖励模型开发的经验教训）

19:33 TOP9(🔥88) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training（Agent-R：通过迭代自训练使语言模型代理具备反思能力）

21:35 TOP10(🔥87) | 🧠 The GAN is dead; long live the GAN! A Modern GAN Baseline（GAN已死；GAN万岁！一个现代的GAN基线）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递