本期的 21 篇论文如下:
00:25 🤖 SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators(SynthDetoxM:现代大语言模型是少样本并行去毒化数据标注器)
01:10 🧠 Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning(探索数学推理中结果奖励的学习极限)
01:55 🤔 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling(10亿参数LLM能否超越4050亿参数LLM?重新思考计算最优的测试时缩放)
02:38 ⚡ Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding(基于时间局部性的层次化草稿实现大语言模型无损加速)
03:19 🚀 Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation(Show-o Turbo:迈向加速统一多模态理解和生成)
03:57 🤖 Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning(利用多智能体强化学习训练语言模型进行社会推理)
04:38 🧠 ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates(ReasonFlux:通过扩展思维模板实现分层LLM推理)
05:28 🌐 EVEv2: Improved Baselines for Encoder-Free Vision-Language Models(EVEv2:改进的无编码器视觉语言模型基线)
06:11 🧠 LM2: Large Memory Models(大型记忆模型)
06:57 🧠 The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering(标记的隐秘生命:通过视觉信息引导减少大型视觉语言模型的幻觉)
07:50 🪆 Matryoshka Quantization(嵌套量化)
08:35 🎥 Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT(Lumina-Video: 多尺度Next-DiT的高效灵活视频生成)
09:22 🎥 History-Guided Video Diffusion(历史引导的视频扩散)
10:12 🎥 CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers(CustomVideoX:三维参考注意力驱动的零样本定制视频扩散变换器动态适应)
10:59 ⚡ APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding(自适应并行编码:通过自适应并行编码实现更快更长的上下文增强生成)
11:38 ⏱ Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile(高效视频扩散Transformer模型)
12:21 🤖 MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents(元链:一个全自动且无需代码的LLM代理框架)
13:03 🚀 Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM(Steel-LLM:从零到开源——构建以中文为中心的LLM的个人历程)
13:47 🧠 The Curse of Depth in Large Language Models(深度在大语言模型中的诅咒)
14:24 🎨 DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization(DreamDPO:通过直接偏好优化对齐文本到3D生成与人偏好)
15:14 🎨 Dual Caption Preference Optimization for Diffusion Models(双标题偏好优化用于扩散模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
