2026.02.16 | 特征激活补数据;区域蒸馏藏放大

2026.02.16 | 特征激活补数据;区域蒸馏藏放大

13分钟 ·
播放数28
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:30 🧠 Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs(少即是够:在大型语言模型特征空间中合成多样化数据)

01:19 🔍 Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception(无需缩放:面向细粒度多模态感知的区域到图像蒸馏)

02:03 🏥 MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs(MedXIAOHE:构建医疗多模态大语言模型的完整方案)

02:43 🎯 OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence(OneVision-编码器:以编解码器对齐的稀疏性作为多模态智能的基础原则)

03:29 🔬 What does RL improve for Visual Reasoning? A Frankenstein-Style Analysis(强化学习对视觉推理有何改进?一项弗兰肯斯坦式分析)

04:18 🤖 RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models(RLinf-Co:基于强化学习的仿真-现实协同训练VLA模型)

05:05 🤖 ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning(ABot-M0:基于动作流形学习的机器人操作VLA基础模型)

05:53 🎬 Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions(迈向具有属性结构和质量验证指令的通用视频多模态大语言模型)

06:55 🤝 Intelligent AI Delegation(智能AI委托框架)

07:49 📍 GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics(GeoAgent:通过强化地理特征学习实现无处不在的地理定位)

08:39 ⚙ BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models(BPDQ:基于可变网格的比特平面分解量化用于大语言模型)

09:37 ⚡ FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching(FLAC:通过动能正则化桥匹配实现最大熵强化学习)

10:14 🔍 On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs(关于RL微调视觉语言模型的鲁棒性与思维链一致性研究)

11:03 ⚡ DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels(DICE:扩散大语言模型在生成CUDA内核方面表现出色)

11:48 ⚡ CoPE-VideoLM: Codec Primitives For Efficient Video Language Models(CoPE-VideoLM:面向高效视频语言模型的编解码器原语)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递