2026.02.05 | ERNIE 5.0统一模态;FASA稀疏注意力省内存

2026.02.05 | ERNIE 5.0统一模态;FASA稀疏注意力省内存

13分钟 ·
播放数113
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:29 🧠 ERNIE 5.0 Technical Report(ERNIE 5.0 技术报告)

01:11 ⚡ FASA: Frequency-aware Sparse Attention(FASA:基于频率感知的稀疏注意力机制)

02:01 📊 Training Data Efficiency in Multimodal Process Reward Models(多模态过程奖励模型中的训练数据效率研究)

02:44 🤖 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning(WideSeek-R1:通过多智能体强化学习探索宽度扩展以实现广泛信息检索)

03:28 ⚡ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models(OmniSIFT:面向高效全模态大语言模型的模态非对称令牌压缩)

04:21 ⚡ HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing(HySparse:一种具有预言机令牌选择和KV缓存共享的混合稀疏注意力架构)

05:02 🤖 EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models(EgoActor:通过视觉语言模型将任务规划落地为空间感知的具身动作)

06:05 🎬 Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization(Quant VideoGen:通过2位KV缓存量化实现自回归长视频生成)

06:59 🤖 SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation(SoMA:面向机器人软体操作的真实到仿真神经模拟器)

07:44 🔍 TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents(TIDE:基于轨迹的LLM智能体测试时改进诊断评估)

08:21 🧠 Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers(语义路由:探索扩散变换器中多层LLM特征加权的融合框架)

09:12 🤖 Rethinking the Trust Region in LLM Reinforcement Learning(重新思考大语言模型强化学习中的信任区域)

09:54 ♻ Residual Context Diffusion Language Models(残差上下文扩散语言模型)

10:40 🧱 HY3D-Bench: Generation of 3D Assets(HY3D-Bench:3D资产的生成)

11:34 🎨 AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations(AutoFigure:生成与优化可直接用于发表的科学插图)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递