2025.02.13 | 多语言评估工具填补空白，密集文本图像数据集挑战生成模型。 - HuggingFace 每日AI论文速递

本期的 20 篇论文如下：

00:23 🌍 BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models（BenchMAX：大型语言模型的综合多语言评估套件）

01:08 📄 TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation（TextAtlas5M：用于密集文本图像生成的大规模数据集）

01:48 🎥 Light-A-Video: Training-free Video Relighting via Progressive Light Fusion（光影视频：基于渐进光融合的无训练视频重照明）

02:36 🎥 CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation（CineMaster：一个三维感知与可控的电影级文本到视频生成框架）

03:16 🖥 WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation（世界GUI：桌面GUI自动化的综合动态测试）

04:06 ⚡ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid（LASP-2：重新思考线性注意力及其混合模型的序列并行性）

04:45 🧠 TransMLA: Multi-head Latent Attention Is All You Need（TransMLA：多头潜在注意力机制的全部需求）

05:31 💼 Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance（Fino1：关于推理增强型大型语言模型在金融领域的可迁移性研究）

06:23 📏 Distillation Scaling Laws（蒸馏缩放定律）

07:02 🚀 Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning（忽略KL惩罚！通过增强关键标记的探索来提升强化学习微调效果）

07:52 🌍 SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation（SARChat-Bench-2M：用于SAR图像解释的多任务视觉语言基准）

08:25 🧠 LLM Pretraining with Continuous Concepts（基于连续概念的LLM预训练）

09:09 🎭 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance（动画任何人2：利用环境可操作性生成高保真角色图像动画）

09:52 🔍 NoLiMa: Long-Context Evaluation Beyond Literal Matching（NoLiMa：超越字面匹配的长上下文评估）

10:39 🧠 Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing（中介：基于参数冲突少和不确定性路由的高效LLM合并）

11:15 📚 Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey（面向可信赖的大语言模型检索增强生成：综述）

11:58 🎥 Next Block Prediction: Video Generation via Semi-Autoregressive Modeling（下一区块预测：通过半自回归建模生成视频）

12:43 🔄 DPO-Shift: Shifting the Distribution of Direct Preference Optimization（DPO-Shift：直接偏好优化分布的可控转移）

13:28 🧠 LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention（LLM模块：使用增强交叉注意力机制从大模型向小模型进行知识迁移）

14:15 🛡 MetaSC: Test-Time Safety Specification Optimization for Language Models（MetaSC：语言模型推理时的安全规范优化）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递