2024.10.08 每日AI论文 | 差分Transformer优化注意力，LLM幻觉研究揭示错误模式。 - HuggingFace 每日AI论文速递

本期的 21 篇论文如下：

00:26 🔍 Differential Transformer（差分Transformer）

01:04 🧠 LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations（大语言模型知多于表：关于LLM幻觉的内在表征）

01:50 📹 VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide（视频指南：通过教师指导提升视频扩散模型无需训练）

02:28 📈 FAN: Fourier Analysis Networks（傅里叶分析网络）

03:05 🏥 Named Clinical Entity Recognition Benchmark（命名临床实体识别基准）

03:37 🔬 ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery（科学智能基准：面向数据驱动科学发现的语言智能体严格评估）

04:19 🎶 UniMuMo: Unified Text, Music and Motion Generation（统一文本、音乐与动作生成）

04:55 🔍 TLDR: Token-Level Detective Reward Model for Large Vision Language Models（TLDR：大视觉语言模型的令牌级侦探奖励模型）

05:35 🎵 Presto! Distilling Steps and Layers for Accelerating Music Generation（快速！加速音乐生成的步骤和层级蒸馏）

06:08 🖥 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents（像人类一样导航数字世界：GUI代理的通用视觉基础）

06:49 🖼 OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction（全能展台：通过多模态指令学习图像合成的潜在控制）

07:29 🌀 MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion（MonST3R：一种在动态场景中估计几何的简单方法）

08:09 🧠 LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning（LLaMA-Berry：O1类奥林匹克级数学推理的成对优化）

08:50 📊 MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs（MathHay：LLMs长上下文数学推理自动化基准）

09:39 📊 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models（GSM-符号化：理解大型语言模型在数学推理中的局限性）

10:34 🤖 Autonomous Character-Scene Interaction Synthesis from Text Instruction（从文本指令自主合成角色场景互动）

11:12 🧩 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles（TurtleBench：通过真实世界的Yes/No谜题评估顶级语言模型）

12:00 🤖 Grounding Language in Multi-Perspective Referential Communication（多视角指称通信中的语言接地）

12:48 🎯 SePPO: Semi-Policy Preference Optimization for Diffusion Alignment（SePPO：扩散模型对齐的半策略偏好优化）

13:25 🧩 What Matters for Model Merging at Scale?（大规模模型合并的关键因素是什么？）

14:02 📊 SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification（SELECT：图像分类数据策展策略的大规模基准）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递