2025.02.24 | 高效学术调查生成，标点符号关键作用 - HuggingFace 每日AI论文速递

本期的 20 篇论文如下：

00:23 📚 SurveyX: Academic Survey Automation via Large Language Models（基于大型语言模型的学术调查自动化）

01:10 🔍 LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers（LLM显微镜：揭示标点符号在Transformer上下文记忆中的隐藏作用）

01:50 🚗 MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction（MaskGWM：结合视频掩码重建的通用驾驶世界模型）

02:28 🧬 Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model（Mol-LLaMA：面向大分子语言模型的分子通用理解）

03:12 🎨 PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data（PhotoDoodle：从少量成对数据中学习艺术图像编辑）

03:55 🔗 VLM$^2$-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues（VLM²-Bench：深入探究视觉语言模型在显式匹配视觉线索上的隐式链接能力）

04:42 📌 SIFT: Grounding LLM Reasoning in Contexts via Stickers（SIFT：通过贴纸将大语言模型的推理扎根于上下文中）

05:27 🧠 LightThinker: Thinking Step-by-Step Compression（光思者：逐步压缩推理）

05:59 🗂 StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following（结构流基准：多轮指令跟随的结构流评估）

06:48 🛡 Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models（安全标准对所有人都一样吗？大型语言模型的用户特定安全评估）

07:40 📚 KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding（KITAB-Bench：阿拉伯语OCR与文档理解的综合多领域基准）

08:30 🧬 ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation（ReQFlow：用于高效高质量蛋白质骨架生成的校正四元数流）

09:11 🧠 MoBA: Mixture of Block Attention for Long-Context LLMs（MoBA：块注意力混合模型用于长上下文LLMs）

09:49 🤖 InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback（InterFeedback：通过人类反馈揭示大型多模态模型的交互智能）

10:37 🧠 The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer（大语言模型中推理与性能的关系——o3（mini）通过更努力而非更长时间进行推理）

11:20 📚 Evaluating Multimodal Generative AI with Korean Educational Standards（评估多模态生成式人工智能与韩国教育标准）

11:54 ⚠ Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?（超级智能体带来灾难性风险：科学家AI能否提供更安全的路径？）

12:29 ⚡ One-step Diffusion Models with $f$-Divergence Distribution Matching（基于$f$-散度分布匹配的一步扩散模型）

13:09 🧠 Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence（在JSON内部思考：强化策略实现严格LLM模式遵循）

13:52 🧠 MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models（MedHallu：检测大型语言模型中的医学幻觉的综合基准）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递