2024.09.30 每日AI论文 | Emu3简化多模态设计，MIO提升视频理解表现。 - HuggingFace 每日AI论文速递

本期的 9 篇论文如下：

00:24 🧠 Emu3: Next-Token Prediction is All You Need（Emu3：下一个词预测是您所需要的全部）

00:53 🌐 MIO: A Foundation Model on Multimodal Tokens（多模态标记的基础模型：MIO）

01:26 🔍 VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models（VPTQ：大语言模型的极端低比特向量后训练量化）

02:21 🎥 PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation（PhysGen：基于刚体物理的图像到视频生成）

03:05 🔄 Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult（调制干预偏好优化（MIPO）：保持简单，细化困难）

03:46 📄 MinerU: An Open-Source Solution for Precise Document Content Extraction（MinerU：一种用于精确文档内容提取的开源解决方案）

04:24 🤖 MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making（MSI-Agent：将多尺度洞察融入具身代理以提升规划与决策能力）

05:01 🤖 A Survey on the Honesty of Large Language Models（大型语言模型诚实性综述）

05:45 📊 LML: Language Model Learning a Dataset for Data-Augmented Prediction（LML：用于数据增强预测的数据集学习语言模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递