2025.06.19 | SEKAI数据集提升视频生成；原型推理增强LLM泛化能力。 - HuggingFace 每日AI论文速递

本期的 15 篇论文如下：

00:22 🌍 Sekai: A Video Dataset towards World Exploration（Sekai：一个面向世界探索的视频数据集）

01:02 💡 ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs（原型推理：作为大型语言模型中通用推理基础的原型）

01:43 💡 GenRecal: Generation after Recalibration from Large to Small Vision-Language Models（GenRecal：从大型到小型视觉-语言模型的重校准后生成）

02:24 🗣 BUT System for the MLC-SLM Challenge（用于MLC-SLM挑战赛的BUT系统）

03:10 🤖 Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence（具身Web智能体：连接物理与数字领域，实现集成智能）

03:57 💡 Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation（自由形式生成中基于语义感知的开放式R1训练奖励）

04:43 🔬 SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification（SciVer：评估多模态科学声明验证中的基础模型）

05:26 🚀 Truncated Proximal Policy Optimization（截断近端策略优化）

06:04 🖼 PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers（PictSure：预训练嵌入对上下文学习图像分类器的影响）

06:37 🖼 CoMemo: LVLMs Need Image Context with Image Memory（CoMemo：LVLM需要带有图像记忆的图像上下文）

07:21 🤖 SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence（群体智能代理：迈向基于群体智能的全自动代理系统生成）

08:01 🧠 MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models（MoTE：面向内存高效的大型多模态模型的三元专家混合）

08:45 🛡 OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents（OS-Harm：衡量计算机使用Agent安全性的基准）

09:34 🏞 ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies（ImmerseGen：基于代理引导的、使用Alpha纹理代理的沉浸式世界生成）

10:09 🤝 FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models（FedNano：面向预训练多模态大语言模型的轻量级联邦调优）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递