2025.08.27 | 物理模型评估显不足；树算法优化提效降本 - HuggingFace 每日AI论文速递

本期的 15 篇论文如下：

00:23 🔬 CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics（CMPhysBench：用于评估凝聚态物理中大语言模型的基准测试）

00:57 🌳 TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling（TreePO: 通过启发式树建模弥合策略优化与效果和推理效率之间的差距）

01:21 🗣 VibeVoice Technical Report（VibeVoice技术报告）

01:45 🔨 VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space（VoxHammer：在原生3D空间中无需训练的精确连贯3D编辑）

02:13 💡 Spacer: Towards Engineered Scientific Inspiration（Spacer：迈向工程化的科学灵感）

02:45 🧠 OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation（OmniHuman-1.5：通过认知模拟为数字人注入活跃思维）

03:10 🧠 UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning（UltraMemV2：扩展至1200亿参数的具有卓越长上下文学习能力的记忆网络）

03:36 ⚡ Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels（Pixie: 从像素中快速且可泛化的3D物理监督学习）

04:04 🎥 Autoregressive Universal Video Segmentation Model（自回归通用视频分割模型）

04:30 🎬 Wan-S2V: Audio-Driven Cinematic Video Generation（Wan-S2V：音频驱动的电影级视频生成）

04:56 🎬 CineScale: Free Lunch in High-Resolution Cinematic Visual Generation（CineScale：高分辨率电影视觉生成中的免费午餐）

05:22 🔷 FastMesh:Efficient Artistic Mesh Generation via Component Decoupling（FastMesh: 通过组件解耦实现高效艺术网格生成）

05:45 📊 ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks（ReportBench：通过学术调查任务评估深度研究代理）

06:13 🧠 ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models（ThinkDial：一种控制大型语言模型推理努力的开源方法）

06:42 🧠 MovieCORE: COgnitive REasoning in Movies（MovieCORE：电影中的认知推理）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递