2026.06.30 | 实时编辑视频流;巧用视野胜参数。

2026.06.30 | 实时编辑视频流;巧用视野胜参数。

15分钟 ·
播放数27
·
评论数0

【赞助商】
OpenClaw快报
每天五分钟,听听 OpenClaw 快报,带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com

【目录】
本期的 15 篇论文如下:

[00:32] 🎬 LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing(LiveEdit:迈向基于实时扩散的流式视频编辑)
[01:20] 🧠 Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent(扩展智能体视野而非参数规模:以35B智能体达到万亿参数级性能)
[02:15] 🤔 Agentic Abstention: Do Agents Know When to Stop Instead of Act?(智能体式弃权:智能体知道何时该停止而非行动吗?)
[03:09] 💻 TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents(TUA-Bench:面向通用终端操作代理的基准测试)
[04:12] 🗿 Trimming the Long-Tail of Visual World Modeling Evaluation(修剪视觉世界模型评估中的长尾分布)
[05:06] 🧠 Video-MME-Logical: A Controlled Diagnostic Benchmark for Video Temporal-Logical Reasoning(视频MME-逻辑:一个用于视频时序逻辑推理的受控诊断基准)
[05:57] 📊 Beyond IID: How General Are Tabular Foundation Models, Really?(超越独立同分布:表格基础模型的泛化能力究竟如何?)
[06:40] 🏭 AsyncOPD: How Stale Can On-Policy Distillation Be?(异步OPD:策略蒸馏可以容忍多旧的数据?)
[07:38] 🧠 ReFreeKV: Towards Threshold-Free KV Cache Compression(ReFreeKV:迈向无阈值KV缓存压缩)
[08:38] 📱 Monte Carlo Energy Aggregation for Mobile 3D Gaussian Splatting(面向移动端三维高斯泼溅的蒙特卡洛能量聚合方法)
[09:39] 🔧 TACO: Tool-Augmented Credit Optimization for Agentic Tool Use(工具增强信用优化:面向智能体工具使用的GRPO变体)
[10:33] 🎥 Bridging VideoQA and Video-Guided Agentic Tasks via Generalized Keyframe Extraction(通过广义关键帧提取桥接视频问答与视频引导的智能体任务)
[11:36] 🔍 Interleaved Speech Language Models Latently Work In Text(交错式语音语言模型在文本中潜在地工作)
[12:27] 🤖 OSWorld2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks(OSWorld2.0:面向长时间跨度的真实世界任务的计算机使用智能体基准测试)
[13:15] 🌍 DreamForge-World 0.1 Preview: A Low-Compute Real-Time Controllable World Model(DreamForge-World 0.1 预览版:一种低计算量、实时可控的世界模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递