2026.06.15 | AI视频精准控镜;智能体推理细粒度优化

2026.06.15 | AI视频精准控镜;智能体推理细粒度优化

16分钟 ·
播放数112
·
评论数0

【赞助商】
OpenClaw快报
每天五分钟,听听 OpenClaw 快报,带你了解最新动态和业内讨论
传送门 www.xiaoyuzhoufm.com

【目录】
本期的 15 篇论文如下:

[00:30] 🎥 OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data(OmniDirector:无需配对数据的通用多镜头相机克隆)
[01:25] 🤖 APPO: Agentic Procedural Policy Optimization(智能体程序策略优化)
[02:23] 🧠 Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents(记忆是重构的,而非检索:面向LLM智能体的图记忆机制)
[03:16] 🤖 From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI(从聊天机器人到数字同事:向持久自主人工智能的范式转变)
[04:12] 🎼 Orchestra-o1: Omnimodal Agent Orchestration(管弦乐队-o1:全模态智能体编排框架)
[05:12] 🔧 HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry(HarnessX:一个可组合、自适应且可演化的代理框架铸造厂)
[06:16] 🎥 Rethinking RAG in Long Videos: What to Retrieve and How to Use It?(重新思考长视频中的检索增强生成:检索什么以及如何使用?)
[07:19] 🎬 OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains(OmniVideo-100K:一个通过结构化脚本和证据链进行音视频推理的数据集)
[08:36] 🤖 From AGI to ASI(从通用人工智能到超级人工智能)
[09:34] 🧠 Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO(小型模型是GRPO中策略级多样性的天然探索者)
[10:32] 🛡 RedAct: Redacting Agent Capability Traces for Procedural Skill Protection(RedAct:为保护程序技能而屏蔽智能体能力痕迹)
[11:21] 👁 LLM Agents Can See Code Repositories(LLM智能体能够“看见”代码仓库)
[12:14] 🩺 Measuring Epistemic Resilience of LLMs Under Misleading Medical Context(测量大语言模型在误导性医疗语境下的认知韧性)
[13:15] 🔄 Skip a Layer or Loop It? Learning Program-of-Layers in LLMs(跳过一层还是循环它?学习大语言模型中的层程序化执行)
[14:12] 🎨 RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space(RepFusion:利用多模态先验在表示空间中进行去噪)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递