2025.12.24 | 语义蓝图提速视频生成;逐层剖析炼出强策略

2025.12.24 | 语义蓝图提速视频生成;逐层剖析炼出强策略

11分钟 ·
播放数124
·
评论数0

本期的 15 篇论文如下:

00:19 🎬 SemanticGen: Video Generation in Semantic Space(SemanticGen:在语义空间中的视频生成)

01:01 🔍 Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies(自底向上策略优化:你的语言模型策略中暗含内部策略)

01:48 🧠 SpatialTree: How Spatial Abilities Branch Out in MLLMs(SpatialTree:多模态大语言模型中的空间能力如何分支发展)

02:23 🤖 LongVideoAgent: Multi-Agent Reasoning with Long Videos(LongVideoAgent:基于多智能体推理的长视频理解)

03:06 🧠 MemEvolve: Meta-Evolution of Agent Memory Systems(MemEvolve:智能体记忆系统的元进化)

03:46 🔍 Step-DeepResearch Technical Report(Step-DeepResearch技术报告)

04:22 🎧 SAM Audio: Segment Anything in Audio(SAM Audio:音频中的任意分割)

05:00 🚀 INTELLECT-3: Technical Report(INTELLECT-3:技术报告)

05:30 🔍 FaithLens: Detecting and Explaining Faithfulness Hallucination(FaithLens:检测与解释忠实性幻觉)

06:07 🧠 Reinforcement Learning for Self-Improving Agent with Skill Library(基于技能库与强化学习的自进化智能体研究)

06:53 📊 QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models(QuantiPhy:评估视觉语言模型物理推理能力的定量基准)

07:38 🔊 Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems(Simulstream:用于流式语音到文本翻译系统评估与演示的开源工具包)

08:18 🧠 Active Intelligence in Video Avatars via Closed-loop World Modeling(通过闭环世界建模实现视频化身的主动智能)

08:55 🔬 Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation(基于多LLM与双重可靠性度量的主题分析:结合Cohen's Kappa与语义相似度进行定性研究验证)

09:32 ⚠ Toxicity Ahead: Forecasting Conversational Derailment on GitHub(毒性预警:预测GitHub对话中的脱轨行为)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递