【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 10 篇论文如下:
00:28 🎬 ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling(ShotStream:用于交互式叙事的多镜头流式视频生成)
01:07 🎬 PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference(PackForcing:短视频训练足以实现长视频采样与长上下文推理)
01:54 🧠 Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills(Trace2Skill:将轨迹局部经验提炼为可迁移的智能体技能)
02:43 📊 RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation(RealChart2Code:基于真实数据与多任务评估推进图表到代码生成)
03:53 🚗 LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset(带有推理轨迹的长尾驾驶场景:KITScenes长尾数据集)
04:42 🧠 Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models(Know3D:利用视觉语言模型知识驱动的3D生成提示)
05:25 🛠 Natural-Language Agent Harnesses(自然语言智能体控制框架)
06:10 🎤 Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models(侍酒师:面向全双工语音语言模型的可扩展开放多轮音频预处理)
06:59 🔬 MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies(MedOpenClaw:基于未整理完整研究的可审计医学影像智能体推理)
07:46 🚀 Diffutron: A Masked Diffusion Language Model for Turkish Language(Diffutron:面向土耳其语的掩码扩散语言模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
