【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:32 🔍 WildDet3D: Scaling Promptable 3D Detection in the Wild(WildDet3D:可扩展的野外可提示三维检测)
01:39 🔧 FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios(FORGE:面向制造场景的细粒度多模态评估)
02:25 🔍 RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details(RefineAnything:面向完美局部细节的多模态区域特定精细化)
02:58 🔍 EXAONE 4.5 Technical Report(EXAONE 4.5 技术报告)
03:56 🎮 Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory(Matrix-Game 3.0:具备长时记忆的实时流式交互世界模型)
04:47 ⚡ ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion(ECHO:基于一步块扩散的高效胸部X光报告生成)
05:40 ♻ ELT: Elastic Looped Transformers for Visual Generation(ELT:用于视觉生成的弹性循环Transformer)
06:25 🔍 VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images(VisionFoundry:利用合成图像教授视觉语言模型视觉感知)
07:19 🧠 Structured Causal Video Reasoning via Multi-Objective Alignment(通过多目标对齐实现结构化因果视频推理)
08:11 ⚠ Backdoor Attacks on Decentralised Post-Training(去中心化后训练中的后门攻击)
08:58 🧠 AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents(AgentSwing:面向长视野网络智能体的自适应并行上下文管理路由)
09:55 ⚠ Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism(大语言模型通过一种独特且统一的机制生成有害内容)
10:49 🎭 Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video(面向说话人脸视频情感编辑的跨模态情感迁移)
11:28 🔍 ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery(ScheMatiQ:从研究问题到结构化数据——通过交互式模式发现)
12:17 🔍 $p1$: Better Prompt Optimization with Fewer Prompts(p1:用更少的提示实现更好的提示优化)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
