2026.04.13 | 百万图训WildDet3D;工业数据炼FORGE小钢炮

2026.04.13 | 百万图训WildDet3D;工业数据炼FORGE小钢炮

13分钟 ·
播放数116
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:32 🔍 WildDet3D: Scaling Promptable 3D Detection in the Wild(WildDet3D:可扩展的野外可提示三维检测)

01:39 🔧 FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios(FORGE:面向制造场景的细粒度多模态评估)

02:25 🔍 RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details(RefineAnything:面向完美局部细节的多模态区域特定精细化)

02:58 🔍 EXAONE 4.5 Technical Report(EXAONE 4.5 技术报告)

03:56 🎮 Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory(Matrix-Game 3.0:具备长时记忆的实时流式交互世界模型)

04:47 ⚡ ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion(ECHO:基于一步块扩散的高效胸部X光报告生成)

05:40 ♻ ELT: Elastic Looped Transformers for Visual Generation(ELT:用于视觉生成的弹性循环Transformer)

06:25 🔍 VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images(VisionFoundry:利用合成图像教授视觉语言模型视觉感知)

07:19 🧠 Structured Causal Video Reasoning via Multi-Objective Alignment(通过多目标对齐实现结构化因果视频推理)

08:11 ⚠ Backdoor Attacks on Decentralised Post-Training(去中心化后训练中的后门攻击)

08:58 🧠 AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents(AgentSwing:面向长视野网络智能体的自适应并行上下文管理路由)

09:55 ⚠ Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism(大语言模型通过一种独特且统一的机制生成有害内容)

10:49 🎭 Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video(面向说话人脸视频情感编辑的跨模态情感迁移)

11:28 🔍 ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery(ScheMatiQ:从研究问题到结构化数据——通过交互式模式发现)

12:17 🔍 $p1$: Better Prompt Optimization with Fewer Prompts(p1:用更少的提示实现更好的提示优化)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递