【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 13 篇论文如下:
00:33 🧠 Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation(越难越好:通过难度感知GRPO与多角度问题重构提升数学推理能力)
01:21 🌍 Advancing Open-source World Models(推进开源世界模型)
01:55 🧠 DeepSeek-OCR 2: Visual Causal Flow(DeepSeek-OCR 2:视觉因果流)
02:58 🚀 Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning(Spark:通过关键状态动态分支实现战略策略感知探索的长视野智能体学习)
03:49 🔬 Innovator-VL: A Multimodal Large Language Model for Scientific Discovery(创新者-VL:面向科学发现的多模态大语言模型)
04:34 🔄 Linear representations in language models can change dramatically over a conversation(语言模型中的线性表征在对话过程中会发生剧烈变化)
05:26 🚀 SERA: Soft-Verified Efficient Repository Agents(SERA:软验证高效代码库智能体)
06:01 🤖 OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution(OmegaUse:构建用于自主任务执行的通用图形用户界面代理)
06:46 🤖 GDCNet: Generative Discrepancy Comparison Network for Multimodal Sarcasm Detection(GDCNet:用于多模态讽刺检测的生成式差异比较网络)
07:37 🗣 SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper(SE-DiCoW:自注册的说话人日志条件化Whisper模型)
08:27 📊 RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation(RIR-Mega-Speech:一个包含全面声学元数据且可复现评估的混响语音语料库)
09:16 ✏ SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation(SketchDynamics:探索自由手绘草图在动画生成中的动态意图表达)
10:07 🚀 UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders(UPLiFT:利用局部注意力机制实现高效像素密集特征上采样)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
