【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:33 🤖 ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas(ASTRA:基于自动化轨迹合成与强化学习竞技场的智能体训练框架)
01:22 🛡 THINKSAFE: Self-Generated Safety Alignment for Reasoning Models(THINKSAFE:推理模型的自生成安全对齐)
02:18 🧠 TTCS: Test-Time Curriculum Synthesis for Self-Evolving(TTCS:面向自进化的测试时课程合成)
03:09 🍌 PaperBanana: Automating Academic Illustration for AI Scientists(PaperBanana:面向AI科学家的学术插图自动化生成框架)
03:51 🔬 FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation(傅里叶采样器:通过频率引导生成解锁扩散语言模型的非自回归潜力)
04:40 🧠 ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought(ReGuLaR:基于渲染思维链指导的变分潜在推理)
05:22 🎯 SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization(SSL:基于甜点学习的差异化引导智能体优化)
06:02 🎯 DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment(DenseGRPO:从稀疏奖励到稠密奖励的流匹配模型对齐方法)
07:08 🧠 Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification(突破自然推理的边界:形式逻辑验证的交织增益)
07:55 📄 PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing(PaddleOCR-VL-1.5:面向鲁棒野外文档解析的多任务0.9B视觉语言模型)
08:45 🎬 DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning(DreamActor-M2:通过时空上下文学习的通用角色图像动画)
09:42 🧠 MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning(MemOCR:面向高效长程推理的布局感知视觉记忆)
10:24 🦢 Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text(金鹅:一种从未经验证的互联网文本中合成无限RLVR任务的简单技巧)
11:13 📊 Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling(大语言模型在最佳N采样下对抗性风险的统计估计)
12:00 ⚡ RM -RF: Reward Model for Run-Free Unit Test Evaluation(RM-RF:一种用于免运行单元测试评估的奖励模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
