2025.08.29 | 稳定文本到图像生成;高效数学推理

2025.08.29 | 稳定文本到图像生成;高效数学推理

8分钟 ·
播放数108
·
评论数0

本期的 15 篇论文如下:

00:24 ⚖ Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning(Pref-GRPO:基于成对偏好奖励的GRPO用于稳定的文本到图像强化学习)

00:57 🧠 rStar2-Agent: Agentic Reasoning Technical Report(rStar2-Agent:智能体推理技术报告)

01:28 🎨 USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning(USO: 通过解耦和奖励学习的统一风格与主题驱动生成)

01:56 🚀 AWorld: Orchestrating the Training Recipe for Agentic AI(AWorld:编排智能体AI的训练配方)

02:26 🎯 TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning(TCIA:一种用于指令微调的任务中心式指令增强方法)

02:54 🧠 Mixture of Contexts for Long Video Generation(上下文混合用于长视频生成)

03:17 🧠 CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification(CogVLA:基于指令驱动路由与稀疏化的认知对齐视觉-语言-动作模型)

03:51 🔍 MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers(MCP-Bench: 通过MCP服务器使用复杂现实世界任务对工具使用LLM代理进行基准测试)

04:23 🎨 OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning(OneReward:通过多任务人类偏好学习实现统一的掩码引导图像生成)

04:54 🛡 Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection(扭转局面:通过秩一安全注入实现轻量级对齐增强)

05:21 🧠 Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD(大型语言模型中的说服动态:使用DuET-PD研究知识和安全方面的鲁棒性和适应性)

05:56 💃 Dress&Dance: Dress up and Dance as You Like It - Technical Preview(着装与舞蹈:随心着装与舞蹈 - 技术预览)

06:18 🎯 OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models(OnGoal:在大型语言模型多轮对话中跟踪和可视化对话目标)

06:42 📷 Multi-View 3D Point Tracking(多视图3D点跟踪)

07:10 🎭 FakeParts: a New Family of AI-Generated DeepFakes(FakeParts:一种新型AI生成的深度伪造家族)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递