【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:31 ⚗ MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models(MolHIT:基于分层离散扩散模型推进分子图生成)
01:08 🎭 DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation(DreamID-Omni:可控人本音视频生成统一框架)
01:49 🧪 ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning(ARLArena:一个用于稳定智能体强化学习的统一框架)
02:40 ⚡ HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation(HyTRec:一种用于长行为序列推荐的混合时序感知注意力架构)
03:22 🎬 SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model(SkyReels-V4:多模态视频-音频生成、修复与编辑模型)
04:10 🎮 Solaris: Building a Multiplayer Video World Model in Minecraft(Solaris:在《我的世界》中构建多人视频世界模型)
05:20 🤖 GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL(GUI-Libra:通过动作感知监督和部分可验证强化学习训练原生GUI智能体进行推理与行动)
06:19 🎬 JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation(JavisDiT++:面向联合音视频生成的统一建模与优化)
07:11 🌐 Image Generation with a Sphere Encoder(使用球面编码器的图像生成)
07:51 🧭 World Guidance: World Modeling in Condition Space for Action Generation(世界引导:基于条件空间的世界建模用于动作生成)
08:31 🔍 NanoKnow: How to Know What Your Language Model Knows(NanoKnow:如何知晓你的语言模型知道什么)
09:10 ⚡ DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference(DualPath:打破智能体化大语言模型推理中的存储带宽瓶颈)
10:11 🧠 The Design Space of Tri-Modal Masked Diffusion Models(三模态掩码扩散模型的设计空间研究)
10:46 🔤 VecGlypher: Unified Vector Glyph Generation with Language Models(VecGlypher:基于语言模型的统一矢量字形生成)
11:20 ⚡ SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models(SeaCache:一种用于加速扩散模型的频谱演化感知缓存)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
