2026.02.26 | 分子图生成首破99%化学有效性;DreamID-Omni把多人脸音色混剪错配率砍到8%

2026.02.26 | 分子图生成首破99%化学有效性;DreamID-Omni把多人脸音色混剪错配率砍到8%

13分钟 ·
播放数152
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:31 ⚗ MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models(MolHIT:基于分层离散扩散模型推进分子图生成)

01:08 🎭 DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation(DreamID-Omni:可控人本音视频生成统一框架)

01:49 🧪 ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning(ARLArena:一个用于稳定智能体强化学习的统一框架)

02:40 ⚡ HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation(HyTRec:一种用于长行为序列推荐的混合时序感知注意力架构)

03:22 🎬 SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model(SkyReels-V4:多模态视频-音频生成、修复与编辑模型)

04:10 🎮 Solaris: Building a Multiplayer Video World Model in Minecraft(Solaris:在《我的世界》中构建多人视频世界模型)

05:20 🤖 GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL(GUI-Libra:通过动作感知监督和部分可验证强化学习训练原生GUI智能体进行推理与行动)

06:19 🎬 JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation(JavisDiT++:面向联合音视频生成的统一建模与优化)

07:11 🌐 Image Generation with a Sphere Encoder(使用球面编码器的图像生成)

07:51 🧭 World Guidance: World Modeling in Condition Space for Action Generation(世界引导:基于条件空间的世界建模用于动作生成)

08:31 🔍 NanoKnow: How to Know What Your Language Model Knows(NanoKnow:如何知晓你的语言模型知道什么)

09:10 ⚡ DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference(DualPath:打破智能体化大语言模型推理中的存储带宽瓶颈)

10:11 🧠 The Design Space of Tri-Modal Masked Diffusion Models(三模态掩码扩散模型的设计空间研究)

10:46 🔤 VecGlypher: Unified Vector Glyph Generation with Language Models(VecGlypher:基于语言模型的统一矢量字形生成)

11:20 ⚡ SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models(SeaCache:一种用于加速扩散模型的频谱演化感知缓存)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递