本期的 14 篇论文如下:
00:22 🤖 Qwen2.5 Technical Report(Qwen2.5技术报告)
01:00 🧠 Progressive Multimodal Reasoning via Active Retrieval(通过主动检索实现渐进式多模态推理)
01:39 🌐 MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval(MegaPairs:大规模数据合成用于通用多模态检索)
02:26 🧠 LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks(LongBench v2:面向现实长上下文多任务的深入理解和推理)
03:15 📊 How to Synthesize Text Data without Model Collapse?(如何合成文本数据而不导致模型崩溃?)
03:56 🌊 Flowing from Words to Pixels: A Framework for Cross-Modality Evolution(从文字到像素:跨模态演化的框架)
04:37 🎥 LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis(LeviTor:面向三维轨迹的图像到视频合成)
05:20 🖼 Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion(可感知功能的对象插入:基于掩码感知的双重扩散)
06:05 🌐 DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation(DI-PCG:基于扩散的高效逆向程序化内容生成用于高质量3D资产创建)
06:46 🧠 AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling(AceMath:通过后训练和奖励建模推进前沿数学推理)
07:33 🧠 Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception(基于视觉专家的描述性字幕增强的多模态感知)
08:14 🖼 UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency(基于循环编辑一致性的无监督指令图像编辑)
08:54 🧪 TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation(基于文本的开放分子生成基准测试)
09:36 🕺 Move-in-2D: 2D-Conditioned Human Motion Generation(二维条件下的生成人体运动)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
