2025.11.26 | 大模型育种进化框架开源;MedSAM-3听懂临床精准分割

2025.11.26 | 大模型育种进化框架开源;MedSAM-3听懂临床精准分割

11分钟 ·
播放数96
·
评论数0

本期的 15 篇论文如下:

00:17 🧬 GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms(GigaEvo:基于大语言模型与进化算法的开源优化框架)

00:57 🔬 MedSAM3: Delving into Segment Anything with Medical Concepts(MedSAM3:深入探索基于医学概念的通用分割模型)

01:34 🔍 Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning(Agent0-VL:探索工具集成视觉语言推理的自进化智能体)

02:03 🎨 iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation(iMontage:统一、通用、高度动态的多对多图像生成)

02:38 🕺 SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation(SteadyDancer:基于首帧保持的协调连贯人体图像动画)

03:18 🔍 Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward(理解是否真正指导统一多模态模型的生成?从分析到前进路径)

04:04 🤖 GigaWorld-0: World Models as Data Engine to Empower Embodied AI(GigaWorld-0:世界模型作为数据引擎赋能具身AI)

04:44 🎯 Soft Adaptive Policy Optimization(软自适应策略优化)

05:14 🎬 UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers(UltraViCo:突破视频扩散变换器的外推极限)

05:55 🎯 SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space(SSA:通过特征空间中对齐全注意力和稀疏注意力输出的稀疏稀疏注意力)

06:51 🎨 OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation(OmniAlpha:面向统一多任务RGBA生成的序列到序列框架)

07:41 🎬 ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding(ReDirector:使用旋转相机编码创建任意长度视频重拍)

08:13 🖼 VQ-VA World: Towards High-Quality Visual Question-Visual Answering(VQ-VA世界:迈向高质量视觉问题-视觉回答)

09:06 🔍 HunyuanOCR Technical Report(幻方OCR技术报告)

09:48 🏙 MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts(MajutsuCity:语言驱动美学自适应城市生成与可控3D资产及布局)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递