2025.11.27 | 俄语多模态评测补空白;潜协作提速14%

2025.11.27 | 俄语多模态评测补空白;潜协作提速14%

11分钟 ·
播放数88
·
评论数0

本期的 15 篇论文如下:

00:22 🔍 Multimodal Evaluation of Russian-language Architectures(俄语多模态架构的评估框架)

01:15 🧠 Latent Collaboration in Multi-Agent Systems(多智能体系统中的潜在协作)

01:47 🌍 Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation(Inferix:基于块扩散的新一代世界模拟推理引擎)

02:18 🎭 Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy(和谐:通过跨任务协同实现音频与视频生成的统一)

03:10 📄 NVIDIA Nemotron Parse 1.1(英伟达Nemotron解析1.1)

03:46 🧠 Monet: Reasoning in Latent Visual Space Beyond Images and Language(Monet:超越图像与语言的潜在视觉空间推理)

04:25 ⚡ Terminal Velocity Matching(终端速度匹配)

05:03 📊 Revisiting Generalization Across Difficulty Levels: It's Not So Easy(重新审视跨难度级别的泛化能力:并非易事)

05:42 🤖 MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots(MobileVLA-R1:强化移动机器人的视觉-语言-动作能力)

06:25 ⚡ Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs(基于轨迹采样对的连续时间一致性图像自由时间步蒸馏)

06:59 🎮 UniGame: Turning a Unified Multimodal Model Into Its Own Adversary(UniGame:将统一多模态模型转化为其自身的对抗者)

07:47 🧩 SPHINX: A Synthetic Environment for Visual Perception and Reasoning(SPHINX:用于视觉感知与推理的合成环境)

08:33 ⚡ Block Cascading: Training Free Acceleration of Block-Causal Video Models(块级联:免训练的块因果视频模型加速)

09:12 🏙 RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale(RAISECity:面向城市尺度的现实对齐三维世界生成多模态智能体框架)

09:58 📊 I-GLIDE: Input Groups for Latent Health Indicators in Degradation Estimation(I-GLIDE:基于输入组的退化估计潜在健康指标)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递