2025.04.08 | 分钟级AI视频生成;小型模型超越大型模型

2025.04.08 | 分钟级AI视频生成;小型模型超越大型模型

11分钟 ·
播放数185
·
评论数0

本期的 15 篇论文如下:

00:21 🎬 One-Minute Video Generation with Test-Time Training(基于测试时训练的分钟级视频生成)

01:03 💡 SmolVLM: Redefining small and efficient multimodal models(SmolVLM:重新定义小型高效多模态模型)

01:39 🖼 URECA: Unique Region Caption Anything(URECA:独特区域描述一切)

02:17 🧰 T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models(工具集成自验证:用于小语言模型中测试时计算扩展)

03:02 🖼 Concept Lancet: Image Editing with Compositional Representation Transplant(概念柳叶刀:基于成分表示移植的图像编辑)

03:41 🤔 Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models(量化会损害推理能力吗?量化推理模型的实证研究)

04:26 📰 LiveVQA: Live Visual Knowledge Seeking(LiveVQA:实时视觉知识检索)

05:08 🎨 Gaussian Mixture Flow Matching Models(高斯混合流动匹配模型)

05:47 💡 VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks(VAPO:用于高级推理任务的高效可靠的强化学习)

06:26 🕵 Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs(你得到的是你所支付的吗?大型语言模型API中的模型替换审计)

07:17 🧰 DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models(DiaTool-DPO:用于工具增强的大型语言模型的多轮直接偏好优化)

07:54 ⚕ Clinical ModernBERT: An efficient and long context encoder for biomedical text(临床ModernBERT:一种用于生物医学文本的高效长上下文编码器)

08:28 🐍 Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation(Mamba:连接视觉基础模型与视觉语言模型,实现领域泛化语义分割)

09:12 🤖 BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation(基于模型和无模型的6D物体姿态估计BOP挑战赛2024)

09:48 🛡 JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model(JailDAM:基于自适应记忆的视觉-语言模型越狱检测)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递