2025.05.14 | 零样本语音合成新模型；多维度评估LLM指令能力 - HuggingFace 每日AI论文速递

本期的 8 篇论文如下：

00:25 🗣 MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder（MiniMax-Speech：具有可学习说话人编码器的内在零样本语音合成）

01:00 🤖 A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models（用于评估和改进大型语言模型指令遵循能力的多维度约束框架）

01:47 🎮 Measuring General Intelligence with Generated Games（基于生成游戏测量通用智能）

02:29 🎦 SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation（SkillFormer：用于评估技能水平的统一多视角视频理解）

03:14 🤖 NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance（NavDP：基于特权信息引导的Sim-to-Real导航扩散策略学习）

03:51 🔍 Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency（优化检索增强生成：超参数对性能和效率影响的分析）

04:28 🇻 ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation（ViMRHP：一个人机协作标注的越南语多模态评论有用性预测基准数据集）

05:04 📖 Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines（推进阿拉伯语逆向词典系统：一种基于Transformer的方法与数据集构建指南）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递