大家好,欢迎收听“Hugging Face 每日AI论文速递”。今天是2024年7月30日,我们将带您快速浏览今日的19篇热门AI论文,涵盖了多语言大型语言模型、机器人学习、视频生成技术等多个前沿领域。现在,让我们立即进入精彩的论文世界。
00:25 🌏 SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages(SeaLLMs 3:面向东南亚语言的开源基础和聊天多语言大型语言模型)
01:11 🤖 Theia: Distilling Diverse Vision Foundation Models for Robot Learning(Theia:为机器人学习蒸馏多样化视觉基础模型)
01:51 🎥 FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention(FreeLong:无需训练的长视频生成与频谱混合时序注意力)
02:32 📜 SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain(SaulLM-54B & SaulLM-141B:法律领域域适应性扩展)
03:10 🧠 Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning(通过直接偏好优化的自训练提升链式思维推理)
03:51 🌐 Mixture of Nested Experts: Adaptive Processing of Visual Tokens(嵌套专家混合模型:视觉标记的自适应处理)
04:34 🧠 MindSearch: Mimicking Human Minds Elicits Deep AI Searcher(思·索:模拟人类思维的深度AI搜索器)
05:13 🔍 Diffusion Feedback Helps CLIP See Better(扩散反馈帮助CLIP看得更清楚)
05:53 📊 MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains(MMAU:跨多个领域评估代理能力的综合基准)
06:28 🧩 Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models(视觉谜题:大型视觉和语言模型在常识与世界知识挑战中的表现)
07:09 🔄 Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle(Cycle3D:通过生成-重构循环过程实现高质量和一致性的图像到3D生成)
07:50 🏙 3D Question Answering for City Scene Understanding(城市场景理解的3D问题回答)
08:31 📊 VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks(VolDoGer:用于视觉语言任务中域泛化的LLM辅助数据集)
09:10 🤖 Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge(元奖励语言模型:通过LLM作为元判断者实现自我改进)
09:53 📚 ATHAR: A High-Quality and Diverse Dataset for Classical Arabic to English Translation(ATHAR:一个高质量且多样化的古典阿拉伯语到英语翻译数据集)
10:40 📷 Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture(弥合差距:从单目手机捕捉创建工作室级头像)
11:16 🐕 WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds(WalkTheDog:跨形态运动对齐通过相位流形)
12:02 🔍 TAPTRv2: Attention-based Position Update Improves Tracking Any Point(TAPTRv2:基于注意力的位置更新改进任意点跟踪)
12:52 📊 Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models(使用大型语言模型对立陶宛在线评论进行情感分析)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递