2025.04.23 | 阿拉伯语性能提升；推理任务性能显著提高。 - HuggingFace 每日AI论文速递

本期的 15 篇论文如下：

00:22 💡 Kuwain 1.5B: An Arabic SLM via Language Injection（Kuwain 1.5B：一种基于语言注入的阿拉伯语SLM）

00:58 🤖 TTRL: Test-Time Reinforcement Learning（测试时强化学习）

01:40 🌍 The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks（从2000+多语种评测基准中汲取的惨痛教训）

02:23 🖼 Describe Anything: Detailed Localized Image and Video Captioning（描述一切：细粒度局部图像与视频字幕生成）

03:00 💡 Learning Adaptive Parallel Reasoning with Language Models（基于语言模型的自适应并行推理学习）

03:34 🖼 IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs（IV-Bench：多模态大语言模型中基于图像的视频感知与推理基准）

04:19 📖 BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation（BookWorld：从小说到交互式智能体社会，用于创意故事生成）

05:10 🚀 Efficient Pretraining Length Scaling（高效预训练长度扩展）

05:49 🩻 CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning（CheXWorld：探索用于X射线影像表征学习的图像世界建模）

06:26 🖼 Personalized Text-to-Image Generation with Auto-Regressive Models（基于自回归模型的个性化文本到图像生成）

07:08 🗣 LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale（LiveCC：基于大规模流式语音转录学习视频大语言模型）

07:47 🎬 Vidi: Large Multimodal Models for Video Understanding and Editing（Vidi：用于视频理解与编辑的大型多模态模型）

08:27 🖼 From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning（从反思到完美：通过反思调优扩展文本到图像扩散模型的推理时优化）

09:03 🤖 LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities（LLM是贪婪的智能体：强化学习微调对决策能力的影响）

09:44 🤖 WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents（WALL-E 2.0：通过神经符号学习实现世界对齐，提升基于世界模型的LLM智能体性能）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递