2025.04.23 | 阿拉伯语性能提升;推理任务性能显著提高。

2025.04.23 | 阿拉伯语性能提升;推理任务性能显著提高。

11分钟 ·
播放数111
·
评论数0

本期的 15 篇论文如下:

00:22 💡 Kuwain 1.5B: An Arabic SLM via Language Injection(Kuwain 1.5B:一种基于语言注入的阿拉伯语SLM)

00:58 🤖 TTRL: Test-Time Reinforcement Learning(测试时强化学习)

01:40 🌍 The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks(从2000+多语种评测基准中汲取的惨痛教训)

02:23 🖼 Describe Anything: Detailed Localized Image and Video Captioning(描述一切:细粒度局部图像与视频字幕生成)

03:00 💡 Learning Adaptive Parallel Reasoning with Language Models(基于语言模型的自适应并行推理学习)

03:34 🖼 IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs(IV-Bench:多模态大语言模型中基于图像的视频感知与推理基准)

04:19 📖 BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation(BookWorld:从小说到交互式智能体社会,用于创意故事生成)

05:10 🚀 Efficient Pretraining Length Scaling(高效预训练长度扩展)

05:49 🩻 CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning(CheXWorld:探索用于X射线影像表征学习的图像世界建模)

06:26 🖼 Personalized Text-to-Image Generation with Auto-Regressive Models(基于自回归模型的个性化文本到图像生成)

07:08 🗣 LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale(LiveCC:基于大规模流式语音转录学习视频大语言模型)

07:47 🎬 Vidi: Large Multimodal Models for Video Understanding and Editing(Vidi:用于视频理解与编辑的大型多模态模型)

08:27 🖼 From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning(从反思到完美:通过反思调优扩展文本到图像扩散模型的推理时优化)

09:03 🤖 LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities(LLM是贪婪的智能体:强化学习微调对决策能力的影响)

09:44 🤖 WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents(WALL-E 2.0:通过神经符号学习实现世界对齐,提升基于世界模型的LLM智能体性能)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递