本期的 17 篇论文如下:
00:26 🤖 AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents(AndroidLab:Android自主代理的训练与系统基准测试)
01:15 🌐 WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning(WebRL:通过自进化在线课程强化学习训练LLM网络代理)
01:55 🌐 Training-free Regional Prompting for Diffusion Transformers(无需训练的扩散变换器区域提示)
02:36 🌍 Survey of Cultural Awareness in Language Models: Text and Beyond(语言模型中的文化意识调查:文本与超越)
03:15 🤖 Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent(混元-大:腾讯开源的520亿激活参数模型)
03:52 📊 DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models(DynaMath:评估视觉语言模型数学推理鲁棒性的动态视觉基准)
04:29 🎥 How Far is Video Generation from World Model: A Physical Law Perspective(视频生成与世界模型有多远:物理定律视角)
05:08 ⚡ Adaptive Caching for Faster Video Generation with Diffusion Transformers(基于扩散变换器的自适应缓存加速视频生成)
05:48 🦖 DynaSaur: Large Language Agents Beyond Predefined Actions(DynaSaur:超越预定义动作的大型语言模型代理)
06:26 🎥 GenXD: Generating Any 3D and 4D Scenes(GenXD:生成任意3D和4D场景)
07:01 📊 Sparsing Law: Towards Large Language Models with Greater Activation Sparsity(稀疏化定律:迈向更大激活稀疏性的大语言模型)
07:45 📚 LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models(LIBMoE:大型语言模型中混合专家的综合基准库)
08:26 🎥 PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance(提示引导下的多样化视频序列理解)
09:08 ⚖ "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization(给我BF16还是给我死亡?LLM量化中的精度-性能权衡)
09:48 🌌 Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models(解码暗物质:用于解释基础模型中罕见概念的专用稀疏自编码器)
10:36 🎨 MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D(MVPaint:同步多视角扩散用于3D绘画)
11:14 🌍 Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks(天鹅与阿拉伯MTEB:方言感知、以阿拉伯语为中心、跨语言和跨文化的嵌入模型与基准)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
