2026.02.25 | 数据工程赋能小模型;轻量重排刷新长文本SOTA

2026.02.25 | 数据工程赋能小模型;轻量重排刷新长文本SOTA

13分钟 ·
播放数197
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:29 🖥 On Data Engineering for Scaling LLM Terminal Capabilities(论扩展大型语言模型终端能力的数据工程)

01:20 🧠 Query-focused and Memory-aware Reranker for Long Context Processing(面向长文本处理的查询聚焦与记忆感知重排序器)

02:12 🔗 From Perception to Action: An Interactive Benchmark for Vision Reasoning(从感知到行动:视觉推理的交互式基准)

03:04 🤖 PyVision-RL: Forging Open Agentic Vision Models via RL(PyVision-RL:通过强化学习锻造开放的智能体视觉模型)

03:52 📊 LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces(LongCLI-Bench:命令行界面中长视野智能体编程的初步基准与研究)

04:41 🔍 DREAM: Deep Research Evaluation with Agentic Metrics(DREAM:基于智能体指标的深度研究评估)

05:39 📈 Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation(Conv-FinRe:面向效用驱动的金融推荐对话式与长期性基准)

06:49 ⚙ QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models(QuantVLA:面向视觉-语言-动作模型的尺度校准后训练量化)

07:35 🤖 Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs(从试错中学习:具身大语言模型的反思性测试时规划)

08:20 🚀 The Diffusion Duality, Chapter II: $Ψ$-Samplers and Efficient Curriculum(扩散对偶性第二章:Ψ采样器与高效课程学习)

09:05 🧩 Communication-Inspired Tokenization for Structured Image Representations(面向结构化图像表征的通信启发式分词方法)

10:02 🤖 Aletheia tackles FirstProof autonomously(Aletheia自主攻克首届FirstProof挑战)

10:42 ⚡ Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking(解绑的尤利西斯:通过注意力头分块实现内存高效上下文并行)

11:34 ⚡ The Art of Efficient Reasoning: Data, Reward, and Optimization(高效推理的艺术:数据、奖励与优化)

12:13 🔒 Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization(自适应文本匿名化:通过提示优化学习隐私与效用的权衡)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递