0611 MLSYS 论文简报：KV Cache Budget、Agentic RL

## 内容时间戳

- 00:00 Opening: 0611 MLSYS 论文简报

- 基于 2026-06-10 晚间完成的 arXiv 论文召回与筛选；音频不朗读链接。

- 00:29 ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

- 机构：清华大学; 香港城市大学

- 夯到拉评价：夯（Jeff champion）

- 亮点：对，我挑的这篇 System Champion 论文绝对是当前的及时雨，标题是 ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models。来自清华大学和香港城市大学。

- 03:15 TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

- 机构：清华大学; 腾讯

- 夯到拉评价：夯（Ada champion）

- 亮点：我挑的这篇是 TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning。由清华大学和腾讯联合发表。

- 06:00 EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

- 机构：上海交通大学; 普林斯顿大学

- 夯到拉评价：顶级（Ada champion）

- 亮点：标题是 EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents。由上海交通大学和普林斯顿大学合作，作者里包括了我们熟知的 Mengdi Wang 教授。

- 08:47 Wrap-up

- 总结本期重点论文和后续阅读优先级。

## 制作元信息

- 论文召回：原始 JSONL 记录 225 篇；新论文 225 篇；带入 backlog 10 篇。

- 筛选链路：新候选 168 篇；backlog 候选 10 篇；粗排 178 篇；LLM 精评 20 篇；本期播客主讲 3 篇；快速提及 2 篇。

- LLM：gemini-3.5-flash；input 5169 tokens，output 1860 tokens，总计 7029 tokens。

- TTS：seed-tts-2.0；Jeff voice zh_male_m191_uranus_bigtts，Ada voice zh_female_yingyujiaoxue_uranus_bigtts；34 turns，输入 3195 字符，计费文本 3195 words。