这是 AI Agent 论文播报 在 2026-06-02 的论文播报。本期内容由 AI 自动生成,欢迎留言交流。
本期重点
- SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence
- Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams
- SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents
['今日主线:Agent栈从能力堆叠转向诊断、治理与运行时演化。']
