[人人能懂AI前沿] 从元认知、内省耦合到多维反馈

你有没有想过，我们如何才能真正信任一个AI？本期节目，我们将从几篇最新论文出发，看看如何让AI学会谦虚地承认“我不确定”，以及如何看穿它解释背后真实的“小心思”。我们还会聊聊，如何赋予AI更强大的“变焦”记忆力，并像指挥家一样精准调教它的行为。准备好，一起揭开AI更深层的秘密吧！

00:00:27 一个更“诚实”的AI，是如何炼成的？

00:05:47 给AI的黑箱，装一扇透明的窗

00:11:35 AI的“读心术”，我们真能看懂它在想什么吗？

00:17:14 AI的记忆难题与“可变焦”图书馆

00:22:22 如何正确地“挑毛病”，一个让机器人变聪明的沟通方法

本期介绍的几篇论文：

[CL] Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

[Yale University & Google Research]

---

[CL] Introspective Coupling: Self-Explanation Training Tracks Behavioral Change Despite Fixed Supervision

[MIT]

---

[LG] Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?

[Meta]

---

[CL] SeKV: Resolution-Adaptive KV Cache with Hierarchical Semantic Memory for Long-Context LLM Inference

[University of British Columbia & Microsoft Research]

---

[RO] Freeform Preference Learning for Robotic Manipulation

[Stanford University]