你有没有想过,我们如何才能真正信任一个AI?本期节目,我们将从几篇最新论文出发,看看如何让AI学会谦虚地承认“我不确定”,以及如何看穿它解释背后真实的“小心思”。我们还会聊聊,如何赋予AI更强大的“变焦”记忆力,并像指挥家一样精准调教它的行为。准备好,一起揭开AI更深层的秘密吧!
一个更“诚实”的AI,是如何炼成的?
给AI的黑箱,装一扇透明的窗
AI的“读心术”,我们真能看懂它在想什么吗?
AI的记忆难题与“可变焦”图书馆
如何正确地“挑毛病”,一个让机器人变聪明的沟通方法
本期介绍的几篇论文:
[CL] Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs
[Yale University & Google Research]
---
[CL] Introspective Coupling: Self-Explanation Training Tracks Behavioral Change Despite Fixed Supervision
[MIT]
---
[LG] Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?
[Meta]
---
[CL] SeKV: Resolution-Adaptive KV Cache with Hierarchical Semantic Memory for Long-Context LLM Inference
[University of British Columbia & Microsoft Research]
---
[RO] Freeform Preference Learning for Robotic Manipulation
[Stanford University]
![[人人能懂AI前沿] 从元认知、内省耦合到多维反馈](https://image.xyzcdn.net/FqWpK8fpivLboaqBbRHUe_BCOvxu.png@small)