00:01:30 人工智能的安全带,真的能系牢吗?
00:06:39 给人工智能算算账:它能力的边界在哪?
00:11:34 你一“喜欢”,人工智能就变乖?这事儿没那么简单
00:16:07 高手与普通人的差距,不在于答案,而在于“清单”
00:20:26 想成事?别总想细节,试试“打包”你的行动
本期介绍的五篇论文:
[LG] On the Impossibility of Separating Intelligence from Judgment: The Computational Intractability of Filtering for AI Alignment
[Ludwig-Maximilians-Universität in Munich & UC Berkeley]
---
[CL] Hallucination Stations: On Some Basic Limitations of Transformer-Based Language Models
[Stanford University & VianAI Systems]
---
[LG] Principled Foundations for Preference Optimization
[Google DeepMind]
---
[LG] PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving
[Google]
---
[LG] Reinforcement Learning with Action Chunking
[UC Berkeley]