2024.08.16 每日AI论文 | 强化学习优化定理证明,LLM自我对齐新方法

2024.08.16 每日AI论文 | 强化学习优化定理证明,LLM自我对齐新方法

9分钟 ·
播放数106
·
评论数0

大家好,欢迎收听“Hugging Face 每日AI论文速递”。今天是2024年8月16日,我们将带您快速浏览12篇热门AI论文,涵盖了从LLM自我对齐、数据集浓缩、知识图谱训练到视频生成等多个前沿领域。现在,让我们立即进入精彩的论文世界。

00:25 🔍 DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search(DeepSeek-Prover-V1.5:利用证明助手反馈进行强化学习和蒙特卡洛树搜索)

01:05 🔄 I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm(I-SHEEP:从零开始通过迭代自我增强范式实现LLM的自我对齐)

01:49 🔍 Heavy Labels Out! Dataset Distillation with Label Space Lightening(重标签出!数据集浓缩与标签空间轻量化)

02:31 🧠 Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability(在知识图谱上训练语言模型:幻觉现象及其可检测性的洞察)

03:05 🧠 Towards flexible perception with visual memory(面向灵活感知与视觉记忆)

03:43 🧠 FuseChat: Knowledge Fusion of Chat Models(FuseChat:聊天模型知识融合)

04:26 🌉 MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing(MVInpainter:学习多视角一致性修复以桥接2D和3D编辑)

05:02 🎥 FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance(FancyVideo:通过跨帧文本引导实现动态且一致的视频生成)

05:47 🔊 Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization(通过对抗流匹配优化加速高保真波形生成)

06:31 🤝 The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community(ShareLM集合与插件:为社区贡献人机对话数据)

07:15 🔄 BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts(BAM!就这样:简单高效的参数升级循环方法用于混合专家模型)

07:56 🤖 Can Large Language Models Understand Symbolic Graphics Programs?(大型语言模型能否理解符号图形程序?)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递