2026.01.14 | 合成数据喂出低资源学霸;AI自演多轮对话更靠谱

2026.01.14 | 合成数据喂出低资源学霸;AI自演多轮对话更靠谱

10分钟 ·
播放数226
·
评论数0

本期的 15 篇论文如下:

00:20 🌍 Solar Open Technical Report(Solar Open 技术报告)

00:54 🤖 User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale(面向用户的大规模多轮对话生成与工具使用)

01:39 🧠 MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences(MemGovern:通过从受治理的人类经验中学习来增强代码代理)

02:11 🖱 ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands(ShowUI-π:基于流的生成模型作为GUI灵巧手)

02:44 🧠 KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions(KnowMe-Bench:面向终身数字伴侣的人物理解基准测试)

03:15 🏆 ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking(ArenaRL:通过基于锦标赛的相对排名扩展开放智能体强化学习)

04:07 🧠 Ministral 3(Ministral 3系列模型)

04:51 ⚖ The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents(置信度二分法:分析与缓解工具使用智能体中的校准错误)

05:31 🧭 VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory(VLingNav:基于自适应推理与视觉辅助语言记忆的具身导航)

06:24 🎬 End-to-End Video Character Replacement without Structural Guidance(无需结构引导的端到端视频角色替换)

07:06 🎬 Motion Attribution for Video Generation(视频生成中的运动归因)

07:36 🚀 SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices(SnapGen++:释放扩散变换器在边缘设备上实现高效高保真图像生成)

08:12 ⚖ JudgeRLVR: Judge First, Generate Second for Efficient Reasoning(JudgeRLVR:先判断后生成的高效推理方法)

08:46 📊 Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization(对齐文本、代码与视觉:基于多目标强化学习的文本到可视化生成框架)

09:25 🔍 Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking(迈向大型语言模型在事实核查中的全面分阶段基准测试)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递