【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:29 🧠 Query as Anchor: Scenario-Adaptive User Representation via Large Language Model(查询作为锚点:基于大型语言模型的场景自适应用户表征)
01:14 ⚛ Qute: Towards Quantum-Native Database(Qute:迈向量子原生数据库)
01:59 🧠 InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem(InnoEval:将研究思想评估视为知识驱动、多视角推理问题)
03:05 🔍 REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents(REDSearcher:一种可扩展且经济高效的长视野搜索智能体框架)
03:56 🚀 BitDance: Scaling Autoregressive Generative Models with Binary Tokens(BitDance:使用二进制令牌扩展自回归生成模型)
04:38 🧠 Experiential Reinforcement Learning(经验性强化学习)
05:24 🧠 Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings(Embed-RL:基于强化学习的推理驱动多模态嵌入方法)
06:21 🧩 UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model(UniWeTok:一种用于统一多模态大语言模型的、具有$\mathit{2^{128}}$码本大小的统一二进制分词器)
07:13 🔍 BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents(BrowseComp-V³:面向多模态浏览代理的视觉、垂直与可验证基准)
08:18 🧠 LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models(LaViDa-R1:推进统一多模态扩散语言模型的推理能力)
09:02 🗣 Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision(对话式图像分割:通过可扩展监督将抽象概念落地)
10:00 🧠 Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts(Nanbeige4.1-3B:一个能够推理、对齐与行动的小型通用模型)
10:49 🎨 FireRed-Image-Edit-1.0 Techinical Report(FireRed-图像编辑-1.0 技术报告)
11:26 🧬 Data Darwinism Part I: Unlocking the Value of Scientific Data for Pre-training(数据达尔文主义第一部分:释放科学数据在预训练中的价值)
12:04 🌐 WebWorld: A Large-Scale World Model for Web Agent Training(WebWorld:用于网络智能体训练的大规模世界模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
