2026.01.30 | 空间智能基准测不准;Idea2Story一键成文

2026.01.30 | 空间智能基准测不准;Idea2Story一键成文

13分钟 ·
播放数113
·
评论数0

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗www.xiaoyuzhoufm.com

【目录】

本期的 15 篇论文如下:

00:29 🧭 Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models(万物归位:文本到图像模型空间智能基准测试)

01:21 🧠 Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives(Idea2Story:将研究概念转化为完整科学叙事的自动化流程)

02:19 ⚡ Scaling Embeddings Outperforms Scaling Experts in Language Models(在语言模型中扩展嵌入层优于扩展专家混合)

02:58 🔍 OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models(OCRVerse:迈向端到端视觉语言模型中的整体OCR)

03:39 🤖 DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation(DynamicVLA:面向动态物体操作的视觉-语言-动作模型)

04:33 🧠 MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods(MMFineReason:通过开放数据为中心的方法弥合多模态推理鸿沟)

05:20 🔺 PLANING: A Loosely Coupled Triangle-Gaussian Framework for Streaming 3D Reconstruction(PLANING:一种用于流式三维重建的松散耦合三角-高斯框架)

06:08 🧠 ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation(ConceptMoE:面向隐式计算分配的自适应令牌到概念压缩)

07:01 🧩 AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts(AgentLongBench:通过环境推演实现可控的长上下文智能体基准测试)

07:43 🧠 Exploring Reasoning Reward Model for Agents(探索智能体推理奖励模型)

08:39 🎤 Qwen3-ASR Technical Report(Qwen3-ASR技术报告)

09:27 🚀 Language-based Trial and Error Falls Behind in the Era of Experience(经验时代下基于语言的试错方法已然落后)

10:16 🌐 Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models(台风-S:主权大语言模型的最小化开放后训练方法)

11:02 ⚡ Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening(可扩展的幂采样:通过分布锐化解锁LLM高效、免训练推理)

11:59 🧠 MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models(MAD:模态自适应解码用于缓解多模态大语言模型中的跨模态幻觉)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递