本期的 8 篇论文如下:
00:24 🤖 EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation(EnerVerse:面向机器人操作的具身未来空间构想)
00:58 🤖 VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction(VITA-1.5:迈向GPT-4o级别的实时视觉与语音交互)
01:33 🤔 Virgo: A Preliminary Exploration on Reproducing o1-like MLLM(Virgo:关于复现o1类多模态大语言模型的初步探索)
02:11 🤖 SDPO: Segment-Level Direct Preference Optimization for Social Agents(SDPO:面向社交代理的片段级直接偏好优化)
02:51 🎨 VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation(VisionReward:基于细粒度多维人类偏好的图像与视频生成学习)
03:31 🧬 Graph Generative Pre-trained Transformer(图生成预训练变换器)
04:04 🌍 LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models(LUSIFER:基于大语言模型的语言通用空间集成增强多语言嵌入)
04:44 🔬 BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery(BoxingGym:自动化实验设计与模型发现进展的基准测试)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递