AI前沿:从数学推理到模型优化AI可可AI生活

AI前沿:从数学推理到模型优化

8分钟 ·
播放数68
·
评论数0

[CL] OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

[Shanghai Jiao Tong University]

arxiv.org

---

[LG] Overtuning in Hyperparameter Optimization

[LMU Munich]

arxiv.org

---

[LG] Distilling Normalizing Flows

[University of Oregon & HSE University & Picsart AI Research]

arxiv.org

---

[LG] Gaussian Invariant Markov Chain Monte Carlo

[Google DeepMind & UCL]

arxiv.org