A podcast discussion about GShard, a module for scaling neural networks using conditional computation and automatic sharding, focusing on its application to multilingual machine translation.

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
6分钟 ·
0·
0