Orhan Firat
論文
2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Close
Type to search.