romanohu
Home
»
Pages
»
Research
»
03_research
»
Authors
»
Overseas
»
Noam Shazeer
Noam Shazeer
論文
2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
2022
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
ST-MoE: Designing Stable and Transferable Sparse Expert Models