William Fedus

論文

2022

  • Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
  • ST-MoE: Designing Stable and Transferable Sparse Expert Models

./ ../pages

Type to search.