Old Web
English
Sign In
Acemap
>
Paper
>
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
2022
Shaden Smith
Mostofa Patwary
Brandon Norick
Patrick LeGresley
Samyam Rajbhandari
Jared Casper
Zhun Liu
Shrimai Prabhumoye
George Zerveas
Vijay Korthikanti
Elton Zheng
Rewon Child
Reza Yazdani Aminabadi
Julie Bernauer
Xia Song
Mohammad Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
Correction
Cite
Save
Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI
[]