[P] Scaling PyTorch Transformer-XL training to 128 GPUs
Scaling PyTorch Transformer-XL training to 128 GPUs
code: https://github.com/cybertronai/transformer-xl
experiments: https://medium.com/south-park-commons/scaling-transformer-xl-to-128-gpus-85849508ec35
submitted by /u/yaroslavvb
[link] [comments]