Published inmsakthiganeshModel Parallelism using Transformers and PyTorchTaking advantage of multiple GPUs to train larger models such as RoBERTa-Large on NLP datasetsJan 26, 20211Jan 26, 20211