Aleksey Bilogur

pytorch training guide

04/27/2022 Homepage.

The PyTorch Training Performance Guide is an introduction to and reference on PyTorch training peformance optimization.

Many of the large-scale deep learning models used in production today have training times measured in days. As a result, techniques for training these models quickly and effectively are an important area of optimization in applied settings.

Techniques discussed in this guide include: mixed-precision training, distributed training, model pruning, gradient checkpoints, and just-in-time compilation.

This is a practitioner-oriented guide which grew out of a sequence of blog posts I wrote at Spell. It includes many ample code samples and benchmarks demonstrating the concepts discussed.