Efficient neural network training on Intel Xeon-based supercomputers
Vikram Saletore and Luke Wilson discuss a collaboration between SURFSara and Intel to advance the state of large-scale neural network training on Intel Xeon CPU-based servers, highlighting improved time to solution on extended training of pretrained models and exploring how various storage and interconnect options lead to more efficient scaling.
Talk Title | Efficient neural network training on Intel Xeon-based supercomputers |
Speakers | Vikram Saletore (Intel), Lucas Wilson (Dell EMC) |
Conference | Artificial Intelligence Conference |
Conf Tag | Put AI to Work |
Location | San Francisco, California |
Date | September 5-7, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Vikram Saletore and Luke Wilson discuss a collaboration between SURFSara and Intel as part of the Intel Parallel Computing Center initiative to advance the state of large-scale neural network training on Intel Xeon CPU-based servers. SURFSara and Intel evaluated a number of data and model parallel approaches and synchronous versus asynchronous SGD methods with popular neural networks, such as ResNet50 using large datasets on the TACC (Texas Advanced Computing Center) and Dell HPC supercomputers. Vikram and Luke share insights on several best-known methods, including CPU core, memory pinning, and hyperparameter tuning, that were developed to demonstrate top-one/top-five state-of-the-art accuracy at scale. They then detail real-world problems that can be solved by utilizing models efficiently trained at large-scale and present tests performed at Dell EMC on CheXNet, a Stanford University project that extends a DenseNet model pretrained on the large-scale ImageNet dataset to detect pathologies in chest X-ray images, including pneumonia. Vikram and Luke highlight improved time to solution on extended training of this pretrained model and the various storage and interconnect options that lead to more efficient scaling.