A19: Performance Analysis of a Parallelized Restricted Boltzmann Machine Artificial Neural Network Using OpenACC Framework and TAU Profiling System
Student: Abhishek Kumar (Brookhaven National Laboratory)
Supervisor: Abid Malik (Brookhaven National Laboratory)
Abstract: Restricted Boltzmann Machines are stochastic neural networks that create probability distribution based off connection weight between nodes of the hidden and visible layer. The distribution makes the program optimal at classifying large amounts of data, which could be useful in work settings, such as a research lab. The parallelization of these neural networks would allow for the classification of data at a much faster rate than before. Using a high-performance computer it was determined that parallelizing the neural networks could decrease the runtime of the algorithm by over 35% when offloading the work to a GPU through OpenACC. Using Tuning and Analysis Utilities Profiling Systems, it was found that scheduling the program would only be effective if the data size was large enough and an increase in the number of thread blocks used for scheduling would allow for greater performance gains than the number of threads in each thread block.
ACM-SRC Semi-Finalist: no
Poster: pdf
Two-page extended abstract: pdf
Poster Index