DescriptionThe emergence of leadership-class systems with GPU-equipped nodes has the potential to vastly increase the performance of existing distributed applications. An increasing number of applications that are converted to run on these systems are reliant on algorithms that perform computations on spatial data. Algorithms that operate on spatial data, such as density-based clustering algorithms, present unique challenges in data partitioning and result aggregation when porting to extreme scale environments. These algorithms require that spatially-near data points are partitioned to the same node and that the output from individual nodes needs to be aggregated to ensure that relationships between partition boundaries are discovered. In the development of an extreme scale density-based clustering algorithm, called Mr. Scan, we leveraged the increased computational power provided by GPUs to overcome these challenges. Three main techniques allowed Mr. Scan to cluster 6.5 billion points from the Twitter dataset using 8,192 GPU nodes on Cray Titan in 7.5 minutes: (1) a data partitioner scheme that ensures correctness by using shadow regions to selectively duplicate computation between nodes to detect clusters that lie in-between partition boundaries, (2) a dense box algorithm that reduces the computational complexity of clustering dense spatial regions, and (3) a new tree based method for merging results with spatial dependencies that requires only a fraction of the intermediate results to produce an accurate final result. The pure computational power of the GPUs allowed us to effectively summarize the initial data, making the scalable use of these techniques possible.