P19: MPI-GIS: An MPI System for Big Spatial Data
SessionPoster Reception
Author
Event Type
ACM Student Research Competition
Poster
Reception

TimeTuesday, November 14th5:15pm - 7pm
LocationFour Seasons Ballroom
DescriptionIn recent times, geospatial datasets are growing in terms of size, complexity and heterogeneity. High performance systems are needed to analyze such data to produce actionable insights in an efficient manner. For polygonal a.k.a vector datasets, operations such as I/O, data partitioning, and communication becomes challenging in a cluster environment.
In this work, we present MPI-GIS equipped with MPI-Vector-IO, a parallel I/O library that we have designed using MPI-IO specifically for irregular polygonal (vector) data formats such as Well Known Text, XML, etc. Our system can perform spatial in-memory indexing and join efficiently for an order of magnitude larger datasets compared to our previous work. It makes MPI aware of spatial data and spatial primitives and provides support for spatial data types embedded within collective computation and communication using MPI message-passing library. It takes less than 2 minutes to scan through 2.7 billion geometries in 96GB file using 160 processes.
In this work, we present MPI-GIS equipped with MPI-Vector-IO, a parallel I/O library that we have designed using MPI-IO specifically for irregular polygonal (vector) data formats such as Well Known Text, XML, etc. Our system can perform spatial in-memory indexing and join efficiently for an order of magnitude larger datasets compared to our previous work. It makes MPI aware of spatial data and spatial primitives and provides support for spatial data types embedded within collective computation and communication using MPI message-passing library. It takes less than 2 minutes to scan through 2.7 billion geometries in 96GB file using 160 processes.