A08: Virtualized Big Data: Reproducing Simulation Output on Demand
Student: Salvatore Di Girolamo (ETH Zurich)
Supervisor: Torsten Hoefler (ETH Zurich)
Abstract:
Scientific simulations are being pushed to the extreme in terms of size and complexity of the addressed problems, producing astonishing amount of data. If the data is stored on disk, analysis applications can randomly access simulation output. Yet, storing the massive amounts simulation data is challenging. This is primarily due to the high storage costs and the fact that compute capabilities grow faster than storage capacities and bandwidths. In-situ analysis removes the storage costs but applications lose random access.
We propose to not store the full simulation output data but to produce it on demand. Our system intercepts I/O requests of both analysis tools and simulators, enabling data virtualization. This new paradigm allows us to explore the computation-storage tradeoff, by trading computation power for storage space. Overall, SDaVi offers a viable path towards exa-scale scientific simulations, by exploiting the growing computing power and relaxing the storage capacity requirements.
ACM-SRC Semi-Finalist: no
Poster: pdf
Two-page extended abstract: pdf
Poster Index