SC17 Denver, CO

Big Data and Exascale Computing (BDEC) Community Report

Authors: Prof. Jack Dongarra (University of Tennessee)

Abstract: The emergence of large scale data analytics in a wide variety of scientific fields, and the explosive growth of data generated in edge environments by new instruments and IoT, has disrupted the landscape on which plans for exascale computing are developing. The international Big Data and Extreme-scale Computing (BDEC) workshop series is systematically mapping out the ways that the major issues associated with data intensive science interact with plans for exascale systems. This meeting will present their report on this effort and elicit community input on the development of plans for the convergence on a common software ecosystem for science.

Long Description: Responding to the National Strategic Computing Initiative (NSCI) and Smart Cities and Communities Strategic Plan in the US, and the H2020 European Cloud Initiative, the Big Data and Exascale Computing (BDEC) workshop series is building an international effort (funded by the NSF, DOE, the EU and Japan) to develop a plan for transnational cooperation in the design and development of a new generation software infrastructure for extreme scale science and engineering. Building on earlier efforts of the International Exascale Software Project (IESP) and the ongoing European EXtreme Data and Computing Initiative (EXDCI), the BDEC community is working on a plan for a common, high quality computing environment that focuses on two main objectives: 1) to systematically map out and analyze the major issues associated with the integration of cloud-based data analytics, extreme scale modeling, and a new network computing paradigm required to manage the explosive growth of data at the edge of the network from new instruments and IoT; and 2) to use this analysis to define one or more pathways from the current state of infrastructure balkanization to a common software ecosystem that enables next generation science and engineering applications to work synergistically to combine and utilize data and resources for discovery and innovation.

The proposed BOF will offer an overview of the BDEC advanced road mapping and planning effort. It will include the presentation and discussion of the “Pathways to Convergence” report that will be finalized this fall. BoF leaders will seek feedback from attendees on the emerging plan for a software infrastructure that can integrate the currently bifurcated software stacks of big data analytics and HPC-driven modeling and simulation, while at the same time interoperating with a new distributed services platform that will be required to cope with massive and rapidly growing flows of data pouring from instruments, sensors, and devices operating at the network edge. With contributions from the US, the EU and Asia, the themes of the BDEC effort intersect with an extremely broad cross section of interests of the SC’17 community. Carrying forward the work of previous international planning efforts, the report reframes the problems involved to fully take account of varied (and evolving) workflows and software ecosystems (including machine learning for big data) that different communities have created to work with data flows and computing resources that are unprecedented in their scale. The systems that need this new extreme-scale software stack must now be viewed as the nodes in the very large distributed services platform that will be required to collect, manipulate, transform, analyze, and explore gigantic mountains of data generated both in the cloud and at the network’s edge.

The BDEC report will show that workshops have made substantial progress in achieving this necessary change in perspective. Its three leading topic areas—application level convergence, centralized facility convergence, edge computing convergence—framed the context for discussions that took place. At the BoF, a panel of experts from the BDEC community will raise issues and solicit input in each of these areas.

Conference Presentation: pdf

Birds of a Feather Index