Presenter Index Full Program · Presenters · Organizations · Search Program - A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Mark Duffield, Kees Vissers, Oliver Gunasekara - Amazon Web Services, Xilinx Inc, NGCodec Keynote: FPGAs in AWS and First Use Cases (joint talk by AWS, NGcodec, and Xilinx) Return to Top A Omar Aaziz New Mexico State University P73: HPC Production Job Quality Assessment Moustafa AbdelBaky Rutgers University Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure David Abdurachmanov CERN The ARM Software Ecosystem: Are We There Yet? David Abramson University of Queensland Scalable Distributed Infrastructure for Data Intensive Science Bilge Acun University of Illinois Mitigating Variability in HPC Systems and Applications for Performance and Power Efficiency Ross N. Adelman US Army Research Laboratory P39: Extremely Large, Wide-Area Power-Line Models Ferrol Aderholdt Oak Ridge National Laboratory P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Vikram Adve University of Illinois Developing an OpenMP Runtime for UVM-Capable GPUs Heterogeneous Parallel Virtual Machine and Parallelism in LLVM Ilya Afanasyev Lomonosov Moscow State University Five-minute presentations by young researchers from around the world - part 2 Hoda Aghaei Khouzani University of Delaware Runtime Solutions to Apply Non-Volatile Memories in Future Computer Systems Danial Aghajarian Georgia State University A Heterogeneous HPC Platform for Ill-Structured Spatial Join Processing Abhinav Agrawal North Carolina State University Leveraging Near Data Processing for High-Performance Checkpoint/Restart Kunal Agrawal Washington University in St. Louis Keynote: Teaching Sound Principles and Good Practices for Parallel Algorithms. Khalid Ahmad University of Utah Automatic Testing of OpenACC Applications Dong H. Ahn Lawrence Livermore National Laboratory P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Behzad R. Ahrabi University of Wyoming P28: High-Fidelity Blade-Resolved Wind Plant Modeling James Ahrens Los Alamos National Laboratory Cosmological Particle Data Compression in Practice P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Alex Aiken Stanford University Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions The Legion Programming Model Mark Ainsworth Brown University Introduction - The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) MGARD: A Multilevel Technique for Compression of Floating-Point Data The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) Asma H. Al-rawi Intel Corporation P95: GEOPM: A Scalable Open Runtime Framework for Power Management Sadaf Alam Swiss National Supercomputing Centre How Serious Are We About the Convergence Between HPC and Big Data? Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Interactivity in Supercomputing Jay Alameda University of Illinois Fourth SC Workshop on Best Practices for HPC Training Carl Albing US Naval Academy Fourth SC Workshop on Best Practices for HPC Training Ben Albrecht Cray Inc Cosmological Particle-Mesh Simulations in Chapel Nia Alexandrov Hartree Centre Fourth SC Workshop on Best Practices for HPC Training Vassil Alexandrov Barcelona Supercomputing Center Fourth SC Workshop on Best Practices for HPC Training Introduction - 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Invited Talk - On Improved Monte Carlo Hybrid Methods for Preconditioner Computations 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Yuri Alexeev Argonne National Laboratory An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Momme Allalen Leibniz Supercomputing Centre P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II Graham Allan University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data William Allcock Argonne National Laboratory P32: Exploring the Performance of Electron Correlation Method Implementations on Kove XPDs Randy Allen Mentor Graphics The Challenges Faced by OpenACC Compilers Tyler Allen Clemson University Performance and Energy Usage of Workloads on KNL and Haswell Architectures Amani Alonazi King Abdullah University of Science and Technology Five-minute presentations by young researchers from around the world - part 2 Ilkay Altintas San Diego Supercomputer Center A Machine Learning Approach for Modular Workflow Performance Prediction Alper Altuntas National Center for Atmospheric Research Verifying Concurrency in an Adaptive Ocean Circulation Model Rommie Amaro University of California, San Diego Molecular Simulation at the Mesoscale Marcos Amarís University of Sao Paulo Performance Prediction Modeling of GPU Applications Abdelhalim Amer Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 MPICH: A High-Performance Open-Source MPI Implementation Sadika Amreen University of Tennessee Position Paper: Experiences on Clustering High-Dimensional Data Using pbdR Amit Amritkar University of Houston Vistas in Advanced Computing Jefferson Amstutz Intel Corporation Flexible In Situ Visualization of LAMMPS Simulations Jooneun An Korea Institute of Science and Technology Information Visualization of Decision-Making Support (DMS) Information for Responding to a Typhoon-Induced Disaster James Ang Sandia National Laboratories Exascale Challenges and Opportunities Rushil Anirudh Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning P75: Model-Agnostic Influence Analysis for Performance Data Katie Antypas National Energy Research Scientific Computing Center How Serious Are We About the Convergence Between HPC and Big Data? Parallel I/O in Practice Hartwig Antz Karlsruhe Institute of Technology University of Tennessee Overcoming Load Imbalance for Irregular Sparse Matrices Hartwig Anzt University of Tennessee Karlsruhe Institute of Technology Flexible Batched Sparse Matrix-Vector Product on GPUs Yulong Ao Chinese Academy of Sciences Five-minute presentations by young researchers from around the world - part 1 Takayuki Aoki Tokyo Institute of Technology Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model David Appelhans IBM Leveraging NVLINK and Asynchronous Data Transfer to Scale Beyond the Memory Capacity of GPUs Charles Archer Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Manuel Arenaz University of A Coruña and Appentra Solutions Parallware Trainer: Interactive Tool for Experiential Learning of Parallel Programming Using OpenMP and OpenACC Dorian Arnold Emory University Silent Errors in HPC Systems Experiencing HPC for Undergraduates: Careers in HPC Forming Strong Networks and Collaborations Connections II: Connecting with Mentors Forming Connections I: Connecting Sideways, with Ourselves and Our Peers Yuuichi Asahi French Alternative Energies and Atomic Energy Commission Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Mitsuteru Asai Kyushu University P21: The First Real-Scale DEM Simulation of a Sandbox Experiment Using 2.4 Billion Particles Mark Asch University of Picardie Total SA Big Data and Exascale Computing (BDEC) Community Report Samar Aseeri King Abdullah University of Science and Technology A Comparison of Distributed Memory Fast Fourier Transform (FFT) Library Packages Rafael Asenjo University of Malaga Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks Thomas Ashby IMEC P62: How To Do Machine Learning on Big Clusters Joshua Asplund Lawrence Livermore National Laboratory DataRaceBench: A Benchmark Suite for Systematic Evaluation of Data Race Detection Tools Danny Auble SchedMD LLC Slurm User Group Meeting Guillaume Aupy French Institute for Research in Computer Science and Automation (INRIA) Periodic I/O Scheduling for Supercomputers Brian Austin Lawrence Berkeley National Laboratory Performance and Energy Usage of Workloads on KNL and Haswell Architectures Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Jeff Autor Hewlett Packard PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Ammar Ahmad Awan Ohio State University An In-Depth Performance Characterization of CPU- and GPU-Based DNN Training on Modern Architectures A26: Co-Designing MPI Runtimes and Deep Learning Frameworks for Scalable Distributed Training on GPU Clusters Abdulrahman Azab University of Oslo Partnership for Advanced Computing in Europe (PRACE) Containers in HPC Return to Top B Abdel-Hameed Badawy New Mexico State University A Scalable Analytical Memory Model for CPU Performance Prediction David Bader Georgia Institute of Technology 15th Graph500 List Michael Bader Technical University Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Materials and Chemistry Frank Baetke Hewlett Packard Enterprise BeeGFS - Architecture, Implementation Examples, and Future Development Lustre Community BoF: Lustre Deployments for the Next 5 Years Saurabh Bagchi Purdue University Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction Anna Maria Bailey Lawrence Livermore National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Seonmyeong Bak University of Illinois Integrating OpenMP into the Charm++ Programming Model Allison H. Baker National Center for Atmospheric Research Quality Assurance and Error Identification for the Community Earth System Model Brandon Baker Intel Corporation P95: GEOPM: A Scalable Open Runtime Framework for Power Management Jason Bakos University of South Carolina Introduction - H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Pavan Balaji Argonne National Laboratory Workshop on Exascale MPI (ExaMPI) Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Advanced MPI Programming MPICH: A High-Performance Open-Source MPI Implementation Prasanna Balaprakash Argonne National Laboratory Contemporary Design of Supercomputer Experiments Rahul Bale RIKEN P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer Gabor Daniel Balogh Pazmany Peter Catholic University Comparison of Parallelization Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs Daniel Balouek-Thomert Rutgers University Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure Fabio Banchelli Barcelona Supercomputing Center P71: Is ARM Software Ecosystem Ready for HPC? Kunal Banerjee Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Purushotham Bangalore University of Alabama, Birmingham Workshop on Exascale MPI (ExaMPI) P45: Campaign Storage: Erasure Coding with GPUs Neelofer Banglawala University of Edinburgh Women in HPC: Non-Traditional Paths to HPC and How They Can and Do Enrich the Field Lorena Barba George Washington University HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Deborah Bard Lawrence Berkeley National Laboratory Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Getting Started with the Burst Buffer: Using DataWarp Technology Ashley Barker Oak Ridge National Laboratory Small Business and the Exascale Computing Project Kevin Barker Pacific Northwest National Laboratory Energy Efficient Supercomputing (E2SC) Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Martina Barnas Indiana University Introduction - Workshop on Education for High Performance Computing (EduHPC) Panel: Attracting Women and Underrepresented Minorities to HPC and Data Science Thomas Barr Research Institute at Nationwide Children's Hospital Computational Approaches for Cancer Carlos Jaime Barrios Hernandez Advanced Computing Service for Latin America and the Caribbean Industrial University of Santander Americas HPC Collaboration Andrea Bartolini ETH Zurich P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Chaitanya Baru National Science Foundation Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Alexey Bataev IBM Implementing Implicit OpenMP Data Sharing on GPUs Natalie Bates Energy Efficient HPC Working Group Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) The Green500: Trends in Energy-Efficient Supercomputing Total Cost of Ownership and HPC System Procurement State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Andrew Bauer Kitware Inc In Situ Summarization with VTK-m In Situ Analysis and Visualization with SENSEI Michael Bauer Nvidia Corporation Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions John Baugh North Carolina State University Verifying Concurrency in an Adaptive Ocean Circulation Model Mohammadreza Bayatpour Ohio State University Scalable Reduction Collectives with Data Partitioning-Based Multi-Leader Design Alexandre Bayen Lawrence Berkeley National Laboratory University of California, Berkeley Inference and Control in Routing Games Neelima Bayyapu Argonne National Laboratory MPICH: A High-Performance Open-Source MPI Implementation Daniel Beall Naval Research Laboratory P18: A Parallel Python Implementation of BLAST+ (PPIB) for Characterization of Complex Microbial Consortia Scott Beamer Lawrence Berkeley National Laboratory Research Execution Lee Beausoleil US Department of Defense Panel Discussion: Diversifying the HPC workforce Identifying the Roadblocks Facing Women in your Workforce Fabian Beck University of Duisburg-Essen Introduction - 4th International Workshop on Visual Performance Analytics – VPA 2017 Fourth International Workshop on Visual Performance Analysis – VPA 2017 Gregory Becker Lawrence Livermore National Laboratory Managing HPC Software Complexity with Spack David Beckingsale Lawrence Livermore National Laboratory P76: A Compiler Agnostic and Architecture Aware Predictive Modeling Framework for Kernels Pete Beckman Director, Exascale Technology & Computing Institute Argonne National Laboratory Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences HPC Connects Plenary: The Century of the City Cross-Layer Allocation and Management of Hardware Resources in Shared Memory Nodes Big Data and Exascale Computing (BDEC) Community Report The Internet of Things and HPC: Are They Teaming Up to Work Together? Bradford M. Beckmann Advanced Micro Devices Inc Gravel: Fine-Grain GPU-Initiated Network Messages Izaak Beekman ParaTools P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Oceane Bel University of California, Santa Cruz CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning P65: CAPES: Unsupervised System Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Kellon Belfon Stony Brook University Experiencing HPC for Undergraduates: Graduate Student Perspective Maxim Belkin University of Illinois Fourth SC Workshop on Best Practices for HPC Training Software Engineering and Reuse in Computational Science and Engineering Gordon Bell Microsoft Thirty Years of the Gordon Bell Prize Francis Belot Atomic Energy and Alternative Energies Commission State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Anouar Benali Argonne National Laboratory Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations Siegfried Benkner University of Vienna Extending the Open Community Runtime with External Application Support John Bent Seagate Government Solutions The Virtual Institute of I/O and the IO-500 Michael Bentley University of Utah A15: Quantifying Compiler Effects on Code Performance and Reproducibility Using FLiT Brad Benton Advanced Micro Devices Inc GPU Triggered Networking for Intra-Kernel Communications Pavel Benáček CESNET Case Study: Usage of High Level Synthesis in HPC Networking Gheorghe-Teodor Bercea IBM Implementing Implicit OpenMP Data Sharing on GPUs Ben Bergen Los Alamos National Laboratory P63: FleCSPH: a Parallel and Distributed Smoothed Particle Hydrodynamics Framework Based on FleCSI Karen Bergman Columbia University Post Moore Supercomputing Francine Berman Rensselaer Polytechnic Institute Blurring the Lines: High-End Computing and Data Science David Bernholdt Oak Ridge National Laboratory OpenMP 4.5 Validation and Verification Suite Better Scientific Software Software Engineering and Reuse in Computational Science and Engineering Carlo Bertolli IBM Implementing Implicit OpenMP Data Sharing on GPUs Colleen Bertoni Argonne National Laboratory P32: Exploring the Performance of Electron Correlation Method Implementations on Kove XPDs Martin Berzins University of Utah Scientific Computing and Imaging Institute Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs Maciej Besta ETH Zurich Scaling Betweenness Centrality Using Communication-Efficient Sparse Matrix Multiplication E. Wes Bethel Lawrence Berkeley National Laboratory Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization In Situ Analysis and Visualization with SENSEI ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Blair Bethwaite Monash University OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Eugen Betke German Climate Computing Center P57: Adaptive Tier Selection for NetCDF and HDF5 P15: Toward Decoupling the Selection of Compression Algorithms from Quality Constraints Joshua Bevan University of Illinois P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Sridutt Bhalachandra University of North Carolina Using Runtime Energy Optimizations to Improve Energy Efficiency in High Performance Computing Siddharth Bhat International Institute of Information Technology, Hyderabad Optimizing Geometric Multigrid Method Computation Using a DSL Approach Abhinav Bhatele Lawrence Livermore National Laboratory Introduction - 4th International Workshop on Visual Performance Analytics – VPA 2017 ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Performance Modeling under Resource Constraints Using Deep Transfer Learning Predicting the Performance Impact of Different Fat-Tree Configurations Fourth International Workshop on Visual Performance Analysis – VPA 2017 Wahid Bhimji Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Paolo Bientinesi RWTH Aachen University A01: GEMM-Like Tensor-Tensor Contraction (GETT) Amanda J. Bienz University of Illinois Reducing Communication Costs in the Parallel Algebraic Multigrid Jay Jay Billings Oak Ridge National Laboratory Software Engineers: Careers in Research Robert Bird Los Alamos National Laboratory A Scalable Analytical Memory Model for CPU Performance Prediction George Biros University of Texas Geometry-Oblivious FMM for Compressing Dense SPD Matrices A Framework for Scalable Biophysics-Based Image Analysis Sean Blanchard Los Alamos National Laboratory Experimental and Analytical Study of Xeon Phi Reliability Wesley Bland Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Gil Bloch Mellanox Technologies Accelerating Big Data Processing and Machine/Deep Learning Middleware on Modern HPC Clusters Michael Blocksome Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Michaela Blott Xilinx Inc Introduction - H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic David Bock National Center for Supercomputing Applications, University of Illinois Simulation and Visual Representation of Tropical Cyclone-Ocean Interactions Christian Bodenstein Research Center Juelich Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework François Bodin University of Rennes European Exascale Projects and Their Global Contributions David Boehme Lawrence Livermore National Laboratory Predicting the Performance Impact of Different Fat-Tree Configurations Stanislav Bohm Technical University of Ostrava P62: How To Do Machine Learning on Big Clusters Taisuke Boku University of Tsukuba Runtime Correctness Checking for Emerging Programming Paradigms Barry Bolding Cray Inc How Serious Are We About the Convergence Between HPC and Big Data? Evan F. Bollig University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Rosie Bolton Square Kilometre Array Life, the Universe and Computing: The Story of the SKA Telescope Uday Bondhugula Indian Institute of Science Optimizing Geometric Multigrid Method Computation Using a DSL Approach Matthias Book University of Iceland Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework Utpal Bora International Institute of Information Technology, Hyderabad Improved Loop Distribution in LLVM Using Polyhedral Dependences Ralph C. Bording Pawsey Supercomputing Centre 4th International Workshop on HPC User Support Tools (HUST-17) HPC Carpentry - Practical, Hands-On HPC Training Andrea Borghesi University of Bologna State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Kalina M. Borkiewicz National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe George Bosilca University of Tennessee Dynamic Task Discovery in PaRSEC- A Data-Flow Task-Based Runtime Fault-Tolerance for High Performance and Distributed Computing: Theory and Practice Resilient Programming Environments Open MPI State of the Union XI Charles A. Bouman Purdue University Massively Parallel 3D Image Reconstruction Aurélien Bouteiller University of Tennessee Fault-Tolerance for High Performance and Distributed Computing: Theory and Practice Anne Dara Bowen Texas Advanced Computing Center, University of Texas Physical Signatures of Cancer Metastasis Geoffrey C. Bower Academica Sinica Institute of Astronomy and Astrophysics realfast@VLA Eric Boyer GENCI Total Cost of Ownership and HPC System Procurement Andrew M. Bradley Sandia National Laboratories Designing Vector-Friendly Compact BLAS and LAPACK Kernels Jim Brandt Sandia National Laboratories HPC Systems Monitoring Data in Action Steven R. Brandt Louisiana State University Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code HPC via HTTP: Portable, Scalable Computing Using App Containers and the Agave API David Brayford Leibniz Supercomputing Centre OpenHPC Community BoF Michael J. Brazell University of Wyoming P28: High-Fidelity Blade-Resolved Wind Plant Modeling Marisa Brazil Purdue University Building a Community: Outreach Strategies for Coordinating a Local WHPC Program Panel Discussion: Diversifying the HPC workforce Peer-Timo Bremer Lawrence Livermore National Laboratory ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Ronny Brendel Oak Ridge National Laboratory An LLVM Instrumentation Plug-In for Score-P Mauricio Breternitz University Institute of Lisbon GPU Triggered Networking for Intra-Kernel Communications Alys Brett Culham Centre for Fusion Energy Software Engineering and Reuse in Computational Science and Engineering Software Engineers: Careers in Research Sven Breuner ThinkParQ GmbH BeeGFS - Architecture, Implementation Examples, and Future Development John Brevik California State University, Long Beach Probabilistic Guarantees of Execution Duration for Amazon Spot Instances Patrick Bridges University of New Mexico Workshop on Exascale MPI (ExaMPI) Ian Briggs University of Utah P84: PRESAGE: Selective Low Overhead Error Amplification for Easy Detection Ron Brightwell Sandia National Laboratories Workshop on Exascale MPI (ExaMPI) Opening Remarks: MCHPC'17: Workshop on Memory Centric Programming for HPC sPIN: High-Performance Streaming Processing in the Network MCHPC2017: Workshop on Memory Centric Programming for HPC André Brinkmann Johannes Gutenberg University Mainz A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Jed Brown University of Colorado, Boulder Contemporary Design of Supercomputer Experiments Maxine Brown University of Illinois, Chicago SAGE2 9th Annual International SC BOF: Scalable Amplified Group Environment for Global Collaboration Nick Brown University of Edinburgh From Outreach to Education to Researcher: Innovative Ways of Expanding the HPC Community Panel Discussion: Diversifying the HPC workforce From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community P81: Offloading Python Kernels to Micro-Core Architectures Dana Brunson Oklahoma State University Fourth SC Workshop on Best Practices for HPC Training Kris Bubendorfer Victoria University of Wellington Heuristic Dynamic Workflow Scheduling Ronak Buch University of Illinois Migratable Objects and Task-Based Parallel Programming with Charm++ Robert Budden Pittsburgh Supercomputing Center OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Reuben Budiardja Oak Ridge National Laboratory Regression Testing and Monitoring Tools Zoran Budimlic Rice University Graph500 on OpenSHMEM: Using a Practical Survey of Past Work to Motivate Novel Algorithmic Developments Gina Bullock North Carolina Agricultural and Technical State University Teaching, Learning and Collaborating through Cloud Computing Online Classes Aydin Buluc Lawrence Berkeley National Laboratory University of California, Berkeley Scaling Deep Learning on GPU and Knights Landing Clusters HPC Graph Toolkits and the GraphBLAS Forum Communication Efficient Methods David Bunde Knox College "Peachy Assignments:" A New Edu* Conference Component Hans-Joachim Bungartz Technical University Munich A Highly Scalable, Algorithm-Based Fault-Tolerant Solver for Gyrokinetic Plasma Simulations Citius, Altius, Fortius! Sarah Burke-Spolaor West Virginia University realfast@VLA Anastasiia Butko Lawrence Berkeley National Laboratory Workshop for Open Source Supercomputing Bryan J. Butler National Radio Astronomy Observatory realfast@VLA Ali R. Butt Virginia Tech TagIt: An Integrated Indexing and Search Service for File Systems Suren Byna Lawrence Berkeley National Laboratory In-System Processing for Performance Vetria Byrd Clemson University Scientific Visualization & Data Analytics Showcase Posters Scientific Visualization and Data Analytics Showcase Posters Scientific Visualization & Data Analytics Showcase Posters Scientific Visualization & Data Analytics Showcase Reception Return to Top C Katharine Cahill Ohio Supercomputer Center A Proposed Model for Teaching Advanced Parallel Computing and Related Topics Blake Caldwell University of Colorado, Boulder P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Rebecca Caldwell Winston-Salem State University Teaching, Learning and Collaborating through Cloud Computing Online Classes Patrice Calegari Bull From HPC-as-a-Service to Deep Learning-as-a-Service Gruia Calinescu Illinois Institute of Technology P12: Multi-Size Optional Offline Caching Algorithms Martin Callaghan University of Leeds HPC Carpentry - Practical, Hands-On HPC Training Scott Callaghan University of Southern California Panel Discussion: Diversifying the HPC workforce The Benefits of Mentoring: Why and How to Set Up a Program rvGAHP – Push-Based Job Submission Using Reverse SSH Connections From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community Spencer Callicott Mississippi State University A14: Analysis of Synthetic Graph Generation Methods for Directed Network Graphs Kirk Cameron Virginia Tech Energy Efficient Supercomputing (E2SC) Funding Agencies HPC Impact Showcase: Computational Modeling Andrew Canning Lawrence Berkeley National Laboratory P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Shane Canon Lawrence Berkeley National Laboratory Container Computing for HPC and Scientific Workflows Containers in HPC Christopher M. Cantalupo Intel Corporation P95: GEOPM: A Scalable Open Runtime Framework for Power Management Franck Cappello Argonne National Laboratory Introduction - H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Compression of Scientific Data Reconfigurable Computing in Exascale P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Emerging Technologies Showcase (Day 3) H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Emerging Technologies Showcase (Day 1) Emerging Technologies Showcase (Day 2) Danilo Carastan-Santos Federal University of ABC, Santo André, Brazil University of Grenoble Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning Experiencing HPC for Undergraduates: Graduate Student Perspective Lawrence Carin Duke University Introduction - Machine Learning in HPC Environments Richard Carlson US Department of Energy Small Business and the Exascale Computing Project William Carlson Institute for Defense Analyses Keynote: Shared Memory HPC Programming: Past, Present and Future PGAS Applications Workshop Panel Marcelo Amaral Barcelona Supercomputing Center Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments Philip Carns Argonne National Laboratory Analyzing Parallel I/O Jeffrey D. Carpenter National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe David Carrera Barcelona Supercomputing Center Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments Jeffrey C. Carver University of Alabama Introduction - The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Software Engineering and Reuse in Computational Science and Engineering Software Engineers: Careers in Research The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Dan Cassidy Los Alamos National Laboratory P47: Understanding Congestion on Omni-Path Fabrics Ralph Castain Intel Corporation Charting the PMIx Roadmap Vito Giovanni Castellana Pacific Northwest National Laboratory Introduction - IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms Charlie Catlett Director, Urban Center for Computation & Data Argonne National Laboratory HPC Connects Plenary: The Century of the City John Cavazos University of Delaware P76: A Compiler Agnostic and Architecture Aware Predictive Modeling Framework for Kernels Carlo Cavazzoni CINECA State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Aurelien Cavelan University of Basel Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis Cris Cecka Nvidia Corporation Low Communication FMM-Accelerated FFT on GPUs Batched, Reproducible, and Reduced Precision BLAS Milind Chabbi Independent Path-Synchronous Performance Monitoring in HPC Interconnection Networks with Source-Code Attribution Sourav Chakraborty Ohio State University Scalable Reduction Collectives with Data Partitioning-Based Multi-Leader Design Bradford L. Chamberlain Cray Inc Introduction - PAW 2017: The 2nd Annual PGAS Applications Workshop PGAS Applications Workshop Panel Henry Chan Argonne National Laboratory Visualizing Silicene Growth Through Island Migration and Coalescence Sunita Chandrasekaran University of Delaware Introduction - Fourth Workshop on Accelerator Programming Using Directives (WACCPD) Introduction - Women in HPC: Diversifying the HPC Community OpenMP 4.5 Validation and Verification Suite An Efficient Data Layout Transformation Algorithm for Locality-Aware Parallel Sparse FFT The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Experiencing HPC for Undergraduates: Careers in HPC OpenACC API User Experience, Vendor Reaction, Relevance, and Roadmap Fourth Workshop on Accelerator Programming Using Directives (WACCPD) Choongseok Chang Princeton University Facing the Big Data Challenge in the Fusion Code XGC Kenneth Chang University of California, Santa Cruz CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning P65: CAPES: Unsupervised System Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Barbara Chapman Stony Brook University Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading OpenMP Common Core: A “Hands-On” Exploration OpenSHMEM in the Era of Exascale Dylan Chapp University of Delaware A17: Toward Capturing Nondeterminism Motifs in HPC Applications Kyle Chard University of Chicago Probabilistic Guarantees of Execution Duration for Amazon Spot Instances Ryan Chard Argonne National Laboratory Probabilistic Guarantees of Execution Duration for Amazon Spot Instances Niladrish Chatterjee Nvidia Corporation Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs Bhaskar Chaudhury Dhirubhai Ambani Institute of Information and Communication Technology P27: Parallelization of the Particle-In-Cell Monte Carlo Collision (PIC-MCC) Algorithm for Plasma Simulation on Intel MIC Xeon Phi Architecture Abhishek Chaurasia FWDNXT Inc Snowflake: Efficient Accelerator for Deep Neural Networks Shuai Che Advanced Micro Devices Inc Gravel: Fine-Grain GPU-Initiated Network Messages Bingwei Chen Tsinghua University 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Changsheng Chen University of Massachusetts, Dartmouth Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Cheng Chen Data Storage Institute National University of Singapore Transactional NVM Cache with High Performance and Crash Consistency Feng Chen University of Texas Advanced Manycore Programming (KNL) Hsing-bung Chen Los Alamos National Laboratory P55: Incorporating Proactive Data Rescue into ZFS Disk Recovery for Enhanced Storage Reliability Jieyang Chen University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform Tong Chen IBM Implementing Implicit OpenMP Data Sharing on GPUs Xiaofei Chen Southern University of Science and Technology, China 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Xinyu Chen University of New Mexico P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Yen Chen Chen National Taiwan University A03: A High-Speed Algorithm for Genome-Wide Association Studies on Multi-GPU Systems Zizhong Chen University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform ParaStack: Efficient Hang Detection for MPI Programs at Large Scale Sai P. Chenna University of Florida A FPGA-Pipelined Approach for Accelerated Discrete-Event Simulation of HPC Systems Gopinath Chennupati Los Alamos National Laboratory A Scalable Analytical Memory Model for CPU Performance Prediction Mathew J. Cherukara Argonne National Laboratory Visualizing Silicene Growth Through Island Migration and Coalescence Naveen Cherukuri Intel Corporation Run-to-Run Variability on Xeon Phi Based Cray XC Systems Kazem Cheshmi Rutgers University Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis Weng Cho Chew University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions Andrew Chien University of Chicago Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis Bruce Childers University of Pittsburgh Reproducibility and Uncertainty in High Performance Computing Wendy K. Cho National Center for Supercomputing Applications, University of Illinois P33: Massively Parallel Evolutionary Computation for Empowering Electoral Reform: Quantifying Gerrymandering via Multi-objective Optimization and Statistical Analysis Jaemin Choi University of Illinois at Urbana-Champaign Migratable Objects and Task-Based Parallel Programming with Charm++ A21: Runtime Support for Concurrent Execution of Overdecomposed Heterogeneous Tasks Andrew Y. Choliy Rutgers University P12: Multi-Size Optional Offline Caching Algorithms Fred Chong University of Chicago Quantum Computing and Irregular Applications Jerry chou National Tsing Hua University, Taiwan Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling Edmond Chow Georgia Institute of Technology Distributed Southwell: An Iterative Method with Low Communication Costs Invited Talks 3 Invited Talks 4 AJ Christensen National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Ching-Hsiang Chu Ohio State University A27: High-Performance and Scalable Broadcast Schemes for Deep Learning on GPU Clusters Pi-Yueh Chuang George Washington University An Example of Porting PETSc Applications to Heterogeneous Platforms with OpenACC Neil Chue Hong University of Edinburgh Introduction - The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Software Engineering and Reuse in Computational Science and Engineering Software Engineers: Careers in Research The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Ryan Chui National Center for Supercomputing Applications, University of Illinois P38: Benchmarking Parallelized File Aggregation Tools for Large Scale Data Management Sudheer Chunduri Argonne National Laboratory Run-to-Run Variability on Xeon Phi Based Cray XC Systems IHsin Chung IBM Towards a Composable Computer System Vladimir Chupakhin Janssen Global Services LLC P62: How To Do Machine Learning on Big Clusters Michael Chuvelev Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Vojtech Cima Technical University of Ostrava P62: How To Do Machine Learning on Big Clusters Florina M. Ciorba University of Basel P74: A Methodology for Bridging the Native and Simulated Executions of Parallel Applications Selim Ciraci Microsoft Introduction - The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Raymond C. Clay III Sandia National Laboratories Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations David Clifton ANSYS Inc HPC Systems Professionals Workshop Thomas Clune NASA Goddard Space Flight Center pFlogger: The Parallel Fortran Logging Framework for HPC Applications Richard Coffey Argonne National Laboratory Fourth SC Workshop on Best Practices for HPC Training HPC Education: Meeting of the SIGHPC Education Chapter Paul Coffman Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Susan Coghlan Argonne National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Albert Cohen French Institute for Research in Computer Science and Automation (INRIA) Implementation of a Cache Miss Calculator in LLVM/Polly Gary Collins University of Tennessee Flexible Batched Sparse Matrix-Vector Product on GPUs Toni Collis Appentra Solutions, Women in High Performance Computing Embracing Diversity: the Benefits Panel Discussion: Diversifying the HPC workforce Introduction - Women in HPC: Diversifying the HPC Community Career Panel Discussion: Hints and Tips to Progress Your Career Workshop Outcomes and Closing Speed Networking Women in HPC: Non-Traditional Paths to HPC and How They Can and Do Enrich the Field Recruitment: How to Build Diverse Teams Women in HPC: Diversifying the HPC Workforce Guojing Cong IBM Accelerating Deep Neural Network Learning for Speech Recognition on a Cluster of GPUs Paul Constantine University of Colorado, Boulder Contemporary Design of Supercomputer Experiments Mike Conway Renaissance Computing Institute Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Steve Conway Hyperion Research A Taxonomy of HPDA Algorithms Blurring the Lines: High-End Computing and Data Science Jeanine Cook Sandia National Laboratories Time Management Jonathan Cook New Mexico State University P73: HPC Production Job Quality Assessment James Coomer DataDirect Networks Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Burst Buffers: Flash in the Pan? Marcin Copik RWTH Aachen University A05: Parallel Prefix Algorithms for the Registration of Arbitrarily Long Electron Micrograph Series Thomas Corcoran Lawrence Berkeley National Laboratory P36: A Novel Feature-Preserving Spatial Mapping for Deep Learning Classification of Ras Structures Anthony Costa Icahn School of Medicine at Mount Sinai Medical Image Analysis and Visualization Timothy B. Costa Intel Corporation Designing Vector-Friendly Compact BLAS and LAPACK Kernels Batched, Reproducible, and Reduced Precision BLAS Jim Cownie Intel Corporation LLVM in HPC: Uses and Desires OpenMP® is Twenty. Where Is It Going? David Cox Harvard University Input-Aware Auto-Tuning of Compute-Bound HPC Kernels Donna J. Cox National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Silvia Crivelli Lawrence Berkeley National Laboratory P36: A Novel Feature-Preserving Spatial Mapping for Deep Learning Classification of Ras Structures Peter D. Crossman Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Carlos A. Cruz NASA Goddard Space Flight Center pFlogger: The Parallel Fortran Logging Framework for HPC Applications Xuewen Cui Virginia Tech P82: Performance Evaluation of the NVIDIA Tesla P100: Our Directive-Based Partitioning and Pipelining vs. NVIDIA’s Unified Memory Massimiliano Culpo Swiss Federal Institute of Technology in Lausanne Managing HPC Software Complexity with Spack Eugenio Culurciello FWDNXT Inc Snowflake: Efficient Accelerator for Deep Neural Networks Matthew L. Curry Sandia National Laboratories P45: Campaign Storage: Erasure Coding with GPUs Tony Curtis Stony Brook University OpenSHMEM in the Era of Exascale Return to Top D John D'Ambrosia Ethernet Alliance Huawei The Ethernet Portfolio for HPC Nicholas D'Imperio Brookhaven National Laboratory P34: GPU Acceleration for the Impurity Solver in GW+DMFT Packages Michael D'mello Intel Corporation P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi Felipe H. da Jornada University of California, Berkeley P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Tamara Dahlgren Lawrence Livermore National Laboratory P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Christopher S. Daley Lawrence Berkeley National Laboratory Performance and Energy Usage of Workloads on KNL and Haswell Architectures Patricia Damkroger Intel Corporation Introduction - Women in HPC: Diversifying the HPC Community Early Career Coaching Anthony Danalis University of Tennessee P72: New Developments for PAPI 5.6+ Tharun Kumar Dangeti International Institute of Information Technology, Hyderabad Improved Loop Distribution in LLVM Using Polyhedral Dependences Anwesha Das North Carolina State University P89: Desh: Deep Learning for HPC System Health Resilience Arnab Das University of Utah P84: PRESAGE: Selective Low Overhead Error Amplification for Easy Detection Santanu Das International Institute of Information Technology, Hyderabad Improved Loop Distribution in LLVM Using Polyhedral Dependences Christos Davatzikos University of Pennsylvania A Framework for Scalable Biophysics-Based Image Analysis James Davis University of Warwick An Efficient Task-Based All-Reduce for Machine Learning Applications Miyuru Dayarathna WSO2 Inc Multiple Stream Job Performance Optimization with Source Operator Graph Transformations Andreas de Blanche University West Sweden Tetra Pak P44: Increasing Throughput of Multiprogram HPC Workloads: Evaluating a SMT Co-Scheduling Approach Raphael Y. de Camargo Federal University of ABC, Santo André, Brazil Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning Cees de Laat University of Amsterdam Innovating the Network for Data Intensive Science (INDIS) Gustavo De Leon Los Alamos National Laboratory University of California, Berkeley P54: Investigating Hardware Offloading for Reed-Solomon Encoding Daniel Oliveira Fluminense Federal University Toward Preserving Results Confidentiality in Cloud-Based Scientific Workflows Daniele De Sensi University of Pisa Nornir: A Power-Aware Runtime Support for Parallel Applications Bronis R. de Supinski Lawrence Livermore National Laboratory Advanced OpenMP: Performance and 4.5 Features Mastering Tasking with OpenMP P82: Performance Evaluation of the NVIDIA Tesla P100: Our Directive-Based Partitioning and Pipelining vs. NVIDIA’s Unified Memory Cutting Edge File Systems Tom Deakin University of Bristol P69: Portable Methods for Measuring Cache Hierarchy Performance Diptorup Deb University of North Carolina QUARC: An Optimized DSL Framework Using LLVM Nathan Debardeleben Los Alamos National Laboratory Experimental and Analytical Study of Xeon Phi Reliability P92: Characterization and Comparison of Application Resilience for Serial and Parallel Executions Ewa Deelman Information Sciences Institute, University of Southern California rvGAHP – Push-Based Job Submission Using Reverse SSH Connections Mauro Del Ben Lawrence Berkeley National Laboratory P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Robert DeLeon University at Buffalo Tracking and Analyzing Job-level Activity Using Open XDMoD, XALT and OGRT Robert L. Deleon University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Phil Demar Fermi National Laboratory P43: Deep Packet/Flow Analysis Using GPUs David E. DeMarle Kitware Inc Large Scale Visualization with ParaView James Demmel University of California, Berkeley Scaling Deep Learning on GPU and Knights Landing Clusters Linear Algebra Libraries for High-Performance Computing: Scientific Computing with Multicore and Accelerators Paul Demorest National Radio Astronomy Observatory realfast@VLA Nicolas Denoyelle French Institute for Research in Computer Science and Automation (INRIA) Modeling Large Compute Nodes with Heterogeneous Memories with the Cache-Aware Roofline Model John W. Dermer Los Alamos National Laboratory P54: Investigating Hardware Offloading for Reed-Solomon Encoding Jack Deslippe Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Hariharan Devarajan Illinois Institute of Technology Evaluating GPGPU Memory Performance Through the C-AMAT Model Open Ethernet Drive: Evolution of Energy-Efficient Storage Technology Mehmet Deveci Sandia National Laboratories Designing Vector-Friendly Compact BLAS and LAPACK Kernels Mike Dewar Numerical Algorithms Group HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Salvatore Di Girolamo ETH Zurich sPIN: High-Performance Streaming Processing in the Network A08: Virtualized Big Data: Reproducing Simulation Output on Demand Sheng Di Argonne National Laboratory An Efficient Approach to Lossy Compression with Pointwise Relative Error Bound P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Lori Diachin Lawrence Livermore National Laboratory Using HPC to Impact US Manufacturing through the HPC4Mfg Program Gerrett Diamond Rensselaer Polytechnic Institute Dynamic Load Balancing of Massively Parallel Unstructured Meshes Philip Diamond Square Kilometre Array Life, the Universe and Computing: The Story of the SKA Telescope Mattias Diener University of Illinois Visualizing, Measuring, and Tuning Adaptive MPI Parameters Integrating OpenMP into the Charm++ Programming Model Mark Dietrich Compute Canada Supercomputing in the Shadow of Giants: Perspectives and Insights from Supercomputing Leaders Outside the “Big 5” Regions and Organizations Americas HPC Collaboration Gary A. Dilts Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Nan Ding Tsinghua University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Minh Dinh University of Queensland Five-minute presentations by young researchers from around the world - part 1 Sebastian Doebel Technical University Dresden An LLVM Instrumentation Plug-In for Score-P Douglas Doerfler Lawrence Berkeley National Laboratory Performance and Energy Usage of Workloads on KNL and Haswell Architectures Usability, Scalability and Productivity on Many-Core Processors: Intel Xeon Phi Jiri Dokulil University of Vienna Extending the Open Community Runtime with External Application Support David Domyancic Lawrence Livermore National Laboratory P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Bin Dong Lawrence Berkeley National Laboratory Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling Wenjie Dong Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Jack Dongarra University of Tennessee Introduction - 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Investigating Half-Precision Arithmetic to Accelerate Dense Linear System Solvers Flexible Batched Sparse Matrix-Vector Product on GPUs Dynamic Task Discovery in PaRSEC- A Data-Flow Task-Based Runtime Keynote - An Overview of High Performance Computing and Challenges for the Future TOP500 - Past, Present, Future Linear Algebra Libraries for High-Performance Computing: Scientific Computing with Multicore and Accelerators Big Data and Exascale Computing (BDEC) Community Report TOP500 Supercomputers Batched, Reproducible, and Reduced Precision BLAS P72: New Developments for PAPI 5.6+ David Donofrio Lawrence Berkeley National Laboratory Workshop for Open Source Supercomputing PARADISE: A ToolFlow to Model Emerging Technologies for the Post-CMOS Era in HPC Reconfigurable Computing in Exascale Rion Dooley University of Texas HPC via HTTP: Portable, Scalable Computing Using App Containers and the Agave API Matthieu Dorier Argonne National Laboratory Supporting Task-level Fault-Tolerance in HPC Workflows by Launching MPI Jobs inside MPI Jobs Matthieu Dreher Argonne National Laboratory In Situ Workflows at Exascale: System Software to the Rescue Nikoli Dryden University of Illinois Lawrence Livermore National Laboratory Toward Scalable Parallel Training of Deep Neural Networks David H.C. Du University of Minnesota P56: ZoneTier: A Zone-Based Storage Tiering and Caching Co-Design to Integrate SSDs with Host-Aware SMR Drives Xiaohui Duan Shandong University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Nicolas Dube Hewlett Packard PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Anshu Dubey Argonne National Laboratory University of Chicago Proposal for a Scientific Software Lifecycle Model Better Scientific Software Multiphysics Pradeep Dubey Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Artificial Intelligence and The Virtuous Cycle of Compute Nicolas Dubé Hewlett Packard Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Dmitry Duplyakin University of Utah Contemporary Design of Supercomputer Experiments Earl Duque Intelligent Light Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization HPC Powers Wind Energy ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Thomas Durbin Durbin Engineering Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Dmitry Durnov Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Matthew Dwyer University of Nebraska Towards Self-Verification in Finite Difference Code Generation Michael D’mello Intel Corporation An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor Return to Top E Jonathan Eastep Intel Corporation PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control P95: GEOPM: A Scalable Open Runtime Framework for Power Management Joe Eaton Nvidia Corporation Parallel Jaccard and Related Graph Clustering Techniques Jerry Ebalunode University of Houston Vistas in Advanced Computing H. Carter Edwards Sandia National Laboratories Kokkos: Enabling Manycore Performance Portability for C++ Applications and Domain Specific Libraries/Languages Stratos Efstathiadis New York University Second Annual Meeting of the SIGHPC - Big Data Chapter Alexandre Eichenberger IBM Implementing Implicit OpenMP Data Sharing on GPUs Stephan Eidenbenz Los Alamos National Laboratory A Scalable Analytical Memory Model for CPU Performance Prediction Victor Eijkhout University of Texas Advanced Manycore Programming (KNL) Greg Eisenhauer Georgia Institute of Technology Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout Daniel Eisenstein Harvard University Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Mohamed El-Hadedy University of Illinois RE-HASE: Regular-Expressions Hardware Synthesis Engine Izzat El-Hajj University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions Nosayba El-Sayed Massachusetts Institute of Technology Qatar Computing Research Institute Understanding Object-Level Memory Access Patterns Across the Spectrum Ahmed Eleliemy University of Basel P74: A Methodology for Bridging the Native and Simulated Executions of Parallel Applications Sally Ellingson University of Kentucky Deep Learning Michael J. Ellsworth, Jr. IBM Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Nahid Emad University of Versailles Maison de la Simulation Parallel Jaccard and Related Graph Clustering Techniques Runtime Correctness Checking for Emerging Programming Paradigms Joel Emer Nvidia Corporation Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Toshio Endo Tokyo Institute of Technology TSUBAME3.0: A Green, Accelerated, Big-Data Supercomputer Applying Temporal Blocking with a Directive-Based Approach State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P05: ooc_cuDNN : A Deep Learning Library Supporting CNNs over GPU Memory Capacity Christian Engelmann Oak Ridge National Laboratory Introduction - 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Failures in Large Scale Systems: Long-Term Measurement, Analysis, and Implications Characterizing Faults, Errors, and Failures in Extreme-Scale Systems Nicolás Erdödy Open Parallel Ltd SKA: The Ultimate Big Data Project The Internet of Things and HPC: Are They Teaming Up to Work Together? Mattan Erez University of Texas Silent Errors in HPC Systems Rajeev S. Erramilli Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Oscar Esquivel-Flores Monterrey Institute of Technology Invited Talk - On Improved Monte Carlo Hybrid Methods for Preconditioner Computations Trilce Estrada University of New Mexico Panel: Attracting Women and Underrepresented Minorities to HPC and Data Science Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Jean-Matthieu ETANCELIN University of Reims Champagne-Ardenne P64: romeoLAB : HPC Training Platform on HPC facility Return to Top F Peyman Faizian Florida State University Modeling UGAL on the Dragonfly Topology A Comparative Study of SDN and Adaptive Routing on Dragonfly Networks Alessandro Fanfarillo National Center for Atmospheric Research Performance Portability of an Intermediate-Complexity Atmospheric Research Model in Coarray Fortran Aiman Fang University of Chicago Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis Jian Fang Delft University of Technology Adopting OpenCAPI for High Bandwidth Database Accelerators Massimiliano Fatica Nvidia Corporation A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Farzad Fatollahi-Fard Lawrence Berkeley National Laboratory Workshop for Open Source Supercomputing Christian Feld Juelich Supercomputing Center Hands-On Practical Hybrid Parallel Application Performance Engineering Alexandre Fender Nvidia Corporation University of Versailles Parallel Jaccard and Related Graph Clustering Techniques Wu Feng Virginia Tech The Green500: Trends in Energy-Efficient Supercomputing Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) P82: Performance Evaluation of the NVIDIA Tesla P100: Our Directive-Based Partitioning and Pipelining vs. NVIDIA’s Unified Memory John Feo Pacific Northwest National Laboratory Introduction - IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms Charles R. Ferenbaugh Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Mark Fernandez Hewlett Packard Enterprise HPC in Space: Supercomputing at 17,500 MPH Rafael Ferreira da Silva University of Southern California On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows Kurt B. Ferreira Sandia National Laboratories P93: Spacehog: Evaluating the Costs of Dedicating Resources to In Situ Analysis Nicola Ferrier Argonne National Laboratory Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Daniel A. Feshbach Haverford College A20: Correctness Verification and Boundary Conditions for Chapel Iterator-Based Loop Optimization Adam Fidel Texas A&M University Bounded Asynchrony and Nested Parallelism for Scalable Graph Processing Steve Fields IBM OpenCAPI: High Performance, Host-Agnostic, Coherent Accelerator Interface Weronika Filinger University of Edinburgh From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community Salvatore Filippone Cranfield University Introduction - PAW 2017: The 2nd Annual PGAS Applications Workshop Hal Finkel Argonne National Laboratory FPGAs for Supercomputing? Progress and Challenges Introduction - LLVM-HPC2017: Fourth Workshop on the LLVM Compiler Infrastructure in HPC Developing an OpenMP Runtime for UVM-Capable GPUs Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading OpenMP 4.5 Validation and Verification Suite Concluding Remarks – LLVM-HPC2017 Distributed and Heterogeneous Programming in C++ for HPC LLVM in HPC: Uses and Desires LLVM-HPC2017: Fourth Workshop on the LLVM Compiler Infrastructure in HPC Paul Fischer University of Illinois Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Goran Flegar Jaume I University Overcoming Load Imbalance for Irregular Sparse Matrices Flexible Batched Sparse Matrix-Vector Product on GPUs Kermin Fleming Intel Corporation LESS: Loop Nest Execution Strategies for Spatial Architectures Fernanda Foertter Oak Ridge National Laboratory Fourth SC Workshop on Best Practices for HPC Training Career Panel Discussion: Hints and Tips to Progress Your Career Overcoming the Confidence Gap Parallware Trainer: Interactive Tool for Experiential Learning of Parallel Programming Using OpenMP and OpenACC The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact An Example of Porting PETSc Applications to Heterogeneous Platforms with OpenACC Kokkos: Enabling Manycore Performance Portability for C++ Applications and Domain Specific Libraries/Languages OpenACC API User Experience, Vendor Reaction, Relevance, and Roadmap HPC Education: Meeting of the SIGHPC Education Chapter Interactivity in Supercomputing Mike Folk HDF Group Software Engineering and Reuse in Computational Science and Engineering John Fonner Texas Advanced Computing Center, University of Texas HPC via HTTP: Portable, Scalable Computing Using App Containers and the Agave API John C. Forbes Harvard University Harvard-Smithsonian Center for Astrophysics Milky Way Analogue Isolated Disk Galaxy Andrea Formisano University of Perugia Accelerating Energy Games Solvers on Modern Architectures Ian Foster Argonne National Laboratory Introduction - The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) Cloud Computing for Science and Engineering The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) Pouya Fotouhi University of California, Davis P50: Energy-Efficient and Scalable Bio-Inspired Nanophotonic Computing P49: Toward Exascale HPC Systems: Exploiting Advances in High Bandwidth Memory (HBM2) through Scalable All-to-All Optical Interconnect Architectures Yvan Fournier EDF France Melissa: Large Scale In Transit Global Sensitivity Analysis Avoiding Intermediate Files Robert J. Fowler University of North Carolina QUARC: An Optimized DSL Framework Using LLVM William Fox Georgia Institute of Technology University of California, San Francisco E-HPC: A Library for Elastic Resource Management in HPC Environments Franz Franchetti Carnegie Mellon University P06: Large Scale FFT-Based Stress-Strain Simulations with Irregular Domain Decomposition IA^3 Debate Tommy Franczak Northern Illinois University A Path from Serial Execution to Hybrid Parallelization for Learning HPC Robert Freeman Jr Harvard University HPC Carpentry - Practical, Hands-On HPC Training Bernhard Friebe Intel Corporation Enabling FPGAs for the Software Developers Brian Friesen National Energy Research Scientific Computing Center Performance Portability of an Intermediate-Complexity Atmospheric Research Model in Coarray Fortran Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Haohan Fu Tsinghua University National Supercomputing Center, Wuxi Redesigning CAM-SE for Peta-Flops Performance on Sunway TaihuLight Lessons on Integrating and Utilizing 10 Million Cores: Experience of Sunway TaihuLight Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Song Fu University of North Texas P55: Incorporating Proactive Data Rescue into ZFS Disk Recovery for Enhanced Storage Reliability Akihiro Fujii Kogakuin University P14: Robust SA-AMG Solver by Extraction of Near-Kernel Vectors Katsuki Fujisawa Kyushu University National Institute of Advanced Industrial Science and Technology Cyber-Physical System and Industrial Applications of Large-Scale Graph Analysis and Optimization Problems P78: Performance Evaluation of Graph500 Considering CPU-DRAM Power Shifting Hajime Fujita Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Kohei Fujita University of Tokyo RIKEN Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC P09: Adaptive Multistep Predictor for Accelerating Dynamic Implicit Finite-Element Simulations P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Douglas Fuller Red Hat Inc Ceph Applications in HPC Environments Student/Postdoc Job Fair Thomas R. Furlani University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Tracking and Analyzing Job-level Activity Using Open XDMoD, XALT and OGRT Mikito Furuichi Japan Agency for Marine-Earth Science and Technology P21: The First Real-Scale DEM Simulation of a Sandbox Experiment Using 2.4 Billion Particles Yasunori Futamura University of Tsukuba Efficient and Scalable Calculation of Complex Band Structure Using Sakurai-Sugiura Method Return to Top G Abhinav Gaba Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Alice-Agnes Gabriel Ludwig Maximilian University of Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Niall Gaffney University of Texas Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Ana Gainaru Vanderbilt University Periodic I/O Scheduling for Supercomputers Kelly Gaither University of Texas Panel Discussion: Diversifying the HPC workforce Introduction - Women in HPC: Diversifying the HPC Community Career Panel Discussion: Hints and Tips to Progress Your Career Hints and Tips for Public Speaking High Performance Computing Education in US Data Science Scientific Visualization & Data Analytics Showcase James Galarowicz Krell Institute How To Analyze the Performance of Parallel Codes 101 Steven M. Gallo University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Todd Gamblin Lawrence Livermore National Laboratory 4th International Workshop on HPC User Support Tools (HUST-17) Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Performance Modeling under Resource Constraints Using Deep Transfer Learning Predicting the Performance Impact of Different Fat-Tree Configurations Managing HPC Software Complexity with Spack P75: Model-Agnostic Influence Analysis for Performance Data Lin Gan Tsinghua University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Aryya Gangopadhyay University of Maryland, Baltimore County Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences Sangram Ganguly NASA Ames Research Center Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Dennis Gannon Indiana University Cloud Computing for Science and Engineering Guang Gao University of Delaware Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Tao Gao University of Delaware A23: Evaluation of Data-Intensive Applications on Intel Knights Landing Cluster Eric Garcia Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Rao V. Garimella Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Michael Garland Nvidia Corporation Parallel Depth-First Search for Directed Acyclic Graphs Nitin A. Gawande Pacific Northwest National Laboratory Evaluating On-Node GPU Interconnects for Deep Learning Workloads Markus Geimer Juelich Supercomputing Center Hands-On Practical Hybrid Parallel Application Performance Engineering Al Geist Oak Ridge National Laboratory Introduction - 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Brad Geltz Intel Corporation P95: GEOPM: A Scalable Open Runtime Framework for Power Management Ann gentile Sandia National Laboratories HPC Systems Monitoring Data in Action Raffaella Gentilini University of Perugia Accelerating Energy Games Solvers on Modern Architectures Giorgis Georgakoudis Queen's University Belfast REFINE: Realistic Fault Injection via Compiler-Based Instrumentation for Accuracy, Portability and Speed Evangelos Georganas Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Alan George University of Pittsburgh Reconfigurable Supercomputing (RSC) Daniel George National Center for Supercomputing Applications, University of Illinois A13: Deep Learning with HPC Simulations for Extracting Hidden Signals: Detecting Gravitational Waves Richard Gerber Lawrence Berkeley National Laboratory Fourth SC Workshop on Best Practices for HPC Training Lisa Gerhardt Lawrence Berkeley National Laboratory Container Computing for HPC and Scientific Workflows Sandra Gesing University of Notre Dame Introduction - WORKS 2017 (12th Workshop on Workflows in Support of Large-Scale Science) Berk Geveci Kitware Inc In Situ Summarization with VTK-m Sheikh K. Ghafoor Tennessee Technological University Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Amir Gholami University of Texas A Framework for Scalable Biophysics-Based Image Analysis Experiencing HPC for Undergraduates: Graduate Student Perspective Devarshi Ghoshal Lawrence Berkeley National Laboratory E-HPC: A Library for Elastic Resource Management in HPC Environments Paolo Giannozzi University of Udine A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Paul Gibbon Forschungszentrum Juelich P87: EoCoE Performance Benchmarking Methodology for Renewable Energy Applications Mike Giles University of Oxford Beyond 16GB: Out-of-Core Stencil Computations P01: Cache-Blocking Tiling of Large Stencil Codes at Runtime Lauren Gillespie Southwestern University P47: Understanding Congestion on Omni-Path Fabrics Ladina Gilly Swiss National Supercomputing Centre Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Alfredo Gimenez Lawrence Livermore National Laboratory University of California, Davis ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine Judit Gimenez Barcelona Supercomputing Center Introduction - 4th International Workshop on Visual Performance Analytics – VPA 2017 Fourth International Workshop on Visual Performance Analysis – VPA 2017 Benjamin H. Glick Lewis & Clark College A07: Scalable Parallel Scripting in the Cloud Matthias Gobbert University of Maryland, Baltimore County Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences Jens Henrik Goebbert Forschungszentrum Juelich Comprehensive Visualization of Large-Scale Simulation Data Linked to Respiratory Flow Computations on HPC Systems Brice Goglin French Institute for Research in Computer Science and Automation (INRIA) Modeling Large Compute Nodes with Heterogeneous Memories with the Cache-Aware Roofline Model Cross-Layer Allocation and Management of Hardware Resources in Shared Memory Nodes Eng Lim Goh Hewlett Packard Enterprise HPC in Space: Supercomputing at 17,500 MPH Ali Murat Gok Argonne National Laboratory Northwestern University P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Nathan J. Goldbaum National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy Deb Goldfarb Intel Corporation Negotiation Skills Career Panel Discussion: Hints and Tips to Progress Your Career Sally Goldman Google Panel: Attracting Women and Underrepresented Minorities to HPC and Data Science Antonio Tedu A. Gomes National Laboratory for Scientific Computing, Brazil Supercomputing in the Shadow of Giants: Perspectives and Insights from Supercomputing Leaders Outside the “Big 5” Regions and Organizations Canstantino Gomez Barcelona Supercomputing Center Five-minute presentations by young researchers from around the world - part 2 Rosalia Gomez Texas Advanced Computing Center, University of Texas High Performance Computing Education in US Data Science Qian Gong Fermi National Laboratory P43: Deep Packet/Flow Analysis Using GPUs Yifan Gong TuSimple Efficient Process Mapping in Geo-Distributed Cloud Data Centers Elsa Gonsiorowski Lawrence Livermore National Laboratory Career Panel Discussion: Hints and Tips to Progress Your Career How to Take the Next Step in Your Career Ganesh Gopalakrishnan University of Utah P84: PRESAGE: Selective Low Overhead Error Amplification for Easy Detection Mark Gordon Iowa State University An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor P32: Exploring the Performance of Electron Correlation Method Implementations on Kove XPDs Mark S. Gordon Iowa State University Porting a GAMESS Computational Chemistry Kernel to FPGAs P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi Steven Gordon Ohio Supercomputer Center A Proposed Model for Teaching Advanced Parallel Computing and Related Topics HPC Education: Meeting of the SIGHPC Education Chapter Sergei Gorlatch University of Munster PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Gerard Gorman Imperial College, London Towards Self-Verification in Finite Difference Code Generation Software Engineering and Reuse in Computational Science and Engineering R. Govindarajan Indian Institute of Science HPC Initiatives in India Paolo Grani University of California, Davis P49: Toward Exascale HPC Systems: Exploiting Advances in High Bandwidth Memory (HBM2) through Scalable All-to-All Optical Interconnect Architectures David Grant Oak Ridge National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Ryan Grant Sandia National Laboratories Workshop on Exascale MPI (ExaMPI) sPIN: High-Performance Streaming Processing in the Network PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Catherine Graves Hewlett Packard Computing with Physics: Analog Computation and Neural Network Classification with a Dot Product Engine Jennifer Green Los Alamos National Laboratory How To Analyze the Performance of Parallel Codes 101 Kevin Griffin Lawrence Livermore National Laboratory Scalable HPC Visualization and Data Analysis Using VisIt Leopold Grinberg IBM P79: Porting the Opacity Client Library to a CPU-GPU Cluster Using OpenMP 4.5 William Gropp University of Illinois Challenges in Programming Extreme Scale Systems Energy Efficiency Gains From Software: Retrospectives and Perspectives Advanced MPI Programming Software Engineering and Reuse in Computational Science and Engineering P70: FFT, FMM, and Multigrid on the Road to Exascale: Performance Challenges and Opportunities Tobias Grosser ETH Zurich Improved Loop Distribution in LLVM Using Polyhedral Dependences Max Grossman Rice University Graph500 on OpenSHMEM: Using a Practical Survey of Past Work to Motivate Novel Algorithmic Developments Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages Robert Grossman University of Chicago Blurring the Lines: High-End Computing and Data Science Sharing Research Data: Data Commons, Distributed Clouds, and Distributed Data Services Paola Grosso University of Amsterdam Innovating the Network for Data Intensive Science (INDIS) Kenny Gruchalla National Renewable Energy Laboratory Contextual Compression of Large-Scale Wind Turbine Array Simulations Hui Guan North Carolina State University Egeria: A Framework for Auto-Construction of HPC Advising Tools through Multi-Layered Natural Language Processing Qiang Guan Los Alamos National Laboratory Ultrascale Systems Research Center P92: Characterization and Comparison of Application Resilience for Serial and Parallel Executions P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Ernesto Guerrero University of Malaga Parallware Trainer: Interactive Tool for Experiential Learning of Parallel Programming Using OpenMP and OpenACC Shashank Gugnani Ohio State University A06: Accelerating Big Data Processing in the Cloud with Scalable Communication and I/O Schemes Pablo Guillen University of Houston Vistas in Advanced Computing Raghul Gunasekaran Oak Ridge National Laboratory Scientific User Behavior and Data-Sharing Trends in a Petascale File System GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Murat E. Guney Intel Corporation Designing Vector-Friendly Compact BLAS and LAPACK Kernels Peng Guo Chinese Academy of Sciences Chinese Academy of Sciences Tessellating Stencils Xinfei Guo University of Virginia RE-HASE: Regular-Expressions Hardware Synthesis Engine Xuan Guo Oak Ridge National Laboratory Introduction - The Eighth International Workshop on Data-Intensive Computing in the Clouds Yanfei Guo Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 MPICH: A High-Performance Open-Source MPI Implementation Anshul Gupta IBM Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Rajiv Gupta University of California, Riverside ParaStack: Efficient Hang Detection for MPI Programs at Large Scale Ravi Gupta Intel Corporation Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction Saurabh Gupta Intel Corporation Failures in Large Scale Systems: Long-Term Measurement, Analysis, and Implications BlazingText: Scaling and Accelerating Word2Vec using Multiple GPUs Levent Gurel University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions John L. Gustafson National University of Singapore Posit Research Posit Math Unit (PMU) – A New Approach Toward Exascale Computing Improving Numerical Computation with Practical Tools and Novel Computer Arithmetic Ethan Gutmann National Center for Atmospheric Research Performance Portability of an Intermediate-Complexity Atmospheric Research Model in Coarray Fortran Markus Götz Research Center Juelich Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework Return to Top H Roland Haas National Center for Supercomputing Applications, University of Illinois P38: Benchmarking Parallelized File Aggregation Tools for Large Scale Data Management Sonja Habbinga Forschungszentrum Juelich Comprehensive Visualization of Large-Scale Simulation Data Linked to Respiratory Flow Computations on HPC Systems Salman Habib Argonne National Laboratory Cosmological Particle Data Compression in Practice Elie Hachem Mines ParisTech Supercomputing for Everyone: Meeting the Growing Needs of Businesses Sebastian Hack Saarland University PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Daniel Hackenberg Technical University Dresden Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Walker Haddock University of Alabama, Birmingham P45: Campaign Storage: Erasure Coding with GPUs Bilel Hadri King Abdullah University of Science and Technology Fourth SC Workshop on Best Practices for HPC Training Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Regression Testing and Monitoring Tools State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Matthieu Haefele French Alternative Energies and Atomic Energy Commission P87: EoCoE Performance Benchmarking Methodology for Renewable Energy Applications Raphael Tuvia Haftka University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Hans Hagen University of Kaiserslautern Cosmological Particle Data Compression in Practice Georg Hager University of Erlangen-Nuremberg Node-Level Performance Engineering Christoph Hagleitner IBM Application Porting and Optimization on GPU-Accelerated POWER Architectures Gabriel Hahn Baylor University P35: Using HPC to Model Quantum-Dot Cellular Automata Azzam Haidar University of Tennessee Investigating Half-Precision Arithmetic to Accelerate Dense Linear System Solvers Batched, Reproducible, and Reduced Precision BLAS Michael Haidl University of Munster PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Mahantesh Halappanavar Pacific Northwest National Laboratory HPC Graph Toolkits and the GraphBLAS Forum Mary Hall University of Utah Writing Effective Proposals Bernd Hamann University of California, Davis ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Khaled Hamidouche Advanced Micro Devices Inc GPU Triggered Networking for Intra-Kernel Communications Dorit M. Hammerling National Center for Atmospheric Research Quality Assurance and Error Identification for the Community Earth System Model Simon D. Hammond Sandia National Laboratories Introduction - The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) Designing Vector-Friendly Compact BLAS and LAPACK Kernels The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) Sunggeun Han Korea Institute of Science and Technology Information P51: TuPiX-Flow: Workflow-Based Large-Scale Scientific Data Analysis System Toshihiro Hanawa University of Tokyo State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) David Hancock Indiana University Future Trends in HPC Matthew R. Hanlon Texas Advanced Computing Center, University of Texas Securing HPC: Development of a Low Cost, Open Source, Multi-Factor Authentication Infrastructure Sean Hanlon National Cancer Institute Impacting Cancer with HPC: Opportunities and Challenges Riyaz Haque Lawrence Livermore National Laboratory P79: Porting the Opacity Client Library to a CPU-GPU Cluster Using OpenMP 4.5 Guénolé Harel Atomic Energy and Alternative Energies Commission Lean Visualization of Large Scale Tree-Based AMR Meshes Siva Hari Nvidia Corporation Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Kevin Harms Argonne National Laboratory Run-to-Run Variability on Xeon Phi Based Cray XC Systems Peter Z. Harrington University of California, Santa Cruz A16: Diagnosing Parallel I/O Bottlenecks in HPC Applications J. Austin Harris Oak Ridge National Laboratory P26: Optimizing Gravity and Nuclear Physics in FLASH for Exascale Cyrus Harrison Lawrence Livermore National Laboratory Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine Scalable HPC Visualization and Data Analysis Using VisIt William Harrod US Department of Energy National Strategic Computing Initiative Post Moore Supercomputing National Strategic Computing Initiative Update Rebecca Hartman-Baker Lawrence Berkeley National Laboratory Fourth SC Workshop on Best Practices for HPC Training Introduction - Women in HPC: Diversifying the HPC Community Career Panel Discussion: Hints and Tips to Progress Your Career Effective Workplace Communication HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Women in HPC: Non-Traditional Paths to HPC and How They Can and Do Enrich the Field Masayuki Hatanaka RIKEN Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 P48: Prototyping of Offloaded Persistent Broadcast on Tofu2 Interconnect Akihiro HAYASHI Rice University Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs GPU Execution in Java Programs Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages Linda Hayden Elizabeth City State University Teaching, Learning and Collaborating through Cloud Computing Online Classes Bingsheng he National University of Singapore Efficient Process Mapping in Geo-Distributed Cloud Data Centers Conghui He Tsinghua University 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Yun (Helen) He Lawrence Berkeley National Laboratory OpenMP Common Core: A “Hands-On” Exploration Mario Heene University of Stuttgart A Highly Scalable, Algorithm-Based Fault-Tolerant Solver for Gyrokinetic Plasma Simulations Sean Hefty Intel Corporation Fabric APIs - libfabric User Perspective and C++ Standardization Alexander Heinecke Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Alan Heirich Stanford University SLAC National Accelerator Laboratory In Situ Visualization with Task-Based Parallelism Katrin Heitmann Argonne National Laboratory Cosmological Particle Data Compression in Practice Stijn Heldens University of Twente P86: HyGraph: High Performance Graph Processing on Hybrid CPU+GPUs platforms Barbara Helland US Department of Energy Small Business and the Exascale Computing Project Greg Henry Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Thomas Herault University of Tennessee Dynamic Task Discovery in PaRSEC- A Data-Flow Task-Based Runtime Reliability, Fault Tolerance, and Resilience Randy Herban Cycle Computing HPC Systems Professionals Workshop Stephen Herbein University of Delaware P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Martin Herbordt Boston University OpenCL for FPGAs/HPC: Case Study in 3D FFT Reconfigurable Supercomputing (RSC) P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons Oscar Hernandez Oak Ridge National Laboratory OpenMP 4.5 Validation and Verification Suite OpenSHMEM in the Era of Exascale Christian Herold Technical University Dresden An LLVM Instrumentation Plug-In for Score-P P67: Measuring I/O Behavior on Upcoming Systems with NVRAM P66: Analyzing Multi-Layer I/O Behavior of HPC Applications Michael A. Heroux Sandia National Laboratories St. John’s University Software Engineering for Computational Science and Engineering: What Can Work and What Will Not Keynote - A Holistic Approach to Advancing Science and Engineering through Extreme-Scale Computing Research Methods Linear Algebra Libraries for High-Performance Computing: Scientific Computing with Multicore and Accelerators Better Scientific Software Software Engineering and Reuse in Computational Science and Engineering Practical Reproducibility by Managing Experiments Like Software Angela M. Herring Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Emily Herron Mercer University A12: Applying Image Feature Extraction to Cluttered Scientific Repositories Andreas Herten Forschungszentrum Juelich Application Porting and Optimization on GPU-Accelerated POWER Architectures William Judson Hervey Naval Research Laboratory P18: A Parallel Python Implementation of BLAST+ (PPIB) for Characterization of Complex Microbial Consortia Mary Hester SURFnet Innovating the Network for Data Intensive Science (INDIS) W. Terry Hewitt WTH Associates Ltd HPC Acquisition and Commissioning Jason Hick Los Alamos National Laboratory Total Cost of Ownership and HPC System Procurement Susan Hicks Oak Ridge National Laboratory P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Mert Hidayetoglu University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions Jan Hidders Vrije Universiteit Brussel Adopting OpenCAPI for High Bandwidth Database Accelerators Joshua Higgins University of Huddersfield Teaching Parallel Computing with Container Virtualization Dean Hildebrand Google LLC Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS) P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Use and Management of Non-Volatile Memories Pamela Hill National Center for Atmospheric Research Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Torsten Hoefler ETH Zurich Introduction - H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Scaling Betweenness Centrality Using Communication-Efficient Sparse Matrix Multiplication sPIN: High-Performance Streaming Processing in the Network Publishing Advanced MPI Programming H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Henry Hoffmann University of Chicago P98: Energy Efficiency in HPC with Machine Learning and Control Theory Steven Hofmeyr Lawrence Berkeley National Laboratory P98: Energy Efficiency in HPC with Machine Learning and Control Theory Peter Hofstee IBM Adopting OpenCAPI for High Bandwidth Database Accelerators Adolfy Hoisie Pacific Northwest National Laboratory Energy Efficient Supercomputing (E2SC) Representative Paths Analysis Evaluating On-Node GPU Interconnects for Deep Learning Workloads Jeffrey Hokanson Colorado School of Mines Contemporary Design of Supercomputer Experiments Jeffrey K. Hollingsworth University of Maryland Awards Ceremony Daniel Holmes University of Edinburgh Introduction - Women in HPC: Diversifying the HPC Community Effective Programming Models for Deep Learning at Scale Violeta Holmes University of Huddersfield Teaching Parallel Computing with Container Virtualization Hans-Christian Hoppe Intel Corporation Intel Corporation Reconfigurable Computing in Exascale Machine Learning for Parallel Performance Analytics Reazul Hoque University of Tennessee Dynamic Task Discovery in PaRSEC- A Data-Flow Task-Based Runtime Atsushi Hori RIKEN P48: Prototyping of Offloaded Persistent Broadcast on Tofu2 Interconnect Muneo Hori University of Tokyo RIKEN Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC P09: Adaptive Multistep Predictor for Accelerating Dynamic Implicit Finite-Element Simulations P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Takane Hori Japan Agency for Marine-Earth Science and Technology P21: The First Real-Scale DEM Simulation of a Sandbox Experiment Using 2.4 Billion Particles Masashi Horikoshi Intel Corporation P09: Adaptive Multistep Predictor for Accelerating Dynamic Implicit Finite-Element Simulations William Connor Horne Naval Research Laboratory P18: A Parallel Python Implementation of BLAST+ (PPIB) for Characterization of Complex Microbial Consortia Naomi Hospodarsky University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Kaixi Hou Virginia Tech Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Mike Houston Nvidia Corporation Production Deep Learning and Scale - Keynote by Mike Houston - Senior Distinguished Engineer - Deep Learning - Nvidia Effective Programming Models for Deep Learning at Scale Paul Hovland Argonne National Laboratory Towards Self-Verification in Finite Difference Code Generation Awards Ceremony Louis Howell Lawrence Livermore National Laboratory Predicting the Performance Impact of Different Fat-Tree Configurations Kevin Hsieh Carnegie Mellon University Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs Tony Hsu Inventec Corporation Towards a Composable Computer System Yang Hu University of Texas, Dallas LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems Jianyu Huang University of Texas Lowering Barriers into HPC through Open Education P02: Strassen's Algorithm for Tensor Contraction Kangli Huang Delft University of Technology Adopting OpenCAPI for High Bandwidth Database Accelerators Shan Huang Chinese Academy of Sciences Chinese Academy of Sciences Tessellating Stencils Xiaoping Huang Northwestern Polytechnical University RE-HASE: Regular-Expressions Hardware Synthesis Engine Yingchao Huang University of California, Merced Unimem: Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Main Memory Martin Huarte-Espinosa University of Houston Vistas in Advanced Computing Kevin A. Huck University of Oregon Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code Jan Huckelheim Imperial College, London Verifying the Floating-Point Computation Equivalence of Manually and Automatically Differentiated Code Towards Self-Verification in Finite Difference Code Generation Yectli Huerta University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Ron Huizen BittWare Inc Cooling Hot FPGAs: A Thermals First Approach Saurabh Hukerikar Oak Ridge National Laboratory Five-minute presentations by young researchers from around the world - part 2 Alan Humphrey University of Utah Scientific Computing and Imaging Institute Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs Wen-Mei Hwu University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions Thomas Häner ETH Zurich 0.5 Petabyte Simulation of a 45-Qubit Quantum Circuit Thomas Hérault University of Tennessee Fault-Tolerance for High Performance and Distributed Computing: Theory and Practice Markus Höhnerbach RWTH Aachen University A04: Optimization of the AIREBO Many-Body Potential for KNL Return to Top I Costin Iancu Lawrence Berkeley National Laboratory Introduction - PAW 2017: The 2nd Annual PGAS Applications Workshop Huda Ibeid University of Illinois P70: FFT, FMM, and Multigrid on the Road to Exascale: Performance Challenges and Opportunities Tsuyoshi Ichimura University of Tokyo RIKEN Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC P09: Adaptive Multistep Predictor for Accelerating Dynamic Implicit Finite-Element Simulations P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Akihiro Ida University of Tokyo Keynote - Application Development Framework for Manycore Architectures on Post-Peta/Exascale Systems Yasuhiro Idomura Japan Atomic Energy Agency Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Shuichi Ihara DataDirect Networks A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Aleksandar Ilic INESC-ID Modeling Large Compute Nodes with Heterogeneous Memories with the Cache-Aware Roofline Model Performance Tuning of Scientific Codes with the Roofline Model Akira Imakura University of Tsukuba Efficient and Scalable Calculation of Complex Band Structure Using Sakurai-Sugiura Method Toshiyuki Imamura riken Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Connor Imes University of Chicago P98: Energy Efficiency in HPC with Machine Learning and Control Theory Takuya Ina Japan Atomic Energy Agency Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Martins D. Innus University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Koji Inoue Kyushu University P78: Performance Evaluation of Graph500 Considering CPU-DRAM Power Shifting Joseph Insley Argonne National Laboratory Flexible In Situ Visualization of LAMMPS Simulations Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout Large Scale Visualization with ParaView Visualizing Silicene Growth Through Island Migration and Coalescence Bertrand Iooss EDF France Melissa: Large Scale In Transit Global Sensitivity Analysis Avoiding Intermediate Files Alexandru Iosup Vrije University Amsterdam P86: HyGraph: High Performance Graph Processing on Hybrid CPU+GPUs platforms Yutaka Ishikawa RIKEN P48: Prototyping of Offloaded Persistent Broadcast on Tofu2 Interconnect HPC Impact Showcase: Healthcare and Manufacturing Yuki Ito Tokyo Institute of Technology P05: ooc_cuDNN : A Deep Learning Library Supporting CNNs over GPU Memory Capacity Shigeru Iwase University of Tsukuba Efficient and Scalable Calculation of Complex Band Structure Using Sakurai-Sugiura Method Hidetoshi Iwashita RIKEN Preliminary Performance Evaluation of Coarray-based Implementation of Fiber Miniapp Suite Using XcalableMP PGAS Language Return to Top J Christiane Jablonowski University of Michigan Parallel Computing 101 Adrian Jackson University of Edinburgh Introduction - Women in HPC: Diversifying the HPC Community Amina Jackson Naval Research Laboratory P18: A Parallel Python Implementation of BLAST+ (PPIB) for Characterization of Complex Microbial Consortia Arpith Jacob IBM Implementing Implicit OpenMP Data Sharing on GPUs Sam Ade Jacobs Lawrence Livermore National Laboratory Toward Scalable Parallel Training of Deep Neural Networks Heike Jagode University of Tennessee University of Tennessee P72: New Developments for PAPI 5.6+ Magnus Jahre Norwegian University of Science and Technology Toward Aggregated Grain Graphs Nikhil Jain Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning Predicting the Performance Impact of Different Fat-Tree Configurations Modeling and Simulation of Communication in HPC Systems P75: Model-Agnostic Influence Analysis for Performance Data William JALBY Versailles Saint-Quentin-en-Yvelines University Workshop on Extreme-Scale Programming Tools (ESPT) Siddhartha Jana Intel Corporation State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers P95: GEOPM: A Scalable Open Runtime Framework for Power Management Matthias Janetschek University of Innsbruck A Compiler Transformation-Based Approach to Scientific Workflow Enactment Dongmin Jang Korea Institute of Science and Technology Information Visualization of Decision-Making Support (DMS) Information for Responding to a Typhoon-Induced Disaster Tomislav Janjusic Mellanox Technologies Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Niclas Jansson KTH Royal Institute of Technology P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer Jiri Jaros Brno University of Technology P40: Running Large-Scale Ultrasound Simulations on Piz Daint with 512 Pascal GPUs Stephen A. Jarvis University of Warwick Introduction - The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) An Efficient Task-Based All-Reduce for Machine Learning Applications The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) Emmanuel Jeannot French Institute for Research in Computer Science and Automation (INRIA) Modeling Large Compute Nodes with Heterogeneous Memories with the Cache-Aware Roofline Model Cross-Layer Allocation and Management of Hardware Resources in Shared Memory Nodes Elizabeth Jessup University of Colorado, Boulder Careers in HPC Test of Time Award Special Lecture Morris Jette SchedMD LLC Slurm User Group Meeting Xu Ji Tsinghua University Qatar Computing Research Institute Understanding Object-Level Memory Access Patterns Across the Spectrum Kenneth Jiang Naval Research Laboratory P18: A Parallel Python Implementation of BLAST+ (PPIB) for Characterization of Complex Microbial Consortia Ivo Jimenez University of California, Santa Cruz Practical Reproducibility by Managing Experiments Like Software Judit Jimenez Barcelona Supercomputing Center Workshop on Extreme-Scale Programming Tools (ESPT) Zheming Jin Argonne National Laboratory P46: Understanding How OpenCL Parameters Impact on Off-Chip Memory Performance of FPGA Platforms Minsu Joh Korea Institute of Science and Technology Information Visualization of Decision-Making Support (DMS) Information for Responding to a Typhoon-Induced Disaster Lizy K. John University of Texas GPU Triggered Networking for Intra-Kernel Communications Chris Johnson University of Utah Medical Image Analysis and Visualization Experiencing HPC for Undergraduates: Introduction to HPC Research Travis Johnston Oak Ridge National Laboratory Optimizing Convolutional Neural Networks for Cloud Detection Andrew Jones Numerical Algorithms Group Essential HPC Finance Practice: Total Cost of Ownership (TCO), Internal Funding, and Cost-Recovery Models Extracting Value from HPC: Business Cases, Planning, and Investment HPC Acquisition and Commissioning Catherine Jones Science and Technology Facilities Council Software Engineers: Careers in Research Matthew D. Jones University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Kirk Jordan IBM Hartree Centre Making HPC Consumable: Helping Wet-Lab Chemists Access the Power of Computational Methods Thomas H. Jordan University of Southern California rvGAHP – Push-Based Job Submission Using Reverse SSH Connections Jithin Jose Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Keyur Joshi University of Illinois Implementation of a Cache Miss Calculator in LLVM/Polly Sebastien Jourdain Kitware Inc In Situ Summarization with VTK-m Guido Juckeland Helmholtz-Zentrum Dresden-Rossendorf Introduction - Fourth Workshop on Accelerator Programming Using Directives (WACCPD) The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact OpenACC API User Experience, Vendor Reaction, Relevance, and Roadmap Fourth Workshop on Accelerator Programming Using Directives (WACCPD) Christoph Junghans Los Alamos National Laboratory P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Gideon Juve Information Sciences Institute, University of Southern California rvGAHP – Push-Based Job Submission Using Reverse SSH Connections Thierry Jéron French Institute for Research in Computer Science and Automation (INRIA) Verifying MPI Applications with SimGridMC Return to Top K Humayun Kabir Pennsylvania State University Hierarchical Sparse Graph Computations on Multicore Platforms David Kahaner Asian Technology Information Program Welcome and Introduction ATIP Workshop on International Exascale and Next-Generation Computing Programs Bhavya Kailkhura Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning Hartmut Kaiser Louisiana State University HPX Smart Executors Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code Jürgen Kaiser Johannes Gutenberg University Mainz A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Yuta Kakibuka Kyushu University P78: Performance Evaluation of Graph500 Considering CPU-DRAM Power Shifting Laxmikant Kale University of Illinois Visualizing, Measuring, and Tuning Adaptive MPI Parameters Integrating OpenMP into the Charm++ Programming Model Migratable Objects and Task-Based Parallel Programming with Charm++ Vivek Kale University of Southern California P80: Adaptive Loop Scheduling with Charm++ to Improve Performance of Scientific Applications Dhiraj Kalmakar Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Sharan Kalwani DataSwing Data Center Design and Planning for HPC Folks Laxmikant Kalé University of Illinois Charmworks Inc Charm++ and AMPI: Adaptive and Asynchronous Parallel Programming Yasushi Kamata Railway Technical Research Institute P22: Numerical Simulation of Snow Accretion by Airflow Simulator and Particle Simulator Supun Kamburugamuve Indiana University Teaching, Learning and Collaborating through Cloud Computing Online Classes Shoaib Kamil Adobe Research Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis Krishna Kant Temple University Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Silent Errors in HPC Systems Larry Kaplan Cray Inc The ARM Software Ecosystem: Are We There Yet? Karen Karavanic Portland State University Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Ian Karlin Lawrence Livermore National Laboratory Predicting the Performance Impact of Different Fat-Tree Configurations DataRaceBench: A Benchmark Suite for Systematic Evaluation of Data Race Detection Tools P76: A Compiler Agnostic and Architecture Aware Predictive Modeling Framework for Kernels Sven Karlsson Technical University of Denmark Reconfigurable Computing in Exascale Yoshihiro Kasai Fujitsu Ltd P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Julian Kates-Harbeck Harvard University Training Distributed Deep Recurrent Neural Networks with Mixed Precision on GPU Clusters Daniel S. Katz University of Illinois Introduction - The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Promoting Scientific Workflows Experiencing HPC for Undergraduates: Careers in HPC Software Engineering and Reuse in Computational Science and Engineering Software Engineers: Careers in Research High Performance Computing Education in US Data Science State of the Practice: Operations The 2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CoDeSE 2017) Masatoshi Kawai University of Tokyo Keynote - Application Development Framework for Manycore Architectures on Post-Peta/Exascale Systems Petr Kaštovský Netcope Technologies Case Study: Usage of High Level Synthesis in HPC Networking Kate Keahey Argonne National Laboratory Practical Reproducibility by Managing Experiments Like Software Distributed Computing and Clouds Stephen Keckler Nvidia Corporation Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Kristopher Keipert Iowa State University Porting a GAMESS Computational Chemistry Kernel to FPGAs An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi Anna Blue Keleher University of Maryland A11: Finding a Needle in a Field of Haystacks: Lightweight Metadata Search for Large-Scale Distributed Research Repositories Alison Kennedy Hartree Centre Introduction - Women in HPC: Diversifying the HPC Community Garrett Kenyon Los Alamos National Laboratory New Mexico Consortium P88: PetaVision Neural Simulation Toolbox on Intel KNLs Darren Kerbyson Pacific Northwest National Laboratory Representative Paths Analysis Dan Kerns Quantum Corporation Quantum Corporation Rook Distributed Storage System Janis Keuper Fraunhofer Institute for Industrial Mathematics TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization Introduction - Machine Learning in HPC Environments David Keyes King Abdullah University of Science and Technology KAUST’s HiCMA Library: Hierarchical Computations on Manycore Architectures How Serious Are We About the Convergence Between HPC and Big Data? Software Engineering and Reuse in Computational Science and Engineering Walid Keyrouz National Institute for Standards and Technology Computational Reproducibility at Exascale 2017 (CRE2017) Zahra Khatami Louisiana State University HPX Smart Executors Gul rukh Khattak CERN P29: A Deep Learning Tool for Fast Simulation Farzad Khorasani Georgia Institute of Technology Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs S. E. Khudikyan Jet Propulsion Laboratory realfast@VLA Samuel Khuvis ParaTools P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures John Kichury Hewlett Packard Enterprise HPC in Space: Supercomputing at 17,500 MPH Ron Kikinis Harvard University Medical Image Analysis and Visualization Eugene Kikinzon Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures William Killian Millersville University of Pennsylvania University of Delaware The Design and Implementation of OpenMP 4.5 and OpenACC Backends for the RAJA C++ Performance Portability Layer P76: A Compiler Agnostic and Architecture Aware Predictive Modeling Framework for Kernels Gwangsun Kim ARM Ltd Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs Hyunwoo Kim Korea Institute of Science and Technology Information P51: TuPiX-Flow: Workflow-Based Large-Scale Scientific Data Analysis System Jeongnim Kim Intel Corporation Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations Jungwon Kim Oak Ridge National Laboratory PapyrusKV: A High-Performance Parallel Key-Value Store for Distributed NVM Architectures Kyungjoo Kim Sandia National Laboratories Designing Vector-Friendly Compact BLAS and LAPACK Kernels Mark Kim Oak Ridge National Laboratory In Situ Visualization of Radiation Transport Geometry Nam Ho Kim University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Youngjae Kim Sogang University Understanding Object-Level Memory Access Patterns Across the Spectrum TagIt: An Integrated Indexing and Search Service for File Systems Jason S. Kimko College of William and Mary P79: Porting the Opacity Client Library to a CPU-GPU Cluster Using OpenMP 4.5 Tom King Queen Mary University of London OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Anantha P. Kinnal Calligo Technologies Posit Research Posit Math Unit (PMU) – A New Approach Toward Exascale Computing Andrew C. Kirby University of Wyoming P28: High-Fidelity Blade-Resolved Wind Plant Modeling Christine Kirkpatrick San Diego Supercomputer Center Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Sherman J. Kisner High Performance Imaging LLC Massively Parallel 3D Image Reconstruction Kevin D. Kissell Google Deep Learning for Science in the Cloud Per Gunnar Kjeldsberg Norwegian University of Science and Technology Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Umayanganie Klaassen University of Texas, El Paso Porting a GAMESS Computational Chemistry Kernel to FPGAs Scott Klasky Oak Ridge National Laboratory Introduction - The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) MGARD: A Multilevel Technique for Compression of Floating-Point Data Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) Kerstin Kleese van Dam Brookhaven National Laboratory HPC Application Development Tools Lars Klein University of Munster PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Romain Klein Transvalor SA Aeromines Supercomputing for Everyone: Meeting the Growing Needs of Businesses Michael Klemm Intel Corporation Advanced OpenMP: Performance and 4.5 Features Mastering Tasking with OpenMP OpenMP: Enabling HPC for Twenty Years OpenMP® is Twenty. Where Is It Going? Alicia Klinvex Sandia National Laboratories Better Scientific Software Hannah Klion University of California, Berkeley Oak Ridge National Laboratory P26: Optimizing Gravity and Nuclear Physics in FLASH for Exascale Sarah Knepper Intel Corporation Designing Vector-Friendly Compact BLAS and LAPACK Kernels Christopher Knight Argonne National Laboratory Scalable In Situ Analysis of Molecular Dynamics Simulations Matthew Knight Metamako LP A Networked-FPGA Platform Offering Flexible Ethernet Switching from Layer 1 All the Way to Full SDN via P4 Aaron Knoll University of Utah Flexible In Situ Visualization of LAMMPS Simulations Andreas Knuepfer Technical University Dresden Performance Evaluation Tools Christina Koch University of Wisconsin HPC Carpentry - Practical, Hands-On HPC Training Greg Koeing Energy Efficient HPC Working Group State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Greg Koenig KPMG Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Peter Kogge University of Notre Dame 15th Graph500 List Memory-Centric Architectures for the Cloud and HPC A Case for Migrating Execution for Irregular Applications Hidetaka Koie National Institute of Advanced Industrial Science and Technology P97: Profile Guided Kernel Optimization for Individual Container Execution on Bare-Metal Container Bastian Koller High Performance Computing Center Stuttgart How Serious Are We About the Convergence Between HPC and Big Data? Martin Kong Brookhaven National Laboratory Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading Alice Koniges Lawrence Berkeley National Laboratory OpenMP Common Core: A “Hands-On” Exploration Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code Rob Kooper University of Illinois Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Israel Koren University of Massachusetts Experimental and Analytical Study of Xeon Phi Reliability Tuomas S. Koskela Lawrence Berkeley National Laboratory Performance Tuning of Scientific Codes with the Roofline Model Douglas Kothe Oak Ridge National Laboratory Exascale Challenges and Opportunities Anthony Kougkas Illinois Institute of Technology Enosis: Bridging the Semantic Gap between File-Based and Object-Based Data Models Spiros Koulouzis University of Amsterdam Seamless Infrastructure Customization and Performance Optimization for Time-Critical Services in Data Infrastructures Jelena Kovacevic Carnegie Mellon University P06: Large Scale FFT-Based Stress-Strain Simulations with Irregular Domain Decomposition Patricia Kovatch Icahn School of Medicine at Mount Sinai Medical Image Analysis and Visualization Computational Approaches for Cancer Impacting Cancer with HPC: Opportunities and Challenges Quincey Koziol Lawrence Berkeley National Laboratory The HDF5 Dataverse Cassie Kozyrkov Google Deep Learning for Science in the Cloud Matthew Krafczyk National Center for Supercomputing Applications, University of Illinois P91: Assessing the Availability of Source Code in Computational Physics Reproducibility and Uncertainty in High Performance Computing Dieter Kranzlmueller Ludwig Maximilian University of Munich Power-Aware High Performance Computing: Challenges and Opportunities for Application and System Developers Jiri Kraus Nvidia Corporation Application Porting and Optimization on GPU-Accelerated POWER Architectures Michael Krause Gen-Z Consortium Understanding Gen-Z Technology – A High Performance Interconnect for the Data-Centric Future Michal Kravcenko Technical University of Ostrava P03: BEM4I: A Massively Parallel Boundary Element Solver Rebecca Kreitinger University of New Mexico P93: Spacehog: Evaluating the Costs of Dedicating Resources to In Situ Analysis Hamid Krim North Carolina State University Egeria: A Framework for Auto-Construction of HPC Advising Tools through Multi-Layered Natural Language Processing Sriram Krishnamoorthy Pacific Northwest National Laboratory WOLFHPC: Workshop on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing Automatic Risk-Based Selective Redundancy for Fault-Tolerant Task-Parallel HPC Applications Silent Errors in HPC Systems P84: PRESAGE: Selective Low Overhead Error Amplification for Easy Detection Mads R. B. Kristensen University of Copenhagen Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Martin Kronbichler Technical University Munich P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II Mark R. Krumholz Australian National University University of California, Santa Cruz Milky Way Analogue Isolated Disk Galaxy Jeff Kuehn Los Alamos National Laboratory OpenSHMEM in the Era of Exascale Michael Kuhn University of Hamburg P57: Adaptive Tier Selection for NetCDF and HDF5 Navjot Kukreja Imperial College, London Towards Self-Verification in Finite Difference Code Generation Anuva Kulkarni Carnegie Mellon University P06: Large Scale FFT-Based Stress-Strain Simulations with Irregular Domain Decomposition Abhishek Kumar Brookhaven National Laboratory A19: Performance Analysis of a Parallelized Restricted Boltzmann Machine Artificial Neural Network Using OpenACC Framework and TAU Profiling System Nalini Kumar University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Rick Kumar Sanmina Corporation Building End-to-End NVMe over Fabric Infrastructure for HPC Vipin Kumar University of Minnesota Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Kalyan Kumaran Argonne National Laboratory Run-to-Run Variability on Xeon Phi Based Cray XC Systems Julian Kunkel German Climate Computing Center Analyzing Parallel I/O The Virtual Institute of I/O and the IO-500 P57: Adaptive Tier Selection for NetCDF and HDF5 P15: Toward Decoupling the Selection of Compression Algorithms from Quality Constraints Shannon Kuntz Emu Solutions Inc. A Case for Migrating Execution for Irregular Applications Thorsten Kurth Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Gregory Kurtzer SingularityWare LLC Containers in HPC Jakub Kurzak University of Tennessee Linear Algebra Libraries for High-Performance Computing: Scientific Computing with Multicore and Accelerators Return to Top L Jesus Labarta Barcelona Supercomputing Center Kennedy Award Presentation: The Real Revolution … from the Latency to the Throughput Age Ignacio Laguna Lawrence Livermore National Laboratory Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction Introduction - 1st International Workshop on Software Correctness for HPC Applications (Correctness 2017) REFINE: Realistic Fault Injection via Compiler-Based Instrumentation for Accuracy, Portability and Speed Correctness 2017: First International Workshop on Software Correctness for HPC Applications Kartik Lakhotia University of Southern California Five-minute presentations by young researchers from around the world - part 1 Maddegedara Lalith University of Tokyo RIKEN P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Debra Lam Managing Director for Smart Cities & Inclusive Innovation Georgia Institute of Technology HPC Connects Plenary: The Century of the City Herman Lam University of Florida A FPGA-Pipelined Approach for Accelerated Discrete-Event Simulation of HPC Systems Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Reconfigurable Supercomputing (RSC) Michael Lam James Madison University Lawrence Livermore National Laboratory Improving Numerical Computation with Practical Tools and Novel Computer Arithmetic Sandy Landsberg US Department of Defense HPC Modernization Program Blurring the Lines: High-End Computing and Data Science Jonas L. Landsgesell University of Stuttgart P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Joshua Landwehr Pacific Northwest National Laboratory Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Michael Lang Los Alamos National Laboratory NUMA Distance for Heterogeneous Memory Modeling UGAL on the Dragonfly Topology A Comparative Study of SDN and Adaptive Routing on Dragonfly Networks P55: Incorporating Proactive Data Rescue into ZFS Disk Recovery for Enhanced Storage Reliability Michael Lange Imperial College, London Towards Self-Verification in Finite Difference Code Generation Ulrich Langenbach Beuth University of Applied Sciences Berlin Heterogeneous Multi-Processing in Software-Defined Cloud Storage Nodes Akhil Langer Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Johannes Langguth Simula Research Laboratory Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures James H. Laros Sandia National Laboratories Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Total Cost of Ownership and HPC System Procurement PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Matthew Larsen Lawrence Livermore National Laboratory Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine The ALPINE In Situ Infrastructure: Ascending from the Ashes of Strawman Rob Latham Argonne National Laboratory Parallel I/O in Practice Scott Lathrop University of Illinois Fourth SC Workshop on Best Practices for HPC Training Promoting Scientific Workflows A Proposed Model for Teaching Advanced Parallel Computing and Related Topics Software Engineering and Reuse in Computational Science and Engineering Dairsie Latimer Red Oak Consulting HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Casey J. Law University of California, Berkeley realfast@VLA T. Joseph W. Lazio Jet Propulsion Laboratory realfast@VLA Valentin Le Fèvre ENS Lyon Periodic I/O Scheduling for Supercomputers Elizabeth Leake STEM-Trek Special Interest Group on HPC in Resource Constrained Environments (SIGHPC-RCE) Christopher Leap University of New Mexico P47: Understanding Congestion on Omni-Path Fabrics Michael LeBeane University of Texas Advanced Micro Devices Inc GPU Triggered Networking for Intra-Kernel Communications Anton Lebedev University of Tubingen Invited Talk - On Improved Monte Carlo Hybrid Methods for Preconditioner Computations Youenn Lebras University of Versailles Five-minute presentations by young researchers from around the world - part 2 Gregory L. Lee Lawrence Livermore National Laboratory Managing HPC Software Complexity with Spack Hyungro Lee Indiana University Teaching, Learning and Collaborating through Cloud Computing Online Classes Jinho Lee IBM Adopting OpenCAPI for High Bandwidth Database Accelerators JunKyu Lee Queen's University Belfast P11: Energy-Efficient Transprecision Techniques for Iterative Refinement Matthew Lee Carnegie Mellon University A Family of Provably Correct Algorithms for Exact Triangle Counting Seyong Lee Oak Ridge National Laboratory Porting a GAMESS Computational Chemistry Kernel to FPGAs PapyrusKV: A High-Performance Parallel Key-Value Store for Distributed NVM Architectures Wonchan Lee Stanford University Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Miriam Leeser Northeastern University Computational Reproducibility at Exascale 2017 (CRE2017) Reproducibility and Uncertainty in High Performance Computing Matthew Legendre Lawrence Livermore National Laboratory Managing HPC Software Complexity with Spack Joshua Leibfried University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data John D. Leidel Tactical Computing Laboratories Workshop for Open Source Supercomputing Pressure-Driven Hardware Managed Thread Concurrency for Irregular Applications Bit Contiguous Memory Allocation for Processing In Memory Reconfigurable Computing in Exascale Jason Leigh University of Hawaii, Manoa SAGE2 9th Annual International SC BOF: Scalable Amplified Group Environment for Global Collaboration Matthew Leininger Lawrence Livermore National Laboratory Predicting the Performance Impact of Different Fat-Tree Configurations Jacques Bernard LEKIEN Atomic Energy and Alternative Energies Commission Lean Visualization of Large Scale Tree-Based AMR Meshes Bryce A. Lelbach Lawrence Berkeley National Laboratory Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code Edgar A. Leon Lawrence Livermore National Laboratory Predicting the Performance Impact of Different Fat-Tree Configurations Siew Hoon Leong National Supercomputing Center Singapore Posit Research Posit Math Unit (PMU) – A New Approach Toward Exascale Computing Richard Lethin Reservoir Labs Inc Yale University Small Business and the Exascale Computing Project Mary Ann Leung Sustainable Horizons Institute Forming Connections I: Connecting Sideways, with Ourselves and Our Peers Randall LeVeque University of Washington Software Engineering and Reuse in Computational Science and Engineering John Levesque Cray Inc Fortran Is 60 Years Old - Has It Changed for the Better? Joshua A. Levine University of Arizona Introduction - 4th International Workshop on Visual Performance Analytics – VPA 2017 Panel Discussion: Challenges and the Future of HPC Performance Visualization Fourth International Workshop on Visual Performance Analysis – VPA 2017 James Levitt University of Texas Geometry-Oblivious FMM for Compressing Dense SPD Matrices Scott Levy Sandia National Laboratories P93: Spacehog: Evaluating the Costs of Dedicating Resources to In Situ Analysis Stuart A. Levy National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Ang Li Pacific Northwest National Laboratory Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Chung-Gang Li Kobe University RIKEN P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer Dong Li University of California, Merced Unimem: Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Main Memory P92: Characterization and Comparison of Application Resilience for Serial and Parallel Executions Guanpeng Li University of British Columbia Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Hongbo Li University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform ParaStack: Efficient Hang Detection for MPI Programs at Large Scale Jiajia Li Georgia Institute of Technology Five-minute presentations by young researchers from around the world - part 1 P10: HiCOO: A Hierarchical Sparse Tensor Format for Tensor Decompositions Jiang Li Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Lingda Li Brookhaven National Laboratory Developing an OpenMP Runtime for UVM-Capable GPUs Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading Shaomeng Li National Center for Atmospheric Research University of Oregon Performance Impacts of In Situ Wavelet Compression on Scientific Simulations Sihuan Li University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform Siyang Li Tsinghua University Tsinghua University LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems Tao Li University of Florida LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems Tiffany Li National Center for Supercomputing Applications, University of Illinois P38: Benchmarking Parallelized File Aggregation Tools for Large Scale Data Management Tonglin Li Oak Ridge National Laboratory Introduction - The Eighth International Workshop on Data-Intensive Computing in the Clouds Weijun Li Shenzhen DAPU Microelectronics Company Introducing DPU - Data-Storage Processing Unit – Placing Intelligence in Storage Xi Li DataDirect Networks A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Yan Li University of California, Santa Cruz University of California, Santa Cruz CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning P65: CAPES: Unsupervised System Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Zhenyu Li University of Warwick An Efficient Task-Based All-Reduce for Machine Learning Applications Xin Liang University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Yishuang Liang Beijing Normal University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Chunhua Liao Lawrence Livermore National Laboratory DataRaceBench: A Benchmark Suite for Systematic Evaluation of Data Race Detection Tools Junfeng Liao Tsinghua University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Hyun Lim Brigham Young University P63: FleCSPH: a Parallel and Distributed Smoothed Particle Hydrodynamics Framework Based on FleCSI Seung-Hwan Lim Oak Ridge National Laboratory Scientific User Behavior and Data-Sharing Trends in a Petascale File System TagIt: An Integrated Indexing and Search Service for File Systems James Lin Shanghai Jiao Tong University Software Engineering and Reuse in Computational Science and Engineering Jin Lin Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Lan Lin Ball State University Designing a Synchronization-Reducing Clustering Method on Manycores: Some Issues and Improvements Meifeng Lin Brookhaven National Laboratory The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Pei-Hung Lin Lawrence Livermore National Laboratory Verifying the Floating-Point Computation Equivalence of Manually and Automatically Differentiated Code DataRaceBench: A Benchmark Suite for Systematic Evaluation of Data Race Detection Tools Iris Linck University of Colorado, Denver P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Peggy Lindner University of Houston Vistas in Advanced Computing Peter Lindstrom Lawrence Livermore National Laboratory Compression of Scientific Data John C. Linford ParaTools P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Hierarchical Memory Usage Andreas Lintermann RWTH Aachen University Juelich Aachen Research Alliance Comprehensive Visualization of Large-Scale Simulation Data Linked to Respiratory Flow Computations on HPC Systems Don Lipari Lawrence Livermore National Laboratory P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Gengchen Liu University of California, Davis Silicon Photonic LIONS: All-to-All Interconnects for Energy-Efficient, Scalable, and Modular HPC Systems Qing Gary Liu New Jersey Institute of Technology Introduction - The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) The 2nd International Workshop on Data Reduction for Big Scientific Data (DRBSD-2) Si Liu University of Texas Advanced Manycore Programming (KNL) Weifeng Liu University of Copenhagen Norwegian University of Science and Technology Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Weiguo Liu Shandong University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Xinlian Liu Lawrence Berkeley National Laboratory P36: A Novel Feature-Preserving Spatial Mapping for Deep Learning Classification of Ras Structures Yan Liu University of Maine P72: New Developments for PAPI 5.6+ P33: Massively Parallel Evolutionary Computation for Empowering Electoral Reform: Quantifying Gerrymandering via Multi-objective Optimization and Statistical Analysis Yuanlai Liu University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform Karl Ljungkvist Uppsala University P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II Li-Ta Lo Los Alamos National Laboratory P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Jay Lofstead Sandia National Laboratories Supercomputing in the Shadow of Giants: Perspectives and Insights from Supercomputing Leaders Outside the “Big 5” Regions and Organizations The Virtual Institute of I/O and the IO-500 Practical Reproducibility by Managing Experiments Like Software Bruce Loftis Independent Students@SC17 Welcome and Opening Session Gabriel H. Loh Advanced Micro Devices Inc Leveraging Near Data Processing for High-Performance Checkpoint/Restart Julien Loiseau University of Reims Champagne-Ardenne P63: FleCSPH: a Parallel and Distributed Smoothed Particle Hydrodynamics Framework Based on FleCSI Josip Loncaric Los Alamos National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Bill Long Cray Inc Introduction - PAW 2017: The 2nd Annual PGAS Applications Workshop Darrell D. E. Long University of California, Santa Cruz CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning P65: CAPES: Unsupervised System Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Guy Lonsdale Scapos HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Patty Lopez Intel Corporation Building Your Professional Persona Building Your Professional Persona Francesc-Josep Lordan Barcelona Supercomputing Center Polytechnic University of Catalonia Enabling GPU Support for the COMPSs-Mobile Framework Burlen Loring Lawrence Berkeley National Laboratory In Situ Analysis and Visualization with SENSEI Dominik Marek Loroch Fraunhofer Institute for Industrial Mathematics TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization Gerald Lotto Mellanox Technologies Interconnect Your Future with Mellanox “Smart” Interconnect Steven G. Louie Lawrence Berkeley National Laboratory University of California, Berkeley P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Tze Meng Low Carnegie Mellon University A Family of Provably Correct Algorithms for Exact Triangle Counting Mike Lowe Indiana University OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure David Lowenthal University of Arizona Energy Efficient Supercomputing (E2SC) Power-Aware High Performance Computing: Challenges and Opportunities for Application and System Developers Hatem Ltaief King Abdullah University of Science and Technology How Serious Are We About the Convergence Between HPC and Big Data? Hao Lu Oak Ridge National Laboratory Spherical Region Queries on Multicore Architectures Xiaoyi Lu Ohio State University Big Data Meets HPC: Exploiting HPC Technologies for Accelerating Big Data Processing and Management Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications Accelerating Big Data Processing and Machine/Deep Learning Middleware on Modern HPC Clusters Scalable Reduction Collectives with Data Partitioning-Based Multi-Leader Design Youyou Lu Tsinghua University LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems Yutong Lu Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Robert F. Lucas University of Southern California Invited Talks 1 Juan Lucio-Vega University of Delaware The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Sebastian Luehrs Forschungszentrum Juelich Juelich Supercomputing Center P87: EoCoE Performance Benchmarking Methodology for Renewable Energy Applications Jakob Luettgau German Climate Computing Center P57: Adaptive Tier Selection for NetCDF and HDF5 Hui Lui University of Illinois Simulation and Visual Representation of Tropical Cyclone-Ocean Interactions Ronald Peter Luijten IBM DOME Hot-Water Cooled MicroDataCenter Justin P. Luitjens Nvidia Corporation P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Andrew Lumsdaine Pacific Northwest National Laboratory University of Washington Thinking Strategically 15th Graph500 List Elias Lundmark University West Sweden P44: Increasing Throughput of Multiprogram HPC Workloads: Evaluating a SMT Co-Scheduling Approach Thomas Lundqvist University West Sweden P44: Increasing Throughput of Multiprogram HPC Workloads: Evaluating a SMT Co-Scheduling Approach Xi Luo University of Tennessee Data Analysis of Earth System Simulation within an In Situ Infrastructure Ye Luo Argonne National Laboratory Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations Yingyi Luo Northwestern University P46: Understanding How OpenCL Parameters Impact on Off-Chip Memory Performance of FPGA Platforms Ziqing Luo University of Delaware Towards Self-Verification in Finite Difference Code Generation P83: Contracts for Message-Passing Programs Fabio Luporini Imperial College, London Towards Self-Verification in Finite Difference Code Generation Piotr Luszczek University of Tennessee Batched, Reproducible, and Reduced Precision BLAS Benjamin Lynch University of Minnesota OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Ceph Applications in HPC Environments P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Geoff Lyon CoolIT Systems Inc Chip-to-Atmosphere: Providing Safe and Effective Cooling for High-Density, High-Performance Data Center Environments Michael Lysaght Irish Centre for High End Computing Introduction - H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic H2RC: Third International Workshop on Heterogeneous Computing with Reconfigurable Logic Return to Top M Prabhat M Lawrence Berkeley National Laboratory Effective Programming Models for Deep Learning at Scale Xiaosong Ma Qatar Computing Research Institute Understanding Object-Level Memory Access Patterns Across the Spectrum Barney Maccabe Oak Ridge National Laboratory Performance, Advancement, and Promotions Lalith Maddegedara University of Tokyo RIKEN Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC P09: Adaptive Multistep Predictor for Accelerating Dynamic Implicit Finite-Element Simulations Elizabeth H. Madden Ludwig Maximilian University of Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Kavitha Madhu Argonne National Laboratory MPICH: A High-Performance Open-Source MPI Implementation Philip J. Maechling University of Southern California rvGAHP – Push-Based Job Submission Using Reverse SSH Connections Shinya Maeyama Nagoya University P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Donald Maghrak Krell Institute How To Analyze the Performance of Parallel Codes 101 Tom Maiden Pittsburgh Supercomputing Center From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community Liudmila S. Mainzer National Center for Supercomputing Applications, University of Illinois P38: Benchmarking Parallelized File Aggregation Tools for Large Scale Data Management Matthias Maiterth Intel Corporation State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Preeti Malakar Argonne National Laboratory Scalable In Situ Analysis of Molecular Dynamics Simulations Tareq Malas Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Abdul N. Malmi-Kakkada University of Texas Physical Signatures of Cancer Metastasis Chris M. Malone Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Allen D. Malony University of Oregon Workshop on Extreme-Scale Programming Tools (ESPT) Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine Performance Tuning Carlos Maltzahn University of California, Santa Cruz Practical Reproducibility by Managing Experiments Like Software Lukas Maly Technical University of Ostrava P03: BEM4I: A Massively Parallel Boundary Element Solver Vani Mandava Microsoft Introduction - MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers Cloud Computing for Science and Engineering Keynote: Cloud based systems and challenges for data rich research workloads MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers Andreas Mang University of Texas A Framework for Scalable Biophysics-Based Image Analysis Pavlos Maniotis Aristotle University of Thessaloniki Computing Architectures Exploiting Optical Interconnect and Optical Memory Technologies Filippo Mantovani Barcelona Supercomputing Center The ARM User Experience: Testbeds and Deployment at HPC Centers P71: Is ARM Software Ecosystem Ready for HPC? Joseph Manzano Pacific Northwest National Laboratory Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Aniruddha Marathe Lawrence Livermore National Laboratory ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Performance Modeling under Resource Constraints Using Deep Transfer Learning Martin Margala University of Massachusetts, Lowell RE-HASE: Regular-Expressions Hardware Synthesis Engine George S. Markomanolis King Abdullah University of Science and Technology Getting Started with the Burst Buffer: Using DataWarp Technology Andres Marquez Pacific Northwest National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Thomas Marrinan University of St. Thomas A Path from Serial Execution to Hybrid Parallelization for Learning HPC Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout David Martin Argonne National Laboratory HPC Impact Showcase: Defense Systems Steven Martin Cray Inc Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Total Cost of Ownership and HPC System Procurement PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control David Martinez Sandia National Laboratories Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Jan Martinovic Technical University of Ostrava P62: How To Do Machine Learning on Big Clusters Naoya Maruyama RIKEN P41: OpenCL-Based High-Performance 3D Stencil Computation on FPGAs Michael Mascagni Florida State University Computational Reproducibility at Exascale 2017 (CRE2017) Matt Masten Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Sergi Mateo Bellido Barcelona Supercomputing Center Mastering Tasking with OpenMP Amrita Mathuriya Intel Corporation Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations Ryo Matsumiya Tokyo Institute of Technology P05: ooc_cuDNN : A Deep Learning Library Supporting CNNs over GPU Memory Capacity Kazuya Matsumoto University of Aizu Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Satoshi Matsuoka Tokyo Institute of Technology 2nd International Workshop on Post Moore's Era Supercomputing (PMES) Applying Temporal Blocking with a Directive-Based Approach Blurring the Lines: High-End Computing and Data Science Energy Efficiency Gains From Software: Retrospectives and Perspectives P41: OpenCL-Based High-Performance 3D Stencil Computation on FPGAs P52: A Simulation-Based Analysis on the Configuration of Burst Buffer Devin A. Matthews University of Texas P02: Strassen's Algorithm for Tensor Contraction Greg Matthews NASA Ames Research Center PBS Pro Open Source Project Community BoF Michael Mattmiller Chief Technology Officer City of Seattle HPC Connects Plenary: The Century of the City Marta Mattoso Federal University of Rio de Janeiro Tracking of Online Parameter Fine-Tuning in Scientific Workflows Timothy Mattson Intel Corporation OpenMP Common Core: A “Hands-On” Exploration Programming Your GPU with OpenMP: A Hands-On Introduction HPC Graph Toolkits and the GraphBLAS Forum Zakhar A. Matveev Intel Corporation Performance Tuning of Scientific Codes with the Roofline Model Alexander Matz University of Heidelberg P85: GPU Mekong: Simplified Multi-GPU Programming Using Automated Partitioning Dimitri J. Mavriplis University of Wyoming P28: High-Fidelity Blade-Resolved Wind Plant Modeling Yury F. Maydanik Institute of Thermal Physics Ural Branch Thercon-LHP Future of the Thermal Management – Thercon-LHP Water-Free Solutions for HPC Cooling Robert Maynard Kitware Inc In Situ Summarization with VTK-m Akie Mayumi Japan Atomic Energy Agency Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Patrick McCormick Los Alamos National Laboratory OpenMPIR Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Kenton McHenry National Center for Supercomputing Applications, University of Illinois Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Lois Curfman McInnes Argonne National Laboratory Software Engineering and Reuse in Computational Science and Engineering Suzanne McIntosh New York University Second Annual Meeting of the SIGHPC - Big Data Chapter Simon McIntosh-Smith University of Bristol A Survey of Application Memory Usage on a National Supercomputer: An Analysis of Memory Requirements on ARCHER The ARM Software Ecosystem: Are We There Yet? Programming Your GPU with OpenMP: A Hands-On Introduction The ARM User Experience: Testbeds and Deployment at HPC Centers P69: Portable Methods for Measuring Cache Hierarchy Performance P96: Correcting Detectable Uncorrectable Errors in Memory Robert McLay University of Texas Texas Advanced Computing Center, University of Texas Tracking and Analyzing Job-level Activity Using Open XDMoD, XALT and OGRT Matt McLean University of Michigan The ARM Software Ecosystem: Are We There Yet? Kim McMahon McMahon Consulting Introduction - Women in HPC: Diversifying the HPC Community Stephen McNally Oak Ridge National Laboratory Characterizing Faults, Errors, and Failures in Extreme-Scale Systems David Meadows Stulz Air Technology Chip-to-Atmosphere: Providing Safe and Effective Cooling for High-Density, High-Performance Data Center Environments Robert L. Meakin US Department of Defense HPC Modernization Program Accelerating Defense Innovation of Military Aircraft with Computational Prototypes and High Performance Computing Miriam Mehl University of Stuttgart A Framework for Scalable Biophysics-Based Image Analysis Maryam Mehri Dehnavi Rutgers University Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis Susan Mehringer Cornell University Fourth SC Workshop on Best Practices for HPC Training Lars Mejsner Grundfos Grundfos Benefits of Having Sensors in Your Water Cooled HPC Mario Melara National Energy Research Scientific Computing Center Managing HPC Software Complexity with Spack Mads Melchiors Grundfos Grundfos Benefits of Having Sensors in Your Water Cooled HPC Nathaniel Mendoza Texas Advanced Computing Center, University of Texas Securing HPC: Development of a Low Cost, Open Source, Multi-Factor Authentication Infrastructure Harshitha Menon Lawrence Livermore National Laboratory Verifying the Floating-Point Computation Equivalence of Manually and Automatically Differentiated Code Integrating OpenMP into the Charm++ Programming Model P80: Adaptive Loop Scheduling with Charm++ to Improve Performance of Scientific Applications Michal Merta Technical University of Ostrava P03: BEM4I: A Massively Parallel Boundary Element Solver Bronson Messer Oak Ridge National Laboratory Application Porting and Optimization on GPU-Accelerated POWER Architectures P26: Optimizing Gravity and Nuclear Physics in FLASH for Exascale Paul Messina Argonne National Laboratory The U.S. D.O.E. Exascale Computing Project – Goals and Challenges Peter Messmer Nvidia Corporation Interactivity in Supercomputing Martin Meuer ISC Events TOP500 Supercomputers TOP500 - Past, Present, Future Lauren Michael University of Wisconsin Software Engineers: Careers in Research Scott Michael Indiana University Student Résumé Workshop Marek Michalewicz University of Warsaw Supercomputing in the Shadow of Giants: Perspectives and Insights from Supercomputing Leaders Outside the “Big 5” Regions and Organizations Martial Michel Data Machines Corp OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure George Michelogiannakis Lawrence Berkeley National Laboratory Post Moore Supercomputing PARADISE: A ToolFlow to Model Emerging Technologies for the Post-CMOS Era in HPC Samuel P. Midkiff Purdue University Massively Parallel 3D Image Reconstruction Ethan L. Miller University of California, Santa Cruz CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning P65: CAPES: Unsupervised System Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Phil Miller Charmworks Inc Charm++ and AMPI: Adaptive and Asynchronous Parallel Programming Ross Miller Oak Ridge National Laboratory GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Michelle Strout University of Arizona Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis Jeff Milrod BittWare Inc Cooling Hot FPGAs: A Thermals First Approach Daniel J. Milroy National Center for Atmospheric Research Quality Assurance and Error Identification for the Community Earth System Model Misun Min Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Vladimir Mironov Lomonosov Moscow State University An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Alok Mishra Stony Brook University Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading Satyajayant Misra New Mexico State University A Scalable Analytical Memory Model for CPU Performance Prediction Jerome Mitchell Indiana University Teaching, Learning and Collaborating through Cloud Computing Online Classes Ioannis Mitliagkas Stanford University Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Katsunori Miura Kitami Institute of Technology P61: Cloud Resource Selection Based on PLS Method for Deploying Optimal Infrastructures for Genomic Analytics Application Mathew Mix University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Audris Mockus University of Tennessee Position Paper: Experiences on Clustering High-Dimensional Data Using pbdR Mohamed Mohamed IBM P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Ali Mohammed University of Basel P74: A Methodology for Bridging the Native and Simulated Executions of Parallel Applications Ayat Mohammed Texas Advanced Computing Center, University of Texas Physical Signatures of Cancer Metastasis Kathryn Mohror Lawrence Livermore National Laboratory Optimizing MPI Simon Moll Saarland University PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Md Atiqul Mollah Florida State University Modeling UGAL on the Dragonfly Topology A Comparative Study of SDN and Adaptive Routing on Dragonfly Networks Modeling and Comparison of Large-Scale Interconnect Designs Shintaro Momose NEC Corporation Project Aurora – Unveiling NEC’s Brand New Vector Supercomputer Jose Monsalve DIaz University of Delaware OpenMP 4.5 Validation and Verification Suite Raffaele Montella Parthenope University of Naples Processing of Crowd-Sourced Data from an Internet of Floating Things David Montoya Los Alamos National Laboratory How To Analyze the Performance of Parallel Codes 101 State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Shirley V. Moore Oak Ridge National Laboratory Porting a GAMESS Computational Chemistry Kernel to FPGAs Kenneth Moreland Sandia National Laboratories Large Scale Visualization with ParaView Yoshiyuki Morie RIKEN P48: Prototyping of Offloaded Persistent Broadcast on Tofu2 Interconnect Vitali Morozov Argonne National Laboratory Run-to-Run Variability on Xeon Phi Based Cray XC Systems PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Karla Morris Sandia National Laboratories Introduction - PAW 2017: The 2nd Annual PGAS Applications Workshop PAW 2017: The 2nd Annual PGAS Applications Workshop William Moses Massachusetts Institute of Technology OpenMPIR Alexander Moskovsky RSC Technologies An Efficient MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation of Intel Xeon Phi Processor P30: MPI/OpenMP Parallelization of the Hartree-Fock Method for the Second Generation Intel Xeon Phi Nicholas Moss Los Alamos National Laboratory P63: FleCSPH: a Parallel and Distributed Smoothed Particle Hydrodynamics Framework Based on FleCSI Misbah Mubarak Argonne National Laboratory Introduction - Women in HPC: Diversifying the HPC Community Career Panel Discussion: Hints and Tips to Progress Your Career How to Find the Help You Need – Identifying Mentors and Those Who Can Help You in Your Career Early Career Lightning Talks Virtual Poster Networking and Mixer Predicting the Performance Impact of Different Fat-Tree Configurations Modeling and Simulation of Communication in HPC Systems Gihan Mudalige University of Warwick Comparison of Parallelization Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs Beyond 16GB: Out-of-Core Stencil Computations P01: Cache-Blocking Tiling of Large Stencil Codes at Runtime Ananya Muddukrishna Norwegian University of Science and Technology Toward Aggregated Grain Graphs Frank Mueller North Carolina State University P89: Desh: Deep Learning for HPC System Health Resilience Michel Mueller Tokyo Institute of Technology Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model Benson Muite University of Tartu A Comparison of Distributed Memory Fast Fourier Transform (FFT) Library Packages Yvo Mulder Delft University of Technology Adopting OpenCAPI for High Bandwidth Database Accelerators Julia Mullen Massachusetts Institute of Technology Fourth SC Workshop on Best Practices for HPC Training From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community Masaharu Munetomo Hokkaido University P61: Cloud Resource Selection Based on PLS Method for Deploying Optimal Infrastructures for Genomic Analytics Application Edward A. Munsell University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Todd Munson Argonne National Laboratory Scalable In Situ Analysis of Molecular Dynamics Simulations Hitoshi Murai RIKEN Runtime Correctness Checking for Emerging Programming Paradigms Preliminary Performance Evaluation of Coarray-based Implementation of Fiber Miniapp Suite Using XcalableMP PGAS Language Kohei Murotani Railway Technical Research Institute P22: Numerical Simulation of Snow Accretion by Airflow Simulator and Particle Simulator Philip Murphy Intel Corporation Omni-Path User Group (OPUG) Meeting Richard Murphy Micron Technology Inc 15th Graph500 List Margaret E. Myers University of Texas Lowering Barriers into HPC through Open Education Matthias S. Müller RWTH Aachen University Runtime Correctness Checking for Emerging Programming Paradigms Return to Top N Jarek Nabrzyski University of Notre Dame Promoting Scientific Workflows Jaroslaw Nabrzyski University of Notre Dame Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Yasodhadevi Nachimuthu Portland State University Experiencing HPC for Undergraduates: Graduate Student Perspective Ramkumar Nagappan Intel Corporation Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Masato Naito Sumitomo Rubber Industries Ltd Development of High Performance Rubber Materials for Tires Using K Computer Koji Nakade Railway Technical Research Institute P22: Numerical Simulation of Snow Accretion by Airflow Simulator and Particle Simulator Kengo Nakajima University of Tokyo Keynote - Application Development Framework for Manycore Architectures on Post-Peta/Exascale Systems Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC Software Engineering and Reuse in Computational Science and Engineering P14: Robust SA-AMG Solver by Extraction of Near-Kernel Vectors ACM Student Research Competition: Presentations by Semi-Finalists Poster Reception ACM Student Research Competition Masahiro Nakao RIKEN Preliminary Performance Evaluation of Coarray-based Implementation of Fiber Miniapp Suite Using XcalableMP PGAS Language Motoki Nakata National Institute for Fusion Science P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Hai Ah Nam Los Alamos National Laboratory Usability, Scalability and Productivity on Many-Core Processors: Intel Xeon Phi Kumudha Narasimhan Indian Institute of Science Optimizing Geometric Multigrid Method Computation Using a DSL Approach Badri Narayanan Argonne National Laboratory Visualizing Silicene Growth Through Island Migration and Coalescence Revathi Narayanan Micron Technology Inc The Silver Lining of the Cloud is the EDGE Rupesh Nasre Indian Institutes of Technology, Five-minute presentations by young researchers from around the world - part 2 HPC Initiatives in India Thomas Naughton Oak Ridge National Laboratory P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Maxim Naumov Nvidia Corporation Parallel Jaccard and Related Graph Clustering Techniques Parallel Depth-First Search for Directed Acyclic Graphs Philippe Navaux Federal University of Rio Grande do Sul Experimental and Analytical Study of Xeon Phi Reliability Americas HPC Collaboration Mohammadamin Nazirzadeh University of California, Davis P50: Energy-Efficient and Scalable Bio-Inspired Nanophotonic Computing P49: Toward Exascale HPC Systems: Exploiting Advances in High Bandwidth Memory (HBM2) through Scalable All-to-All Optical Interconnect Architectures Aravind Neelakantan University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Henry Neeman University of Oklahoma Fourth SC Workshop on Best Practices for HPC Training Chris J. Newburn Nvidia Corporation The ARM Software Ecosystem: Are We There Yet? Esmond G. Ng Lawrence Berkeley National Laboratory Plenary Invited Talk Bao Nguyen Washington State University, Vancouver Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory P25: Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory Hoang Nguyen University of Queensland Five-minute presentations by young researchers from around the world - part 1 Eric J. Nielsen NASA Langley Research Center P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Dimitrios Nikolopoulos Queen's University Belfast Energy Efficient Supercomputing (E2SC) REFINE: Realistic Fault Injection via Compiler-Based Instrumentation for Accuracy, Portability and Speed P11: Energy-Efficient Transprecision Techniques for Iterative Refinement Teodor Nikolov Marie Skłodowska Curie Initial Training Networks Five-minute presentations by young researchers from around the world - part 2 Daisuke Nishiura Japan Agency for Marine-Earth Science and Technology P21: The First Real-Scale DEM Simulation of a Sandbox Experiment Using 2.4 Billion Particles Bill Nitzberg Altair Engineering PBS Pro Open Source Project Community BoF Asare Nkansah University of Kentucky A Path from Serial Execution to Hybrid Parallelization for Learning HPC Kelly Nolan Talent Strategy Self Branding and Advocacy: How to Get Known in Your Organization and Push Your Ideas Forward Jean-Phillippe Nomine French Alternative Energies and Atomic Energy Commission French HPC Ecosystem and Strategy and the Role of CEA Jean-Philippe Nominé European Technology Platform for High Performance Computing French Alternative Energies and Atomic Energy Commission European Exascale Projects and Their Global Contributions Naoya Nomura University of Tokyo P14: Robust SA-AMG Solver by Extraction of Near-Kernel Vectors Michael L. Norman San Diego Supercomputer Center University of California, San Diego First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Boyana Norris University of Oregon Compilation Techniques Anastasiia Novikova University of Hamburg P15: Toward Decoupling the Selection of Compression Algorithms from Quality Constraints Lucy Nowell US Department of Energy VPA Keynote: Visual Performance Analysis for Extremely Heterogeneous Systems Small Business and the Exascale Computing Project Masanori Nunami National Institute for Fusion Science P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Return to Top O Kevin O'Brien IBM Implementing Implicit OpenMP Data Sharing on GPUs Mike O'Connor Nvidia Corporation Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs Patrick O'Leary Kitware Inc Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization In Situ Summarization with VTK-m ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Brian W. O'Shea Michigan State University First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Steve Oberlin Nvidia Corporation How Serious Are We About the Convergence Between HPC and Big Data? Michael Obersteiner Technical University Munich A Highly Scalable, Algorithm-Based Fault-Tolerant Solver for Gyrokinetic Plasma Simulations Sergey Oblomov Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Lena Oden Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Seda Ogrenci-Memik Northwestern University P46: Understanding How OpenCL Parameters Impact on Off-Chip Memory Performance of FPGA Platforms Martin Ohlerich Leibniz Supercomputing Centre P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II Daniel Oliveira Federal University of Rio Grande do Sul Experimental and Analytical Study of Xeon Phi Reliability Stephen Oliver Sandia National Laboratories PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Stephen Olivier Sandia National Laboratories OpenMPIR Luke Olson University of Illinois P70: FFT, FMM, and Multigrid on the Road to Exascale: Performance Challenges and Opportunities Hensley Omorodion University of Benin Special Interest Group on HPC in Resource Constrained Environments (SIGHPC-RCE) Theodore Omtzigt Stillwater Supercomputing Inc Posit Research Posit Math Unit (PMU) – A New Approach Toward Exascale Computing Keiji Onishi RIKEN P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer Takatsugu Ono Kyushu University P78: Performance Evaluation of Graph500 Considering CPU-DRAM Power Shifting Tomoya Ono University of Tsukuba Efficient and Scalable Calculation of Complex Band Structure Using Sakurai-Sugiura Method Gopalan Oppiliappan Intel Corporation High Performance Computing Education in US Data Science Sarp Oral Oak Ridge National Laboratory GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Lustre Community BoF: Lustre Deployments for the Next 5 Years Jason Orender Old Dominion University P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Marc S. Orr University of Wisconsin Gravel: Fine-Grain GPU-Initiated Network Messages Sergio Ortega University of Malaga Parallware Trainer: Interactive Tool for Experiential Learning of Parallel Programming Using OpenMP and OpenACC Samuel Oshin Intel Corporation Run-to-Run Variability on Xeon Phi Based Cray XC Systems Mark Oskin Advanced Micro Devices Inc University of Washington Gravel: Fine-Grain GPU-Initiated Network Messages Paul Osmialowski ARM Ltd How The Flang Frontend Works - Introduction to the Interior of the Open-Source Fortran Frontend for LLVM Marcin Ostasz European Technology Platform for High Performance Computing Barcelona Supercomputing Center European Exascale Projects and Their Global Contributions Matthew Otten Cornell University The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Kaiming Ouyang University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform Kalin Ovtcharov Microsoft Accelerating Deep Neural Networks at Datacenter Scale with the BrainWave Architecture Return to Top P Hans Pabst Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Nikhil Padmanabhan Yale University Cosmological Particle-Mesh Simulations in Chapel Glenn Page SustainaMetrix Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences Scott Pakin Los Alamos National Laboratory Modeling UGAL on the Dragonfly Topology Brian Pan H3 Platform Inc Towards a Composable Computer System Dhabaleswar Panda Ohio State University ESPM2'17: Opening Remarks ESPM2'17: Closing Remarks An In-Depth Performance Characterization of CPU- and GPU-Based DNN Training on Modern Architectures Scalable Reduction Collectives with Data Partitioning-Based Multi-Leader Design InfiniBand, Omni-Path, and High-Speed Ethernet: Advanced Features, Challenges in Designing, HEC Systems and Usage InfiniBand, Omni-Path, and High-Speed Ethernet for Dummies Big Data Meets HPC: Exploiting HPC Technologies for Accelerating Big Data Processing and Management Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications Accelerating Big Data Processing and Machine/Deep Learning Middleware on Modern HPC Clusters ESPM2 2017: Third International Workshop on Extreme Scale Programming Models and Middleware Jean-Pierre Panziera European Technology Platform for High Performance Computing Atos European Exascale Projects and Their Global Contributions Thomas Papatheodore Oak Ridge National Laboratory P26: Optimizing Gravity and Nuclear Physics in FLASH for Exascale Michael E. Papka Argonne National Laboratory Flexible In Situ Visualization of LAMMPS Simulations A Path from Serial Execution to Hybrid Parallelization for Learning HPC Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout Scalable In Situ Analysis of Molecular Dynamics Simulations Manish Parashar Rutgers University Extreme Scale Data Management for In-Situ Scientific Workflows Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure Experiencing HPC for Undergraduates: Introduction to HPC Research Devangi N. Parikh University of Texas Lowering Barriers into HPC through Open Education Chanyoung Park University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Junghyun Park Korea Institute of Science and Technology Information Visualization of Decision-Making Support (DMS) Information for Responding to a Typhoon-Induced Disaster Kyongseok Park Korea Institute of Science and Technology Information P51: TuPiX-Flow: Workflow-Based Large-Scale Scientific Data Analysis System Scott Parker Argonne National Laboratory Run-to-Run Variability on Xeon Phi Based Cray XC Systems Alfredo Parra Hinojosa Technical University Munich A Highly Scalable, Algorithm-Based Fault-Tolerant Solver for Gyrokinetic Plasma Simulations Mark Parsons University of Edinburgh HPC Impact Showcase: Energy and Climate Carlo Pascoe University of Florida A FPGA-Pipelined Approach for Accelerated Discrete-Event Simulation of HPC Systems Valerio Pascucci University of Utah Flexible In Situ Visualization of LAMMPS Simulations Igor Pasichnyk IBM P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II John Patchett Los Alamos National Laboratory Large Scale Visualization with ParaView Tirthak Patel Northeastern University Failures in Large Scale Systems: Long-Term Measurement, Analysis, and Implications Onkar Patil North Carolina State University A28: Exploring Use Cases for Non-Volatile Memories in Support of HPC Resilience Tapasya Patki Lawrence Livermore National Laboratory P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Abani K. Patra University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Karthik Pattabiraman University of British Columbia Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Michael Patterson Intel Corporation Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Robert M. Patterson National Center for Supercomputing Applications, University of Illinois Milky Way Analogue Isolated Disk Galaxy First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Robert M. Patton Oak Ridge National Laboratory Introduction - Machine Learning in HPC Environments Md Mostofa Patwary Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Sri Raj paul Rice University Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages Robert S. Pavel Los Alamos National Laboratory P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Grzegorz Pawelczak University of Bristol P96: Correcting Detectable Uncorrectable Errors in Memory David Pearah HDF Group The HDF5 Dataverse Roger Pearce Lawrence Livermore National Laboratory Toward Scalable Parallel Training of Deep Neural Networks Carl Pearson University of Illinois P16: Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions Kevin Pedretti Sandia National Laboratories State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) P90: Global Survey of Energy and Power-Aware Job Scheduling and Resource Management in Supercomputing Centers Jim Peek Missing Link Electronics Heterogeneous Multi-Processing in Software-Defined Cloud Storage Nodes Paul Peltz Jr Los Alamos National Laboratory HPC Systems Professionals Workshop Zong Peng Indiana University Reliable Access to Massive Restricted Texts: Experience-Based Evaluation Swann Perarneau Argonne National Laboratory Experiencing HPC for Undergraduates: Careers in HPC Guilherme Peretti-Pezzi Swiss National Supercomputing Centre Regression Testing and Monitoring Tools Olga Perevalova University of Hamburg P57: Adaptive Tier Selection for NetCDF and HDF5 Danny Perez Los Alamos National Laboratory Gaining Insights into the Properties of Materials Using Atomistic Simulations on Large-Scale HPC Platforms P20: Facilitating the Scalability of ParSplice for Exascale Testbeds David Perez-Suarez University College London Software Engineers: Careers in Research Chris Persson University West Sweden P44: Increasing Throughput of Multiprogram HPC Workloads: Evaluating a SMT Co-Scheduling Approach Bradley Peterson University of Utah Scientific Computing and Imaging Institute Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs Serge Petiton University of Lille Maison de la Simulation Parallel Jaccard and Related Graph Clustering Techniques Runtime Correctness Checking for Emerging Programming Paradigms Fabrizio Petrini Intel Corporation HPC Graph Toolkits and the GraphBLAS Forum Antonio J. Peña Barcelona Supercomputing Center, Polytechnic University of Catalonia GPUs and Communication David Pfander University of Stuttgart P77: AutoTuneTMP: Auto Tuning in C++ With Runtime Template Metaprogramming Dirk Pflüger University of Stuttgart A Highly Scalable, Algorithm-Based Fault-Tolerant Solver for Gyrokinetic Plasma Simulations P77: AutoTuneTMP: Auto Tuning in C++ With Runtime Template Metaprogramming Franz-Josef Pfreundt Fraunhofer Institute for Industrial Mathematics TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization Vinanti Phadke Hewlett Packard Enterprise PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Anusha Phadnis Dhirubhai Ambani Institute of Information and Communication Technology P27: Parallelization of the Particle-In-Cell Monte Carlo Collision (PIC-MCC) Algorithm for Plasma Simulation on Intel MIC Xeon Phi Architecture The Anh Pham French Institute for Research in Computer Science and Automation (INRIA) ENS Rennes Verifying MPI Applications with SimGridMC Cynthia A. Phillips Sandia National Laboratories Introduction - Workshop on Education for High Performance Computing (EduHPC) Everett Phillips Nvidia Corporation A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Laercio Pilla Federal University of Santa Catarina Experimental and Analytical Study of Xeon Phi Reliability Sergio Pino Gallardo University of Delaware OpenMP 4.5 Validation and Verification Suite Beth Plale Indiana University Reliable Access to Massive Restricted Texts: Experience-Based Evaluation Dirk Pleiter Forschungszentrum Juelich The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Thoughts on the Path Toward Exascale from a JSC Perspective Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Application Porting and Optimization on GPU-Accelerated POWER Architectures Steve Plimpton Sandia National Laboratories Fernbach Award Presentation: Particles, HPC, and the Ukulele Syndrome Norbert Podhorszki Oak Ridge National Laboratory Data Analysis of Earth System Simulation within an In Situ Infrastructure Artur Podobas Tokyo Institute of Technology P41: OpenCL-Based High-Performance 3D Stencil Computation on FPGAs James Pogge Tennessee Technological University P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Martin Pokorny National Radio Astronomy Observatory realfast@VLA Jorda Polo Barcelona Supercomputing Center Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments Artem Polyakov Mellanox Technologies Charting the PMIx Roadmap Duncan Poole Nvidia Corporation OpenACC API User Experience, Vendor Reaction, Relevance, and Roadmap Steve Poole Los Alamos National Laboratory OpenSHMEM in the Era of Exascale Experiencing HPC for Undergraduates: Careers in HPC Swaroop Pophale Oak Ridge National Laboratory OpenMP 4.5 Validation and Verification Suite Vasileios Porpodas Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Allan Porterfield University of North Carolina QUARC: An Optimized DSL Framework Using LLVM Douglass E. Post US Department of Defense HPC Modernization Program Accelerating Innovation of Defense Systems with Computational Prototypes and High Performance Computing Alex Pothen Purdue University HPC Graph Toolkits and the GraphBLAS Forum Courtney Powell Hokkaido University P61: Cloud Resource Selection Based on PLS Method for Deploying Optimal Infrastructures for Genomic Analytics Application Michael M. Pozulp Lawrence Livermore National Laboratory P79: Porting the Opacity Client Library to a CPU-GPU Cluster Using OpenMP 4.5 Mr Prabhat Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Sushil K. Prasad Georgia State University Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences James Price University of Bristol P69: Portable Methods for Measuring Cache Hierarchy Performance Reid Priedhorsky Los Alamos National Laboratory Charliecloud: Unprivileged Containers for User-Defined Software Stacks in HPC Containers in HPC Howard Pritchard Los Alamos National Laboratory Graph500 on OpenSHMEM: Using a Practical Survey of Past Work to Motivate Novel Algorithmic Developments W. Cyrus Proctor Texas Advanced Computing Center, University of Texas Securing HPC: Development of a Low Cost, Open Source, Multi-Factor Authentication Infrastructure Roberto Proietti University of California, Davis Silicon Photonic LIONS: All-to-All Interconnects for Energy-Efficient, Scalable, and Modular HPC Systems P50: Energy-Efficient and Scalable Bio-Inspired Nanophotonic Computing P49: Toward Exascale HPC Systems: Exploiting Advances in High Bandwidth Memory (HBM2) through Scalable All-to-All Optical Interconnect Architectures Andrea Prosperetti University of Houston Vistas in Advanced Computing Joachim Protze RWTH Aachen University Runtime Correctness Checking for Emerging Programming Paradigms Spencer R. Pruitt Worcester Polytechnic Institute P32: Exploring the Performance of Electron Correlation Method Implementations on Kove XPDs David Pugmire Oak Ridge National Laboratory Scalable HPC Visualization and Data Analysis Using VisIt Shweta Purawat San Diego Supercomputer Center A Machine Learning Approach for Modular Workflow Performance Prediction Satish Puri Marquette University P19: MPI-GIS: An MPI System for Big Spatial Data Milos Puzovic Hartree Centre State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Philippe P. Pébaÿ Sandia National Laboratories A Novel Shard-Based Approach for Asynchronous Many-Task Models for In Situ Analysis Lean Visualization of Large Scale Tree-Based AMR Meshes Return to Top Q Depei Qian Beihang University Sun Yat-Sen University China’s New HPC Key Project Yingjin Qian DataDirect Networks A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Yang Qiao Delft University of Technology Adopting OpenCAPI for High Bandwidth Database Accelerators Zhi Qiao University of North Texas P55: Incorporating Proactive Data Rescue into ZFS Disk Recovery for Enhanced Storage Reliability Judy Qiu Indiana University Teaching, Learning and Collaborating through Cloud Computing Online Classes Harp-DAAL: A Next Generation Platform for High Performance Machine Learning on HPC-Cloud Irene Qualters National Science Foundation National Strategic Computing Initiative Update Heather Quinn Los Alamos National Laboratory Experimental and Analytical Study of Xeon Phi Reliability Martin Quinson ENS Rennes Verifying MPI Applications with SimGridMC Enrique S. Quintana-Orti Jaume I University Flexible Batched Sparse Matrix-Vector Product on GPUs Return to Top R Carolyn Raab Corsa Technology Protecting against Hyper Scale Network Attacks with Bump-in-the-Wire 100G filtering Evan Racah Lawrence Berkeley National Laboratory Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Ken Raffenetti Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 MPICH: A High-Performance Open-Source MPI Implementation Bruno RAFFIN French Institute for Research in Computer Science and Automation (INRIA) Melissa: Large Scale In Transit Global Sensitivity Analysis Avoiding Intermediate Files Padma Raghavan Vanderbilt University Invited Talks 2 Md Shafayat Rahman Florida State University Modeling UGAL on the Dragonfly Topology Ioan Raicu Illinois Institute of Technology, Argonne National Laboratory Software for HPC Facilities Swapna Raj Intel Corporation LESS: Loop Nest Execution Strategies for Spatial Architectures Sivasankaran Rajamanickam Sandia National Laboratories Designing Vector-Friendly Compact BLAS and LAPACK Kernels Batched, Reproducible, and Reduced Precision BLAS Espen Birger Raknes Aker BP ASA Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Vinay B. Ramakrishnaiah University of Wyoming P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Karthik Raman Intel Corporation P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems J. Ramanujam Louisiana State University WOLFHPC: Workshop on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing HPX Smart Executors Mouad Ramil National School of Bridges and Roads - ParisTech P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Tim Randles Los Alamos National Laboratory Charliecloud: Unprivileged Containers for User-Defined Software Stacks in HPC OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Containers in HPC Arvind Rao San Diego Supercomputer Center A Machine Learning Approach for Modular Workflow Performance Prediction Gil Rapaport Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Georg Rath Lawrence Berkeley National Laboratory Tracking and Analyzing Job-level Activity Using Open XDMoD, XALT and OGRT Thilina Rathnayake University of Illinois Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Tyler S. Rau Los Alamos National Laboratory P54: Investigating Hardware Offloading for Reed-Solomon Encoding Archana Ravindar IBM Application Porting and Optimization on GPU-Accelerated POWER Architectures Navamita Ray Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Pablo Reble Intel Corporation Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks Paolo Rech Federal University of Rio Grande do Sul Analyzing the Criticality of Transient Faults-Induced SDCs on GPU Applications Experimental and Analytical Study of Xeon Phi Reliability Daniel Reed University of Iowa Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Energy Efficiency Gains From Software: Retrospectives and Perspectives Istvan Zoltan Reguly Pazmany Peter Catholic University Comparison of Parallelization Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs Beyond 16GB: Out-of-Core Stencil Computations P01: Cache-Blocking Tiling of Large Stencil Codes at Runtime James Reinders James Reinders Consulting LLC Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks Steven K. Reinhardt Microsoft Gravel: Fine-Grain GPU-Initiated Network Messages GPU Triggered Networking for Intra-Kernel Communications Nico Reissmann Norwegian University of Science and Technology Toward Aggregated Grain Graphs Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Severin Reiz Technical University Munich Geometry-Oblivious FMM for Compressing Dense SPD Matrices Luc Renambot University of Illinois, Chicago SAGE2 9th Annual International SC BOF: Scalable Amplified Group Environment for Global Collaboration Arnaud Renard University of Reims Champagne-Ardenne P64: romeoLAB : HPC Training Platform on HPC facility Vasudevan Rengasamy Pennsylvania State University Optimizing Word2Vec Performance on Multicore Systems Sebastian Rettenberger Technical University Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Alejandro Ribes Electricity of France (EDF) Keynote: Computing Ubiquitous Statistics: Computational Challenges Melissa: Large Scale In Transit Global Sensitivity Analysis Avoiding Intermediate Files Michael Rice Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Morris Riedel Research Center Juelich Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework Lorna Rivera Georgia Institute of Technology Why Subtle Bias is Often Worse than Blatant Discrimination Panel Discussion: Diversifying the HPC workforce Introduction - Women in HPC: Diversifying the HPC Community From Outreach to Education to Researcher - Innovative Ways of Expanding the HPC Community Recruitment: How to Build Diverse Teams Silvio Rizzi Argonne National Laboratory Flexible In Situ Visualization of LAMMPS Simulations Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout In Situ Analysis and Visualization with SENSEI Yves Robert French Institute for Research in Computer Science and Automation (INRIA) Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis Budget-Aware Scheduling Algorithms for Scientific Workflows on IaaS Cloud Platforms Fault-Tolerance for High Performance and Distributed Computing: Theory and Practice James Robnett National Radio Astronomy Observatory realfast@VLA Michael Robson University of Illinois Migratable Objects and Task-Based Parallel Programming with Charm++ Ivan Rodero Rutgers University Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure James Rogers Oak Ridge National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Total Cost of Ownership and HPC System Procurement Mike L. Rogers Los Alamos National Laboratory P07: PORTAGE - A Flexible Conservative Remapping Framework for Modern HPC Architectures Georgios Rokos IBM Implementing Implicit OpenMP Data Sharing on GPUs Joshua Romero Nvidia Corporation A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Todd Rosedahl IBM PowerAPI, GEOPM and Redfish: Open Interfaces for Power/Energy Measurement and Control Arnold L. Rosenberg Northeastern University Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Rob Ross Argonne National Laboratory Parallel I/O in Practice Barry Rountree Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning Power-Aware High Performance Computing: Challenges and Opportunities for Application and System Developers Damian Rouson Sourcery Institute PGAS Applications Workshop Panel Performance Portability of an Intermediate-Complexity Atmospheric Research Model in Coarray Fortran Cindy Rubio González University of California, Davis Introduction - 1st International Workshop on Software Correctness for HPC Applications (Correctness 2017) Cindy Rubio-Gonzalez University of California, Davis Correctness 2017: First International Workshop on Software Correctness for HPC Applications Andy Rudoff Intel Corporation Invited Talk: Persistent Memory: The Value to HPC and the Challenges Gregory Ruetsch Nvidia Corporation A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Daniel Ruiz Barcelona Supercomputing Center Five-minute presentations by young researchers from around the world - part 2 P71: Is ARM Software Ecosystem Ready for HPC? Michael Rupen National Research Council of Canada Dominion Radio Astrophysical Observatory realfast@VLA Lukas Rupprecht IBM P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Songhui Ryu Purdue University A24: Comparison of Machine Learning Algorithms and Their Ensembles for Botnet Detection Return to Top S Amit Sabne Microsoft Massively Parallel 3D Image Reconstruction Vipin Sachdeva Silicon Therapeutics OpenCL for FPGAs/HPC: Case Study in 3D FFT P. Sadayappan Ohio State University WOLFHPC: Workshop on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing Subhash Saini NASA Ames Research Center ACM Gordon Bell Finalists Hideki Saito Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Putt Sakdhnagool Purdue University Massively Parallel 3D Image Reconstruction Rizos Sakellariou University of Manchester Introduction - WORKS 2017 (12th Workshop on Workflows in Support of Large-Scale Science) Tetsuya Sakurai University of Tsukuba Efficient and Scalable Calculation of Complex Band Structure Using Sakurai-Sugiura Method Soulmaz Salehian Oakland University Evaluation of Knight Landing High Bandwidth Memory for HPC Workloads Joel Saltz Stony Brook University Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Ahmed Sanaullah Boston University OpenCL for FPGAs/HPC: Case Study in 3D FFT P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons Daniel Sanchez Massachusetts Institute of Technology Understanding Object-Level Memory Access Patterns Across the Spectrum Maria-Ribera Sancho Barcelona Supercomputing Center Fourth SC Workshop on Best Practices for HPC Training Subramanian Sankaranarayanan Argonne National Laboratory Visualizing Silicene Growth Through Island Migration and Coalescence Alexander Sannikov Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Kentaro Sano RIKEN Tohoku University Reconfigurable Computing in Exascale Nandakishore Santhi Los Alamos National Laboratory A Scalable Analytical Memory Model for CPU Performance Prediction Fernando Fernandes do Santos Federal University of Rio Grande do Sul Analyzing the Criticality of Transient Faults-Induced SDCs on GPU Applications Vivek Sarkar Rice University Graph500 on OpenSHMEM: Using a Practical Survey of Past Work to Motivate Novel Algorithmic Developments Keynote: Compiler and Runtime Challenges for Memory Centric Programming Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages OpenSHMEM in the Era of Exascale Dale Sartor Lawrence Berkeley National Laboratory Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Ruchira Sasanka Intel Corporation P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Shinsuke Satake National Institute for Fusion Science P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Nadathur Satish Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Kento Sato Lawrence Livermore National Laboratory P52: A Simulation-Based Analysis on the Configuration of Burst Buffer Mitsuhisa Sato RIKEN PGAS Applications Workshop Panel Keynote: How Does PGAS Collaborate with MPI+X? Preliminary Performance Evaluation of Coarray-based Implementation of Fiber Miniapp Suite Using XcalableMP PGAS Language The ARM User Experience: Testbeds and Deployment at HPC Centers Invited Talks 5 Erik Saule University of North Carolina, Charlotte Experiencing HPC for Undergraduates Orientation Marie-Christine Sawley Intel Corporation Intel Corporation Reconfigurable Computing in Exascale Machine Learning for Big Data: Integrated, Collaborative, Multi-Technological Solutions to Multi-Objective Problems Klaudius Scheufele University of Stuttgart Experiencing HPC for Undergraduates: Graduate Student Perspective A Framework for Scalable Biophysics-Based Image Analysis John Schmidt University of Utah Scientific Computing and Imaging Institute Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs Markus Schordan Lawrence Livermore National Laboratory Verifying the Floating-Point Computation Equivalence of Manually and Automatically Differentiated Code DataRaceBench: A Benchmark Suite for Systematic Evaluation of Data Race Detection Tools Andreas Schreiber German Aerospace Center 7th Workshop on Python for High-Performance and Scientific Computing Reproducibility and Uncertainty in High Performance Computing Endric Schubert Missing Link Electronics Heterogeneous Multi-Processing in Software-Defined Cloud Storage Nodes Thomas Schulthess Swiss National Supercomputing Centre Swiss National Programs Energy Efficiency Gains From Software: Retrospectives and Perspectives Pete Schultz Los Alamos National Laboratory New Mexico Consortium P88: PetaVision Neural Simulation Toolbox on Intel KNLs Karl Schulz Intel Corporation OpenHPC Community BoF ESPM2'17: Opening Remarks ESPM2'17: Closing Remarks ACM Student Research Competition: Presentations by Semi-Finalists ESPM2 2017: Third International Workshop on Extreme Scale Programming Models and Middleware Martin Schulz Technical University Munich Experiencing HPC for Undergraduates: Introduction to HPC Research Power-Aware High Performance Computing: Challenges and Opportunities for Application and System Developers How To Analyze the Performance of Parallel Codes 101 The Message Passing Interface: On the Road to MPI 4.0 and Beyond Workshop on Extreme-Scale Programming Tools (ESPT) ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data REFINE: Realistic Fault Injection via Compiler-Based Instrumentation for Accuracy, Portability and Speed Robert Schöne Technical University Dresden Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Thomas Scogland Lawrence Livermore National Laboratory The Green500: Trends in Energy-Efficient Supercomputing P82: Performance Evaluation of the NVIDIA Tesla P100: Our Directive-Based Partitioning and Pipelining vs. NVIDIA’s Unified Memory J. Ray Scott Pittsburgh Supercomputing Center Omni-Path User Group (OPUG) Meeting Stephen L. Scott Tennessee Technological University P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments W. Alan Scott Sandia National Laboratories Large Scale Visualization with ParaView William Scullin Argonne National Laboratory HPC Systems Professionals Workshop 7th Workshop on Python for High-Performance and Scientific Computing Mark Seager Intel Corporation How Serious Are We About the Convergence Between HPC and Big Data? Seetharami Seelam IBM Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments Career Management Satoshi Sekiguchi National Institute of Advanced Industrial Science and Technology ABCI - AI Bridging Cloud Infrastructure for Everyone Paul Selwood Met Office, UK HPC Software: Is “Cool Stuff” Really Incompatible with Sustainability? Patrick Semon Brookhaven National Laboratory P34: GPU Acceleration for the Impurity Solver in GW+DMFT Packages Karthik Senthil University of Illinois P80: Adaptive Loop Scheduling with Charm++ to Improve Performance of Scientific Applications Sangmin Seo Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Adrian Serio Louisiana State University HPX Smart Executors Keval Shah Dhirubhai Ambani Institute of Information and Communication Technology P27: Parallelization of the Particle-In-Cell Monte Carlo Collision (PIC-MCC) Algorithm for Plasma Simulation on Intel MIC Xeon Phi Architecture Miral Shah Dhirubhai Ambani Institute of Information and Communication Technology P27: Parallelization of the Particle-In-Cell Monte Carlo Collision (PIC-MCC) Algorithm for Plasma Simulation on Intel MIC Xeon Phi Architecture Gilad Shainer Mellanox Technologies Accelerating Big Data Processing and Machine/Deep Learning Middleware on Modern HPC Clusters John Shalf Lawrence Berkeley National Laboratory Energy Efficiency Gains From Software: Retrospectives and Perspectives Experiencing HPC for Undergraduates: Introduction to HPC Research PARADISE: A ToolFlow to Model Emerging Technologies for the Post-CMOS Era in HPC Pavel Shamis ARM Ltd OpenSHMEM in the Era of Exascale Mohammadsadegh Shamsabardeh University of California, Davis P50: Energy-Efficient and Scalable Bio-Inspired Nanophotonic Computing Lalitha Shankar National Cancer Institute Medical Image Analysis and Visualization Gowtham S Michigan Technological University Fourth SC Workshop on Best Practices for HPC Training Shruti Sharan CERN P29: A Deep Learning Tool for Fast Simulation Hashim Sharif University of Illinois Developing an OpenMP Runtime for UVM-Capable GPUs VIshal Chandra Sharma University of Utah P84: PRESAGE: Selective Low Overhead Error Amplification for Easy Detection Xipeng Shen North Carolina State University Egeria: A Framework for Auto-Construction of HPC Advising Tools through Multi-Layered Natural Language Processing Sameer Shende University of Oregon Hands-On Practical Hybrid Parallel Application Performance Engineering OpenSHMEM in the Era of Exascale P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Mark Shephard Rensselaer Polytechnic Institute Dynamic Load Balancing of Massively Parallel Unstructured Meshes Toshiyuki Shimizu Fujitsu Ltd Post-K Supercomputer with Fujitsu's Original CPU, Powered by ARM ISA Galen Shipman Los Alamos National Laboratory Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Jun Shirako Rice University Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages Mikhail Shiryaev Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Kathleen Shoga Lawrence Livermore National Laboratory ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data Fumiyoshi Shoji RIKEN Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) Mike Showerman National Center for Supercomputing Applications, University of Illinois HPC Systems Monitoring Data in Action Anton Shterenlikht University of Bristol PGAS Applications Workshop Panel Jiwu Shu Tsinghua University LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems Max Shulaker Massachusetts Institute of Technology Post Moore Supercomputing Luke Shulenburger Sandia National Laboratories Embracing a New Era of Highly Efficient and Productive Quantum Monte Carlo Simulations Hao Shyng National Tsing Hua University, Taiwan Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling Min Si Argonne National Laboratory Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Jeffrey Sica University of Michigan Containers in HPC Charles Siegel Pacific Northwest National Laboratory Evaluating On-Node GPU Interconnects for Deep Learning Workloads P89: Desh: Deep Learning for HPC System Health Resilience Stephen Siegel University of Delaware Towards Self-Verification in Finite Difference Code Generation A Verification Language for High Performance Computing P83: Contracts for Message-Passing Programs Bruno Silva Francis Crick Institute OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Hyogi Sim Virginia Tech Oak Ridge National Laboratory TagIt: An Integrated Indexing and Search Service for File Systems Scientific User Behavior and Data-Sharing Trends in a Petascale File System Nikolay A. Simakov University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Derek Simmel Pittsburgh Supercomputing Center OpenHPC Community BoF Horst Simon Lawrence Berkeley National Laboratory TOP500 - Past, Present, Future TOP500 Supercomputers Mark Sims US Department of Defense National Strategic Computing Initiative Update Alok Singh San Diego Supercomputer Center A Machine Learning Approach for Modular Workflow Performance Prediction Jasmit Singh Rogue Wave Software Approaches to Debugging Mixed-Language HPC Apps Ranvijay Singh Purdue University Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction Swati Singhal University of Maryland Adaptive Compression to Improve I/O Performance for Climate Simulations Robert Sisneros National Center for Supercomputing Applications, University of Illinois Scalable HPC Visualization and Data Analysis Using VisIt Jay Sitaraman Parallel Geometric Algorithms LLC P28: High-Fidelity Blade-Resolved Wind Plant Modeling Happy Sithole Center for High Performance Computing in South Africa Supercomputing in the Shadow of Giants: Perspectives and Insights from Supercomputing Leaders Outside the “Big 5” Regions and Organizations SKA: The Ultimate Big Data Project Tor Skeie Fabriscale Technologies AS Simula Research Laboratory Efficient Managing and Monitoring of InfiniBand HPC Clusters Anthony Skjellum University of Tennessee, Chattanooga P45: Campaign Storage: Erasure Coding with GPUs Workshop on Exascale MPI (ExaMPI) Dimitrios Skourtis IBM P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Elliott Slaughter Stanford University SLAC National Accelerator Laboratory Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Zachary Slepian Lawrence Berkeley National Laboratory Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies Cameron Smith Rensselaer Polytechnic Institute Dynamic Load Balancing of Massively Parallel Unstructured Meshes Lauren Smith US Department of Defense PGAS Applications Workshop Panel Tyler M. Smith University of Texas Lowering Barriers into HPC through Open Education Mikhail E. Smorkalov Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Addison Snell Intersect360 Research Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Edgar Solomonik University of Illinois Scaling Betweenness Centrality Using Communication-Efficient Sparse Matrix Multiplication David Solt IBM Charting the PMIx Roadmap Fengguang Song Indiana University-Purdue University Indianapolis Designing a Synchronization-Reducing Clustering Method on Manycores: Some Issues and Improvements Correcting Soft Errors Online in Fast Fourier Transform Shuaiwen Leon Song Pacific Northwest National Laboratory Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Lawrence Sorrillo Oak Ridge National Laboratory P59: Secure Enclaves: An Isolation-Centric Approach for Creating Secure High-Performance Computing Environments Mohammed Sourouri Norwegian University of Science and Technology Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures Leonel Sousa INESC-ID Modeling Large Compute Nodes with Heterogeneous Memories with the Cache-Aware Roofline Model Mark Speck Chaminade University of Honolulu High Performance Computing Education in US Data Science Filippo Spiga University of Cambridge A Performance Study of Quantum ESPRESSO's PWscf Code on Multi-Core and GPU Systems Bill Spotz Sandia National Laboratories 7th Workshop on Python for High-Performance and Scientific Computing Paul Springer RWTH Aachen University A01: GEMM-Like Tensor-Tensor Contraction (GETT) Jeffrey M. Squyres Cisco Systems Effective Programming Models for Deep Learning at Scale Fabric APIs - libfabric User Perspective and C++ Standardization Open MPI State of the Union XI Rahul Sridhar University of California, Irvine Lawrence Livermore National Laboratory P75: Model-Agnostic Influence Analysis for Performance Data Srinivas Sridharan Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Ryan L. Sriver University of Illinois Simulation and Visual Representation of Tropical Cyclone-Ocean Interactions Eric Stahlberg Frederick National Laboratory Computational Approaches for Cancer Impacting Cancer with HPC: Opportunities and Challenges Ann Stapleton University of North Carolina, Wilmington High Performance Computing Education in US Data Science Matt Starr Spectra Logic Corporation Spectra Logic Delivers a New Paradigm in Tape Library Deployment Craig Steffen National Center for Supercomputing Applications, University of Illinois P38: Benchmarking Parallelized File Aggregation Tools for Large Scale Data Management Damian S. Steiger ETH Zurich 0.5 Petabyte Simulation of a 45-Qubit Quantum Circuit Malgorzata Steinder IBM Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments Deryl Steinert Oak Ridge National Laboratory GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Thomas Steinke Zuse Institute Berlin Impacting Cancer with HPC: Opportunities and Challenges George Stelle Los Alamos National Laboratory OpenMPIR Thomas Sterling Indiana University OpenHPC Community BoF Rick Stevens Argonne National Laboratory Effective Programming Models for Deep Learning at Scale Blurring the Lines: High-End Computing and Data Science Impact of the DOE and NCI Partnership on Precision Oncology Adam J. Stewart Argonne National Laboratory University of Illinois Managing HPC Software Complexity with Spack Greg Stitt University of Florida A FPGA-Pipelined Approach for Accelerated Discrete-Event Simulation of HPC Systems Victoria Stodden University of Illinois Reproducibility and Uncertainty in High Performance Computing P91: Assessing the Availability of Source Code in Computational Physics Christopher Stone Computational Science and Engineering LLC Concurrent parallel processing on Graphics and Multicore Processors with OpenACC and OpenMP John E. Stone University of Illinois The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Using Accelerator Directives to Adapt Science Applications for State-of-the-Art HPC Architectures Patrick Storm Texas Advanced Computing Center, University of Texas Securing HPC: Development of a Low Cost, Open Source, Multi-Factor Authentication Infrastructure Shane Story Intel Corporation Designing Vector-Friendly Compact BLAS and LAPACK Kernels Eric J. Stotzer Texas Instruments Advanced OpenMP: Performance and 4.5 Features Quentin F. Stout University of Michigan Parallel Computing 101 Michael Strickland Intel Corporation Efficiently Accelerating HPC Workloads with FPGAs Erich Strohmaier Lawrence Berkeley National Laboratory TOP500 - Past, Present, Future The Green500: Trends in Energy-Efficient Supercomputing TOP500 Supercomputers Jeff Stuecheli IBM OpenCAPI: High Performance, Host-Agnostic, Coherent Accelerator Interface Craig Stunkel IBM Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Ernesto Su Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Simon Su US Army Research Laboratory P53: TensorViz: Visualizing the Training of Convolutional Neural Network Using ParaView Alejandro Suarez National Science Foundation The Future of NSF Advanced Cyberinfrastructure Estela Suarez Forschungszentrum Juelich Usability, Scalability and Productivity on Many-Core Processors: Intel Xeon Phi Omer Subasi Pacific Northwest National Laboratory Automatic Risk-Based Selective Redundancy for Fault-Tolerant Task-Parallel HPC Applications Hari Subramoni Ohio State University ESPM2'17: Opening Remarks ESPM2'17: Closing Remarks An In-Depth Performance Characterization of CPU- and GPU-Based DNN Training on Modern Architectures Scalable Reduction Collectives with Data Partitioning-Based Multi-Leader Design InfiniBand, Omni-Path, and High-Speed Ethernet: Advanced Features, Challenges in Designing, HEC Systems and Usage InfiniBand, Omni-Path, and High-Speed Ethernet for Dummies ESPM2 2017: Third International Workshop on Extreme Scale Programming Models and Middleware Scott Suchyta Altair Engineering OpenHPC Community BoF Joshua Suetterlein University of Delaware Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes P99: The Intersection of Big Data and HPC: Using Asynchronous Many Task Runtime Systems for HPC and Big Data Nitin Sukhija Slippery Rock University of Pennsylvania Fourth SC Workshop on Best Practices for HPC Training Dalal Sukkari King Abdullah University of Science and Technology Five-minute presentations by young researchers from around the world - part 2 Rangan Sukumar Cray Inc Future Trends in HPC Exploiting HPC for Big Data, Analytics and AI Michael Sullivan Nvidia Corporation Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Shinji Sumimoto Fujitsu Ltd The ARM Software Ecosystem: Are We There Yet? Huihui Sun University of Munster PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model Jimeng Sun Georgia Institute of Technology P10: HiCOO: A Hierarchical Sparse Tensor Format for Tensor Decompositions Xian-He Sun Illinois Institute of Technology Opening Remarks: MCHPC'17: Workshop on Memory Centric Programming for HPC Narayanan Sundaram Intel Corporation Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Galactos: Computing the 3-pt Anisotropic Correlation for 2 Billion Galaxies P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Nirmala Sundararajan Dell Inc OpenHPC Community BoF Hyojin Sung IBM Implementing Implicit OpenMP Data Sharing on GPUs Sayantan Sur Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Alan Sussman University of Maryland Adaptive Compression to Improve I/O Performance for Climate Simulations Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Kuniyasu Suzaki National Institute of Advanced Industrial Science and Technology P97: Profile Guided Kernel Optimization for Individual Container Execution on Bare-Metal Container Dave Suzuki Quantum Corporation Quantum Corporation Rook Distributed Storage System Gert Svensson KTH Royal Institute of Technology Total Cost of Ownership and HPC System Procurement Alexey Svyatkovskiy Princeton University Training Distributed Deep Recurrent Neural Networks with Mixed Precision on GPU Clusters Brent Swartz University of Minnesota P60: Managing dbGaP Data with Stratus, a Research Cloud for Protected Data Kyle Sweeney University of Notre Dame Lightweight Container Integration into Workflow Systems: A First Look at Singularity and Makeflow Thomas D. Swinburne Los Alamos National Laboratory P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Steve S. Sylvester Intel Corporation P95: GEOPM: A Scalable Open Runtime Framework for Power Management Tim Süß Johannes Gutenberg University Mainz A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Return to Top T Bassam Tabbara Quantum Corporation Rook Distributed Storage System Philip Taffet Rice University A18: Understanding the Impact of Fat-Tree Network Locality on Application Performance Masamichi Takagi RIKEN Workshop on Exascale MPI (ExaMPI) Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 P48: Prototyping of Offloaded Persistent Broadcast on Tofu2 Interconnect Ryousei Takano National Institute of Advanced Industrial Science and Technology P97: Profile Guided Kernel Optimization for Individual Container Execution on Bare-Metal Container Nathan Tallent Pacific Northwest National Laboratory Evaluating On-Node GPU Interconnects for Deep Learning Workloads Representative Paths Analysis Hua Tan Washington State University, Vancouver Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory P25: Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory William Tang Princeton University Training Distributed Deep Recurrent Neural Networks with Mixed Precision on GPU Clusters Alan Tannenbaum Stony Brook University Medical Image Analysis and Visualization Dingwen Tao University of California, Riverside Correcting Soft Errors Online in Fast Fourier Transform P37: PaSTRI: A Novel Data Compression Algorithm for Two-Electron Integrals in Quantum Chemistry Konstantin Taranov ETH Zurich sPIN: High-Performance Streaming Processing in the Network A09: Ring: Unifying Replication and Erasure Coding to Rule Resilience in KV-Stores Vasily Tarasov IBM P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Osamu Tatebe University of Tsukuba ACM Student Research Competition: Presentations by Semi-Finalists Mahidhar Tatineni San Diego Supercomputer Center Virtualization Ecosystems – Supporting Increasingly Complex Scientific Applications Michela Taufer University of Delaware Blurring the Lines: High-End Computing and Data Science Reproducibility and Uncertainty in High Performance Computing P94: Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads Chris Taylor US Department of Defense Fabric APIs - libfabric User Perspective and C++ Standardization Tracy Teal Michigan State University HPC Carpentry - Practical, Hands-On HPC Training Stig Telfer StackHPC Ltd OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Keita Teranishi Sandia National Laboratories Resilient Programming Environments Christian Terboven RWTH Aachen University Runtime Correctness Checking for Emerging Programming Paradigms Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices Advanced OpenMP: Performance and 4.5 Features Mastering Tasking with OpenMP Théophile Terraz French Institute for Research in Computer Science and Automation (INRIA) Melissa: Large Scale In Transit Global Sensitivity Analysis Avoiding Intermediate Files Andy Terrel NumFOCUS Software Engineering and Reuse in Computational Science and Engineering Xavier Teruel Barcelona Supercomputing Center Mastering Tasking with OpenMP Douglas Thain University of Notre Dame P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Rajeev Thakur Argonne National Laboratory Advanced MPI Programming Jayaraman J. Thiagarajan Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning P75: Model-Agnostic Influence Analysis for Performance Data Philippe thierry Intel Corporation Performance Tuning of Scientific Codes with the Roofline Model Murray Thom D-Wave Systems Inc Post Moore Supercomputing Michael Thomas Environmental Systems Design Inc Data Center Design and Planning for HPC Folks Owen G. M. Thomas Red Oak Consulting Essential HPC Finance Practice: Total Cost of Ownership (TCO), Internal Funding, and Cost-Recovery Models Extracting Value from HPC: Business Cases, Planning, and Investment HPC Acquisition and Commissioning Rollin Thomas Lawrence Berkeley National Laboratory 7th Workshop on Python for High-Performance and Scientific Computing David Thompson Kitware Inc In Situ Summarization with VTK-m Simon Thompson University of Birmingham OpenStack For HPC: Best Practices for Optimizing Software-Defined Infrastructure Sunil Thulasidasan Los Alamos National Laboratory A Scalable Analytical Memory Model for CPU Performance Prediction Xinmin Tian Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Philippe Tillet Harvard University Input-Aware Auto-Tuning of Compute-Bound HPC Kernels Jenett Tillotson Indiana University HPC Systems Professionals Workshop Devesh Tiwari Northeastern University Failures in Large Scale Systems: Long-Term Measurement, Analysis, and Implications GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Silent Errors in HPC Systems Stan Tomov University of Tennessee Investigating Half-Precision Arithmetic to Accelerate Dense Linear System Solvers Zhou Tong Florida State University A Comparative Study of SDN and Adaptive Routing on Dragonfly Networks Brian Toonen Argonne National Laboratory P32: Exploring the Performance of Electron Correlation Method Implementations on Kove XPDs Isaax Traxler Louisiana State University HPC Systems Professionals Workshop Kathy Traxler Louisiana State University HPC via HTTP: Portable, Scalable Computing Using App Containers and the Agave API Bradley E. Treeby University College London P40: Running Large-Scale Ultrasound Simulations on Piz Daint with 512 Pascal GPUs Sean Treichler Stanford University Nvidia Corporation Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Lukas Troska Louisiana State University HPX Smart Executors Christian Trott Sandia National Laboratories Kokkos: Enabling Manycore Performance Portability for C++ Applications and Domain Specific Libraries/Languages Timothy Tsai Nvidia Corporation Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications Yuhsiang M. Tsai National Taiwan University A02: Accelerating the Higher Order Singular Value Decomposition Algorithm for Big Data with GPUs Ronny Tschueter Technical University Dresden An LLVM Instrumentation Plug-In for Score-P Ronny Tschüter Technical University Dresden Hands-On Practical Hybrid Parallel Application Performance Engineering P67: Measuring I/O Behavior on Upcoming Systems with NVRAM P66: Analyzing Multi-Layer I/O Behavior of HPC Applications Makoto Tsubokura Kobe University RIKEN P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer James Tuck North Carolina State University Leveraging Near Data Processing for High-Performance Checkpoint/Restart Steve Tuecke University of Chicago Automating Research Data Management with Globus Ozan Tugluk Brown University MGARD: A Multilevel Technique for Compression of Floating-Point Data Antonino Tumeo Pacific Northwest National Laboratory Introduction - IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms HPC Graph Toolkits and the GraphBLAS Forum IA^3 2017 - 7th Workshop on Irregular Applications: Architectures and Algorithms Andrew Turner University of Edinburgh A Survey of Application Memory Usage on a National Supercomputer: An Analysis of Memory Requirements on ARCHER Software Engineers: Careers in Research HPC Carpentry - Practical, Hands-On HPC Training Return to Top U Naonori Ueda RIKEN P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Thomas Ulrich Ludwig Maximilian University of Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Jung-Ho UM Korea Institute of Science and Technology Information P51: TuPiX-Flow: Workflow-Based Large-Scale Scientific Data Analysis System Osman Unsal Barcelona Supercomputing Center Automatic Risk-Based Selective Redundancy for Fault-Tolerant Task-Parallel HPC Applications Ramakrishna Upadrasta International Institute of Information Technology, Hyderabad Implementation of a Cache Miss Calculator in LLVM/Polly Improved Loop Distribution in LLVM Using Polyhedral Dependences Carsten Uphoff Technical University Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake E. Lynn Usery US Geological Survey Common Big Data Challenges in Bio, Geo, Climate, and Social Sciences Will Usher University of Utah Flexible In Situ Visualization of LAMMPS Simulations Return to Top V Karan Vahi Information Sciences Institute, University of Southern California rvGAHP – Push-Based Job Submission Using Reverse SSH Connections Ramachandran Vaidyanathan Louisiana State University Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Mateo Valero Barcelona European Initiative on HPC Sofia Vallecorsa CERN Machine Learning for Big Data: Integrated, Collaborative, Multi-Technological Solutions to Multi-Objective Problems P29: A Deep Learning Tool for Fast Simulation Geoffroy Vallee Oak Ridge National Laboratory Workshop on Exascale MPI (ExaMPI) TagIt: An Integrated Indexing and Search Service for File Systems Robert A. van de Geijn University of Texas Lowering Barriers into HPC through Open Education P02: Strassen's Algorithm for Tensor Contraction Ruud van der Pas Oracle IA^3 Debate Advanced OpenMP: Performance and 4.5 Features Brian Van Essen Lawrence Livermore National Laboratory Introduction - Machine Learning in HPC Environments Toward Scalable Parallel Training of Deep Neural Networks Eric Van Hensbergen ARM Ltd IA^3 Debate The ARM Software Ecosystem: Are We There Yet? ARM's Road to Exascale Tom Aa IMEC P62: How To Do Machine Learning on Big Clusters Hans Vandierendonck Queen's University Belfast P11: Energy-Efficient Transprecision Techniques for Iterative Refinement Ana Lucia Varbanescu University of Amsterdam P86: HyGraph: High Performance Graph Processing on Hybrid CPU+GPUs platforms Shah Varun Cavium Inc Innovative Alternate Architectures for Exascale Computing: ThunderX2 and Beyond Vas Vasiliadis University of Chicago Automating Research Data Management with Globus Vinay Vasista Indian Institute of Science Optimizing Geometric Multigrid Method Computation Using a DSL Approach Andrei Vassiliev Scientific Concepts International Corporation Scalable Heterogeneous Programmable Reconfigurable Computing Dilip Vasudevan Lawrence Berkeley National Laboratory PARADISE: A ToolFlow to Model Emerging Technologies for the Post-CMOS Era in HPC Ranga Raju Vatsavai North Carolina State University In Situ Summarization with VTK-m Filip Vaverka Brno University of Technology P40: Running Large-Scale Ultrasound Simulations on Piz Daint with 512 Pascal GPUs Sudharshan S. Vazhkudai Oak Ridge National Laboratory Scientific User Behavior and Data-Sharing Trends in a Petascale File System Understanding Object-Level Memory Access Patterns Across the Spectrum GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility TagIt: An Integrated Indexing and Search Service for File Systems Malathi Veeraghavan University of Virginia Innovating the Network for Data Intensive Science (INDIS) Flavio Vella Sapienza University of Rome Scaling Betweenness Centrality Using Communication-Efficient Sparse Matrix Multiplication Accelerating Energy Games Solvers on Modern Architectures Anand Venkat Intel Corporation P31: Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture Manjunath Venkata Oak Ridge National Laboratory OpenSHMEM in the Era of Exascale Verónica Vergara Larrea Oak Ridge National Laboratory The ARM Software Ecosystem: Are We There Yet? Verónica Melesse Vergara Oak Ridge National Laboratory Careers in HPC Louis J. Vernon Los Alamos National Laboratory P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Merijn Elwin Verstraaten University of Amsterdam Speeding Up GPU Graph Processing Using Structural Graph Properties Jeffrey S. Vetter Oak Ridge National Laboratory 2nd International Workshop on Post Moore's Era Supercomputing (PMES) Porting a GAMESS Computational Chemistry Kernel to FPGAs PapyrusKV: A High-Performance Parallel Key-Value Store for Distributed NVM Architectures Post Moore Supercomputing Jerome Vienne University of Texas Advanced Manycore Programming (KNL) J. J. Villalobos Rutgers University Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure Brian Vinter University of Copenhagen Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Abhinav Vishnu Pacific Northwest National Laboratory P89: Desh: Deep Learning for HPC System Health Resilience Evaluating On-Node GPU Interconnects for Deep Learning Workloads Venkatram Vishwanath Argonne National Laboratory Flexible In Situ Visualization of LAMMPS Simulations Scalable In Situ Analysis of Molecular Dynamics Simulations Vladimir Voevodin Lomonosov Moscow State University The AlgoWiki Project: an Algorithmic Pillar of Exascale Computing Michael Voss Intel Corporation Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks Alysson Vrielink Nvidia Corporation Parallel Depth-First Search for Directed Acyclic Graphs Richard Vuduc Georgia Institute of Technology IA^3 Debate P10: HiCOO: A Hierarchical Sparse Tensor Format for Tensor Decompositions Return to Top W Aaron Walden NASA Langley Research Center P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Edward Walker National Science Foundation The Future of NSF Advanced Cyberinfrastructure Bob Walkup IBM Leveraging NVLINK and Asynchronous Data Transfer to Scale Beyond the Memory Capacity of GPUs Wolfgang A. Wall Technical University Munich P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II Chao Wang Oak Ridge National Laboratory Understanding Object-Level Memory Access Patterns Across the Spectrum Chundong Wang Data Storage Institute Transactional NVM Cache with High Performance and Crash Consistency Dali Wang Oak Ridge National Laboratory Data Analysis of Earth System Simulation within an In Situ Infrastructure David Wang Samsung IA^3 Debate Feiyi Wang Oak Ridge National Laboratory GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Hao Wang Virginia Tech Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels Hong Wang Stony Brook University Five-minute presentations by young researchers from around the world - part 1 Jianwu Wang University of Maryland, Baltimore County Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences Ke Wang Microsoft Introduction - MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers Lanning Wang Beijing Normal University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Wei-Hsiang Wang RIKEN P24: A Deployment of HPC Algorithm into Pre/Post-Processing for Industrial CFD on K-Computer Xiao Wang Purdue University Massively Parallel 3D Image Reconstruction Xinliang Wang Tsinghua University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Amit S. Warke IBM P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Tomo-Hiko Watanabe Nagoya University P17: Fully Non-Blocking Communication-Computation Overlap Using Assistant Cores toward Exascale Computing Vince Weaver University of Maine P72: New Developments for PAPI 5.6+ Gunther Weber Lawrence Berkeley National Laboratory Discussion/Open Mic Session/Closing Remarks Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Matthias Weber Technical University Dresden An LLVM Instrumentation Plug-In for Score-P P68: Continuous Clock Synchronization for Accurate Performance Measurements P66: Analyzing Multi-Layer I/O Behavior of HPC Applications Charles Weems University of Massachusetts Introduction - Workshop on Education for High Performance Computing (EduHPC) Revisions to NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing (PDC) for Undergraduate Education - Updates on the Curriculum Revision and Audience Comments Norbert Wehn University of Kaiserslautern TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization Qingsong Wei Data Storage Institute Transactional NVM Cache with High Performance and Crash Consistency Michèle Weiland University of Edinburgh Progressive Load Balancing of Asynchronous Algorithms Performance Analysis Brent Welch Google Parallel I/O in Practice Gerhard Wellein University of Erlangen-Nuremberg Node-Level Performance Engineering Andrew Wellington Australian National University PBS Pro Open Source Project Community BoF Jack Wells Oak Ridge National Laboratory Leadership AI - Keynote by Jack Wells - Director of Science - Oak Ridge Leadership Computing Facility The ARM User Experience: Testbeds and Deployment at HPC Centers Small Business and the Exascale Computing Project Benjamin Welton University of Wisconsin Data Reduction and Partitioning in an Extreme Scale GPU-Based Clustering Algorithm Bert Wesarg Technical University Dresden An LLVM Instrumentation Plug-In for Score-P P66: Analyzing Multi-Layer I/O Behavior of HPC Applications Ross Whitaker University of Utah Medical Image Analysis and Visualization Joseph P. White University at Buffalo A Slurm Simulator: Implementation and Parametric Analysis Sam White University of Illinois Visualizing, Measuring, and Tuning Adaptive MPI Parameters Integrating OpenMP into the Charm++ Programming Model Charm++ and AMPI: Adaptive and Asynchronous Parallel Programming Brad Whitlock Intelligent Light In Situ Analysis and Visualization with SENSEI Max D. Whitmore Brandeis University P12: Multi-Size Optional Offline Caching Algorithms Ben Whitney Brown University MGARD: A Multilevel Technique for Compression of Floating-Point Data Robert Whitten University of Tennessee Fourth SC Workshop on Best Practices for HPC Training Andreas Wicenec University of Western Australia International Centre for Radio Astronomy Research SKA: The Data Domino Enabled by DALiuGE Nathan Wichmann Cray Inc P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Patrick M. Widener Sandia National Laboratories Faodail: Enabling In Situ Analytics for Next-Generation Systems P93: Spacehog: Evaluating the Costs of Dedicating Resources to In Situ Analysis Sandra Wienke RWTH Aachen University Productivity and Software Development Effort Estimation in HPC Stefan Wild Argonne National Laboratory Contemporary Design of Supercomputer Experiments Machine Learning and Quantum Computing Torsten Wilde Leibniz Supercomputing Centre Eighth Annual Workshop for the Energy Efficient HPC Working Group (EE HPC WG) State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Jeremiah Wilke Sandia National Laboratories Modeling and Simulation of Communication in HPC Systems Nancy Wilkins-Diehr San Diego Supercomputer Center Americas HPC Collaboration Samuel W. Williams Lawrence Berkeley National Laboratory Performance Tuning of Scientific Codes with the Roofline Model Amalee Wilson University of Alabama, Birmingham LESS: Loop Nest Execution Strategies for Spatial Architectures Experiencing HPC for Undergraduates: Graduate Student Perspective Wesley M. Wilson US Naval Surface Warfare Center Accelerating Defense Innovation of US Naval Vessels with Computational Prototypes and High Performance Computing Theresa Lynn Windus Iowa State University Ames Laboratory Taking the Nanoscale to the Exascale John H. Wise Georgia Institute of Technology First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe John G. Wohlbier Engility Corporation P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Felix Wolf Technical University Darmstadt Machine Learning for Parallel Performance Analytics Contemporary Design of Supercomputer Experiments Emerging Technologies Showcase (Day 3) Emerging Technologies Showcase (Day 2) Emerging Technologies Showcase (Day 1) Matthew Wolf Oak Ridge National Laboratory Discussion/Open Mic Session/Closing Remarks Introduction - ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Parallel Streaming for In Transit Analysis with Heterogeneous Data Layout In Situ Analysis and Visualization with SENSEI ISAV 2017: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization Michael Wolf Sandia National Laboratories HPC Graph Toolkits and the GraphBLAS Forum Michael Wolfe Portland Group Why Iteration Space Tiling? Experiencing HPC for Undergraduates: Careers in HPC Scalable Parallel Programming Using OpenACC for Multicore, GPUs, and Manycore Automatic Testing of OpenACC Applications Compiler-Assisted Software Testing for Heterogeneous Computing Systems Noah Wolfe Rensselaer Polytechnic Institute Predicting the Performance Impact of Different Fat-Tree Configurations Jordi Wolfson-Pou Georgia Institute of Technology Distributed Southwell: An Iterative Method with Low Communication Costs Stephanie Wollherr Ludwig Maximilian University of Munich Extreme Scale Multi-Physics Simulations of the Tsunamigenic 2004 Sumatra Megathrust Earthquake Rich Wolski University of California, Santa Barbara Probabilistic Guarantees of Execution Duration for Amazon Spot Instances David Wong University of Illinois P91: Assessing the Availability of Source Code in Computational Physics Michael Wong Khronos Group Inc. Codeplay Software Ltd Khronos SYCL: Tomorrow’s Heterogeneous C++ and C Today Distributed and Heterogeneous Programming in C++ for HPC Chad Wood University of Oregon Projecting Performance Data Over Simulation Geometry Using SOSflow and Alpine ScrubJay: Deriving Knowledge from the Disarray of HPC Performance Data David A. Wood University of Wisconsin Advanced Micro Devices Inc Gravel: Fine-Grain GPU-Initiated Network Messages Paul Wood Purdue University Snowpack: Efficient Parameter Choice for GPU Kernels via Static Analysis and Statistical Prediction Simon Woodman Newcastle University e-Science Central Workflows for Stream Processing: Adding Streaming Support to an Existing Workflow System Jon Woodring Los Alamos National Laboratory Large Scale Visualization with ParaView Bernie Woytek Gensler Inc Data Center Design and Planning for HPC Folks Nicholas Wright Lawrence Berkeley National Laboratory Performance and Energy Usage of Workloads on KNL and Haswell Architectures Steven A. Wright University of York Introduction - The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) The 8th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS17) Kai wu University of California, Merced Unimem: Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Main Memory P92: Characterization and Comparison of Application Resilience for Serial and Parallel Executions Kesheng Wu Lawrence Berkeley National Laboratory Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling Panruo Wu University of Tennessee Investigating Half-Precision Arithmetic to Accelerate Dense Linear System Solvers Correcting Soft Errors Online in Fast Fourier Transform Tzuhsien Wu National Tsing Hua University, Taiwan Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling Wenhao Wu University of Delaware A22: Verifying Functional Equivalence Between C and Fortran Programs Wenji Wu Fermi National Laboratory P43: Deep Packet/Flow Analysis Using GPUs Nathan Wukie University of Cincinnati Position Paper: Software Engineering for Efficient Development of Flexible Numerical Software Michael R. Wyatt University of Delaware A10: Revealing the Power of Neural Networks to Capture Accurate Job Resource Usage from Unparsed Job Scripts and Application Inputs Brian Wylie Juelich Supercomputing Center Hands-On Practical Hybrid Parallel Application Performance Engineering Return to Top X Andre Xian Ming Chang FWDNXT Inc Snowflake: Efficient Accelerator for Deep Neural Networks Liquan Xiao National University of Defense Technology P56: ZoneTier: A Zone-Based Storage Tiering and Caching Co-Design to Integrate SSDs with Host-Aware SMR Drives Xian Xiao University of California, Davis Silicon Photonic LIONS: All-to-All Interconnects for Energy-Efficient, Scalable, and Modular HPC Systems Xuchao Xie National University of Defense Technology P56: ZoneTier: A Zone-Based Storage Tiering and Caching Co-Design to Integrate SSDs with Host-Aware SMR Drives Ying Hao Xu Lin Barcelona Supercomputing Center P71: Is ARM Software Ecosystem Ready for HPC? Danya Xu Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Haiying Xu National Center for Atmospheric Research Quality Assurance and Error Identification for the Community Earth System Model Hao Xu University of California, San Diego First Light in the Renaissance Simulation Visualization: Formation of the Very First Galaxies in the Universe Tianqi Xu Tokyo Institute of Technology P52: A Simulation-Based Analysis on the Configuration of Burst Buffer Weijia Xu Texas Advanced Computing Center, University of Texas High Performance Computing Education in US Data Science Mingdi Xue Data Storage Institute Transactional NVM Cache with High Performance and Crash Consistency Wei Xue Tsinghua University Understanding Object-Level Memory Access Patterns Across the Spectrum 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Return to Top Y Sudhakar Yalamanchili Georgia Institute of Technology Energy Efficient Supercomputing (E2SC) Memory-Centric Architectures for the Cloud and HPC Susumu Yamada Japan Atomic Energy Agency Application of a Communication-Avoiding Generalized Minimal Residual Method to a Gyrokinetic Five Dimensional Eulerian Code on ManyCore Platforms Takuma Yamaguchi University of Tokyo P23: AI with Super-Computed Data for Monte Carlo Earthquake Hazard Classification Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation Using OpenACC Keiji Yamomoto RIKEN State of the Practice: Energy and Power Aware Job Scheduling and Resource Management (EPA-JSRM) Yonghong Yan University of South Carolina Evaluation of Knight Landing High Bandwidth Memory for HPC Workloads Opening Remarks: MCHPC'17: Workshop on Memory Centric Programming for HPC MCHPC2017: Workshop on Memory Centric Programming for HPC Chao Yang Lawrence Berkeley National Laboratory P13: Large-Scale GW Calculations on Pre-Exascale HPC Systems Chen Yang Boston University P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons Guangwen Yang Tsinghua University Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Hailong Yang Beihang University Five-minute presentations by young researchers from around the world - part 1 Jinzhe Yang Imperial College, London Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Jun Yang Data Storage Institute Transactional NVM Cache with High Performance and Crash Consistency Max Yang Georgia Institute of Technology A25: Investigating Performance of Serialization Methods for Networked Data Transfer in HPC Applications Qing Yang Shenzhen DAPU Microelectronics Company University of Rhode Island Introducing DPU - Data-Storage Processing Unit – Placing Intelligence in Storage Yafei Yang Shenzhen DAPU Microelectronics Company Introducing DPU - Data-Storage Processing Unit – Placing Intelligence in Storage Yechao Yang Data Storage Institute Transactional NVM Cache with High Performance and Crash Consistency Zhi Yang University of Wyoming P28: High-Fidelity Blade-Resolved Wind Plant Modeling Yuichiro Yasui Kyushu University P78: Performance Evaluation of Graph500 Considering CPU-DRAM Power Shifting Katherine Yelick Lawrence Berkeley National Laboratory PGAS Applications Workshop Panel Keynote - Breakthrough Science at the Exascale Jae-Seung Yeom Lawrence Livermore National Laboratory Performance Modeling under Resource Constraints Using Deep Transfer Learning Wanwang Yin National Research Center of Parallel Computer Engineering and Technology, China 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Zekun Yin Shandong University 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Adarsh Yoga Rutgers University Hewlett Packard Enterprise Path-Synchronous Performance Monitoring in HPC Interconnection Networks with Source-Code Attribution S.J. Ben Yoo University of California, Davis Silicon Photonic LIONS: All-to-All Interconnects for Energy-Efficient, Scalable, and Modular HPC Systems P50: Energy-Efficient and Scalable Bio-Inspired Nanophotonic Computing P49: Toward Exascale HPC Systems: Exploiting Advances in High Bandwidth Memory (HBM2) through Scalable All-to-All Optical Interconnect Architectures Boram Yoon Los Alamos National Laboratory P88: PetaVision Neural Simulation Toolbox on Intel KNLs Kazutomo Yoshii Argonne National Laboratory Reconfigurable Computing in Exascale P46: Understanding How OpenCL Parameters Impact on Off-Chip Memory Performance of FPGA Platforms P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons Yang You University of California, Berkeley Scaling Deep Learning on GPU and Knights Landing Clusters Steven R. Young Oak Ridge National Laboratory Introduction - Machine Learning in HPC Environments Evolving Deep Networks Using HPC Machine Learning in HPC Environments Andrew Younge Sandia National Laboratories Containers in HPC Alex Younts Purdue University HPC Systems Professionals Workshop Chenhan D. Yu University of Texas Geometry-Oblivious FMM for Compressing Dense SPD Matrices Kwangmin Yu Brookhaven National Laboratory P34: GPU Acceleration for the Impurity Solver in GW+DMFT Packages Fengming Yuan Oak Ridge National Laboratory Data Analysis of Earth System Simulation within an In Situ Infrastructure Liang Yuan Chinese Academy of Sciences Tessellating Stencils Xin Yuan Florida State University Modeling UGAL on the Dragonfly Topology A Comparative Study of SDN and Adaptive Routing on Dragonfly Networks Jin-Hee Yuk Korea Institute of Science and Technology Information Visualization of Decision-Making Support (DMS) Information for Responding to a Typhoon-Induced Disaster Return to Top Z Aliasger Zaidy FWDNXT Inc Snowflake: Efficient Accelerator for Deep Neural Networks Ayal Zaks Intel Corporation LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization Ali Reza Zamani Zadeh Najari Rutgers University Submarine: A Subscription-Based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure Rafael Zamora-Resendiz Lawrence Berkeley National Laboratory P36: A Novel Feature-Preserving Spatial Mapping for Deep Learning Classification of Ras Structures Jan Zapletal Technical University of Ostrava P03: BEM4I: A Massively Parallel Boundary Element Solver Justs Zarins University of Edinburgh Progressive Load Balancing of Asynchronous Algorithms Lingfang Zeng Johannes Gutenberg University Mainz A Configurable Rule-Based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System Xianwei Zeng Delft University of Technology Adopting OpenCAPI for High Bandwidth Database Accelerators Max Zeyen Los Alamos National Laboratory University of Kaiserslautern Cosmological Particle Data Compression in Practice Jidong Zhai Tsinghua University Efficient Process Mapping in Geo-Distributed Cloud Data Centers Boyu Zhang Microsoft Introduction - The Eighth International Workshop on Data-Intensive Computing in the Clouds Hui Zhang University of Louisville High Performance Computing Education in US Data Science Jian Zhang Stanford University Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data Jie Zhang Ohio State University Designing and Building Efficient HPC Cloud with Modern Networking Technologies on Heterogeneous HPC Clusters Lihao Zhang Stony Brook University Five-minute presentations by young researchers from around the world - part 1 Tingjian Zhang Shandong University 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Wen Zhang Stanford University Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions Wenqiang Zhang University of Science and Technology of China 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Xuechen Zhang Washington State University, Vancouver Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory P25: Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory Yiming Zhang University of Florida Multi-Fidelity Surrogate Modeling for Application/Architecture Co-Design Yu Zhang University of California, Davis Silicon Photonic LIONS: All-to-All Interconnects for Energy-Efficient, Scalable, and Modular HPC Systems Yunquan Zhang Chinese Academy of Sciences Tessellating Stencils Zhenguo Zhang Southern University of Science and Technology, China 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of Realistic 10 Hz Scenarios Zhibo Zhang University of Maryland, Baltimore County Multidisciplinary Education on Big Data + HPC + Atmospheric Sciences Dongfang Zhao University of Nevada, Reno Introduction - MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers MTAGS17: 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers Xin Zhao Mellanox Technologies Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Zhiming Zhao University of Amsterdam Seamless Infrastructure Customization and Performance Optimization for Time-Critical Services in Data Infrastructures Chao Zheng University of Notre Dame P58: Wharf: Sharing Docker Images across Hosts from a Distributed Filesystem Gengbin Zheng Intel Corporation Why Is MPI So Slow? Analyzing the Fundamental Limits in Implementing MPI-3.1 Weijian Zheng Indiana University-Purdue University Indianapolis Designing a Synchronization-Reducing Clustering Method on Manycores: Some Issues and Improvements Yan Zheng National Research Center of Parallel Computer Engineering and Technology, China Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight Yantong Zheng University of Illinois P91: Assessing the Availability of Source Code in Computational Physics Kangyou Zhong Sun Yat-Sen University Visualizations of a High-Resolution Global-Regional Nested, Ice-Sea-Wave Coupled Ocean Model System Amelie Chi Zhou Shenzhen University Efficient Process Mapping in Geo-Distributed Cloud Data Centers Ying Zhou Loughborough University P20: Facilitating the Scalability of ParSplice for Exascale Testbeds Johannes Ziegenbalg Technical University Dresden An LLVM Instrumentation Plug-In for Score-P P68: Continuous Clock Synchronization for Accurate Performance Measurements Dmitri Zimine Brocade Communications Systems StackStorm Inc Genomic Computations at Scale with Serverless, and Docker Swarm Christopher Zimmer Oak Ridge National Laboratory GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility Best Practices for Architecting Performance and Capacity in the Burst Buffer Era Michael Zingale Stony Brook University The OLCF GPU Hackathon Series: The Story Behind Advancing Scientific Applications with a Sustained Impact Taieb Znati University of Pittsburgh Silent Errors in HPC Systems Hamid Reza Zohouri Tokyo Institute of Technology P41: OpenCL-Based High-Performance 3D Stencil Computation on FPGAs Mawussi Zounon University of Manchester Batched, Reproducible, and Reduced Precision BLAS Mohammad Zubair Old Dominion University P04: Unstructured-Grid CFD Algorithms on Many-Core Architectures Return to Top