Towards efficient data movement at extreme scale
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
High Performance Computing (HPC) systems are equipped with a massive number of processors, high-bandwidth networks, large capacity fast storage, and specialized parallel software to provide answers to challenging scientific and engineering questions. Over the past decades, data generation capabilities in these domains have grown rapidly due to the emergence of large-scale instruments such as telescopes, colliders, as well as the proliferation of sensors and high-throughput analysis devices. The explosion in data volume, variety, and complexity has required modern supercomputers to improve the sustained performance continuously and bring HPC to the extreme scale.
The most significant changes at the extreme scale come from architectural changes in the underlying hardware platforms. On-chip parallelism is dramatically increased to improve performance as power and cooling constraints have limited increases in microprocessor clock speeds. The number of cores is expected to be hundreds or even thousands per chip. Total concurrency must rise by up to five orders of magnitude on the extreme scale systems. Meanwhile, the available memory and memory bandwidth will not scale by the same order of magnitude. The disparity of growth between computing and memory means that memory capacity and bandwidth per core even less than today’s petascale machines. This has led to deeper memory and storage hierarchies to keep most of relevant data for a program close to the processing logic.
The phenomenal increase in datasets and the rapid advance of computing architectures on extreme scale high performance computing systems bring critical challenges to data organization and management. The software solution that addresses these challenges, however, has significantly lagged behind. This dissertation research focuses on designing and developing an innovative coordinated data movement methodology that can substantially improve the performance and energy efficiency of high performance computing. This new methodology provides a coordinated architecture that organizes data movement to achieve desired parallelism, locality, and scalability for extreme scale systems. It adds features to the programming environment and develops an execution model that offers effective methods to explicitly arrange data movement. This dissertation advances the understandings of scientific applications’ I/O activities and further unleashes the power of high performance computing.