Applications of high performance computing to the analysis of large eddy simulation of the convective boundary layer
Sliwinski, Timothy S.
MetadataShow full item record
In atmospheric science, the notion of what is considered a "large dataset" has changed. As computing and data storage resources have become more affordable with time, researchers dealing with terabytes (10^12 bytes) of data have become more commonplace, centralized stores of atmospheric data now contain petabytes (10^15 bytes) of archived observations and model output, and exabytes (10^18 bytes) are now referred to as the next "grand challenge". As these datasets continue to grow, analysis methods will need to continue to evolve with them so that pulling meaning from these datasets can be done so in a timely fashion that does not impede scientific progress and operational effectiveness. To do so, technologies must be better harnessed that have the potential to accelerate the way datasets are accessed and processed. In this dissertation, practical methods are presented by which an analysis may be executed in parallel using the Message Passing Interface (MPI) and Python. These methods first consider the inherent spatial dependencies of a particular data analysis process. By identifying these dependencies, horizontal or vertical distribution of the dataset across processes can be carried out with minimal process intercommunication. In addition, an analysis method is classified as either data-transfer-limited or computationally-limited. In data-transfer-limited problems, data transfer time outweighs processing time. In computationally-limited problems, processing time outweighs data transfer time. The results show that by increasing processor count, the execution time of computationally-limited problems shows improvement. For data-transfer-limited problems, increasing node count offers the greatest improvement. To further improve the performance of computationally-limited problems, a Graphics Processing Unit (GPU) and the Compute Unified Device Architecture (CUDA) framework are used. It is shown that this GPU implementation offers further improvement over the MPI version of the analysis methods tested. To further showcase the usefulness of these methods, research utilizing model output from a fine-scale Weather Research and Forecasting (WRF) model runs configured as Large Eddy Simulation (LES) is performed. The model output simulates the homogeneous convective boundary layer at a grid spacing of 12.5 m in both the horizontal and vertical over a domain of 9 km x 9 km x 4 km. This fine mesh resolution results in a total dataset output size just over 4 TB for the 4-hour simulated time period from 11:00 to 15:00 local time. The goal of this research is to investigate the diurnal cycle influence on the interscale character of boundary layer moisture fluxes first noted by Jonker, et al. (1999) and the role of the diurnally-evolving entrainment zone on these fluxes. These models are driven by prescribed homogeneous surface flux forcing which is varied over time in a sinusoidal manner to reproduce the change during the day, and with this dataset, a significant portion of the smallest eddies in the entrainment zone can be resolved providing insight into the entrainment processes occurring in response to this forcing. Through entrainment flux quadrant analysis it will be shown how the entrainment zone of the diurnally-forced convective boundary layer transitions to one influenced more heavily by downward (negative) vertical motions in line with the "scouring" entrainment process noted by Sullivan, et al. (1998) under more stable capping inversions. Additionally, through scale-dependency and 2-D multi-resolution spectral analysis techniques, these downward motions will be shown to grow in scale over time and play a role in increasing the scale of the variance of boundary layer moisture through the entrainment of drier free-atmosphere air.