Data engineering application to develop a comprehensive hydro-climatological database for the High Plains region
Ogallala Aquifer, underlying the High Plains Region of the US, provides thirty percent of all United States’ irrigation demand and supports one-sixth of the world’s total grain production. The aquifer, however, is rapidly depleting due to the over-utilization of this resource. Prolonging the useful life of aquifer requires analysis of future water use scenarios with solution strategies guided by scientific modeling procedures. Any model is the result of its underlying data and making a informed decision during data compilation helps avoid issues in the future. To facilitate hydrological modeling, it is beneficial to have a comprehensive, multi-scale, multi-resolution and, scalable dataset with historical data of the region’s climatic and meteorological parameters. This thesis outlines the design and preparation of a comprehensive relational database that archives hydrometeorological data of use in hydrological modeling. The tasks performed here, include, but are not limited to, the methods used to extract data from individual sources, imputation of missing data, arrangement of data in a homogeneous format, storing it in relational database format and estimation of parameters that are not readily available. Data were extracted from primary and secondary data sources, ensuring required quality of the data and maintaining the parameters' temporal continuity over 39 years (1981 – 2019) across all sources and stations. Kalman filtering was used for imputation of the missing data. The developed database with thirteen different hydro-climatological parameters contains a total of 1.46 billion data points with 58.4 million records.
As a second objective, gridded (interpolated) data from two popular sources - Daymet and PRISM were compared with primary data(USHCN Daily Data) for estimating a suite of metrics relevant to hydro-agricultural applications. While these secondary datasets were able to capture the trend of parameters, they did not accurately estimate the values of the parameters.
Embargo status: Restricted until 06/2022. To request the author grant access, click on the PDF link to the left.