Tree-based modeling methods to predict nitrate exceedances in the Ogallala aquifer in Texas


The performance of four tree-based classification techniques-classification and regression trees (CART), multi-adaptive regression splines (MARS), random forests (RF) and gradient boosting trees (GBT) were compared against the commonly used logistic regression (LR) analysis to assess aquifer vulnerability in the Ogallala Aquifer of Texas. The results indicate that the tree-based models performed better than the logistic regression model, as they were able to locally refine nitrate exceedance probabilities. RF exhibited the best generalizable capabilities. The CART model did better in predicting non-exceedances. Nitrate exceedances were sensitive to well depths-an indicator of aquifer redox conditions, which, in turn, was controlled by alkalinity increases brought forth by the dissolution of calcium carbonate. The clay content of soils and soil organic matter, which serve as indicators of agriculture activities, were also noted to have significant influences on nitrate exceedances. Likely nitrogen releases from confined animal feedlot operations in the northeast portions of the study area also appeared to be locally important. Integrated soil, hydrogeological and geochemical datasets, in conjunction with tree-based methods, help elucidate processes controlling nitrate exceedances. Overall, tree-based models offer flexible, transparent approaches for mapping nitrate exceedances, identifying underlying mechanisms and prioritizing monitoring activities.


© 2020 by the authors. cc-by


Aquifer vulnerability, CART, Gradient boosting algorithms, Machine learning, MARS, Nitrate, Ogallala aquifer, Random forests, Water quality


Uddameri, V., Silva, A.L.B., Singaraju, S., Mohammadi, G., & Hernandez, E.A.. 2020. Tree-based modeling methods to predict nitrate exceedances in the Ogallala aquifer in Texas. Water (Switzerland), 12(4).