Date of Award


Document Type


Degree Name

Master of Science (MS)


Computer Science

First Advisor

Kristen L. Underwood

Second Advisor

Donna M. Rizzo


Surface water turbidity, a crucial parameter affecting water quality, is a significant concern for public health and water treatment processes. With turbidity levels ranging from a few to several hundred nephelometric turbidity units (NTUs) due to events like heavy rainfall, flooding, and agricultural runoff, the need for effective turbidity prediction methods is paramount. Current forecasting methods depend on historical data and meteorological forecasts, with the predictive accuracy being limited by the complexity of the hydrological system. Operators of drinking water reservoirs, particularly in large unfiltered water supply systems like the Catskill and Delaware river systems serving New York City, face significant challenges in managing reservoir operations due to fluctuating turbidity levels.

The complexity and variability inherent in hydrological systems pose significant challenges to the development of accurate turbidity prediction models. Factors affecting prediction accuracy include meteorological variables, catchment attributes, management practices, memory effects, and feedbacks within river systems. Traditional prediction methods such as regression models, while widely used, often fail to account for these complexities, leading to limitations in predictive accuracy. In contrast, advanced machine learning techniques such as Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), RNN combined with Bi-directional LSTM (RNN+Bi-LSTM), and Self-Attention based LSTM (SA-LSTM) have shown promise in time-series forecasting. These models excel in capturing non-linear patterns and temporal dependencies. This research focuses on building predictive models using these recurrent models and comparing their performance. Using daily frequency data spanning from Oct 1, 2013, to Sep 30, 2022, we created optimized models from these algorithms in which we tuned batch sizes, time to prediction, window lengths, error metrics, number of epochs, and regularization methods. More specifically, we used a batch size of 32, a forecast window of 1 and 7 days, a window length of 14 days, various error metrics including MSE and NSE, 50 epochs with early stopping, and 50% dropout to avoid overfitting. By building and analyzing these forecast models, this research aims to improve the accuracy and reliability of turbidity forecasts, optimizing river and reservoir management.

The present study includes the development of a framework for turbidity prediction based on various machine learning models. The framework incorporates historical and real-time turbidity data as well as meteorological data to capture the complexity of hydrological systems. Daily frequency data from the Stony Clove watershed in the Catskill system, including the Ashokan Reservoir, are used to test and validate the prediction models. The developed models are compared based on their ability to predict turbidity levels across seasons over a period of 9 water years. It is expected that these advanced prediction models will provide significant benefits in terms of early warnings and decision support for reservoir operations, thereby contributing to improved water quality management. Accurate turbidity prediction models can optimize these operations, especially in water bodies such as the Ashokan Reservoir, which often faces turbidity levels exceeding the recommended limits.



Number of Pages

80 p.

Available for download on Saturday, October 05, 2024