Feature Engineering and Supervised Machine Learning to Forecast Biogas Production during Municipal Anaerobic Co-Digestion

ACS ES T Eng. 2023 Dec 28;4(3):660-672. doi: 10.1021/acsestengg.3c00435. eCollection 2024 Mar 8.

Abstract

Municipalities with excess anaerobic digestion capacity accept offsite wastes for co-digestion to meet sustainability goals and create more biogas. Despite the benefits inherent to co-digestion, the temporal and compositional heterogeneity of external waste streams creates operational challenges that lead to upsets or conservative co-digestion. Given the complex microbial bioprocesses occurring during anaerobic digestion, prediction and modeling of the outcomes can be challenging, and machine learning has the potential to improve understanding and control of co-digestion processes. Biogas flows are a surrogate for process health, and here, we predicted biogas production from historical data collected by a water resource recovery facility (WRRF) during normal operation. We tested a daily lab and operational data set (n = 1089 after cleaning) and a minute-by-minute supervisory control and data acquisition (SCADA) operational data set (n = 491,761 after cleaning) to determine if forecasting biogas flow for a 24 h time horizon is feasible without collecting additional data. We found that a multilayer perceptron (MLP) neural network model outperformed tree-based and multiple linear regression models. Using a high-resolution SCADA data set for the first time, we showed that MLP neural networks could predict biogas production with an adjusted coefficient of determination (R2) of 0.78 and a mean absolute percentage error of 13.4% on a holdout test set. Adding daily laboratory analyses to the model did not appreciably improve the prediction of biogas flows. Feature engineering was essential to an accurate prediction, and 11 of the 15 most important features in the SCADA model were calculated from raw SCADA outputs. In summary, this paper demonstrates that minute-scale SCADA information collected at a municipal co-digestion facility can forecast biogas production, as a first step toward a digital twin model, without additional data collection.