TY - JOUR
T1 - Improvement of performance of in-situ virtual monitoring system of the occurrence probability for high concentrations of naturally occurring radioactive materials in groundwater through the solution of the data imbalance problem
AU - Lee, Hyeongmok
AU - Jeong, Jina
AU - Choung, Sungwook
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/4
Y1 - 2024/4
N2 - This paper presents two data-driven virtual sensors to estimate the time-series of the probability of high-concentration occurrence of naturally occurring radioactive materials (NORMs; 238U and 222Rn) in groundwater based on the in-situ groundwater quality monitoring data and geological information. The random forest was applied to estimate the NORM concentration based on the actual in-situ groundwater quality data, rock type, and the aquifer depth. Additionally, this study proposes three data sampling techniques (i.e., under-sampling, synthetic minority over-sampling, and a complex sampling) to improve the model applicability and accuracy. The developed models were validated using the actual data acquired from 201 locations in South Korea. The models for 238U and 222Rn showed estimation accuracies of 85% and 80%, respectively; the models with over-sampling showed better performance. All the results verified the usefulness of the developed models as virtual sensors for providing immediate information on the in-situ presence of NORMs in groundwater.
AB - This paper presents two data-driven virtual sensors to estimate the time-series of the probability of high-concentration occurrence of naturally occurring radioactive materials (NORMs; 238U and 222Rn) in groundwater based on the in-situ groundwater quality monitoring data and geological information. The random forest was applied to estimate the NORM concentration based on the actual in-situ groundwater quality data, rock type, and the aquifer depth. Additionally, this study proposes three data sampling techniques (i.e., under-sampling, synthetic minority over-sampling, and a complex sampling) to improve the model applicability and accuracy. The developed models were validated using the actual data acquired from 201 locations in South Korea. The models for 238U and 222Rn showed estimation accuracies of 85% and 80%, respectively; the models with over-sampling showed better performance. All the results verified the usefulness of the developed models as virtual sensors for providing immediate information on the in-situ presence of NORMs in groundwater.
KW - Data augmentation
KW - Groundwater quality virtual sensor
KW - Major factor analysis
KW - Random forest
KW - Sampling technique
KW - Sensitivity analysis
UR - http://www.scopus.com/inward/record.url?scp=85185608726&partnerID=8YFLogxK
U2 - 10.1016/j.envsoft.2024.105978
DO - 10.1016/j.envsoft.2024.105978
M3 - Article
AN - SCOPUS:85185608726
SN - 1364-8152
VL - 175
JO - Environmental Modelling and Software
JF - Environmental Modelling and Software
M1 - 105978
ER -