Authors :
Valine Atieno Okeyo; Idah Orowe; Nicholas Otienoh Oguge
Volume/Issue :
Volume 9 - 2024, Issue 7 - July
Google Scholar :
https://tinyurl.com/mr2bp7ya
Scribd :
https://tinyurl.com/mr234rpf
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24JUL1521
Abstract :
This study investigates the predictive
capability of a Random Forest model in identifying
respiratory diseases attributed to PM2.5 exposure in
Nairobi County. Leveraging a comprehensive dataset
encompassing demographic and air quality variables, the
model demonstrated robust performance metrics,
achieving an accuracy of 79.97% and an area under the
curve (AUC) of 0.872. These results highlight the model’s
effectiveness in distinguishing between respiratory and
cardiovascular conditions. The model’s sensitivity and
specificity were 81.88% and 73.27%, respectively,
indicating a strong ability to correctly identify both true
positives and true negatives. Analysis of feature
importance revealed that age and PM2.5 concentrations
were the most influential factors in predicting health
outcomes, emphasizing the significant impact of air
pollution and demographic factors on respiratory and
cardiovascular health. Furthermore, the consistent train
and test error rates across varying training set sizes
suggest the model’s stability and generalizability. This
study underscores the importance of addressing air
quality issues to mitigate the health impacts of PM2.5
exposure in urban settings.
Keywords :
Respiratory Diseases, PM2.5, Random Forest, Accuracy, Feature Importance.
References :
- Dockery DW, Pope CA. Acute respiratory effects of particulate air pollution. Rev Public Health. 1993;15(1):107-32.
- Pope CA, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA. 2002;287(9):1132-41.
- Health Effects Institute (HEI). Understanding the health effects of ambient ultrafine particles. Research Report 155. Boston, MA: Health Effects Institute; 2010.
- Amegah AK, Agyei-Mensah S. Urban air pollution in sub-Saharan Africa: Time for action. Environ Pollut. 2017;220:738-43.
- European Environment Agency (EEA). Air quality in Europe - 2020 report. Copenhagen, Denmark: European Environment Agency; 2020.
- Kanyiva KW, Mwalukumbi JM, Chege W, Juma PA, Mutemi J. Air quality in Nairobi, Kenya: A review of monitoring and policy gaps. Atmosphere. 2021;12(4):508.
- Githinji G, Wanyua J, Karu J, Muchiri EM. Assessment of ambient air quality and its health impact in Nairobi City, Kenya. Int J Environ Res Public Health. 2019;16(11):1987.
- Breiman L. Random forests. Mach Learn. 2001;45(1):5-32.
- Lall R, Kendall M, Zhao Y, Wesson B, Harlan S, Jones M. Machine learning approaches for estimating spatial PM2.5 concentrations across the continental United States. Environ Sci Technol. 2017;51(21):12449-58.
- Hu X, Waller LA, Al-Hamdan MZ, Crosson WL, Estes MG Jr, Estes SM, et al. A systematic review of machine learning applications in air quality research. Environ Res Lett. 2020;15(6):063001.
- World Health Organization. Air pollution. Available from: https://www.who.int/airpollution. 2018.
- Liu Y, Chen X, Yan B. The impact of PM2.5 on respiratory diseases: Evidence from hospital admissions in China. J Environ Manag. 2020;274:111214.
- Anderson JO, Thundiyil JG, Stolbach A. Clearing the air: A review of the effects of particulate matter air pollution on human health. J Med Toxicol. 2012;8(2):166-75.
- Gatari MJ, Kinyari BN, Gaita SM, Wafula G, Blake DR, Harrison RM. The state of air quality in Nairobi, Kenya. Atmos Environ. 2015;123:177-84.
- Egondi T, Kyobutungi C, Ng N, Muindi K, Oti S, Vijver S, et al. Exposure to airborne particles and respiratory health in Nairobi informal settlements. Environ Health. 2018;17(1):62.
- Onyango C, Wamukoya DK, Macharia E, Ayah R. Air quality monitoring in Kenya: Current status and future perspectives. Environ Sci Policy. 2021;122:36-46.
- Breiman L. Random forests. Mach Learn. 2001;45(1):5-32.
- Ravindra K, Bahadur SS, Katoch V, Bhardwaj S, Kaur-Sidhu M, Gupta M, et al. Machine learning models for predicting respiratory diseases due to air pollution in urban India. Environ Res Lett. 2023;18(1):014003.
- Li L, Sun J, Jiang X, Liu X. Predicting high-cost patients using medical insurance data: A case study in western China. Health Serv Res. 2019;54(1):120-30.
- Patel SJ, Teach SJ, Haynes ML, Mathew M, Mittal MK. Predictive modeling of asthma exacerbations in pediatric patients using machine learning. Pediatr Pulmonol. 2018;53(6):873-82.
- Ravindra K, Bahadur SS, Katoch V, Bhardwaj S, Kaur-Sidhu M, Gupta M, et al. Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections. Department of Community Medicine & School of Public Health, PGIMER, Chandigarh 160012, India. 2023.
- Harrou F, Dairi A, Sun Y, Kadri F. Detecting abnormal ozone measurements with a deep learning-based strategy. IEEE Sens J. 2018;18:7222-32. doi: 10.1109/jsen.2018.2852001.
- Xi Y, Tian CL, Qian L. A study of deep learning methods for de-identification of clinical notes in cross-institute settings. BMC Med Inform Decis Mak. 2019;19:232. doi: 10.1186/s12911-019-0935-4.
- Gans D, Kralewski J, Hammons T, Dowd B. Medical groups’ adoption of electronic health records and information systems. Health Aff. 2005;24:1323-33. doi: 10.1377/hlthaff.24.5.1323.
- Raghupathi W, Raghupathi V. Big data analytics in healthcare: Promise and potential. Health Inf Sci Syst. 2014;2:3. doi: 10.1186/2047-2501-2-3.
- Yu G, Yang Z, Shi Y. Identification of pediatric respiratory diseases using a fine-grained diagnosis system. J Biomed Inform. 2021;117:103754. doi: 10.1016/j.jbi.2021.103754.
- Deo RC. Machine learning in medicine. Circulation. 2015;132:1920-30. doi: 10.1161/CIRCULATIONAHA.115.001593.
- Patrício M, Pereira J, Crisóstomo J, Matafome P, Gomes M, Seiça R, et al. Using resistin, glucose, age, and BMI to predict the presence of breast cancer. BMC Cancer. 2018;18:181-88. doi: 10.1186/s12885-017-3877-1.
- Abera A, Friberg J, Isaxon C, Jerrett M, Malmqvist E, Sjöström C, et al. Air quality in Africa: Public health implications. Annu Rev Public Health. 2021;42:193-210. doi: 10.1146/annurev-publhealth-100119-113802.
- Agbo KE, Walgraeve C, Eze JI, Ugwoke PE, Ukoha PO, Van Langenhove H. A review on ambient and indoor air pollution status in Africa. Atmos Pollut Res. 2021;12:243-60. doi: 10.1016/j.apr.2020.11.006.
- Kurmi OP, Lam KBH, Ayres JG. Indoor air pollution and the lung in low- and medium-income countries. Eur Respir J. 2012;40(1):239-54. doi: 10.1183/09031936.00193311.
- Abegaz SB, Zereyesus YA, Dalie FS, Belay KA. Air pollution and respiratory health: A review. Int J Environ Res Public Health. 2021;18(4):1947. doi: 10.3390/ijerph18041947.
- Amegah AK, Agyei-Mensah S. Urban air pollution and noncommunicable diseases in low- and middle-income countries: A narrative review. J Environ Public Health. 2021;2021:9747538. doi: 10.1155/2021/9747538.
- Chowdhury S, Dey A, Smith KR. Ambient PM2.5 exposure and premature mortality burden in the 10 most populous urban localities in India: An assessment of exposure-response relationships. Environ Health Perspect. 2021;129(5):057004. doi: 10.1289/EHP7071.
- Limaye VS, Schraufnagel DE. Impact of air pollution on lung health—Strategies for global action. Glob Heart. 2021;16(1):28. doi: 10.5334/gh.897.
This study investigates the predictive
capability of a Random Forest model in identifying
respiratory diseases attributed to PM2.5 exposure in
Nairobi County. Leveraging a comprehensive dataset
encompassing demographic and air quality variables, the
model demonstrated robust performance metrics,
achieving an accuracy of 79.97% and an area under the
curve (AUC) of 0.872. These results highlight the model’s
effectiveness in distinguishing between respiratory and
cardiovascular conditions. The model’s sensitivity and
specificity were 81.88% and 73.27%, respectively,
indicating a strong ability to correctly identify both true
positives and true negatives. Analysis of feature
importance revealed that age and PM2.5 concentrations
were the most influential factors in predicting health
outcomes, emphasizing the significant impact of air
pollution and demographic factors on respiratory and
cardiovascular health. Furthermore, the consistent train
and test error rates across varying training set sizes
suggest the model’s stability and generalizability. This
study underscores the importance of addressing air
quality issues to mitigate the health impacts of PM2.5
exposure in urban settings.
Keywords :
Respiratory Diseases, PM2.5, Random Forest, Accuracy, Feature Importance.