Authors :
M V V Krishna; G Sri Jaya Sairam; P Karthik; M Shakeer; G Arjun; SD Basheer Babu
Volume/Issue :
Volume 10 - 2025, Issue 4 - April
Google Scholar :
https://tinyurl.com/ybxakyu6
Scribd :
https://tinyurl.com/2vjx9npm
DOI :
https://doi.org/10.38124/ijisrt/25apr760
Google Scholar
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 15 to 20 days to display the article.
Abstract :
The project "Gen AI for Disease Prediction", utilizes advanced machine learning methodologies to forecast
diseases such as diabetes, heart disease, and cancer based on user-input symptoms. It employs the Random Forest algorithm,
a powerful and flexible machine learning model, ensuring accurate predictions while reducing the likelihood of overfitting.
To enhance prediction reliability, the system incorporates data preprocessing techniques such as feature selection, data
cleaning, and encoding. Developed using Scikit-learn, Python, and Django, the project integrates sophisticated machine
learning functions with an intuitive web interface. Users can conveniently select symptoms from dropdown menus, which
are then processed by the backend system. The machine learning model, trained on a well-structured dataset covering
various medical conditions and their symptoms, analyzes the input to generate predictions. Ultimately, this project delivers
a scalable and efficient disease prediction system that aids in the early detection of potential health issues.
Keywords :
Random Forest Algorithm, Medical Diagnosis, Scikit-Learn, Symptom Analysis, Early Disease Detection.
References :
- L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
- J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. Morgan Kaufmann, 2011.
- A. Rajkomar, J. Dean, and I. Kohane, “Machine Learning in Medicine,” New England Journal of Medicine, vol. 380, no. 14, pp. 1347–1358, 2019. [Online]. Available: https://doi.org/10.1056/NEJMra1814259.
- T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794.
- S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [Online]. Available: https://doi.org/10.1162/neco.1997.9.8.173
- Django Software Foundation, “Django Web Framework,” [Online]. Available: https://www.djangoproject.com/
- Scikit-learn Developers, “Scikit-learn: Machine Learning in Python,” [Online]. Available: https://scikit-learn.org/.
- McKinsey & Company. (2021). AI in Healthcare: Transforming Diagnosis and Treatment.
- Esteva, A., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115-118.
- Topol, E. (2019). Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. Basic Books.
The project "Gen AI for Disease Prediction", utilizes advanced machine learning methodologies to forecast
diseases such as diabetes, heart disease, and cancer based on user-input symptoms. It employs the Random Forest algorithm,
a powerful and flexible machine learning model, ensuring accurate predictions while reducing the likelihood of overfitting.
To enhance prediction reliability, the system incorporates data preprocessing techniques such as feature selection, data
cleaning, and encoding. Developed using Scikit-learn, Python, and Django, the project integrates sophisticated machine
learning functions with an intuitive web interface. Users can conveniently select symptoms from dropdown menus, which
are then processed by the backend system. The machine learning model, trained on a well-structured dataset covering
various medical conditions and their symptoms, analyzes the input to generate predictions. Ultimately, this project delivers
a scalable and efficient disease prediction system that aids in the early detection of potential health issues.
Keywords :
Random Forest Algorithm, Medical Diagnosis, Scikit-Learn, Symptom Analysis, Early Disease Detection.