Predictive Modeling of Hepatitis using Machine Learning Algorithms and Mathematical Formulations

Victor C. Chiawa; Kingsley Chibueze

doi:10.62054/ijdm/0202.16

Authors

Victor C. Chiawa Department of Education Mathematics, Abia State University Uturu, Nigeria Author
Kingsley I. Chibueze Department of Computer Science & Mathematics, Godfrey Okoye University, Nigeria Author

DOI:

https://doi.org/10.62054/ijdm/0202.16

Keywords:

Hepatitis Prediction, Healthcare AI, Early Diagnosis, SMOTE, Machine Learning

Abstract

Hepatitis continues to be a major health concern for the world, mostly impacting those who are most at risk, such as infants and pregnant women. This research aims to develop a machine learning model for early hepatitis detection and reduce the number of people affected and killed by the disease. Characterization of Hepatitis B (HBV), Hepatitis C (HCV), and Hepatitis E (HEV) was done using information from a range of demographics, drawn from healthcare facilities and online repositories. Advanced data preprocessing techniques including feature selection, imputation, and normalization were employed to prepare a robust dataset. The machine learning model and the two gradient boosting algorithms of Random Forest, AdaBoost, and XGBoost were trained and tested. The models' performance was assessed using metrics such as Accuracy, Precision, Recall, F1 Score, and Confusion Matrix. XGBoost performed best according to all metrics, giving an accuracy of 0.96, precision of 0.95, recall of 0.94, and F1 score of 0.94. AdaBoost achieved an accuracy of 0.90, precision of 0.90, recall of 0.88, and F1 score of 0.89. Random Forest performed the worst among the others, giving results of 0.85 accuracy, 0.85 precision, 0.84 recall, and 0.84 F1 score. Another check on XGBoost’s performance was done with a ten-fold cross-validation approach. It gave excellent results, reaching an average accuracy of 0.8532 along with precision of 0.8503, recall of 0.8395, and an F1 score of 0.8450. When matched with other models, the proposed model outperformed them and can be applied in a clinical setting.

References

Ali, M., Khan, R., & Hussain, Z. (2019). Data mining with decision tree and fuzzy logic for hepatitis prediction. Journal of Fuzzy Systems, 20(2), 98–115.

Alizargar, J., & Mohebi, S. (2021). Utilization of XGBoost for hepatitis C prediction. Journal of Medical Data Mining, 29(5), 182–197.

Ansari, S., Shafi, I., Ansari, A., Ahmad, J., & Shah, S. I. (2011). Diagnosis of liver disease induced by hepatitis virus using artificial neural networks. In 2011 IEEE 14th International Multitopic Conference (pp. 8–12). IEEE. https://doi.org/10.1109/INMIC.2011.6151515

Ansaldi, F., Orsi, A., Sticchi, L., Bruzzone, B., & Icardi, G. (2014). Hepatitis C virus in the new era: Perspectives in epidemiology, prevention, diagnostics and predictors of response to therapy. World Journal of Gastroenterology, 20(29), 9633–9652. https://doi.org/10.3748/wjg.v20.i29.9633

Aruleba, K., Obaido, G., Ogbuokiri, B., Fadaka, A. O., Klein, A., Adekiya, T. A., & Aruleba, R. T. (2020). Applications of computational methods in biomedical breast cancer imaging diagnostics: A review. Journal of Imaging, 6(10), 105. https://doi.org/10.3390/jimaging6100105

Bayrak, E. A., Kirci, P., & Ensari, T. (2019). Performance analysis of machine learning algorithms and feature selection methods on hepatitis disease. International Journal of Multidisciplinary Studies and Innovation Technology, 3(2), 135–138.

Bhargav, R., Sharma, K., & Kumar, S. (2018). Classifying hepatitis C survival using machine learning algorithms. Journal of Clinical Data Analysis, 13(4), 102–118.

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785

Chen, X., Zhao, Y., & Li, Q. (2020). Hybrid deep learning and traditional machine learning approaches for hepatitis prediction. Artificial Intelligence in Medicine, 19(4), 198–215.

Chibueze, K. I., Ezigbo, L. I., & Kwubeghari, A. (2024). Breast cancer prediction with gradient boosting classifiers. Academy Journal of Science and Engineering, 18(2), 219–238. Retrieved from http://ajse.academyjsekad.edu.ng

Chowdhury, A., Rahman, S., & Ahmed, F. (2024). Federated learning techniques for hepatitis prediction. Journal of Healthcare Privacy, 33(2), 65–80.

GBD 2019 Hepatitis B Collaborators. (2022). Global, regional, and national burden of hepatitis B, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. The Lancet Gastroenterology & Hepatology, 7(9), 796–829. https://doi.org/10.1016/S2468-1253(22)00111-X

Gupta, S., Jain, A., & Kumar, V. (2021). Transfer learning with pretrained models for hepatitis detection. Medical Imaging Journal, 35(1), 98–115.

Huynh, T., Nguyen, H., & Pham, D. (2023). An ensemble method combining multiple classifiers for hepatitis prediction. Journal of Medical Informatics, 31(3), 88–102.

Johnson, L., Martinez, P., & Clark, K. (2017). Ensemble learning and hybrid models for hepatitis detection: A comparison of single models and ensemble approaches. Medical Data Analysis, 15(3), 234–250.

Kumar, V., Singh, M., & Patel, R. (2019). Feature engineering and machine learning techniques for hepatitis detection. International Journal of Medical Informatics, 17(1), 56–73.

Lee, S., Kim, J., & Park, H. (2018). Application of deep learning and neural networks for hepatitis prediction. Journal of Healthcare Informatics, 22(2), 89–102.

Li, F., Chen, J., & Huang, Y. (2023). Explainable AI framework using SHAP for hepatitis prediction. Journal of Medical AI, 32(6), 141–158.

Mienye, I. D., Obaido, G., Aruleba, K., & Dada, O. A. (2022). Enhanced prediction of chronic kidney disease using feature selection and boosted classifiers. In K. Arai & R. Bhatia (Eds.), Intelligent Systems Design and Applications. ISDA 2022. Lecture Notes in Networks and Systems (Vol. 564, pp. 527–537). Springer, Cham. https://doi.org/10.1007/978-3-031-21445-4_48

Nguyen, M. H., Wong, G., Gane, E., Kao, J. H., & Dusheiko, G. (2020). Hepatitis B virus: Advances in prevention, diagnosis, and therapy. Clinical Microbiology Reviews, 33(2), e00046-19. https://doi.org/10.1128/CMR.00046-19

Patel, N., Desai, M., & Shah, S. (2022). Advanced ensemble methods and deep learning for hepatitis prediction. Journal of Computational Health, 25(6), 301–318.

Perz, J. F., Armstrong, G. L., Farrington, L. A., Hutin, Y. J. F., & Bell, B. P. (2006). The contributions of hepatitis B virus and hepatitis C virus infections to cirrhosis and primary liver cancer worldwide. Journal of Hepatology, 45(4), 529–538. https://doi.org/10.1016/j.jhep.2006.05.013

Pieczkiewicz, D. S., Fowlkes, J. B., & Rees, C. A. (2010). Evaluating the decision accuracy and speed of clinical data visualizations. Journal of the American Medical Informatics Association, 17(2), 59–66. https://doi.org/10.1136/jamia.2009.001867

Safdari, R., Zakerabasali, S., & Mansourian, M. (2018). Application of SMOTE and Random Forest for hepatitis prediction. Journal of Biomedical Informatics, 21(3), 214–229.

Shang, G. F., Wang, X., & Zhao, L. (2013). Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining. Journal of Medical Virology, 85(6), 1095–1102. https://doi.org/10.1002/jmv.23540

Singh, A., Gupta, R., & Mehra, P. (2023). Explainable AI and deep learning techniques for hepatitis prediction. Journal of Medical Informatics, 27(7), 123–140.

Tai, W., He, L., Zhang, X., Pu, J., Voronin, D., Jiang, S., Zhou, Y., & Du, L. (2020). Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: Implication for development of RBD protein as a viral attachment inhibitor and vaccine. Cellular & Molecular Immunology, 17(6), 613–620. https://doi.org/10.1038/s41423-020-0400-4

Tesfa, T., Hawulte, B., Tolera, A., & Abate, D. (2021). Hepatitis B virus infection and associated risk factors among medical students in Eastern Ethiopia. PLOS ONE, 16(2), e0247267. https://doi.org/10.1371/journal.pone.0247267

Terrault, N. A., Lok, A. S. F., McMahon, B. J., Chang, K. M., Hwang, J. P., Jonas, M. M., Brown, R. S., Jr., Bzowej, N. H., & Wong, J. B. (2018). Update on prevention, diagnosis, and treatment of chronic hepatitis B: AASLD 2018 hepatitis B guidance. Hepatology, 67(4), 1560–1599.

Trépo, C., Chan, H. L., & Lok, A. (2014). Hepatitis B virus infection. The Lancet, 384(9959), 2053–2063.

Viitanen, P., Vartiainen, H., Aarnio, J., von Gruenewaldt, V., Hakamäki, S., Lintonen, T., Mattila, A. K., Wuolijoki, T., & Joukamaa, M. (2011). Hepatitis A, B, C and HIV infections among Finnish female prisoners – Young females a risk group. Journal of Infection, 62(1), 59–66.

Wang, H., Liu, Y., & Huang, W. (2017). Random forest and Bayesian prediction for hepatitis B virus reactivation. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) (pp. 2060–2064). IEEE. https://doi.org/10.1109/FSKD.2017.8399705

Wang, X., Chen, Y., & Zhang, F. (2024). Federated learning and privacy-preserving techniques for hepatitis prediction. Journal of Secure Health Data, 30(1), 23–38.

WHO. (2016). Global health sector strategy on viral hepatitis 2016–2021: Towards ending viral hepatitis (WHO/HIV/2016.06). World Health Organization. https://www.who.int/publications/i/item/WHO-HIV-2016.06

Worachartcheewan, A., Nantasenamat, C., Isarankura-Na-Ayudhya, C., & Prachayasittikul, V. (2015). On the origins of hepatitis C virus NS5B polymerase inhibitory activity using machine learning approaches. Current Topics in Medicinal Chemistry, 15(19), 2042–2054.

Yang, L., Wang, Y., & Zhang, M. (2022). Graph-based deep learning models for hepatitis prediction. Journal of Computational Medicine, 28(4), 201–216.

Yarasuri, N., Rao, P., & Reddy, M. (2019). Comparing SVM, ANN, and KNN for hepatitis prediction. Journal of Medical Machine Learning, 22(5), 67–83.

Zhang, C., Wang, C., & Liu, W. (2020). Boosting algorithms for classification tasks: An overview. IEEE Access, 8, 182923–182937. https://doi.org/10.1109/ACCESS.2020.3028654

Zhang, D., Wang, Z., Zhang, Y., & Lin, H. (2023). The value of artificial intelligence and imaging diagnosis in the fight against COVID-19. Personal and Ubiquitous Computing, 27(3), 305–318. https://doi.org/10.1007/s00779-023-01726-5

Zhang, L., Zhao, X., Wang, Y., & Sun, J. (2021). Deep learning-based prediction model for hepatitis B progression. Artificial Intelligence in Medicine, 112, 102034. https://doi.org/10.1016/j.artmed.2021.102034

Zheng, M. H., Zhang, L., & Li, H. (2014). Artificial neural network accurately predicts hepatitis B surface antigen seroclearance. PLOS ONE, 9(8), e104000. https://doi.org/10.1371/journal.pone.0104000

Predictive Modeling of Hepatitis using Machine Learning Algorithms and Mathematical Formulations

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Make a Submission

Information

Language

Indexing

Indexing