
International Conference on Innovative Computing and Communications pp 785–800 Cite as

Disease Detection and Prediction Using the Liver Function Test Data: A Review of Machine Learning Algorithms
- Ifra Altaf 20 ,
- Muheet Ahmed Butt 20 &
- Majid Zaman 21
- Conference paper
- First Online: 01 September 2021
587 Accesses
3 Citations
Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 1388)
In the last decade, there has been an admirable improvement in the classification accuracy of various machine learning techniques used for disease diagnosis. This even aids in finding the associations and patterns in the data, which helps in the construction of prediction model. Diagnosing illness by considering the features that have the maximum impact on recognition is important to control the disease. The main objective of this research paper is to provide a summarized review of literature with comparative results, which has been done for the detection and prediction of liver diseases with various machine learning algorithms using the liver function test data in order to make the analytical conclusions. From this study, it is observed that the CMAC, RBF, PSO-LS-SVM and ADTree improve the accuracy of liver disease detection and prediction. A review of past findings on the LFT data and its association with diabetes prediction is also studied.
- Liver function tests
- Diabetes nellitus
- Disease diagnosis
- Deep learning
- Artificial neural networks
This is a preview of subscription content, access via your institution .
Buying options
- DOI: 10.1007/978-981-16-2597-8_68
- Chapter length: 16 pages
- Instant PDF download
- Readable on all devices
- Own it forever
- Exclusive offer for individuals only
- Tax calculation will be finalised during checkout
- ISBN: 978-981-16-2597-8
- Instant EPUB and PDF download
- ISBN: 978-981-16-2596-1
- Dispatched in 3 to 5 business days
- Free shipping worldwide See shipping information .

P. Sharma, et al., Diagnosis of Parkinson’s disease using modified grey wolf optimization. Cogn. Syst. Res. 54 , 100–115 (2019)
Google Scholar
M. Ashraf, et al., Prediction of cardiovascular disease through cutting-edge deep learning technologies: an empirical study based on TENSORFLOW, PYTORCH and KERAS, in International Conference on Innovative Computing and Communications (Springer, Singapore, 2020)
J.A. Alzubi, et al., Efficient approaches for prediction of brain tumor using machine learning techniques. Indian J. Public Health Res. Dev. 10 (2), 267–272 (2019)
M. Ashraf, M. Zaman, M. Ahmed, An intelligent prediction system for educational data mining based on ensemble and filtering approaches. Procedia Comput. Sci. 167 , 1471–1483 (2020)
CrossRef Google Scholar
M. Ashraf, Z. Majid, A. Muheet, To ameliorate classification accuracy using ensemble vote approach and base classifiers, in Emerging Technologies in Data Mining and Information Security (Springer, Singapore, 2019), pp. 321–334
M. Ashraf, Z. Majid, A. Muheet, Performance analysis and different subject combinations: An empirical and analytical discourse of educational data mining, in 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (IEEE, 2018)
M. Ashraf, M. Zaman, M. Ahmed, Using Ensemble StackingC method and base classifiers to ameliorate prediction accuracy of pedagogical data. Procedia Comput. Sci. 132 , 1021–1040 (2018)
R. Mohd, A.B. Muheet, Z.B. Majid, GWLM–NARX. Data Technol. Appl. (2020)
R. Mohd, A.B. Muheet, Z.B. Majid Zaman Baba.SALM-NARX: Self Adaptive LM-based NARX model for the prediction of rainfall, in 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on. IEEE, 2018 .
Z. Majid, K. Sameer, A. Muheet, Analytical comparison between the information gain and Gini index using historical geographical data‖ (IJACSA) Int. J. Adv. Comput. Sci. Appl. 11 (5), 429–440 (2020)
M. Zaman, S.M.K. Quadri, A.B. Muheet, information translation: a practitioners approach.Proc. World Congr. Eng. Comput. Sci. 1 (2012)
A. Omar, Deep learning-based intrusion detection model for industrial wireless sensor networks. J. Intell. Fuzzy Syst. (2020), In press
N.M. Mir, et al., An experimental evaluation of bayesian classifiers applied to intrusion detection. Indian J. Sci. Technol. 9 (12), 1–7 (2016)
Y. Zhao, X. Huichun, A different perspective for management of diabetes mellitus: controlling viral liver diseases. J. Diabetes Res. 2017 (2017)
D.J. McLernon, et al., The utility of liver function tests for mortality prediction within one year in primary care using the algorithm for liver function investigations (ALFI). PLoS One 7 (12), e50965 (2012)
C. Kalaiselvi, G.M. Nasira, A new approach for diagnosis of diabetes and prediction of cancer using ANFIS, In 2014 World Congress on Computing and Communication Technologies (IEEE, 2014)
S. Sontakke, L. Jay, D. Reshul, Diagnosis of liver diseases using machine learning, in 2017 International Conference on Emerging Trends & Innovation in ICT (ICEI) (IEEE, 2017)
M. Jain, et al., Incidence and risk factors for mortality in patients with cirrhosis awaiting liver transplantation. Indian J. Transplant. 13 (3), 210 (2019)
A. Ifra, A.B. Muheet, Z. Majid, S. Jahangir Sidiq,A comparative study of various data mining algorithms for effective liver disease diagnosis a decade review from 2010 to 2019. 6 (1), 980–995 (2019)
Diseases and Conditions, Apollo Hospitals, https://www.apollohospitals.com/patient-care/health-and-lifestyle/diseases-and-conditions
A. Koch, Schiff’s diseases of the liver—10th edition. J. Am. Coll. Surg. (2007)
MedlinePlus, U.S. National Library of Medicine, https://medlineplus.gov/lab-tests/liver-function-tests/
Liver Function Test, https://www.webmd.com/hepatitis/liver-function-test-lft
Y. Zhao, et al., Management of diabetes mellitus in patients with chronic liver diseases. J. Diabetes Res. 2019 (2019)
The Hidden Risk of Liver Disease From Diabetes, WebMD, https://www.webmd.com/diabetes/diabetes-liver-disease-hidden-risk
S. Wild, et al., Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care 27 (5), 1047–1053 (2004)
G. Melli, A Lazy Model-Based Approach to On-Line Classification (Simon Fraser University, 1998)
P.D. Turney, Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. J. Artif. Intell. Res. 2 , 369–409 (1994)
N. Ye, X. Li, A scalable, incremental learning algorithm for classification problems. Comput. Ind. Eng. 43 (4), 677–692 (2002)
L. Ozyilmaz, Y. Tulay, Artificial neural networks for diagnosis of hepatitis disease, in Proceedings of the International Joint Conference on Neural Networks , vol. 1 ( IEEE, 2003)
Z.-H. Zhou, Y. Jiang, NeC4. 5: neural ensemble based C4. 5. IEEE Trans. Knowl. Data Eng. 16 (6), 770–773 (2004)
K. Revett, et al., Mining a primary biliary cirrhosis dataset using rough sets and a probabilistic neural network, in 2006 3rd International IEEE Conference Intelligent Systems (IEEE, 2006)
E. Comak, et al., A new medical decision making system: least square support vector machine (LSSVM) with fuzzy weighting pre-processing. Expert. Syst. Appl. 32 (2), 409–414 (2007)
M. Neshat, et al., Fuzzy expert system design for diagnosis of liver disorders, in 2008 International Symposium on Knowledge Acquisition and Modeling (IEEE, 2008)
M. Rouhani, M. Motavalli Haghighi, The diagnosis of hepatitis diseases by support vector machines and artificial neural networks, in 2009 International Association of Computer Science and Information Technology-Spring Conference (IEEE, 2009)
İÖ Bucak, S. Baki, Diagnosis of liver disease by using CMAC neural network approach, Expert. Syst. Appl. 37 (9), 6157–6164 (2010)
L.M. Ming, L. Chu Kiong, L.W. Soong, Autonomous and deterministic supervised fuzzy clustering with data imputation capabilities. Appl. Soft Comput. 11 (1), 1117–1125 (2011)
B.V. Ramana, M.S. Prasad Babu, N.B. Venkateswarlu, Liver classification using modified rotation forest.Int. J. Eng. Res. Dev. 6 (1), 17–24 (2012)
S.N.N. Alfisahrin, T. Mantoro, Data mining techniques for optimization of liver disease classification, in 2013 International Conference on Advanced Computer Science Applications and Technologies (IEEE, 2013)
O.S. Soliman, E.A. Elhamd, Classification of hepatitis C virus using modified particle swarm optimization and least squares support vector machine.Int. J. Sci. Eng. Res. 5 (3), 122 (2014)
H. Ayeldeen, et al., Prediction of liver fibrosis stages by machine learning model: A decision tree approach, in 2015 Third World Conference on Complex Systems (WCCS) (IEEE, 2015)
M. Birjandi, et al., Prediction and diagnosis of non-alcoholic fatty liver disease (NAFLD) and identification of its associated factors using the classification tree method. Iran. Red Crescent Med. J. 18 (11) (2016)
M. Hassoon, et al., Rule optimization of boosted c5. 0 classification using genetic algorithm for liver disease prediction, in 2017 International Conference on Computer and Applications (ICCA) (IEEE, 2017)
M.M. Islam, et al., Applications of machine learning in fatty live disease prediction. MIE (2018)
M. Sato, et al., Machine-learning approach for the development of a novel predictive model for the diagnosis of hepatocellular carcinoma. Sci. Rep. 9 (1), 1–7 (2019)
Hashem, Somaya, et al. “Machine Learning Prediction Models for Diagnosing Hepatocellular Carcinoma with HCV-related Chronic Liver Disease.” Computer Methods and Programs in Biomedicine (2020): 105551.
R. Philip, M. Mathias, K.M. Damodara Gowda, Evalation of relationship between markers of liver function and the onset of type 2 diabetes. J. Health Allied Sci. 4 (2), 090-093 (2014)
Q.M. Nguyen, et al., Elevated liver function enzymes are related to the development of prediabetes and type 2 diabetes in younger adults: the Bogalusa Heart Study. Diabetes Care 34 (12), 2603–2607 (2011)
H. Ni, H.H.K. Soe, A. Htet, Determinants of abnormal liver function tests in diabetes patients in Myanmar. Int J Diabetes Res 1 (3), 36–41 (2012)
D.H. Salih, Study of liver function tests and renal function Tests in diabetic type II patients. IOSR J. Appl. Chem 3 (3), 42–44 (2013)
K. Bora, et al., Presence of concurrent derangements of liver function tests in type 2 diabetes and their relationship with glycemic status: a retrospective observational study from Meghalaya. J. Lab. Physicians 8 (1), 30 (2016)
S. Ghimire, et al., Abnormal liver parameters among individuals with type 2 diabetes mellitus Nepalese population. Biochem Pharmacol (Los Angel) 7 (1), 2167-0501 (2018)
A. Singh, et al., Deranged liver function tests in type 2 diabetes: a retrospective study
G. Teshome, et al., Prevalence of liver function test abnormality and associated factors in type 2 diabetes mellitus: a comparative cross-sectional study. EJIFCC 30 (3), 303 (2019)
D. Nikitha Alampally, DS Jaipuriar, N. Alampally, A study on liver function impairment in type-2 diabetes mellitus.IJRAR-Int. J. Res. Anal. Rev. (IJRAR) 7 (1), 939–943 (2020)
Download references
Author information
Authors and affiliations.
Department of Computer Sciences, University of Kashmir, Srinagar, J&K, India
Ifra Altaf & Muheet Ahmed Butt
Directorate of IT&SS, University of Kashmir, Srinagar, J&K, India
Majid Zaman
You can also search for this author in PubMed Google Scholar
Editor information
Editors and affiliations.
Maharaja Agrasen Institute of Technology, Delhi, India
Dr. Ashish Khanna
Department of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Rohini, Delhi, India
Dr. Deepak Gupta
Rajnagar Mahavidyalaya, Birbhum, India
Prof. Dr. Siddhartha Bhattacharyya
Faculty of Computers and Information, Cairo University, Giza, Egypt
Prof. Aboul Ella Hassanien
Department of Computer Science, Shaheed Sukhdev College of Business Studies, Rohini, India
Dr. Sameer Anand
Department of Computer Science, Shaheed Sukhdev College of Business Studies, Rohini, Delhi, India
Dr. Ajay Jaiswal
Rights and permissions
Reprints and Permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper.
Altaf, I., Butt, M.A., Zaman, M. (2022). Disease Detection and Prediction Using the Liver Function Test Data: A Review of Machine Learning Algorithms. In: Khanna, A., Gupta, D., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing, vol 1388. Springer, Singapore. https://doi.org/10.1007/978-981-16-2597-8_68
Download citation
DOI : https://doi.org/10.1007/978-981-16-2597-8_68
Published : 01 September 2021
Publisher Name : Springer, Singapore
Print ISBN : 978-981-16-2596-1
Online ISBN : 978-981-16-2597-8
eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Advanced Search
- Journal List
- J Healthc Eng
- v.2022; 2022

Prediction Model of Adverse Effects on Liver Functions of COVID-19 ICU Patients
Aisha mashraqi.
1 College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
Hanan Halawani
Turki alelyani, mutaib mashraqi.
2 Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Najran University, Najran, Saudi Arabia
Mohammed Makkawi
3 Faculty of Applied Medical Sciences, King Khalid University, Abha, Saudi Arabia
Sultan Alasmari
Asadullah shaikh, ahmad alshehri, associated data.
For the privacy of individuals (patients' laboratory results involved in the study), data cannot be made available publicly.
SARS-CoV-2 is a recently discovered virus that poses an urgent threat to global health. The disease caused by this virus is termed COVID-19. Death tolls in different countries remain to rise, leading to continuous social distancing and lockdowns. Patients of different ages are susceptible to severe disease, in particular those who have been admitted to an ICU. Machine learning (ML) predictive models based on medical data patterns are an emerging topic in areas such as the prediction of liver diseases. Prediction models that combine several variables or features to estimate the risk of people being infected or experiencing a poor outcome from infection could assist medical staff in the treatment of patients, especially those that develop organ failure such as that of the liver. In this paper, we propose a model called the detecting model for liver damage (DMLD) that predicts the risk of liver damage in COVID-19 ICU patients. The DMLD model applies machine learning algorithms in order to assess the risk of liver failure based on patient data. To assess the DMLD model, collected data were preprocessed and used as input for several classifiers. SVM, decision tree (DT), Naïve Bayes (NB), KNN, and ANN classifiers were tested for performance. SVM and DT performed the best in terms of predicting illness severity based on laboratory testing.
1. Introduction
The COVID-19 pandemic was declared a health emergency in 2020. Many people have died during the pandemic, particularly in the early stages, due to a lack of understanding of the virus. COVID-19 has led to over 3.5 million deaths worldwide [ 1 – 3 ]. Patients infected with COVID-19 may experience no symptoms or severe illness that can lead to death [ 4 ]. The virus continues to evolve, with concerning mutants emerging all over the world [ 5 ]. This is an alarming situation and requires a better understanding of the disease in order to save more lives. Critical cases of COVID-19 could result in organ failure and death. Lung failure is the most common complication, but other organs can also be affected by the virus. In fact, multiorgan failure involving the lungs, kidneys, liver, cardiovascular system, and gastrointestinal tract (GIT) can also occur [ 6 ]. Additionally, people who already suffer from liver diseases, such as cirrhosis, are at a higher risk of decompensation and death during COVID-19 infection [ 7 ]. Organ failure is serious; therefore, managing infection is of interest.
The liver is a vital organ, and its failure could be fatal. COVID-19 patients can have mild to severe symptoms and may develop acute hepatic failure [ 6 ]. According to the proposed mechanism, hepatic failure occurs due to multiple factors. These include angiotensin-converting enzyme 2 (ACE2), a SARS-CoV-2 receptor found in multiple organs including the liver, and cytokine storm, which occurs as a result of inflammatory mediators, endothelial dysfunction, coagulation abnormalities, and inflammatory cell infiltration into the organs [ 6 ]. Direct cytotoxicity caused by active virus replication in the liver could result in liver cell damage. Furthermore, hypoxic liver damage is exacerbated by severe lung failure and disease. Cardiac congestion as a result of SARS-CoV-2 disease-induced right-sided heart failure can also result in liver damage. Furthermore, people with preexisting liver disease, as well as drug-induced liver injury, experience exacerbation [ 8 ]. To avoid COVID-19 disease complications, it is critical to detect liver damage early and understand its extent.
The exact molecular mechanism of the above-mentioned hepatic injury is unknown. However, SARS-CoV-2 viral RNA has been detected in liver tissue using qRT-PCR, indicating that the virus can affect liver cells [ 9 ]. It is still unclear where virus replication occurs in the liver, but an intact virus was found in the cytoplasm of COVID-19 patients with abnormal liver function tests [ 10 ]. Viral receptors have been found on the surface of host cells, which could explain the viral tropism towards the specific tissue. SARS-CoV-2 enters the cell via the virus's S protein, which binds to host cell receptors such as ACE2 and TMPRSS2 [ 11 ]. The expression of ACE2 and TMPRSS2 receptors is low but still presents in the hepatic cells [ 12 ]. Moreover, it is a noteworthy finding that the expression of ACE2 receptors is increased in both humans and mice with liver fibrosis [ 13 ]. Interestingly, hypoxic cases were found to be associated with increased expression of ACE2 receptors, which could explain the mechanism of ACE2 receptor upregulation in COVID-19 patients due to lung damage [ 13 ].
A variety of factors in SARS-CoV-2 infection can result in hypoxia-induced liver damage. Heart failure, lung failure, and sepsis are the three most serious of these. These factors account for 90% of all cases of hypoxic damage in COVID-19 cases. Moreover, right-sided heart failure causes liver congestion due to raised central venous pressure (CVP). Hypoxia and liver congestion cause centrilobular necrosis over time [ 14 ]. Many known hepatotoxic agents have been used to treat COVID-19 disease. These drugs include corticosteroids and antivirals. Corticosteroids have been found to cause steatosis, and hepatotoxicity is caused by antivirals such as ritonavir and remdesivir [ 8 ].
Liver enzymes, which were found to be elevated in a number of COVID-19 cases, can be used to detect liver damage. Although the incidence of liver involvement has been reported in several COVID-19 cases, the extent of the prevalence of hepatic damage remains unknown [ 15 ]. Elevated liver enzymes, particularly alanine aminotransferase (ALT) and aspartate aminotransferase (AST), have been reported in 14% to 53% of patients [ 16 ]. There is a strong correlation between the severity of the disease and the extent of liver involvement [ 16 ]. According to research, mild COVID-19 disease causes a mild elevation of liver enzymes, whereas severe disease causes a significantly higher level of liver enzymes [ 16 , 17 ]. In a study of 222 COVID-19 patients, 28.2% had elevated liver enzymes. The reason for this elevation, however, was not specified, and it could have been preexisting [ 18 ]. Furthermore, a study of 417 COVID-19 patients discovered that 76.3% of the total sample had abnormal liver function tests. During their hospital stay, 21.5% suffered a liver injury. Their levels of liver enzymes significantly increased within two weeks of hospitalization. According to the findings of the study, patients with significantly elevated liver enzymes are at a higher risk of developing severe disease [ 19 , 20 ].
Machine learning (ML) is being introduced to medicine and used as artificial intelligence (AI) to create predictive models based on data patterns. Machine learning can also be used to create a predictive model of liver involvement [ 21 ]. Machine learning (ML) is currently being used to predict the possibility of fatty liver disease [ 22 ], the success of liver transplants [ 23 ], and other hepatic conditions. However, there is still no firm agreement on which machine learning algorithm is best to use as an illness-prediction method. The outcome of patients with raised liver enzymes admitted to the ICU with COVID-19 disease should be predicted using machine learning (ML), which could be useful in disease management.
Millions of people have died as a result of the SARS-CoV-2 virus, and more people are becoming infected every day. Elevated liver enzymes are linked to the severity of the illness, which can be fatal. Early detection of disease warning signs, on the other hand, can be beneficial. During COVID-19 disease, elevated liver enzymes are seen, and their level is related to the severity of the disease and the extent of liver damage. Therefore, monitoring of liver enzymes in ICU SARS-CoV-2 patients can be used to improve their health. Moreover, with the progress of machine learning toward improved screening methods for the severity of COVID-19 infection, the numbers of infected individuals have decreased significantly, motivating artificial intelligence (AI) scientists and medical physicians to employ this subject more thoroughly in the health sector. Algorithms in machine learning are developed to allow computers to learn. ML algorithms can be used for classification problems, which have been applied in the medical field to help in the early diagnosis of several diseases. However, there are specific difficulties with these computational methods, including the feature-selection step in prediction models. Other studies have used a different methodology for feature selection, such as a pivot table in [ 24 ] and a P-value in [ 25 ].
In this paper, we propose a model to predict liver damage based on data patterns using supervised learning techniques. The model is named detecting model for liver damage (DMLD), and it employs machine learning algorithms to assist in the early detection of the risk of liver damage. It will support healthcare professionals to diagnose the disease at its early stages. Data from blood tests of COVID-19 patients admitted to the ICU were collected, cleaned, and prepared to be used as input for the model. Secondly, we designed the DMLD model that prepares the data set in the preprocessing phase by addressing the missing values and applying the normalization approach. Then, the DMLD model identifies the most relevant features in the feature-selection phase by applying a filtering method. Consequently, five machine learning classifiers were examined in order to find the best-performing algorithms; which are support vector machine (SVM), decision tree, Naïve Bayes (NB), K-nearest neighbors (KNN), and artificial neural network (ANN). These methods have certain drawbacks; for example, NB is simple and suitable for large data sets. However, it assumes that numeric properties have a normal distribution. Data preparation is easier with the DT but is dependent on the sequence of the characteristics. KNN, SVM, and ANN are computationally expensive [ 26 ]. In our study, the performance of the DMLD model was evaluated on the collected data set, and the results show that the accuracy, precision, and recall of the SVM and DT classifiers are better than others. Therefore, we considered SVM and DT the likely best algorithms for detecting the risk of liver damage. Figure 1 illustrates the study framework.

Study framework.
The rest of the paper is structured as follows. First, we present the related work. Then, we explain the DMLD prediction model in detail, describing the data set details and the DMLD stages with the classification algorithms. Then, we present the results and discuss the performance of the DMLD model, including the measurement of classification techniques. Finally, we provide the conclusions and identify the future directions.
2. Related Work
Machine learning approaches have attracted the attention of many researchers and have been applied in different disciplines such as medicine, the economy, and education. Moreover, machine learning plays an essential role in the medical field, contributing to various health sectors such as the early diagnosis of disease and treatment. Liver disease is a common health issue. Therefore, early diagnosis of the risk factors will help medical physicians predict the development of the disease [ 27 ].
Ayeldeen et al. [ 28 ] highlighted that the positive prediction of different stages of liver fibrosis can be predicted by biochemical markers. The decision tree algorithm has been considered to predict the risk of liver fibrosis, and the model has been tested using a data set that includes laboratory tests and fibrosis markers. Another study [ 29 ] compared the performance of different algorithms (logistic regression, KNN, ANN, and SVM) to assess liver disease detection. Additionally, Sontakke et al. [ 30 ] utilized backpropagation and SVM algorithms to predict liver disease. Thirunavukkarasu et al. [ 24 ] applied logistic regression, SVM, and KNN for predicting liver disease based on the evaluation of accuracy, sensitivity, and specificity (recall). Moreover, Venkata Ramana et al. [ 31 ] studied the performance of various machine learning algorithms using different metrics (accuracy, precision, sensitivity, and specificity).
A support vector machine (SVM) is considered a promising machine learning algorithm for classification problems. In addition, there are many studies that apply the SVM algorithm to text classification, face recognition, and bioinformatics. The performance of the SVM algorithm is often good compared to other techniques [ 32 – 34 ]. Another machine learning algorithm is the Naïve Bayes classifier, which is a simple probabilistic classifier applying Bayes' theorem. In addition, the Naïve Bayes classifier estimates the means and variances of the variables for classification using a small amount of training data [ 35 ]. Moreover, decision tree (DT) and K-nearest neighbors (KNN) are supervised learning algorithms considered suitable for addressing both classification and regression problems [ 36 – 38 ]. Another popular machine learning method is the artificial neural networks (ANN) that are inspired by the neural networks of the human brain [ 39 ].
Deep learning has exploded significantly in scientific computing, with its techniques being utilized by a variety of fields to solve complicated problems. To perform certain tasks, all deep learning algorithms employ various forms of neural networks. Neural networks are used in deep learning to perform complex computations on massive amounts of data. It is a form of machine learning that is based on the human brain's structure and function. The performance of classification is improved the most when the machine learning algorithm is updated with a deep learning algorithm. Over the last few years, there has been a lot of development in the use of neural networks for feature extraction in object identification problems. For example, Zhang et al. created Deep-IRTarget, a unique backbone network composed of a frequency feature extractor, a spatial feature extractor, and a dual-domain feature resource allocation model, to cope with challenges in feature extraction [ 40 ]. Moreover, the deep learning algorithm is employed in burnt area mapping with the use of Sentinel-12 data [ 41 ]. Zhang et al. present a Siamese self-attention (SSA) classification approach for multisensor burnt area mapping, and a multisource data set is created at the object level for training and testing. Zhang et al. implement a robust, multicamera, multiplayer tracking framework. They used a deep learning algorithm in their system to understand the impact of player identification and the most distinguishing data [ 42 ]. Furthermore, deep learning algorithms have been used to identify COVID-19 using X-ray processing. For example, several studies [ 43 – 45 ] present a rapid, robust, and practical method for detecting COVID-19 from chest X-ray images. According to experiments by Mahajan et al. [ 43 ], DenseNet is the best classifier to utilize as a base network with SSD512, especially for the problem of identifying COVID-19 infection in chest X-ray images. Mahajan et al. [ 44 ] developed a model for detecting COVID-19 from chest X-ray images. They used ResNet101 as the basic network and implemented transposed convolution, prediction modules, and information injection into the DSSD network. The artificial intelligence-based detection models can significantly contribute to the attainment of massive and high-performing screening programs in various medical sectors.
3. Proposed Method
The main contribution of this study is the design of a prediction model to detect the risk of liver damage, called the detecting model for liver damage (DMLD).
3.1. Detecting Model for Liver Damage (DMLD)
In this study, we design a prediction model for adverse effects on liver functionality of COVID-19 ICU patients called detecting model for liver damage (DMLD). The methodology of this study involves five stages, which are data collection, data preprocessing, feature selection, classifiers, and evaluation and then result collection. Figure 2 illustrates the system architecture of the DMLD prediction model. Moreover, a detailed explanation of the DMLD model will be presented in the following subsections.

System architecture of DMLD prediction model.
3.1.1. Material
The data set used in this research was obtained from two main hospitals in the southern region of Saudi Arabia (Asir Central Hospital (ACH) in Asir and King Khalid Hospital in Najran). A total of 140 patients were included in the data set. The study was limited to patients with positive COVID-19 infection who were admitted to the intensive care unit (ICU). Ethical approval (REC No.: REC-11-1O-2020) for this study was obtained from the Regional Committee for Research Ethics, Directorate of Health Affairs, Asir Region, Ministry of Health, Saudi Arabia, and ethical approval (IRB Log Number: 2020-24E) for this study was obtained from the Regional Committee for Research Ethics, Directorate of Health Affairs Najran, Ministry of Health, Saudi Arabia.
The data set has recent laboratory results and missing values are very minimal. The laboratory results contain 20 numeric attributes as follows: creatinine, glucose, sodium, potassium, calcium, phosphorus, magnesium, chloride, uric acid, urea, total protein, TG, AST, ALT, cholesterol-VLDL, cholesterol-LDL, cholesterol-HDL, and LDH. The class presented in this data set is binary, which refers to whether a patient has damage in the liver functionality or not based on abnormal liver enzymes. Prediction of liver damage is very likely based on elevated liver enzymes, which are released from the liver as a result of liver injury. SARS-CoV-2 has been reported to cause infection of the liver via binding to angiotensin-converting enzyme 2 (ACE2) on cholangiocytes, which are a population of liver cells [ 46 ]. The binding of SARS-CoV-2 to ACE2 will facilitate viral entry into the liver, causing damage to liver cells (hepatocytes) [ 46 , 47 ]. Levels of ALT and AST in our data, which are specific liver enzymes, were significantly increased indicating liver injury. We identified liver damage based on normal values of liver enzymes. Table 1 shows the liver enzymes along with their normal and disturbing values. Any patient with increased liver enzymes levels is considered at risk of liver damage. In the study data set, the percentage of possible liver damage is 50%. Table 2 shows the data set attributes and the obtained results from the laboratory, which were used to examine the DMLD prediction model.
Specific liver enzymes with reference ranges.
Data set attributes.
3.1.2. Data Preprocessing
The aim of the data preprocessing phase is to clean the data set in order to use it as input for classifier algorithms and then to provide more accurate observation. One of the significant issues in the collected real data is missing values. These missing values are very rare, at 4%; therefore, they were excluded from the data set. Another important aspect of data preprocessing is normalization, in which all attributes should have equal weight. In a simple ward, a common scale or range can be used. A popular and widely used normalization technique is min-max normalization, which is applied in this study. The min-max normalization technique transforms and rescales the data between the range [0, 1] by the following equation:
where min F and max F are the minimum and the maximum values of the feature F , respectively. The original and the normalized value of the attributes, F , are represented by x and x ′, respectively [ 48 ].
3.1.3. Features Selection
The data collected from the blood test will have plenty of different features with different information. Therefore, the feature-selection step is applied to reduce the number of relevant features in the data set, and consequently, the size of the problem will be reduced, and we can obtain a better prediction for the risk of liver damage. In this research, the filter method has been followed in order to rank the importance of k features in the data set based on the relationship between the features and the target variable [ 49 ]. In addition, the correlation between the selected features was examined in order to understand the data set and the relationship between the features.
3.1.4. Classifiers
In the DMLD model, five machine learning classifiers have been used, which are support vector machine (SVM), decision tree (DT), Naïve Bayes (NB), K-nearest neighbors (KNN), and artificial neural network (ANN). These classifiers were used to determine the risk of liver damage and the selection of these classifiers is based on the following characteristics.

Classification of data by support vector machine (SVM).
- where x 1 is a real vector and y 1 is the class to which x 1 belongs and is either 1 or −1. The distance between the two classes y =1 and y =−1 can be maximized by constructing a hyperplane, which is defined as follows: w ⟶ · x ⟶ − b = 0 , (3)
- where w ⟶ is the normal vector and b / w ⟶ is the hyperplane's offset along w ⟶ .
- In an SVM model, tuning parameters help optimize the classification results based on the specific data points provided [ 54 ]. One of them may be the kernel, a mathematical function that accepts data as input and transforms it into the required format. These functions return the inner combination between two points in a sufficient space, which might be linear, nonlinear, radial base function (RBF), polynomial, or sigmoid.
- Decision Tree (DT): The decision tree classifier is considered a supervised learning algorithm [ 36 ]. Compared with other supervised learning algorithms, a decision tree algorithm can be used for dealing with both classification and regression problems. The overall perspective of using a DT is to create a preparation model that can predict class or assessment of target factors by taking decision standards derived from training data. The decision tree classifier can be a fast learner when constructing a decision/regression tree utilizing acquired information as the splitting criterion, and it prunes the tree by minimizing error pruning [ 37 ].
- Naïve Bayes (NB): A Naïve Bayes classifier is a classical probabilistic classifier dependent on performing Bayes' theorem within a highly independent assumption [ 35 ]. The fundamental probability model would be as descriptive as the self-determining feature model. The basic assumption in the Naïve Bayes classifier is that the presence of a specific feature of a class is unassociated with the presence of other features [ 55 ]. Even if the assumption is not accurate, the Naïve Bayes classifier performs reasonably well. The Naïve Bayes classifier has another advantage, which is that it only requires a small data set for the training stage in order to compute the means and variances of the essential variables for classification. For each label, only the variances of the variables need to be computed, not the whole covariance matrix, because unassociated variables are unspecified. The kernel of the Naïve Bayes operator can be formulated on numerical attributes. This is clearly achieved by applying Bayes' theorem and kernel density estimation. P ^ y = j | x 0 = π ^ j f ^ j x 0 ∑ k = 1 k π ^ k f ^ k x 0 , (4)
- where π ^ is an estimate of the prior probability of class j , and normally, π ^ is the sample proportion falling into the j th classification. f ^ j is the predictable density at x 0 depending on a kernel density fit, including only perceptions from the j th class. This is essentially similar to discriminant analysis, only instead of assuming normality, it estimates the probability density of the classes utilizing a nonparametric method, Patrick.
- K-Nearest Neighbors (KNN): In machine learning, KNN is one of the most fundamental classification algorithms, and it produces excellent results [ 36 ]. KNN is a nonparametric, instance-based learning algorithm and can be used to solve problems involving classification and regression. In classification, KNN is used to determine which class a new unlabeled item belongs to. In any case, the KNN makes a shot at the assumption that comparable samples are close fits [ 38 ]. KNN sorts a sample into the most decided class among K neighbors. K is usually odd and is restricted by how the classification algorithms can be adjusted [ 56 ]. This will be achieved by computing the distance between the data points that are nearest to the samples by using methods such as Euclidean distance, Manhattan distance, Hamming distance, or Minkowski distance. In this study, the Euclidean distance metric was used in the final model for calculating the distance between data points. Following the calculation of the distance, the K closest neighbors are chosen, and the resultant class of the new object is determined using the votes of the neighbors [ 51 , 57 ].
- Artificial neural network (ANN): The functionality of an artificial neural network (ANN) is similar to that of the human brain [ 39 ]. It resembles a network of nodes known as artificial neurons. All of these nodes communicate with each other to transmit information. The neurons in the ANN can be represented by a state (0 or 1), and each node might have a weight attached to it that determines its relevance or strength in the system. The ANN structure is separated into layers with many nodes; data flow from the first layer (input layer) to the output layer after passing through intermediary levels (hidden layers). Every layer turns the data into relevant information before delivering the target output [ 58 ]. The processes of transfer and activation are crucial in the functioning of neurons. The sum of all the weighted inputs is calculated using the transfer function:
where b is the bias value, which in most cases is 1. Furthermore, the activation function essentially flattens the transfer function's output into a specified range. The activation function could be linear or nonlinear and can be expressed simply as follows: f z = z , (6)
Since no data restrictions are provided by the activation function, the sigmoid function is employed [ 51 ], which is written as follows:
3.1.5. Evaluation
The proposed model's (DMLD) performance was evaluated using the measurement performance of several classification algorithms. Various evaluation methodologies, such as accuracy, precision, and recall, are used. The following is a list of their definition.
Accuracy: The percentage of accurate and valid classifications is known as the accuracy [ 59 ]. To calculate the accuracy, the true positive (TP), false positive (FP), true negative (TN), and false negative (FN) values are required.
Precision: Positive predictive value is another term for precision. It shows the percentage of positive outcomes successfully predicted by classifier algorithms.
Recall: Recall is also referred to as sensitivity or true positive rate because it mostly displays the method's positive outcomes [ 60 ]. The affectability evaluation determines the patient's ability to be identified by their liver condition.
The evaluation variables that are used in the performance measurement, which is the confusion matrix, are determined as follows. True positive (TP): The outcome of the prediction properly identifies the presence of the risk of liver damage in a patient. False positive (FP): The outcome of the prediction mistakenly identifies a patient as having the risk of liver damage. True negative (TN): The outcome of the prediction properly rejects the possibility of a patient being at risk of liver damage. False negative (FN): The outcome of the prediction mistakenly rejects the possibility of a patient being at risk of liver damage.
Tenfold cross-validation is used to avoid the problems of over- and underfitting [ 61 ]. Then, the previous measurement performance is used to evaluate the classification systems' performance. Accuracy reflects how accurate our classifier is in determining whether or not a patient is at risk of liver damage. Precision also has been applied to measure the classifier's ability to make an accurate, positive prediction of the risk of liver damage. Additionally, sensitivity or recall is employed in our research to determine the percentage of actual positive cases of risk of liver damage that the classifier properly detects.
4. Results and Discussion
In this study, the DMLD model is proposed to contribute to the prediction of the risk of liver damage using laboratory blood tests. The DMLD model was implemented and examined in the Python 3.8 programming language via Anaconda Navigator [ 62 ]. In addition, different measurement metrics (accuracy, precision, and recall) were considered to assess the performance of the DMLD model. This was conducted using different machine learning classifiers to predict the risk of liver damage. Tenfold cross-validation was considered in order to validate the results. The data set in this study includes 140 COVID-19 ICU patients with 20 features, as shown in Table 1 . Normalization is used for scaling the data because the data set variables (e.g., ALT, AST, and LDH) have different ranges of values. For example, LDH for a single patient is 499 U/L, and ALT and AST are 90 U/L and 34 U/L, respectively. Therefore, we applied different normalization algorithms such as min-max and mean, but the results did not show any difference. After applying the feature-selection step in the DMLD model, the results revealed that the three highest-scoring features were AST, ALT, and LDH, as shown in Figure 4 . These selected features agreed with clinically reported features related to liver injury. ALT and AST are specific liver enzymes, and hence, they are considered markers for liver injury and failure [ 63 , 64 ]. Moreover, increased LDH levels have been reported in patients with acute liver failure [ 65 , 66 ]. Correlation coefficients of selected features were applied to screen for possible correlation. The linear relationship among selected features was defined as follows: positive correlation for r = 0.01 to 1.0 (where 1.0 was considered strong). As illustrated in Figure 5 , a heat map was used to present our results, in which ALT and AST showed a significant positive correlation with r = 0.96. This correlation between ALT and AST is not surprising, since they are already approved scientifically as liver function markers. However, in agreement with our selection of LDH as an important feature, the heat map results interestingly revealed a very strong correlation between LDH and both specific liver enzymes ALT and AST, with r = 0.94 and r = 0.97, respectively.

Top selected features.

Heat map for checking the correlation between selected features.
Figure 6 describes the performance of the different classifiers used in the DMLD model, which are support vector machine (SVM), DT, Naïve Bayes (NB), K-nearest neighbors (KNN), and artificial neural network (ANN). In the validation phase, the model was tested in two different methods, namely train-test split and tenfold cross-validation. In the train-test split approach, the data set was divided into two parts, training and testing. The DMLD model was trained with 80% of the data set, and the remaining data were used for testing the DMLD model, by which preliminary results were gained. In addition, the tenfold cross-validation was applied in order to avoid overfitting, as shown in Table 3 and Figure 6 .

Results of the classifier's performance on the DMLD model.
Evaluation parameters of different classifiers in the DMLD model.
Table 3 and Figure 6 show that the accuracy of SVM is 0.87 and that of DT is 0.85, while for the Naïve Bayes, KNN, and ANN, it is 0.71. Therefore, SVM and DT achieved higher accuracy than other classifiers (Naïve Bayes, KNN, and ANN). In addition, we tried to study the impact of different layers on the ANN performance by measuring the accuracy of the ANN algorithm, but the results showed no effect on the algorithm performance, as presented in Table 4 . Regarding precision, SVM achieved the highest score, with 0.95, and the score was 0.93 for DT. For Naïve Bayes, KNN, and ANN classifiers, the precision values were found to be 0.5, 0.5, and 0.49, respectively. The recall score of SVM was the highest, at 0.95, and this score was 0.93 for DT. For Naïve Bayes, KNN, and ANN classifiers, recall scores were 0.5, 0.5, and 0.49, respectively.
The impact of different layers on the ANN performance.
The performances of five classifiers in the DMLD model have been examined. Therefore, from the above results, it can be noted that SVM and DT are the most sufficient classifiers in the DMLD model for predicting the risk of liver damage in COVID-19 patients. In agreement with our study, performances of the SVM [ 30 ] and DT [ 28 ] algorithms have been utilized to predict liver disease. SVM has shown the best performance. This is perhaps due to its ability to classify classes and generate a hyperplane that segregates classes after data transformation. Therefore, early diagnosis of risk factors by machine learning models such as SVM could assist in planning medical decisions and treatment.
5. Conclusions and Future Work
The effects of COVID-19 on the body are widespread. The early diagnosis of liver damage due to COVID-19 can contribute to making medical decisions. Therefore, this study suggests that the DMLD model can help in the prediction of the risk of liver damage during SARS-CoV-2 infection. To evaluate the DMLD model, data on COVID-19 and ICU patients were collected, preprocessed, and then used as an input for different classifiers. The performances of SVM, DT, Naïve Bayes, KNN, and ANN classifiers were evaluated. SVM and DT showed the best performance for predicting the diagnosis of disease severity based on laboratory tests. Therefore, this model could be applied for the prediction of other diseases. The further study of our work can be considered from two directions. Firstly, the prediction of different risk levels of liver diseases could be extended, as the current work is limited to the DMLD model. Secondly, our data were limited to laboratory tests, and therefore future work could consider CT scan images.
Acknowledgments
The authors would like to express their gratitude to the Ministry of Education and the Deanship of Scientific Research—Najran University—Kingdom of Saudi Arabia for their financial and Technical support under code number: NU/-/SERC/10/623.
Data Availability
Ethical approval.
Ethical approval (REC No.: REC-11-1O-2020) for this study was obtained from the Regional Committee for Research Ethics, Directorate of Health Affairs, Asir Region, Ministry of Health, Saudi Arabia, and ethical approval (IRB Log Number: 2020-24E) for this study was obtained from the Regional Committee for Research Ethics, Directorate of Health Affairs Najran, Ministry of Health, Saudi Arabia.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Prediction for Diagnosing Liver Disease in Patients using KNN and Naïve Bayes Algorithms
Ieee account.
- Change Username/Password
- Update Address
Purchase Details
- Payment Options
- Order History
- View Purchased Documents
Profile Information
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Accessibility
- Terms of Use
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2023 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

We apologize for the inconvenience...
To ensure we keep this website safe, please can you confirm you are a human by ticking the box below.
If you are unable to complete the above request please contact us using the below link, providing a screenshot of your experience.
https://ioppublishing.org/contacts/
Please solve this CAPTCHA to request unblock to the website
- Research article
- Open Access
- Published: 16 November 2020
Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques
- Georgios Kantidakis 1 , 2 , 3 ,
- Hein Putter 2 ,
- Carlo Lancia 1 ,
- Jacob de Boer 4 ,
- Andries E. Braat 4 &
- Marta Fiocco 1 , 2 , 5
BMC Medical Research Methodology volume 20 , Article number: 277 ( 2020 ) Cite this article
4274 Accesses
18 Citations
3 Altmetric
Metrics details
Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians.
In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques.
Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years.
In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables.
Trial registration
Retrospective data were provided by the Scientific Registry of Transplant Recipients under Data Use Agreement number 9477 for analysis of risk factors after liver transplantation.
Peer Review reports
Liver transplantation (LT) is the second most common type of transplant surgery in the United States after kidney [ 1 ]. Over the last decades, the success of liver transplants has improved survival outcome for a large number of patients suffering from chronic liver disease everywhere on earth [ 2 ]. Availability of donor organs is a major limitation especially when compared with the growing demand of liver candidates due to the enlargement of age limits. Therefore, improvement on current prediction models for survival since LT is important.
There is an open discussion about the value of machine learning (ML) versus statistical models (SM) within clinical and healthcare practice [ 3 – 7 ]. For survival data, the most commonly applied statistical model is the Cox proportional hazards regression model [ 8 ]. This model allows a straightforward interpretation, but is at the same time restricted to the proportional hazards assumption. On the other hand, ML techniques are assumption-free and data adaptive which means that they can be effectively employed for modelling complex data. In this article, the results between SM and ML techniques are assessed based on a 3-stage comparison: predictive performance for large sample size/large number of covariates, calibration (absolute accuracy) which is often neglected, and interpretability in terms of the most prognostic factors identified. Advantages and disadvantages for each method are detailed.
ML techniques need a precise set of operating conditions to perform well. It is important that a) the data have been adequately processed so that the inputs allow for good learning, b) modern method is applied using state-of-the-art programming software and c) proper tuning of the parameters is performed to avoid sub-optimal or default choices for parameters which downgrade the algorithm’s performance. Danger of overfitting is associated with ML approaches (as they employ complex algorithms). A note of caution is required during model training to prevent from overfitting, e.g. the selection of suitable hyper-parameters. Needless to say, overfitting might also occur with a traditional model if it is too complex (estimation of too many parameters) thus limiting generalizability outside training instances.
Neural networks have been commonly applied in healthcare. Consequently, different approaches for time-to-event endpoints are present in the literature. Biganzoli et al. proposed a partial logistic regression approach of feed forward neural networks (PLANN) for flexible modelling of survival data [ 3.0.co;2-d " href="/articles/10.1186/s12874-020-01153-1#ref-CR9" id="ref-link-section-d19895204e844">9 ]. By using the time interval as an input in a longitudinally transformed feed forward network with logistic activation and entropy error function, they estimated smoothed discrete hazards at each time interval in the output layer. This is a well known approach for modelling survival neural networks [ 10 ]. In 2000, Xiang et al. [ 11 ] compared the performance of 3 existing neural network methods for right censored data (the Faraggi-Simon [ 12 ], the Liestol-Andersen-Andersen [ 13 ] and a modification of the Buckley-James method [ 14 ]) with Cox models in a Monte Carlo simulation study. None of the networks outperformed the Cox models and they only performed as good as Cox for some scenarios. Lisboa et al. extended the PLANN approach introducing a Bayesian framework which can perform Automatic Relevance Determination for survival data (PLANN-ARD) [ 15 ]. Several applications of the PLANN and the PLANN-ARD methods can be found in the literature [ 16 – 19 ]. They show potential for neural networks in systems with non-linearity and complex interactions between factors. Here extensions of the PLANN approach for big LT data are examined.
The clinical endpoint of interest for this study is overall graft-survival defined as the time between LT and graft-failure or death. Predicting survival after LT is hard as it depends on many factors and is associated with donor, transplant and recipient characteristics whose importance changes over time and per outcome measure [ 20 ]. Models that combine donor and recipient characteristics have usually better performance for predicting overall graft-survival and particularly those that include sufficient donor risk factors have better performance for long-term graft survival [ 21 ]. The aims of this manuscript can be summarised as: i) potential role of ML as a competitor of traditional methods when complexity of the data is high (large sample size, high dimensional setting), ii) identification of potential risk factors using 2 ML methods (random survival forest, survival neural networks) complementary to the Cox model, iii) use of variable selection methods to compare their predictive ability with the models including the non-reduced set of variables, iv) evaluation of predictions and goodness of fit, and v) clinical relevance of the findings (potential for medical applications).
The paper is organized as follows. “ Methods ” section presents details about data collection and the imputation technique, SMs and ML. Further sections discuss model training, predictive performance assessment on test data, and details about interpretability of the models. Comparisons between models based on global performance measures, prediction error curves, variable importance and calibration plots are discussed in the “ Results ” section. The article is concluded by the “ Discussion ” section about findings, limitations of this work and future perspectives. All analyses were performed in R programming language version 3.5.3 [ 22 ]. Preliminary results were presented at 40th Annual Conference of the International Society for Clinical Biostatistics [ 23 ].
An analysis is presented on survival data after LT based on 62294 patients from the United States. Information was collected from the United Network of Organ Sharing (UNOS) Footnote 1 . After extensive pre-processing from a set of more than 600 covariates, 97 variables were included in the final dataset based on clinical and statistical considerations (see Additional file 1 ); 52 donor and 45 liver recipient characteristics (missing values were imputed). As the UNOS data is large in both number of observations and covariates, it is of interest to see how ML algorithms - which are able to capture naturally multi-way interactions between variables and can deal with big datasets - will perform compared to Cox models. The clinical endpoint is overall graft-survival (OGS) the time between LT and graft-failure or death. The choice for this endpoint was made for two reasons 1) it is of primary interest for clinicians and 2) it is the most appropriate outcome measure to evaluate the efficacy of LT, because it incorporates both patient mortality and survival of the graft [ 21 ].
This section is divided into different subsections including the necessary components of analyses for OGS (provided in “ Results ” section). We discuss in detail both Cox models and ML techniques (Random Survival Forest, Survival Neural Networks). Elements of how the models were trained and how the predictive performance was assessed on the test data are presented. More technical details are provided in the supplementary material. We conclude this extensive section with a focus on methods to extract interpretation for the ML approaches.
Data collection and imputation technique
UNOS manages the Organ Procurement and Transplantation Network (OPTN) and together they collect, organise and maintain data of statistical information regarding organ transplants in the Scientific Registry of Transplant Recipients (SRTR) database Footnote 2 . SRTR gathers data from local Organ Procurement Organisations (OPO) and from OPTN (primary source). It includes data from transplantations performed in the United States from 1988 onwards. This information is used to set priorities and seek improvements in the organ donation process.
The data provided by UNOS included 62294 patients who underwent LT surgery from 2005 to 2015 (project under DUA number 9477). Standard analysis files contained 657 variables for both donors and patients (candidates and recipients). Among these, 97 candidate risk factors - 52 donor and 45 patient characteristics - were pre-selected before carrying out analysis. This resulted in a final dataset with 76 categorical and 21 continuous variables amounting to 2.2% missing data overall. The percentage of missing values for each covariate varied from 0 to 26.61% (no missing values for 26 covariates, up to 1% missingness for 51 covariates, 1 to 10% for 11 variables, 10 to 25% for 7 variables and 25 to 26.61% for only 2 variables). Analysis on the complete case would reduce the available sample size from 62294 to 33394 patients leading to a huge waste of data. Furthermore, this could lead to invalid results (underestimation or overestimation of survival) if the excluded group of patients represents a subgroup from the entire sample [ 3.0.CO;2-R ." href="/articles/10.1186/s12874-020-01153-1#ref-CR24" id="ref-link-section-d19895204e956">24 ]. To reconstruct the missing values the missForest algorithm [ 25 ] was applied for both continuous and categorical variables. This is a non-parametric imputation method that does not make explicit assumptions about the functional form of the data and builds a random forest model for each variable (500 trees were used). It specifies the model to predict missing values by using information based on the observed values. It is the most exhaustive and accurate of all random forests algorithms used for missing data imputation, because all possible variable combinations are checked as responses.
Cox proportional Hazard regression models
In survival analysis, the focus is on the time till the occurrence of the event of interest (here graft-failure or death). The Cox proportional hazards model is usually employed to estimate the effect of risk factors on the outcome of interest [ 8 ].
Data with sample size n consist of the independent observations from the triple ( T , D , X ) i.e. ( t 1 , d 1 , x 1 ), ⋯ ,( t n , d n , x n ). For the i t h individual, t i is the survival time, d i the indicator ( d i =1 if the event occurred and d i =0 if the observation is right censored) and x i is the vector of predictors ( x 1 , ⋯ , x p ). The hazard function of the Cox model with time-fixed covariates is as follows:
where h ( t | X ) is the hazard at time t given predictor values X, h 0 ( t ) is an arbitrary baseline hazard and β =( β 1 , ⋯ , β p ) is a parameter vector.
The corresponding partial likelihood can be written as:
where D is the set of failures, and R ( t i ) is the risk set at time t i of all individuals who are still in the study at the time just before time t i . This function is then maximised over β to estimate the model parameters.
Two other Cox models were employed 1) a Cox model with a backward elimination and 2) a penalised Cox regression with the Least Angle and Selection Operator (LASSO). Both models have been widely used for variable selection. We aim to compare these more parsimonious models versus a Cox model with all variables in terms of predictive performance. For the first, a numerically stable version of the backward elimination on factors was applied using a method based on Lawless and Singhal (1978) [ 26 ]. This method estimates the full model and computes approximate Wald statistics by computing conditional maximum likelihood estimates - assuming multivariate normality of estimates. Factors that require multiple degrees of freedom are dropped or retained as a group.
The latter approach uses a combination of selection and regularisation [ 27 ]. Denote the log-partial likelihood by ℓ ( β )= l o g L ( β ). The vector β is estimated via the criterion:
with s a user specified positive parameter.
Equation ( 3 ) can also be rewritten as
The quantity \(\sum _{j = 1}^{p} |\beta _{j}| \) is also known as the L 1 -norm and performs regularisation to the log-partial likelihood. The term λ LASSO is a non-negative constant that assigns the amount of penalisation. Larger values for the parameter mean larger penalty to the β j coefficients and enlarged shrinkage towards zero.
The tuning parameter s in Eq. ( 3 ) or equivalently parameter λ Lasso in Eq. ( 4 ) is the controlling mechanism for the variance of the model. Higher values reduce further the variance but introduce at the same time more bias (variance-bias trade off). To find a suitable value for this parameter 5-fold cross-validation was performed to minimise the prediction error; here in terms of the cross-validated log-partial likelihood (CVPL) [ 28 ]
where ℓ (− i ) ( β ) is the partial log-likelihood of Eq. ( 2 ) when individual i is excluded. Therefore, the term \(\ell (\hat {\beta }_{(-i)}) - \ell _{{(-i)}}\left (\hat {\beta }_{(-i)}\right)\) represents the contribution of observation i . The value that maximizes ℓ (− i ) ( β (− i ) ) is denoted by \(\hat {\beta }_{(-i)}\) .
Random forests for survival analysis
Random Survival Forests (RSFs) are an ensemble tree method for survival analysis of right censored data [ 29 ] adapted from random forests [ 30 ]. The main idea of random forests is to get a series of decision trees - which can capture complex interactions but are notorious for their high variance - and obtain a collection averaging their characteristics. In this way weak learners (the individual trees) are turned into strong learners (the ensemble) [ 31 ].
For RSFs, randomness is introduced in two ways: bootstrapping a number of patients at each tree \(\mathcal {B}\) times and selecting a subset of variables for growing each node. During growing each survival tree, a recursive application of binary splitting is performed per region (called node) on a specific predictor in such a way that survival difference between daughter nodes is maximised and difference within them is minimised. Splitting is terminated when a certain criterion is reached (these nodes are called terminal). The most commonly used splitting criteria are the log-rank test by Segal [ 32 ] and the log-rank score test by Hothorn and Lausen [ 33 ]. Each terminal node should have at least a pre-specified number of unique events. Combining information from the \(\mathcal {B}\) trees, survival probabilities and ensemble cumulative hazard estimate can be calculated using the Kaplan-Meier and Nelson-Aalen methodology, respectively.
The fundamental principle behind each survival tree is the conservation of events. It is used to define ensemble mortality, a new type of predicted outcome for survival data derived from the ensemble cumulative hazard function (comparable to the prognostic index based on the Cox model). This principle asserts that the sum of estimated cumulative hazard estimate over time is equal to the total number of deaths, therefore the total number of deaths is conserved within each terminal node \(\mathcal {H}\) [ 29 ]. RSFs can handle both data with large sample size and vast number of predictors. Moreover, they can reach remarkable stability combining the results of many trees. However, combining an ensemble of trees downgrades significantly the intuitive interpretation of a single tree.
Survival neural networks
Artificial neural networks (ANNs) are a machine learning method able to model non-linear relationships between prognostic factors with great flexibility. These systems are inspired from biological neural networks that aimed at imitating the human brain activity [ 34 ]. A ANN has a layered structure and is based on a collection of connected units called nodes or neurons which comprise a layer. The input layer picks up the signals and passes them through transformation functions to the next layer which is called “hidden”. A network may have more than one hidden layer that connects with the previous and transmit signals towards the output layer. Connections between artificial neurons are called edges. Artificial neurons and edges have a weight (connection strength) which adjusts as learning proceeds. It increases or decreases the strength of the signal of each connection according to its sign. For the purpose of training, a target is defined, which is the observed outcome. The simplest form of a NN is the single layer feed-forward perceptron with the input layer, one hidden layer and the output layer [ 35 ].
The application of NNs has been extended to survival analysis over the years [ 13 ]. Different approaches have been considered; some model the survival probability \(\mathcal {S}(t)\) directly or the unconditional probability of death \(\mathcal {F}(t)\) whereas other approaches estimate the conditional hazard h ( t ) [ 10 ]. They can be distinguished according to the method used to deal with the censoring mechanism. Some networks have k output nodes [ 36 ] - where k denotes k separate time intervals - while others have a single output node.
In this research, the method of Biganzoli was applied, which specifies a partial logistic feed-forward artificial neural network (PLANN) with a single output node [ 3.0.co;2-d " href="/articles/10.1186/s12874-020-01153-1#ref-CR9" id="ref-link-section-d19895204e2367">9 ]. This method uses as inputs the prognostic factors and the survival times to increase the predictive ability of the model. Data have to be transformed into a longitudinal format with the survival times being divided into a set of k non-overlapping intervals (months or years) I k =( τ k −1 , τ k ], with 0= τ o < τ 1 < ⋯ < τ k a set of pre-defined time points. In this way, the time component of survival data is taken into consideration. On the training data, each individual is repeated for the number of intervals he/she was observed in the study and on the test data for all time intervals. PLANN provides the discrete conditional probability of dying \(\mathcal {P}\left (T \in I_{k} \mid T>\tau _{k-1}\right)\) using as transformation function of both input and output layers the logistic (sigmoid) function:
where \(\eta = \sum _{i = 1}^{p}w_{i} X_{i}\) is the summed linear combination of the weights w i of input-hidden layer and the input variables X i ( i =1,2, ⋯ , p ).
The contribution to the log-likelihood for each individual is calculated all over the intervals one is at risk. The output node is a large target vector with 0 if the event did not occur and 1 if the event occurred in a specific time interval. Therefore, such a network first estimates the hazard for each interval h k = P ( τ k −1 < T ≤ τ k | T > τ k −1 ) and then \(S(t) = \prod _{k: t_{k} \leq t} (1 - h_{k})\) .
In this work, novel extensions in the specification of the PLANN are tested. Two new transformation functions were investigated for the input-hidden layer the rectified linear unit (ReLU)
which is the most used activation function for NNs and the hyperbolic tangent (tanh)
These functions can be seen as different modulators of the degree of non-linearity implied by the input and the hidden layer.
The PLANN was expanded in 2 hidden layers with same node size and identical activation functions for input-hidden 1 and hidden 1 - hidden 2 layers. The k non-overlapping intervals of the survival times were treated as k separate variables. In this way, the contribution of each interval to the predictions of the model using the relative importance method by Garson [ 37 ] and its extension for 2 hidden layers can be obtained (see “ Interpretability of the models ” section below and Additional file 1 ).

Model training
The split sample approach was employed; data was split randomly into two complementary parts, a training set (2/3) and a test set (1/3) under the same event/censoring proportions. To tune a model, 5-fold cross validation was performed in the training set for the machine learning techniques (and for Cox LASSO). Training data was divided into 5 folds. Each time 4 folds were used to train a model and the remaining fold was used to validate its performance and the procedure was repeated for all combination of folds. Tuning of the hyper-parameters was done using grid search and performance of final models was assessed on the test set. Analyses were performed in R programming language version 3.5.3 [ 22 ]. Package of implementation for RSFs and NNs as well as technical details regarding the choice of tuning parameters and the cross-validation procedure for each method are provided in Additional File 2 .
Assessing predictive performance on test data
To assess the final predictive performance of the models the concordance index, the Brier score, and the Integrated Brier Score (IBS) were applied.
The most popular measure of model performance in a survival context is the concordance index [ 3.0.CO;2-4 ." href="/articles/10.1186/s12874-020-01153-1#ref-CR38" id="ref-link-section-d19895204e2966">38 ] which computes the proportion of pairs of observations for which the survival times and model predictions order are concordant taking into account censoring. It takes values typically in the range 0.5 - 1 with higher values denoting higher ability of the model to discriminate and 0.5 indicating no discrimination. The C-index cannot be defined for neural network models since it relies on the ordering of individuals according to prognosis and there is no unique ordering between the subjects. At one year individual i may have better survival probability than individual j, but this could be reversed for a different time point.
The C-index provides a rank statistic between the observations that is not time-dependent. Following van Houwelingen and le Cessie [ 39 ] a time-dependent prediction error is defined as
where \(\hat {S}(t_{0}|x)\) is the model-based probabilistic prediction for the survival of an individual beyond t 0 given the predictor x , and y =1{ t > t 0 } is the actual observation ignoring censoring. The expected value with respect to a new observation Y new under the true model S ( t 0 | x ) can be written as:
The Brier Score consists of two components: the “true variation” S ( t 0 | x )(1− S ( t 0 | x )) and the error due to the model \((S(t_{0}|x) - \hat {S}(t_{0}|x))^{2}\) . A perfect prediction is only possible if S ( t 0 | x )=0 or S ( t 0 | x )=1. In practice the two components cannot be separated since the true S ( t 0 | x ) is unknown.
To assess the performance of a prediction rule in actual data, censored observations before time t 0 must be considered. To calculate Brier Score when censored observations are present, Graf proposed the use of inverse probability of censoring weighting [ 3.0.co;2-5 ." href="/articles/10.1186/s12874-020-01153-1#ref-CR40" id="ref-link-section-d19895204e3544">40 ]. Then an estimate of the average prediction error of the prediction model \(\hat {S}(t|x)\) at time t = t 0 is
In ( 11 ), \(\frac {1}{\hat {C}(\min (t_{i}-, t_{0}) | x_{i})} \) is a weighting scheme known as inverse probability of censoring weighting (IPCW) and Score is the Brier Score for the prediction model. It ranges typically from 0 to 0.25 with a lower value meaning smaller prediction error.
Brier score is calculated at different time-points. An overall measure of prediction error is the Integrated Brier Score (IBS) which can be used to summarise the prediction error over the whole range up to the time horizon \(\int _{0}^{t_{hor}}Err_{Score}(\hat {S}, t_{0})dt_{0}\) (here t hor = 10 years) [ 41 ]. IBS provides the cumulative prediction error up to t hor at all available times ( t ∗ = 1, 2, ⋯ , 10 years) and takes values in the same range as the Brier score. In this study, we use IBS as the main criterion to evaluate the predictive ability of all models up to 10 years.
Interpretability of the models
Interpretation of models is of great importance for the medical community. It is well known that Cox models offer a straightforward interpretation through hazard ratios.
For neural networks with one hidden layer the connection weights algorithm by Garson [ 37 ] – later modified by Goh [ 42 ] – can provide information about the mechanism of the weights. The idea behind this algorithm is that inputs with larger connection weights produce greater intensities of signal transfer. As a result, these inputs will be more important for the model. Garson’s algorithm can be used to determine relative importance of each input variable, partitioning the weights in the network. Their absolute values are used to specify percentage of importance. Note that the algorithm does not provide the direction of relationships, so it remains uncertain whether the relative importance indicates a positive or a negative effect. For details about the algorithm see [ 43 ]. During this work, the algorithm was extended for 2 hidden layers to obtain the relative importance of each variable (for the implementation see algorithm 1 in Additional file 1 ).
Random survival forest relies on two methods which can provide interpretability: variable importance (VIMP) and minimal depth [ 44 ]. The former is associated with the prediction error before and after the permutation of a prognostic factor. Large importance values indicate variables with strong predictive ability. The latter is related to the forest topology as it assesses the predictive value of a variable by computing its depth compared to the root node of a tree. VIMP is more frequently reported than minimal depth in the literature [ 45 ]. For both methods interpretation is available only for variable entities and not for each variable level.
Administrative censoring was applied to the UNOS data at 10 years. Median follow-up is equal to 5.36 years (95% CI: 5.19 - 5.59 years) and it was estimated with reverse Kaplan-Meier [ 46 ]. Clinical endpoint is overall graft-survival (OGS). From the total number of patients, 69.1% was alive/censored and 30.9% experienced the event of interest (graft-failure or death). 3 models were used from the Cox family to predict survival outcome: a) a model with all 97 prognostic factors, b) a model with backward selection and c) a model based on the LASSO method for variable selection. Furthermore, 3 machine learning methods were employed: a) a random survival forest, b) a NN with one hidden layer and c) a NN with two hidden layers.
Comparisons between models
In this section a direct comparison of the 6 models is illustrated in terms of variable importance on the training set and predictive performance on the test set. Specification of the variables with dummy coding included 119 variable levels from the 97 potentially prognostic factors. For NNs - to apply and extend the methodology of Biganzoli - follow-up time was divided into 10 time intervals (0,1],(1,2], ⋯ , (9,10] denoting years since transplantation. For Cox models and RSF exact time points were used.
Cox model assumes that each covariate has a multiplicative effect in the hazard function (which is constant over time). Estimating a model with 97 prognostic factors leads inevitably to a violation of the proportional hazards assumption for some covariates (17 out of 97 here). This means that hazard ratios for those risk factors are the mean effects on the outcome which is still a valuable information for the clinicians. To consider all possible non-linear effects on interactions leads to a complex model where too many parameters need to be estimated and the interpretability becomes very difficult. On the other hand, ML techniques do not make any assumptions about the data structure and therefore their performance is not affected by the violation of PH. The backward and the LASSO methods selected 28 (out of 97) and 45 predictors (out of 119 dummy coded), respectively. Selection of a smaller set of variables by Cox backward was expected, since it is a greedier (heuristic) method than LASSO penalized regression. The 12 most influential variables for the Cox model with all variables were selected by both methods (see Table 2 ). 5 of these variables: re-transplantation , donor type , log(Total cold ischemic time) , diabetes and pre-treatment status violated the PH assumption.
5-fold cross-validation in the training data resulted in the following optimal hyper-parameters combinations for the machine learning techniques:
For the Random Survival Forest nodesize = 50, mtry = 12, nsplit = 5 and ntree = 300. Stratified bootstrap sub-sampling of half the patients was used per tree (due to the large training time required).
For the neural network with 1 hidden layer activation function = “sigmoid” (for the input-hidden layer), node size = 85, dropout rate = 0.2, learning rate = 0.2, momentum = 0.9 and weak class weight = 1.
For the neural network with 2 hidden layers activation function = “sigmoid” (for the input-hidden 1 and the hidden 1-hidden 2 layers), node size = 110, dropout rate = 0.1, learning rate = 0.2, momentum = 0.9 and weak class weight = 1.
Global performance measures
The global performance measures on test data are provided in Table 1 . Examining the Integrated Brier Score (IBS), the NNs with 1 and with 2 hidden layers have the lowest (IBS = 0.180) followed by the RSF (IBS = 0.182). Cox models have a comparable performance (IBS = 0.183). Therefore, the predictive ability of Cox backward and Cox LASSO is the same as the less parsimonious Cox model with all variables in terms of IBS. The best model in terms of C-index is the Random Survival Forest (0.622) while the Cox models with all variables has slightly worse performance. C-index for Cox backward and Cox LASSO are respectively 0.615 and 0.614.
Stability of the networks was investigated by rerunning the same models on the test data, and showed that the NN with 1 hidden layer had stable predictive performance and variable importance. In contrast, the NN with 2 hidden layers was quite unstable regarding variable importance. This behavior might be related to the vast amount of weights that had to be trained for this model which can lead to overfitting (in total 26621 connection weights were estimated for a sample size of 41530 patients in long format; whereas for the NN with 1 hidden layer 11136 connection weights). For the RSF, model obtained remarkable stability in terms of performance error after a particular number of trees ( ntree = 300 was selected).
Prediction error curves
Figure 1 shows the average prediction Brier error over time for all models. Small differences can be observed between Cox models and RSF. The NNs with 1 hidden and with 2 hidden layers have almost identical evolution over time achieving better performance than the Cox models and the RSF.

Prediction error curves for all models
Variable importance
In this section, the models are compared based on the most prognostic variables identified from the set of 97 predictors - 52 donor and 45 recipient characteristics. Hazard ratios of the 12 most prognostic variables for the Cox models are shown in Table 2 , based on the absolute z-score values for the Cox model with all variables. The strongest predictor is re-transplantation . Having been transplanted before increases the hazard of graft-failure or death by more than 55%. The other most detrimental variables are donor age and donor type circulatory dead . One unit increase for donor age rises the hazard by around 1% while having received the graft from a donor circulatory versus brain-dead increases the hazard by more than 29% for all models. The rest of the factors which have an adverse effect are: cold ischemic time , diabetes , race , life-support , recipient age , incidental tumour , spontaneous hypertensive bleeding , serology status of HCV and intense care unit before the operation .
In Table 3 the most prognostic factors for the machine learning techniques are presented. The top predictors are provided in terms of relative importance (Rel-Imp) for the PLANN models and in terms of variable importance (VIMP) for the RSF. For the NNs, the strongest predictor is re-transplantation (Rel-Imp 0.035 for 1 hidden and 0.028 for 2 hidden layers), which is the second strongest for the RSF (VIMP 0.009). According to the tuned RSF, the most prognostic factor for the overall graft-survival of the patient is donor age (VIMP 0.010).
Other strong prognostic variables for the NN with 1 hidden layer are life support (Rel-Imp 0.025), intense care unit before the operation (Rel-Imp 0.023) and donor type circulatory dead versus brain-dead (Rel-Imp 0.023). For the NN with 2 hidden layers other very prognostic variables are serology status for HCV (Rel-Imp 0.025), life support (Rel-Imp 0.024) and donor age (Rel-Imp 0.023).
For the RSF life support (VIMP 0.007), serology status for HCV (VIMP 0.007) and intense care unit before the operation (VIMP 0.006). Note that variable total cold ischemic time which was identified as the 4th most prognostic for the Cox model with all variables and the 10th most prognostic for random survival forest is not in the list of the 12 most prognostic for both NNs.
Individual predictions
In this section, the predicted survival probabilities are compared for 3 new hypothetical patients and 3 patients from the test data.
In Fig. 2 a the patient with reference characteristics shows the best survival. The highest probabilities are predicted by the RSF and the lowest by the Cox model. The same pattern occurs for the patient that suffers from diabetes (orange lines). The patient with diabetes who has been transplanted before has the worst survival predictions. In this case the NN predicts the highest survival probabilities and the Cox model built using all the prognostic factors the lowest.

a Predicted survival probabilities for 3 new hypothetical patients using the Cox model with all variables (solid lines), the tuned RSF (short dashed lines) and the tuned NN with 1 hidden layer (long dashed lines). The green lines correspond to a reference patient with the median values for the continuous and the mode value for categorical variables. The patient in the orange line has diabetes (the other covariates as in reference patient). The patient in the red line has been transplanted before and has diabetes simultaneously (the other covariates as in reference patient). Values for 10 prognostic variables for the reference patient are provided in Table 2 of Additional file 1 . b Predicted survival probabilities for 3 patients selected from the test data based on the Cox model with all variables (solid lines), the tuned RSF (short dashed lines) and the tuned NN with 1 hidden layer (long dashed lines). Green lines correspond to a patient censored at 1.12 years. Patient in the orange line was censored at 6.86 years. Patient in the red line died at 0.12 years. Values for 10 prognostic variables for the patients are provided in Tables 3-5 of Additional file 1
In Fig. 2 b the estimated survival probabilities are showed by the Cox model with all variables, the tuned RSF and the tuned PLANN with 1 hidden layer for 3 patients from the test set. The first patient shows the highest survival predictions by the 3 models. The RSF provides the highest survival probabilities and the NN the lowest. The second patient experiences lower survival probabilities (orange lines) whereas the third patient shows the lowest survival probabilities overall. For the second patient the NN predicts the lowest survival probabilities over time and for the third the Cox model.
In general, the random survival forest provides the most optimistic survival probabilities whereas the most pessimistic survival probabilities are predicted by either the Cox model or the NN (more often by the Cox model). This may be related to the characteristics of the methods as RSF relies on recursive binary partitioning of predictors, whereas Cox models imply linearity, and NNs fit non-linear relationships.
Calibration
Here 4 methods are compared: Cox model with all variables, RSF, PLANN 1 hidden and 2 hidden layers based on the calibration on the test data. For each method, the predicted survival probabilities at each year are estimated and the patient data are split into 10 equally sized groups based on the deciles of the probabilities. Then the survival probabilities along with their 95% confidence intervals are calculated using the Kaplan-Meier methodology [ 47 ].
In Fig. 3 the results are showed at 2 years since LT. The Cox model with all variables and the PLANN with 1 hidden layer are both well calibrated. The RSF and the PLANN with 2 hidden layers tend to overestimate the survival probabilities for the patients at higher risk. Survival neural network with 1 hidden layer seems to be the most reliable for predictions between the ML techniques. Calibration plots at 5 and 10 years can be found in Additional file 3 .

Calibration plots at 2 years on the test data: a Cox model with all variables, b Random Survival Forest, c Partial Logistic Artificial Neural Network with 1 hidden layer, d Partial Logistic Artificial Neural Network with 2 hidden layers
With the rise of computational power and technology on the 21 s t century, more and more data have been collected in the medical field to identify trends and patterns which will allow building better allocation systems for patients, provide more accurate prognosis and diagnosis as well as more accurate identification of risk factors. During the past few years, machine learning (ML) has received increased attention in the medical area. For instance, in the area of LTs graft failure or primary non-function might be predicted at decision time with ML methodology [ 48 ]. Briceño et al. created a NN process for donor-recipient matching specifying a binary classification survival output (recipient or graft survival) to predict 3-month graft mortality [ 49 ].
In this study statistical and ML models were estimated for patients from the US post-transplantation. Random survival forest performed better than Cox models with respect to the C-index. This shows the ability of the model to discriminate between low and high risk groups of patients. The C-index was not estimated for NN because a natural ordering of subjects is not feasible. Therefore, the Brier score was measured each year for all methods. The RSF showed similar results to the Cox models having slightly smaller total prediction error (in terms of IBS). The NNs performed in general better than the Cox models or the RSF and had very similar performance over time. RSF and survival NN are ML techniques which have a different learning method and model non-linear relationships between variables automatically. Both methods may be used in medical application but should be applied at present as additional analysis for comparison.
Special emphasis was given on the interpretation of the models. An indirect comparison was performed to examine which are the most prognostic variables for a Cox model with all variables, a RSF and NNs. Results showed that Cox model with all variables (via absolute z-score values) and the NNs with one/two hidden layer(s) (via relative importance) identified similar predictors. Both methods identified re-transplantation as the strongest predictor and donor age , diabetes , life support and race as relatively strong predictors. According to RSF, the most prognostic variables were donor age , re-transplantation , life support and serology status of HCV . Aetiology and last serum creatinine were selected as the 7 th and the 8 th most prognostic. This raises a known concern about the RSF bias towards continuous variables and categorical variables with multiple levels [ 50 ] ( aetiology has 9 levels: metabolic, acute, alcoholic, cholestatic, HBV, HCV, malignant, other cirrhosis, other unknown). As continuous and multilevel variables incorporate larger amount of information than categorical, they tend to be favoured by the splitting rule of the forest during binary partitioning. Such bias was reflected in the variable importance results.
When comparing statistical models with machine learning techniques with respect to interpretability, Cox models offer a straightforward interpretation through the hazard ratios. On the contrary, for both neural networks and random survival forests the sign of the prediction is not provided (if the effect is positive or negative). Additionally, for NNs interpretation is possible for different variable levels (with the method of Garson and its extension), whereas for RSF only the total effect of a variable is shown. There is no common metric to directly compare Cox models with ML techniques in terms of interpretation. Future research in this direction is needed.
ML techniques are inherently based on mechanisms introducing randomisation and therefore very small changes are expected between different iterations of the same algorithm. To evaluate stability of performance, ML models were run several times under the same parametrisation. RSF were consistently stable after a certain number of trees (300 were selected). This was not the case for the NNs where instability is a common problem. It is challenging to tune a NN due to many hyper-parameter combinations available and the lack of a consistent global performance measure for survival data. IBS was used to tune the novel NNs, which may be the reason of instability for the NN with 2 hidden layers together with the large number of weights. Note also that the NN with 1 hidden layer is well calibrated whereas the NN with 2 hidden layers is less calibrated on the test data.
This is the first study where ML techniques are applied to transplant data where a comparison with the traditional Cox model was investigated. To construct the survival NN, the original problem had to be converted into a classification problem where exact survival times were transformed into (maximum) 10 time intervals denoting years since transplantation. On the other hand, for the Cox models and the RSF exact time to event was used. Recently, a new feed forward NN has been proposed for omics data which calculates directly a proportional hazards model as part of the output node using exact time information [ 51 ]. A survival NN with exact times may lead to better predictive performance. For UNOS data, 69.1% of the recipients were alive/censored and 30.9% had the event of interest. Results above were based on these particular percentages for censoring and events (for the NNs the percentages varied because of the reformulation of the problem).
It might be useful to investigate how the number of variables affects the performance of the models. Here 97 variables were pre-selected supported by clinical and statistical reasons (e.g. variables available before or during LT). It might be interesting to repeat the analyses on a smaller group of predictors, implementation time can be drastically reduced as the calculation complexity depends on sample size and predictors multiplicity. Alongside, predictive accuracy might be increased as some noisy factors will be removed from the dataset increasing the signal of potentially prognostic variables.
Both traditional Cox models and PLANNs allow for the inclusion of time-dependent covariates. For PLANNs, each patient is replicated multiple times during the transformation of exact times into a set of k non-overlapping intervals in long format. Thus, different values of a covariate can be naturally incorporated to increase the predictive ability of the networks. It would be interesting to apply and compare the predictive ability of time-dependent Cox models and PLANNs to liver transplantation data including explanatory variables whose values change over time. Such extension to more dynamic methods may increase predictive performance and help in decision making.
Conclusions
There is an increased attention to ML techniques beyond SM in the medical field with methods and applications being more necessary than ever. Utilization of these algorithmic approaches can lead to pattern discovery in the data promoting fast and accurate decision making. For time-to-event data, more ML techniques may be applied for prediction such as Support Vector Machines and Bayesian Networks. Moreover, deep learning with NN is gaining more and more attention and will likely be another trend in the future for these complex data.
In this work two alternatives to the Cox model from machine learning for medical data with large total sample size (62294 patients) and many predictors (97 in total) were discussed. RSF showed better performance than the Cox models with respect to C-index so it can be a useful tool for prioritisation of particular high risk patients. NNs showed better prediction performance in terms of Integrated Brier score. However, both ML techniques required a non-trivial implementation time. Cox models are preferable in terms of straightforward interpretation and fast implementation. Our study suggests that some caution is required when ML methods are applied to survival data. Both approaches can be used for exploratory and analysis purposes as long as the advantages and the disadvantages of the methods are presented.
Availability of data and materials
The research data for this project is private. Unauthorized use is a violation of the terms of the Data Use Agreement with the U.S. Department of Health and Human Services. More information and instructions for researchers to request UNOS data can be found at https://unos.org/data/ . R-code developed to perform the analysis is available in Additional file 4 .
UNOS is a non-profit and scientific organisation in the United States which arranges organ donation and transplantation. For more information visit its website https://unos.org .
Dictionary for variables details is provided at: https://www.srtr.org/requesting-srtr-data/saf-data-dictionary/ .
Abbreviations
Brier score
cross-validated log-partial likelihood
Donor Circulatory Dead
Chronic hepatitis B virus
Chronic hepatitis C virus
Integrated Brier score
Inverse Probability of Censoring Weighting
least angle and selection operator
liver transplantation
Leiden University Medical Center
machine learning
artificial neural network(s)
overall graft-survival
Organ Procurement Organisations
Organ Procurement and Transplantation Network
partial logistic artificial neural network
partial logistic artificial neural network - automatic relevance determination
proportional hazards
relative importance
random survival forest
statistical model
Scientific Registry of Transplant Recipients
United Network of Organ Sharing
variable importance.
Grinyó JM. Why is organ transplantation clinically important?Cold Spring Harb Perspect Med. 2013; 3(6). https://doi.org/10.1101/cshperspect.a014985 .
Merion RM, Schaubel DE, Dykstra DM, Freeman RB, Port FK, Wolfe RA. The survival benefit of liver transplantation. Am J Transplant. 2005; 5(2):307–13. https://doi.org/10.1111/j.1600-6143.2004.00703.x .
Article PubMed Google Scholar
Song X, Mitnitski A, Cox J, Rockwood K. Comparison of machine learning techniques with classical statistical models in predicting health outcomes. Stud Health Technol Inform. 2004; 107(Pt 1):736–40.
PubMed Google Scholar
Deo RC. Machine learning in medicine. Circulation. 2015; 132(20):1920–30. https://doi.org/10.1161/CIRCULATIONAHA.115.001593 .
Article PubMed PubMed Central Google Scholar
Shailaja K, Seetharamulu B, Jabbar MA. Machine learning in healthcare: A review. In: Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). Coimbatore: 2018. p. 910–4. https://doi.org/10.1109/ICECA.2018.8474918 .
Scott IA, Cook D, Coiera EW, Richards B. Machine learning in clinical practice: prospects and pitfalls. Med J Aust. 2019; 211:203–5. https://doi.org/10.5694/mja2.50294 .
Desai RJ, Wang SV, Vaduganathan M, Evers T, Schneeweiss S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw open. 2020; 3(1):1918962. https://doi.org/10.1001/jamanetworkopen.2019.18962 .
Article Google Scholar
Cox DR. Regression models and life-tables. J Roy Stat Soc Ser B Methodol. 1972; 34(2):187–220.
Google Scholar
Biganzoli E, Boracchi P, Mariani L, Marubini E. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med. 1998; 17(10):1169–86. https://doi.org/10.1002/(sici)1097-0258(19980530)17:10<1169::aid-sim796>3.0.co;2-d
Article CAS PubMed Google Scholar
Wang P, Li Y, Reddy CK. Machine learning for survival analysis: A survey. ACM Comput Surv. 2019; 51(6). https://doi.org/10.1145/3214306 .
Xiang A, Lapuerta P, Ryutov A, Buckley J, Azen S. Comparison of the performance of neural network methods and cox regression for censored survival data. Comput Stat Data Anal. 2000; 34(2):243–57. https://doi.org/10.1016/S0167-9473(99)00098-5 .
Faraggi D, Simon R. A neural network model for survival data. Stat Med. 1995; 14(1):73–82. https://doi.org/10.1002/sim.4780140108 .
Liestøl K, Andersen PK, Andersen U. Survival analysis and neural nets. Stat Med. 1994; 13(12):1189–200. https://doi.org/10.1002/sim.4780131202 .
Buckley J, James I. Linear regression with censored data. Biometrika. 1979; 66(3):429–36. https://doi.org/10.1093/biomet/66.3.429 .
Lisboa PJG, Wong H, Harris P, Swindell R. A bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer. Artif Intell Med. 2003; 28(1):1–25. https://doi.org/10.1016/S0933-3657(03)00033-2 .
Biganzoli E, Boracchi P, Marubini E. A general framework for neural network models on censored survival data. Neural Netw. 2002; 15(2):209–18. https://doi.org/10.1016/s0893-6080(01)00131-9 .
Biglarian A, Bakhshi E, Baghestani AR, Gohari MR, Rahgozar M, Karimloo M. Nonlinear survival regression using artificial neural network. J Probab Stat. 2013; 2013. https://doi.org/10.1155/2013/753930 .
Jones AS, Taktak AGF, Helliwell TR, Fenton JE, Birchall MA, Husband DJ, Fisher AC. An artificial neural network improves prediction of observed survival in patients with laryngeal squamous carcinoma. Eur Arch Otorhinolaryngol. 2006; 263(6):541–7. https://doi.org/10.1007/s00405-006-0021-2 .
Taktak A, Antolini L, Aung M, Boracchi P, Campbell I, Damato B, Ifeachor E, Lama N, Lisboa P, Setzkorn C, Stalbovskaya V, Biganzoli E. Double-blind evaluation and benchmarking of survival models in a multi-centre study. Comput Biol Med. 2007; 37(8):1108–20. https://doi.org/10.1016/j.compbiomed.2006.10.001 .
Blok JJ, Putter H, Metselaar HJ, Porte RJ, Gonella F, De Jonge J, Van den Berg AP, Van Der Zande J, De Boer JD, Van Hoek B, Braat AE. Identification and validation of the predictive capacity of risk factors and models in liver transplantation over time. Transplantation Direct. 2018; 4(9). https://doi.org/10.1097/TXD.0000000000000822 .
de Boer JD, Putter H, Blok JJ, Alwayn IPJ, van Hoek B, Braat AE. Predictive capacity of risk models in liver transplantation. Transplantation Direct. 2019; 5(6):457. https://doi.org/10.1097/TXD.0000000000000896 .
R: A Language and Environment for Statistical Computing. http://www.R-project.org/ .
Kantidakis G, Lancia C, Fiocco M. Prediction Models for Liver Transplantation - Comparisons Between Cox Models and Machine Learning Techniques [abstract OC30-4]: 40th Annual Conference of the International Society for Clinical Biostatistics; 2019, pp. 343–4. https://kuleuvencongres.be/iscb40/images/iscb40-2019-e-versie.pdf .
Van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med. 1999; 18(6):681–94. https://doi.org/10.1002/(SICI)1097-0258(19990330)18:6<681::AID-SIM71>3.0.CO;2-R .
Stekhoven DJ, Bühlmann P. Missforest-non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012; 28(1):112–8. https://doi.org/10.1093/bioinformatics/btr597 .
Lawless JF, Singhal K. Efficient screening of nonnormal regression models. Biometrics. 1978; 34(2):318–27. https://doi.org/10.2307/2530022 .
Tibshirani R. The lasso method for variable selection in the cox model. Stat Med. 1997; 16(4):385–95.
Verweij PJM, Van Houwelingen HC. Cross-validation in survival analysis. Stat Med. 1993; 12(24):2305–14. https://doi.org/10.1002/sim.4780122407 .
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008; 2(3):841–60. https://doi.org/10.1214/08-AOAS169 .
Breiman L. Random forests. Mach Learn. 2001; 45(1):5–32. https://doi.org/10.1023/A:1010933404324 .
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Springer; 2009. https://doi.org/10.1007/978-0-387-84858-7 .
Segal MR. Regression trees for censored data. Biometrics. 1988; 44(1):35–47.
Hothorn T, Lausen B. On the exact distribution of maximally selected rank statistics. Comput Stat Data Anal. 2003; 43(2):121–37. https://doi.org/10.1016/S0167-9473(02)00225-6 .
van Gerven M, Bohte S. Editorial: Artificial neural networks as models of neural information processing. Front Comput Neurosci. 2017; 11:114. https://doi.org/10.3389/fncom.2017.00114 .
Minsky M, Papert S. Perceptrons; an Introduction to Computational Geometry. (Book edition 1). Cambridge: MIT Press; 1969.
Lapuerta ASbsuffixP, L L. Use of neural networks in predicting the risk of coronary artery disease. Comput Biomed Res. 1995; 28(1):38–52. https://doi.org/10.1006/cbmr.1995.1004 .
Garson GD. Interpreting neural network connection weights. AI Expert. 1991; 6(4):46–51.
Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996; 15(4):361–87. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4 .
Van Houwelingen JC, Le Cessie S. Predictive value of statistical models. Stat Med. 1990; 9(11):1303–25. https://doi.org/10.1002/sim.4780091109 .
Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999; 18(17-18):2529–45. https://doi.org/10.1002/(sici)1097-0258(19990915/30)18:17/18<2529::aid-sim274>3.0.co;2-5 .
Houwelingen JCv, Putter H. Dynamic Prediction in Clinical Survival Analysis. (Book edition 1). Boca, Raton: CRC Press; 2012, p. 234.
Goh ATC. Back-propagation neural networks for modeling complex systems. Artif Intell Eng. 1995; 9(3):143–51. https://doi.org/10.1016/0954-1810(94)00011-S .
Olden JD, Jackson DA. Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecol Model. 2002; 154(1-2):135–50.
Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Lauer MS. High-dimensional variable selection for survival data. J Am Stat Assoc. 2010; 105(489):205–17. https://doi.org/10.1198/jasa.2009.tm08622 .
Article CAS Google Scholar
Ishwaran H, Lu M. Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival. Stat Med. 2019; 38(4):558–82. https://doi.org/10.1002/sim.7803 .
Schemper M, Smith TL. A note on quantifying follow-up in studies of failure time. Control Clin Trials. 1996; 17(4):343–6. https://doi.org/10.1016/0197-2456(96)00075-x .
Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958; 53(282):457–81. https://doi.org/10.2307/2281868 .
Lau L, Kankanige Y, Rubinstein B, Jones R, Christophi C, Muralidharan V, Bailey J. Machine-learning algorithms predict graft failure after liver transplantation. Transplant. 2017; 101(4):125–32. https://doi.org/10.1097/TP.0000000000001600 .
Briceño J, Cruz-Ramírez M, Prieto M, Navasa M, De Urbina JO, Orti R, Gómez-Bravo MN, Otero A, Varo E, Tomé S, Clemente G, Bañares R, Bárcena R, Cuervas-Mons V, Solórzano G, Vinaixa C, Rubín N, Colmenero J, Valdivieso A, Ciria R, Hervás-Martínez C, De La Mata M. Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: Results from a multicenter spanish study. J Hepatol. 2014; 61(5):1020–8. https://doi.org/10.1016/j.jhep.2014.05.039 .
Loh W-Y, Shih Y-S. Split selection methods for classification trees. Stat Sin. 1997; 7:815–40.
Ching T, Zhu X, Garmire LX. Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput Biol. 2018; 14(4). https://doi.org/10.1371/journal.pcbi.1006076 .
Download references
Acknowledgements
The authors would like to thank the United Network of Organ Sharing (UNOS) and Scientific Registry of Transplant Recipients (SRTR) for providing the data about liver transplantation to Leiden University Medical Center (LUMC) under DUA number 9477.
Georgios Kantidakis’s work as a Fellow at EORTC Headquarters was supported by a grant from the EORTC Soft Tissue and Bone Sarcoma Group and Leiden University as well as from the EORTC Cancer Research Fund (ECRF). The funding sources had no role in the design of the study and collection, analysis, and interpretation of data or preparation of the manuscript.
Author information
Authors and affiliations.
Mathematical Institute (MI) Leiden University, Niels Bohrweg 1, Leiden, 2333 CA, the Netherlands
Georgios Kantidakis, Carlo Lancia & Marta Fiocco
Department of Biomedical Data Sciences, Section Medical Statistics, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, The Netherlands
Georgios Kantidakis, Hein Putter & Marta Fiocco
Department of Statistics, European Organisation for Research and Treatment of Cancer (EORTC) Headquarters, Ave E. Mounier 83/11, Brussels, 1200, Belgium
Georgios Kantidakis
Department of Surgery, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, the Netherlands
Jacob de Boer & Andries E. Braat
Trial and Data Center, Princess Máxima Center for pediatric oncology (PMC), Heidelberglaan 25, Utrecht, 3584 CS, the Netherlands
Marta Fiocco
You can also search for this author in PubMed Google Scholar
Contributions
JDB and AEB requested the data to the Scientific Registry of Transplant Recipients (SRTR) and provided clinical input. GK, HP, CL and MF designed the models. GK carried out the statistical analysis. GK wrote the manuscript and HP, MF critically revised it. All authors read and approved the final version.
Corresponding author
Correspondence to Georgios Kantidakis .
Ethics declarations
Ethics approval and consent to participate.
The ethics committee of Leiden University Medical Center (LUMC) approved the study by sending a letter to JDB. For all patients written informed consent was provided to use the data for scientific research.
Consent for publication
The study was submitted to a functioning Institutional Review Board (IRB) for review and approval. Consent was provided for publication.
Competing interests
The authors declare that they have no competing interests. The data reported here have been supplied by the Minneapolis Medical Research Foundation (MMRF) as the contractor for the Scientific Registry of Transplant Recipients (SRTR). The interpretation and reporting of these data are the responsibility of the author(s) and in no way should be seen as an official policy of or interpretation by the SRTR or the U.S. Government.
This study used data from the Scientific Registry of Transplant Recipients (SRTR). The SRTR data system includes data on all donor, wait-listed candidates, and transplant recipients in the US, submitted by the members of the Organ Procurement and Transplantation Network (OPTN). The Health Resources and Services Administration (HRSA), U.S. Department of Health and Human Services provides oversight to the activities of the OPTN and SRTR contractors.
Additional information
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The majority of this work was done at Leiden University
Supplementary Information
Additional file 1.
Includes the Garson’s algorithm for 2 hidden layers, a table with the relative importance of the time intervals for the neural networks with 1 and 2 hidden layes, detailed criteria for variable pre-selection, a plot of survival and censoring distributions and 4 tables with individual patient characteristics.
Additional file 2
Provides information about the package to implement RSFs and NNs as well as technical parts regarding the choice of tuning parameters and the cross-validation procedure for each method. A figure illustrates the cross-validation procedure for RSF on a 3D space. References are provided for further reading.
Additional file 3
Contains calibration plots at 5 and 10 years for a) a Cox model with all prognostic factors, b) a Random Survival Forest with all prognostic factors, c) a Partial Logistic Artificial Neural Network with 1 hidden layer with all prognostic factors and d) a Partial Logistic Artificial Neural Network with 2 hidden layers with all prognostic factors.
Additional file 4
Provides the R code developed for the analyses of this project.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and Permissions
About this article
Cite this article.
Kantidakis, G., Putter, H., Lancia, C. et al. Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques. BMC Med Res Methodol 20 , 277 (2020). https://doi.org/10.1186/s12874-020-01153-1
Download citation
Received : 11 April 2020
Accepted : 26 October 2020
Published : 16 November 2020
DOI : https://doi.org/10.1186/s12874-020-01153-1
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Random survival forest
- Neural networks
- Predictive performance
- Risk factors
- Post-transplantation
- Survival analysis
BMC Medical Research Methodology
ISSN: 1471-2288
- Submission enquiries: [email protected]
- General enquiries: [email protected]

( An International Open Access , Peer-reviewed, Refereed Journal )
( an international open access journal & and issn approved ), call for paper june 2023 last date 25 - june 2023, impact factor 7.376 (year 2021).

- Join as Reviewer
- Peer Review Editorial Policy
- Ph.D. Thesis Publication
- Paper Format
- How to Publish paper?
- Author's Guidelines
- Copyright Form
- Submit Paper
- Processing Charges
- Track Paper
- Impact Factor Details
- Publication Policy
- Ethics Policy
- New Proposal / Terms
- Special Issues
- Upcoming Conferences
- Volume 10 Articles
- Volume 9 Articles
- Volume 8 Articles
- Volume 7 Articles
- Volume 6 Articles
- Volume 5 Articles
- Volume 4 Articles
- Volume 3 Articles
- Volume 2 Articles
- Volume 1 Articles
Conference Alert
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2021
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT
Last Date: 7th November 2021
Latest Publication
- Analysis And Design Of High Rise Building Using ... Paper ID : IJIRT160364
- Utilization of Refractory Casting Cement Waste (R... Paper ID : IJIRT160358
- IOT BASED FOREST SAFTEY ALERT SYSTEM ON DISASTERS ... Paper ID : IJIRT160350
- Formulation And Evaluation Aloe Cold Cream... Paper ID : IJIRT160336
- Deep Learning Instance Segmentation for Estimating... Paper ID : IJIRT160326
Go To Issue
Call for paper, volume 10 issue 1, last date for paper submitting for march issue is 25 june 2023.
IJIRT.org enables door in research by providing high quality research articles in open access market.
Send us any query related to your research on [email protected]
Social Media
Google verified reviews.

Contact Details
Telephone: 6351679790 Email: [email protected] Website: ijirt.org
- Peer Review Policy
- Ethics policy
Important Links
- Current Issue
- Submit Your Paper
Browse Rsearch papers
- Engineering Research Papers
- Pharmacy Research Papers
- literature Research Papers
- Management Research Papers
- Food Science Research Papers

IMAGES
VIDEO
COMMENTS
Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). ... The following section describes fatty liver prediction models that are likely to suit different scenarios. We focus on a basic model (model 1), which includes variables that are widely ...
Volume 167, 2020, Pages 1970-1980 Software-based Prediction of Liver Disease with Feature Selection and Classification Techniques Jagdeep Singh a , Sachin Bagga b , Ranjodh Kaur c Add to Mendeley https://doi.org/10.1016/j.procs.2020.03.226 Get rights and content Under a Creative Commons license open access
First Online: 05 January 2021 402 Accesses 1 Citations Part of the Lecture Notes in Electrical Engineering book series (LNEE,volume 702) Abstract Liver disease (LD) is a common disease in the world.
The main objective of this research paper is to provide a summarized review of literature with comparative results, which has been done for the detection and prediction of liver diseases with various machine learning algorithms using the liver function test data in order to make the analytical conclusions. ... In 2020, Somaya et al. developed ...
Liver Disease Prediction using Machine learning Classification Techniques Authors: Ketan Gupta University of the Cumberlands Nasmin Jiwani University of the Cumberlands Neda Afreen Jamia...
Prediction of Liver Diseases Based on Machine Learning Technique for Big Data Authors: Engy El-Shafeiy University of Sadat City Ali Ibrahim El-Desouky Mansoura University Sally Elghamrawy MISR...
In Human beings, Liver is the most primary part of the body that performs many functions including the production of Bile, excretion of bile and bilirubin, metabolism of proteins and carbohydrates, activation of Enzymes, Storing glycogen, vitamins, and minerals, plasma proteins synthesis and clotting factors. The liver easily gets affected due to intake of alcohol, pain killer tablets, food ...
In this paper, we thus used the Machine Learning method of Logistic Regression to predict liver disease in patients. Keywords - Liver Diseases, Logistic Regression, Machine Learning, Confusion Matrix, Cross-Validation. 1. Introduction The diagnosis of liver diseases in the early stages is a perplexing task as the symptoms are unnoticeable ...
Conclusions and Future Work. The effects of COVID-19 on the body are widespread. The early diagnosis of liver damage due to COVID-19 can contribute to making medical decisions. Therefore, this study suggests that the DMLD model can help in the prediction of the risk of liver damage during SARS-CoV-2 infection.
Nidhi Lal IIIT, Nagpur, Maharashtra Date Written: March 28, 2020 Abstract Liver Diseases are prevalent in India accounting for 2.4% of Indian deaths per year. According to the WHO, liver disease is one of the most common causes of death in India.
Medical diagnoses have important implications for improving patient care, research, and policy. For a medical diagnosis, health professionals use different kinds of pathological methods to make decisions on medical reports in terms of the patients' medical conditions. Recently, clinicians have been actively engaged in improving medical diagnoses. The use of artificial intelligence and ...
A brief overview of the dataset's characteristics is shown in Table 1. Table 1. Dataset Description. 2.2. Liver Disease Risk Prediction. Nowadays, clinicians and health carers exploit machine-learning models to develop efficient tools for the risk assessment of a disease occurrence based on several risk factors.
Abstract: There is a lot of data on patients who undergo medication or medical examinations at the hospital and this is information that must be extracted so that it can provide information for future improvement conditions, meaning that past data can be used as a prediction basis for liver disease in patients. This is very beneficial for medical personnel and also for patients if they ...
This paper is aboutto study the prediction of liver disease to produce better performance accuracy by comparingvarious mining data classi cation algorithms. 1. Introduction Liver is the second largest inside organ in the human body.
Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data.
I. INTRODUCTION Diagnosis of a disease is based on a doctor's knowledge and experience. However, under some circumstances, the prediction can be wrong, which leads to incorrect treatment to the patient.
IJIRTEXPLORE - Search Thousands of research papers. Call For Paper May 2023 Last Date 25 - May 2023 Impact Factor 7.376 (Year 2021) ISSN: 2349-6002 ESTD Year: 2014. UGC approved journal no 47859. ... prediction and diagnosis of liver disease using machine learning models ; Author(s): Vishnu Teja S Hingoli, Narendra G, Tejas SV, Predeep E ...