Estimation of risk factors associated with colorectal cancer: an application of knowledge discovery in databases

2016 
Colorectal cancer is one of the first reasons for death due to cancer in the world.The goal of this study is to predict important risk factors of colorectal cancer (CRC)by knowledge discovery in databases (KDD) methods. This study comprised aretrospective CRC data of patients who had been diagnosed with colorectal cancer. Theselected records between 1 January 2010 and 1 March 2014 were collected randomlyfrom Turgut Ozal Medical Centre databases. The study included 160 individuals: 80patients admitted to Department of Oncology and diagnosed with CRC, and 80 controlsubjects with non-CRC categorization. The groups were matched for age and gender.We mined retrospective CRC data from large integrated health systems with electronichealth records. Specific demographical and clinical variables including calcium,hemoglobin, white blood cells, platelets, potassium, sodium, glucose, creatinine andtotal bilirubin were used in multilayer perceptron (MLP) artificial neural networks(ANN) modeling. In this study, patient and control groups consist of 160 individuals.In each group, 45 of these (56.3%) are male, and 35 (43.7%) are women. Mean ageof CRC patients and control groups is 58.6±13.0. While the accuracy was 71.31%in training dataset (n=122), the accuracy was 81.82% in testing dataset. Area undercurve (AUC) values of training and testing datasets were 0.73 and 0.81, respectively.The suggested MLP ANN model identified significant factors of calcium, creatinine,potassium, platelets, sodium, hemoglobin and total bilirubin. Taken together, thesuggested MLP ANN model might be used for the estimation of risk factors associatedwith CRC as an application of medical KDD.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    35
    References
    6
    Citations
    NaN
    KQI
    []