A new local covariance matrix estimation for the classification of gene expression profiles in RNA-Seq data
2019
Background and Objective: Recent developments in the next-generation sequencing (NGS) based on RNA-sequencing(RNA- Seq) allow researchers to measure the expression levels of thousands of genes for multiple samples simultaneously. In order to analyze this kind of data sets, many classification models have been proposed in the literature. Most of the existing classifiers assume that genes are independent; however, this is not a realistic approach for real RNA-Seq classification problems. For this reason, some other classification methods, which incorporates the dependence structure between genes into a model, are proposed. qtQDA proposed by Koc han et al. [1] is one of those classifiers, which estimates covariance matrix by Maximum Likelihood Estimator. Methods: In this study, we use another approach based on local dependence function to estimate the covariance matrix to be used in the qtQDA classification model. We investigate the impact of different covariance estimates on RNA-Seq data classification. Results: The performances of qtQDA classifier based on two different covariance matrix estimates are compared over two real RNA-Seq data sets, in terms of classification error rates. The results show that using local dependence function approach yields a better estimate of the covariance matrix and increases the performance of qtQDA classifier. Conclusion: Incorporating the true/accurate covariance matrix into the classification model is an important and crucial step particularly for cancer prediction. The local covariance matrix estimate allows researchers to classify cancer patients based on gene expression profiles more accurately. R code for local dependence function is available athttps://github.com/Necla/LocalDependence.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
17
References
0
Citations
NaN
KQI