Multiple Sources Classification of Gene Position on Chromosomes Using Statistical Significance of Individual Classification Results

2011 
In data mining applications it is common to have more than one data source available to describe the same record. For example, in biological sciences, the same genes may be characterized through many types of experiments. Which of the data sources proves to be most reliable in predictions may depend on the record in question. For some records pieces of information may be unavailable because an experiment has not yet been done, or certain type of inferences may not be applicable, such as when a gene does not have a homologue in some species. We demonstrate how multi-classifier systems can allow classification in cases where any individual source is scarce or unreliable to provide an accurate prediction model by itself. We propose a method to predict a class label using statistical significance of individual classification results. We show that the proposed approach increases the accuracy of results compared with conventional techniques in a problem related to gene mapping in wheat.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    3
    Citations
    NaN
    KQI
    []