Genome and proteome annotation using automatically recognized concepts and functional networks.

2013 
Many tools have been developed for prediction of the function or disease association of genes and proteins, and this continues to be a highly active area of bioinformatics research. Typically, these methods predict which concepts should be annotated to genes or proteins, using terms from ontologies such as Gene Ontology (GO), largely overlooking other ontologies that are available. Here, we set out to broadly evaluate novel, automatically retrieved, gene-term annotations and identify those concepts of publicly available ontologies that can be predicted using a generalized tool for prediction of annotations. We identified terms that perform better than expected by chance using randomly generated gene sets and show that both manually curated terms in GO and automatically recognized terms can be used to develop reasonable predictive models. In all, we characterize terms in over 250 ontologies and identify more than 127,000 statistically significant terms that can be predicted on human genes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []