Mining large heterogeneous data sets in drug discovery

2009 
Background: Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. Objective: This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. Conclusion: The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    76
    References
    16
    Citations
    NaN
    KQI
    []