Towards a graph-theoretic approach to hybrid performance prediction from large-scale phenotypic data

2015 
High-throughput biological data analysis has received a large amount of interest in the last decade due to pioneering technologies that are able to automatically generate large-scale datasets by performing millions of analytical tests on a daily basis. Here we present a new network-based approach to analyze a high-throughput phenomic dataset that was collected on maize inbreds and hybrids by an automated phenotyping facility. Our dataset consists of 1600 biological samples from 600 different genotypes (200 inbred and 400 hybrid lines). On each sample, 141 phenotypic traits were observed for 33 days. We apply a graph-theoretic approach to address two important problems: (i) to discover meaningful patterns in the dataset and (ii) to predict hybrid performance in terms of biomass based on automatically collected phenotypic traits. We propose a modelling framework in which the prediction problem becomes transformed into finding the shortest path in a correlation-based network. Preliminary results show small but encouraging correlations between predicted and observed biomass. Extensions of the algorithm and applications of the modelling framework to other types of biological data are discussed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []