The quest for deus ex machina : harnessing the power of machine learning for synthetic biology

2020 
Machine learning is nowadays an ever-present part of many aspects of modern life and has increasingly been used in the field of synthetic biology as well. Examples of studies that successfully harnessed the superb pattern recognition abilities of machine learning algorithms range from a molecular to a big-scale production level and include the prediction of transcription factor activity, enzyme expression balancing, gene annotation and the prediction of production parameters. However, a closer look reveals the lack of a standard for such published models, with no known guidelines on what metrics and analyses should be included in a publication. Studies are often highlighting the fact that the community needs more data in machine-readable format, but it has rarely been discussed how such a study should be conducted and analyzed to obtain a meaningful, robust, high quality and predictive model. Here, we present a guideline specifically aimed at synthetic biologists who wish to use machine learning in their research, in particular on smaller, in-house collected datasets. We discuss key aspects on how to evaluate and interpret a model’s performance with focus on regression, and common problems and pitfalls that arise during the workflow. Together with the increasing availability of vast datasets, the implementation of such guidelines can contribute to the strive for standardization and strong application of engineering principles in the synthetic biology community.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []