bletl - A Python Package for Integrating Microbioreactors in the Design-Build-Test-Learn Cycle

2021 
Microbioreactor (MBR) devices have emerged as powerful cultivation tools for tasks of microbial phenotyping and bioprocess characterization and provide a wealth of online process data in a highly parallelized manner. Such datasets are difficult to interpret in short time by manual workflows. In this study, we present the Python package bletl and show how it enables robust data analyses and the application of machine learning techniques without tedious data parsing and preprocessing. bletl reads raw result files from BioLector I, II and Pro devices to make all the contained information available to Python-based data analysis workflows. Together with standard tooling from the Python scientific computing ecosystem, interactive visualizations and spline-based derivative calculations can be performed. Additionally, we present a new method for unbiased quantification of time-variable specific growth rate [Formula] based on a novel method of unsupervised switchpoint detection with Student-t distributed random walks. With an adequate calibration model, this method enables practitioners to quantify time-variable growth rate with Bayesian uncertainty quantification and automatically detect switch-points that indicate relevant metabolic changes. Finally, we show how time series feature extraction enables the application of machine learning methods to MBR data, resulting in unsupervised phenotype characterization. As an example, t-distributed Stochastic Neighbor Embedding (t-SNE) is performed to visualize datasets comprising a variety of growth/DO/pH phenotypes. Practical ApplicationThe bletl package can be used to analyze microbioreactor datasets in both data analysis and autonomous experimentation workflows. Using the example of BioLector datasets, we show that loading such datasets into commonly used data structures with one line of Python code is a significant improvement over spreadsheet or hand-crafted scripting approaches. On top of established standard data structures, practitioners may continue with their favorite data analysis routines, or make use of the additional analysis functions that we specifically tailored to the analysis of microbioreactor time series. Particularly our function to fit cross-validated smoothing splines can be used for on-line signals from any microbioreactor system and has the potential to improve robustness and objectivity of many data analyses. Likewise, our random walk based [Formula] method for inferring growth rates under uncertainty, but also the time-series feature extraction may be applied to on-line data from other cultivation systems as well.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []