The authors, an artist, a mathematician and a biologist, describe their collaboration examining the potential of drawing to further the understanding of biological processes. As a case study, this article considers C.H. Waddington's powerful visual representation of the “epigenetic landscape,” whose purpose is to unify research in genetics, embryology and evolutionary biology. The authors explore the strengths and limitations of Waddington's landscape and attempt to transcend the latter through a collaborative series of exploratory images. Through careful description of this drawing process, the authors touch on its epistemological consequences for all participants.
ABSTRACT Here we describe a dataset of freely available, readily processed, whole-body μ CT-scans of 56 species (116 specimens) of Lake Malawi cichlid fishes that captures a considerable majority of the morphological variation present in this remarkable adaptive radiation. We contextualise the scanned specimens within a discussion of their respective ecomorphological groupings and suggest possible macroevolutionary studies that could be conducted with these data. We also describe a methodology to efficiently μ CT-scan (on average) 23 specimens per hour, limiting scanning time and alleviating the financial cost whilst maintaining high resolution. We demonstrate the utility of this method by reconstructing 3D models of multiple bones from multiple specimens within the dataset. We hope this dataset will enable further morphological study of this fascinating system and permit wider-scale comparisons with other cichlid adaptive radiations.
Abstract Insects determine their body segments in two different ways. Short-germband insects, such as the flour beetle Tribolium castaneum , use a molecular clock to establish segments sequentially. In contrast, long-germband insects, such as the vinegar fly Drosophila melanogaster , determine all segments simultaneously through a hierarchical cascade of gene regulation. Gap genes constitute the first layer of the Drosophila segmentation gene hierarchy, downstream of maternal gradients such as that of Caudal (Cad). We use data-driven mathematical modelling and phase space analysis to show that shifting gap domains in the posterior half of the Drosophila embryo are an emergent property of a robust damped oscillator mechanism, suggesting that the regulatory dynamics underlying long- and short-germband segmentation are much more similar than previously thought. In Tribolium , Cad has been proposed to modulate the frequency of the segmentation oscillator. Surprisingly, our simulations and experiments show that the shift rate of posterior gap domains is independent of maternal Cad levels in Drosophila . Our results suggest a novel evolutionary scenario for the short- to long-germband transition, and help explain why this transition occurred convergently multiple times during the radiation of the holometabolan insects. Author summary Different insect species exhibit one of two distinct modes of determining their body segments during development: they either use a molecular oscillator to position segments sequentially, or they generate segments simultaneously through a hierarchical gene-regulatory cascade. The sequential mode is ancestral, while the simultaneous mode has been derived from it independently several times during evolution. In this paper, we present evidence which suggests that simultaneous segmentation also involves an oscillator in the posterior of the embryo of the vinegar fly, Drosophila melanogaster . This surprising result indicates that both modes of segment determination are much more similar than previously thought. Such similarity provides an important step towards explaining the frequent evolutionary transitions between sequential and simultaneous segmentation.
Pattern formation during development is a highly dynamic process. In spite of this, few experimental and modelling approaches take into account the explicit time-dependence of the rules governing regulatory systems. We address this problem by studying dynamic morphogen interpretation by the gap gene network in Drosophila melanogaster. Gap genes are involved in segment determination during early embryogenesis. They are activated by maternal morphogen gradients encoded by bicoid (bcd) and caudal (cad). These gradients decay at the same time-scale as the establishment of the antero-posterior gap gene pattern. We use a reverse-engineering approach, based on data-driven regulatory models called gene circuits, to isolate and characterise the explicitly time-dependent effects of changing morphogen concentrations on gap gene regulation. To achieve this, we simulate the system in the presence and absence of dynamic gradient decay. Comparison between these simulations reveals that maternal morphogen decay controls the timing and limits the rate of gap gene expression. In the anterior of the embyro, it affects peak expression and leads to the establishment of smooth spatial boundaries between gap domains. In the posterior of the embryo, it causes a progressive slow-down in the rate of gap domain shifts, which is necessary to correctly position domain boundaries and to stabilise the spatial gap gene expression pattern. We use a newly developed method for the analysis of transient dynamics in non-autonomous (time-variable) systems to understand the regulatory causes of these effects. By providing a rigorous mechanistic explanation for the role of maternal gradient decay in gap gene regulation, our study demonstrates that such analyses are feasible and reveal important aspects of dynamic gene regulation which would have been missed by a traditional steady-state approach. More generally, it highlights the importance of transient dynamics for understanding complex regulatory processes in development.
Abstract The study of pattern formation has greatly benefited from our ability to reverse-engineer gene regulatory network (GRN) structure from spatio-temporal quantitative gene expression data. Traditional approaches omit tissue morphogenesis, and focus on systems where the timescales of pattern formation and morphogenesis can be separated. In such systems, pattern forms as an emergent property of the underlying GRN and mechanistic insight can be obtained from the GRNs alone. However, this is not the case in most animal patterning systems, where patterning and morphogenesis are co-occurring and tightly linked. To address the mechanisms driving pattern formation in such systems we need to adapt our GRN inference methodologies to explicitly accommodate cell movements and tissue shape changes. In this work we present a novel framework to reverse-engineer GRNs underlying pattern formation in tissues undergoing morphogenetic changes and cell rearrangements. By integrating quantitative data from live and fixed embryos, we approximate gene expression trajectories (AGETs) in single cells and use a subset to reverse-engineer candidate GRNs using a Markov Chain Monte Carlo approach. GRN fit is assessed by simulating on cell tracks (live-modelling) and comparing the output to quantitative data-sets. This framework generates candidate GRNs that recapitulate pattern formation at the level of the tissue and the single cell. To our knowledge, this inference methodology is the first to integrate cell movements and gene expression data, making it possible to reverse-engineer GRNs patterning tissues undergoing morphogenetic changes.
Abstract Machine learning approaches are becoming increasingly widespread and are now present in most areas of research. Their recent surge can be explained in part due to our ability to generate and store enormous amounts of data with which to train these models. The requirement for large training sets is also responsible for limiting further potential applications of machine learning, particularly in fields where data tend to be scarce such as developmental biology. However, recent research seems to indicate that machine learning and Big Data can sometimes be decoupled to train models with modest amounts of data. In this work we set out to train a CNN-based classifier to stage zebrafish tail buds at four different stages of development using small information-rich data sets. Our results show that two and three dimensional convolutional neural networks can be trained to stage developing zebrafish tail buds based on both morphological and gene expression confocal microscopy images, achieving in each case up to 100% test accuracy scores. Importantly, we show that high accuracy can be achieved with data set sizes of under 100 images, much smaller than the typical training set size for a convolutional neural net. Furthermore, our classifier shows that it is possible to stage isolated embryonic structures without the need to refer to classic developmental landmarks in the whole embryo, which will be particularly useful to stage 3D culture in vitro systems such as organoids. We hope that this work will provide a proof of principle that will help dispel the myth that large data set sizes are always required to train CNNs, and encourage researchers in fields where data are scarce to also apply ML approaches. Author summary The application of machine learning approaches currently hinges on the availability of large data sets to train the models with. However, recent research has shown that large data sets might not always be required. In this work we set out to see whether we could use small confocal microscopy image data sets to train a convolutional neural network (CNN) to stage zebrafish tail buds at four different stages in their development. We found that high test accuracies can be achieved with data set sizes of under 100 images, much smaller than the typical training set size for a CNN. This work also shows that we can robustly stage the embryonic development of isolated structures, without the need to refer back to landmarks in the tail bud. This constitutes an important methodological advance for staging organoids and other 3D culture in vitro systems. This work proves that prohibitively large data sets are not always required to train CNNs, and we hope will encourage others to apply the power of machine learning to their areas of study even if data are scarce.