Abstract When grown on agar surfaces, microbes can produce distinct multicellular spatial structures called colonies, which contain characteristic sizes, shapes, edges, textures, and degrees of opacity and color. For over one hundred years, researchers have used these morphology cues to classify bacteria and guide more targeted treatment of pathogens. Advances in genome sequencing technology have revolutionized our ability to classify bacterial isolates and while genomic methods are in the ascendancy, morphological characterization of bacterial species has made a resurgence due to increased computing capacities and widespread application of machine learning tools. In this paper, we revisit the topic of colony morphotype on the within-species scale and apply concepts from image processing, computer vision, and deep learning to a dataset of 69 environmental and clinical Pseudomonas aeruginosa strains. We find that colony morphology and complexity under common laboratory conditions is a robust, repeatable phenotype on the level of individual strains, and therefore forms a potential basis for strain classification. We then use a deep convolutional neural network approach with a combination of data augmentation and transfer learning to overcome the typical data starvation problem in biological applications of deep learning. Using a train/validation/test split, our results achieve an average validation accuracy of 92.9% and an average test accuracy of 90.7% for the classification of individual strains. These results indicate that bacterial strains have characteristic visual ‘fingerprints’ that can serve as the basis of classification on a sub-species level. Our work illustrates the potential of image-based classification of bacterial pathogens and highlights the potential to use similar approaches to predict medically relevant strain characteristics like antibiotic resistance and virulence from colony data. Author Summary Since the birth of microbiology, scientists have looked at the patterns of bacterial growth on agar (colony morphology) as a key tool for identifying bacterial species. We return to this traditional approach with modern tools of computer vision and deep learning and show that we can achieve high levels of classification accuracy on a within-species scale, despite what is considered a ‘data-starved’ dataset. Our results show that strains of the environmental generalist and opportunistic pathogen Pseudomonas aeruginosa have a characteristic morphological ‘fingerprint’ that enables accurate strain classification via a custom deep convolutional neural network. Our work points to extensions towards predicting phenotypes of interest (e.g. antibiotic resistance, virulence), and suggests that sample size limitations may be less restrictive than previously thought for deep learning applications in biology, given appropriate use of data augmentation and transfer-learning tools.
The diversity of multicellular organisms is, in large part, due to the fact that multicellularity has independently evolved many times. Nonetheless, multicellular organisms all share a universal biophysical trait: cells are attached to each other. All mechanisms of cellular attachment belong to one of two broad classes; intercellular bonds are either reformable or they are not. Both classes of multicellular assembly are common in nature, having independently evolved dozens of times. In this review, we detail these varied mechanisms as they exist in multicellular organisms. We also discuss the evolutionary implications of different intercellular attachment mechanisms on nascent multicellular organisms. The type of intercellular bond present during early steps in the transition to multicellularity constrains future evolutionary and biophysical dynamics for the lineage, affecting the origin of multicellular life cycles, cell–cell communication, cellular differentiation, and multicellular morphogenesis. The types of intercellular bonds used by multicellular organisms may thus result in some of the most impactful historical constraints on the evolution of multicellularity.
The ribosomal RNA (rrn) operon is a key suite of genes related to the production of protein synthesis machinery and thus to bacterial growth physiology. Experimental evidence has suggested an intrinsic relationship between the number of copies of this operon and environmental resource availability, especially the availability of phosphorus (P), because bacteria that live in oligotrophic ecosystems usually have few rrn operons and a slow growth rate. The Cuatro Ciénegas Basin (CCB) is a complex aquatic ecosystem that contains an unusually high microbial diversity that is able to persist under highly oligotrophic conditions. These environmental conditions impose a variety of strong selective pressures that shape the genome dynamics of their inhabitants. The genus Bacillus is one of the most abundant cultivable bacterial groups in the CCB and usually possesses a relatively large number of rrn operon copies (6-15 copies). The main goal of this study was to analyze the variation in the number of rrn operon copies of Bacillus in the CCB and to assess their growth-related properties as well as their stoichiometric balance (N and P content). We defined 18 phylogenetic groups within the Bacilli clade and documented a range of from six to 14 copies of the rrn operon. The growth dynamic of these Bacilli was heterogeneous and did not show a direct relation to the number of operon copies. Physiologically, our results were not consistent with the Growth Rate Hypothesis, since the copies of the rrn operon were decoupled from growth rate. However, we speculate that the diversity of the growth properties of these Bacilli as well as the low P content of their cells in an ample range of rrn copy number is an adaptive response to oligotrophy of the CCB and could represent an ecological mechanism that allows these taxa to coexist. These findings increase the knowledge of the variability in the number of copies of the rrn operon in the genus Bacillus and give insights about the physiology of this bacterial group under extreme oligotrophic conditions.
When grown on agar surfaces, microbes can produce distinct multicellular spatial structures called colonies, which contain characteristic sizes, shapes, edges, textures, and degrees of opacity and color. For over one hundred years, researchers have used these morphology cues to classify bacteria and guide more targeted treatment of pathogens. Advances in genome sequencing technology have revolutionized our ability to classify bacterial isolates and while genomic methods are in the ascendancy, morphological characterization of bacterial species has made a resurgence due to increased computing capacities and widespread application of machine learning tools. In this paper, we revisit the topic of colony morphotype on the within-species scale and apply concepts from image processing, computer vision, and deep learning to a dataset of 69 environmental and clinical Pseudomonas aeruginosa strains. We find that colony morphology and complexity under common laboratory conditions is a robust, repeatable phenotype on the level of individual strains, and therefore forms a potential basis for strain classification. We then use a deep convolutional neural network approach with a combination of data augmentation and transfer learning to overcome the typical data starvation problem in biological applications of deep learning. Using a train/validation/test split, our results achieve an average validation accuracy of 92.9% and an average test accuracy of 90.7% for the classification of individual strains. These results indicate that bacterial strains have characteristic visual 'fingerprints' that can serve as the basis of classification on a sub-species level. Our work illustrates the potential of image-based classification of bacterial pathogens and highlights the potential to use similar approaches to predict medically relevant strain characteristics like antibiotic resistance and virulence from colony data.
Reproductive division of labor (e.g., germ-soma specialization) is a hallmark of the evolution of multicellularity, signifying the emergence of a new type of individual and facilitating the evolution of increased organismal complexity. A large body of work from evolutionary biology, economics, and ecology has shown that specialization is beneficial when further division of labor produces an accelerating increase in absolute productivity (i.e., productivity is a convex function of specialization). Here we show that reproductive specialization is qualitatively different from classical models of resource sharing, and can evolve even when the benefits of specialization are saturating (i.e., productivity is a concave function of specialization). Through analytical theory and evolutionary individual based simulations, our work demonstrates that reproductive specialization is strongly favored in sparse networks of cellular interactions, such as trees and filaments, that reflect the morphology of early, simple multicellular organisms, highlighting the importance of restricted social interactions in the evolution of reproductive specialization. More broadly, we find that specialization is strongly favored, despite saturating returns on investment, in a wide range of scenarios in which sharing is asymmetric.
Article Figures and data Abstract Introduction Results Discussion Materials and methods Appendix 1 Data availability References Decision letter Author response Article and author information Metrics Abstract Reproductive division of labor (e.g. germ-soma specialization) is a hallmark of the evolution of multicellularity, signifying the emergence of a new type of individual and facilitating the evolution of increased organismal complexity. A large body of work from evolutionary biology, economics, and ecology has shown that specialization is beneficial when further division of labor produces an accelerating increase in absolute productivity (i.e. productivity is a convex function of specialization). Here we show that reproductive specialization is qualitatively different from classical models of resource sharing, and can evolve even when the benefits of specialization are saturating (i.e. productivity is a concave function of specialization). Through analytical theory and evolutionary individual-based simulations, we demonstrate that reproductive specialization is strongly favored in sparse networks of cellular interactions that reflect the morphology of early, simple multicellular organisms, highlighting the importance of restricted social interactions in the evolution of reproductive specialization. Introduction The evolution of multicellularity set the stage for unprecedented increases in organismal complexity (Szathmáry and Smith, 1995; Knoll, 2011). A key factor in the remarkable success of multicellular strategies is the ability to take advantage of within-organism specialization through cellular differentiation (Queller and Strassmann, 2009; Brunet and King, 2017; Cavalier-Smith, 2017). Reproductive specialization, which includes both the creation of a specialized germ line during ontogeny (as in animals and volvocine green algae) and functional differentiation into reproductive and non-reproductive tissues (as in plants, green and red macroalgae, and fungi), may be especially important (Cooper and West, 2018; Michod et al., 2006; Ispolatov et al., 2012; Solari et al., 2013; Michod, 2007; West et al., 2015). Reproductive specialization is an unambiguous indication that biological individuality rests firmly at the level of the multicellular organism (Michod, 1999; Folse and Roughgarden, 2010), and is thought to play an important role in spurring the evolution of further complexity by inhibiting within-organism (cell-level) evolution (Buss, 1988) and limiting reversion to unicellularity (Libby and Ratcliff, 2014). Despite the central importance of reproductive specialization, its origin and further evolution during the transition to multicellularity remain poorly understood (McShea, 2000). The origin of specialization has long been of interest to evolutionary biologists, ecologists, and economists. A large body of theory from these fields shows that specialization pays off only when it increases total productivity, compared to the case where each individual simply produces what they need (Szathmáry and Smith, 1995; Smith and Szathmáry, 1997; Goldsby et al., 2012; Corning and Szathmáry, 2015; Hidalgo and Hausmann, 2009; Boza et al., 2014; Taborsky et al., 2016; Page et al., 2006; Rueffler et al., 2012; Szekely et al., 2013; Findlay, 2008; Amado et al., 2018). Certain types of trading arrangements maximize the benefits of specialization; highly reciprocal interactions, which facilitate exchange between complementary specialists, amplify cooperation (Allen et al., 2017; Pavlogiannis et al., 2018). Still, previous work finds that even when groups grow in an ideal spatial arrangement, increased specialization and trade is only favored by natural selection when productivity increases as an accelerating function of the degree of specialization (i.e., productivity is a convex, or super-linear, function of the degree of specialization). Conversely, saturating functional returns (i.e. productivity is a concave, or sub-linear, function of the degree of specialization) should inhibit the evolution of specialization (Cooper and West, 2018; Michod et al., 2006; Ispolatov et al., 2012; Solari et al., 2013; Michod, 2007; West et al., 2015). Reproductive specialization differs from classical models of trade in several key respects. Trade between germ (reproductive) and somatic (non-reproductive) cells is intrinsically asymmetric, because the cooperative action, multicellular replication, is not a product that is shared evenly. Selection acts primarily on the fitness of the multicellular group as a whole (Folse and Roughgarden, 2010). As a result, optimal specialization can result in behaviors that reduce the short-term fitness of some cells within the multicellular group (Michod et al., 2006; Michod, 2007), often manifest as reproductive altruism. Understanding the evolution of cell-cell trade, a classic form of social evolution (Kirk, 2005), requires understanding the extent of between-cell interactions. Network theory has proven to be an exceptionally powerful and versatile technique for analyzing social dynamics (Wey et al., 2008; Lieberman et al., 2005), and indeed, is uniquely well suited to understanding the evolution of early multicellular organisms. When cells adhere through permanent bonds, sparse network-like bodies (i.e. filaments and trees) often result (Amado et al., 2018). This mode of group formation is not only common today among simple multicellular organisms (Umen, 2014; Claessen et al., 2014), but is the dominant mode of group formation in the lineages evolving complex multicellularity (i.e. plants, red algae, brown algae, and fungi, but not animals). In this paper, we develop and investigate a model for how the network topology of early multicellular organisms affects the evolution of reproductive specialization. We find that under a broad class of sparse networks, complete functional specialization can be adaptive even when returns from dividing labor are saturating (i.e. concave/sub linear). Sparse networks impose constraints on who can share with whom, which counterintuitively increases the benefit of specialization (McShea, 2000). By dividing labor, multicellular groups can capitalize on high between-cell variance in behavior, ultimately increasing group-level reproduction. Further, we consider group morphologies that naturally arise from simple biophysical mechanisms and show that these morphologies strongly promote reproductive specialization. Our results show that reproductive specialization can evolve under a far broader set of conditions than previously thought, lowering a key barrier to major evolutionary transitions. Model Reproductive specialization can be modeled as the separation of two key fitness parameters, those related to either viability or fecundity, into separate cells within the multicellular organism (Michod, 2006; Folse and Roughgarden, 2010). The dichotomy of viability versus fecundity was originally used by Michod, 2006 to partition components of cellular fitness into actions that contribute to keeping a cell alive (viability), and actions that directly contribute to reproduction (fecundity). Multicellular organisms often have evolved to divide labor along these two lines (i.e. reproduction by germ cells and survival provided by somatic cells), while their unicellular ancestors had to do both. We define viability as activities keeping the cell alive (e.g. investing in cellular homeostasis or behaviors that improve survival), and fecundity as activities involved in cellular reproduction. At the cellular level, there appears to be a fundamental asymmetry in how viability effort and fecundity effort can be shared among cells: while multicellular organisms readily evolve differentiated cells that are completely reliant on helper cells (i.e. glial cells that support neurons in animals or companion cells that support sieve tube cells in plants), no cell can directly share its ability to reproduce. To better understand the intuition behind this, consider a cell that elongates prior to fission. This cell must grow to approximately twice its original length. Two cells cannot elongate by 50% and then combine their efforts; elongation is an intrinsically single cell effort. We thus use a model in which viability can be shared across connected cells, but fecundity cannot be shared (note, in order to test the sensitivity of our predictions to this assumption, in a later section we will consider the more general case in which viability and fecundity can both be shared, but by different amounts). We consider a model of multicellular groups composed of clonal cells that each invest resources into viability and fecundity. Because there is no within-group genetic variation, within-group evolution is not possible, though selection can act on group-level fitness differences. Specifically, we consider the pattern of cellular investment in fecundity and viability, and their sharing of these resources with neighboring cells within the group, to be the result of a heritable developmental program. Thus, selection is able to act on the multicellular fitness consequences of different patterns of cellular behavior within the group. We let v denote each cell’s investment into viability, and b denote each cell’s investment into fecundity. Each cell's total investment is constrained so that v+b=1. However, a cell's return on its investment is in general nonlinear. Here, we let α represent the ‘return on investment exponent': by tuning α above and below 1.0, we can simulate conditions with accelerating and saturating (i.e. convex and concave, or super- and sub-linear) returns on investment, respectively. We let v~ and b~ represent a cell’s return on viability and fecundity investments, respectively. Following Michod, 2006; Michod and Roze, 1997, we calculate a cell’s reproductive output as a multiplicative function of v~ and b~ (thus, both functions must be positive for a cell to grow). A single cell’s reproduction rate is w=v~b~=vαbα. At the group level, fitness is the total contribution of all cells in the group toward the production of new groups (i.e. group level reproduction). The group level fitness is thus the sum of v~b~ over all cells. Finally, cells may share the products of their investment in viability with other cells to whom they are connected.For a given group, the details about who may share with whom, and how much, is encoded in a weighted adjacency matrix 𝐜. The element cij defines what proportion of viability returns cell i shares with cell j. Cells cannot give away all of their viability returns, as they would no longer be viable; mathematically, we count a cell among its neighbors and thus ensure that they always ‘share’ a positive portion of viability returns with themselves, so that cii>0. Furthermore, since a cell cannot share more viability returns than the total it possesses, we have ∑i=1Ncji=1 for a group of N cells. For the networks we consider, each cell takes a fraction β of its viability returns and shares that fraction equally among all of its ni neighbors (including itself), and keeps the rest of its returns 1-β for itself. Therefore cell i keeps a total fraction of 1-β+βni of its returns for itself and gives βni to each of its non-self neighbors. In other words, cii=1−β+βni, cij=βni if cells i and j are connected, and cij=0 if cells i and j are not connected. This means the total amount of returns kept by cell i depends on both the network topology and β. When β=0 there is no sharing, and when β=1 cells share everything equally among all connections and themselves. We refer to β as interaction strength. A given group topology (unweighted adjacency matrix) and β completely specify 𝐜. Within a group of N cells, the overall returns on viability that a given cell enjoys, then, comprises its own returns as well as whatever is shared with it by other members of the group. This can be written as v~i=viαcii+∑j≠invjαcji, or equivalently, v~i=∑jnvjαcji. Note that this is a column sum, since it describes the total incoming viability returns a cell receives as a result of its own effort and trade with neighboring cells. Therefore, we write the group level reproduction rate (i.e. the group fitness) for a group of N cells as (1) W=∑i=1i=Nb~i⋅v~iW=∑i=1i=Nb~i∑j=1j=NvjαcjiW=∑i=1i=N∑j=1j=Nbiαcjivjα, where all three of the above equations are equivalent. We investigate evolutionary outcomes under this definition of group level fitness for groups with different topologies (who shares with whom), and in scenarios with various return on investment exponents α. Results Fixed resource sharing We first consider cases wherein cells within a group share across fixed intercellular interactions. In each case we vary the return on investment exponent, α, between 0.5 and 1.5, and the interaction strength, β, between 0.0 and 1.0, both in increments of 0.1. For each combination of topology, α, and β, the group investment strategy (vi for all i) was allowed to evolve for 1000 generations. We begin with simple topologies: groups with no connections and groups that are maximally connected. They represent, respectively, the case in which all cells within the group are autonomous and the case in which every cell interacts with all others (i.e. a ‘well-mixed’ group). In the absence of interactions, cells cannot benefit from functions performed by others and therefore must perform both functions v and b; hence specialization is not favored, and does not evolve. In the fully connected case, a high degree of specialization is observed for many values of α and β (Figure 1a). Consistent with classic results (Cooper and West, 2018; Michod et al., 2006; Ispolatov et al., 2012; Solari et al., 2013; Michod, 2007; West et al., 2015), specialization is only achieved in the fully connected case for α>1. Figure 1 Download asset Open asset Schematic of topology for a simplified ten cell group (first row), and mean specialization as a function of specialization power α and interaction strength β across the entire population. (A) When each cell in the group is connected to all others, specialization is favored only when α>1. (B) For the nearest neighbor topology, specialization is favorable for a wider range of parameters, including for some values of α<1. Specifically, specialization is advantageous when α>34β. (C) Connecting alternating specialists creates a bipartite graph which maximizes the benefits of specialization and the range of parameters for which it is advantageous. In this case, specialization is favorable wherever α>35β. The red curves represent analytical predictions for α*, the lowest value of α for which complete generalization is disfavored, and the orange vertical lines are at α=1 to guide the eye. While analysis shows that some degree of specialization must occur in the regime upward and to the right of the red curves, simulations reveal that when complete generalization is disfavored complete specialization is favored in these networks. Next, we consider a simple sparse network in which each cell within a group is connected to only two other cells, forming a complete ring (Figure 1b); we refer to this as the neighbor network. Surprisingly, preventing trade between most cells encourages division of labor. We find that specialization evolves even when α<1.0, that is, when the returns on investment are saturating or concave. In our simulations, this topology leads to alternating specialists in viability and fecundity (Figure 1b). Analytically, we find that this topology always favors at least some degree of specialization whenever α>34β. We next study a network with cells that can be separated into two disjoint sub-groups, where every edge of the network connects a cell in one sub-group to a cell in the other sub-group and no within sub-group connections exist, that is, a bipartite graph (Figure 1c). We refer to the specific network structure in Figure 1c as the ‘balanced bipartite’ network. We find that specialization evolves even when α<1.0, similar to the neighbor network. However, we find that specialization evolves for a wider range of α and β values for the balanced bipartite network than for the neighbor network. We can analytically determine under what conditions complete generalization is optimal. The complete generalist investment strategy is where every cell in the group invests equally into viability and fecundity, defined as: vi*=12 for all i. For these simple topologies, the complete generalist strategy is either a maximum or a saddle point, depending on the values of α and β. Complete generalization is only favored when the Hessian evaluated at the generalist investment strategy ∂2W∂vk∂vℓ|v→∗=H∗ is negative definite, that is, all of its eigenvalues are negative. The largest eigenvalues of the Hessian for the complete, neighbor, and balanced bipartite networks are α(12)2α-3(-1+αβ), α(12)2α-3(-1+43αβ), and α(12)2α-3(-1+2NN+2αβ), respectively. When α and β are chosen so that the largest eigenvalue becomes non-negative, complete generalization cannot maximize group fitness. While we have not analytically shown where the fitness maximum occurs in cases where the generalist strategy becomes a saddle point, evolutionary simulations (Figure 1) suggest that when complete generalization is not a fitness maximum, a high degree of (or even complete) specialization typically does maximize fitness. In all cases in which complete specialization is achieved in evolutionary simulations, v~b~ terms for viability specialists go to zero, as they cannot reproduce on their own. Furthermore, the fecundity specialists are entirely reliant on the viability specialists for their survival; if viability sharing were suddenly prevented, their v~b~ terms would also be zero. This amounts to complete reproductive specialization (Cooper and West, 2018; Kirk, 2005; Michod, 2006). Evolving resource sharing Until now, sharing has been included in every intercellular interaction within groups. Here, we consider the case in which there is initially no sharing, and sharing must evolve along with specialization. These simulations begin with no resource sharing (i.e. β=0); during every round, each group in the population has a 2% chance that a mutation will impact its developmental program, and the β value for one of its cells will change. The new β value is chosen from a truncated Gaussian with standard deviation of 10% of the mean, centered on the current value. Whatever is not retained is shared equally across all interactions, including the self term. Evolutionary simulation results are similar to those from the fixed-sharing model (Appendix 1—figure 1). Saturating specialization (i.e. specialization despite a concave return function) still occurs for the neighbor and balanced bipartite topologies. Thus, for both fixed and evolved resource sharing, we observe specialization for the largest range of parameters (including α<1) not when the group is maximally connected, but rather when connections are fairly sparse. Therefore, a sparse group topology constitutes a cooperation-prone physical substrate that can favor the evolution of cellular. As an example of the benefit of evolving sharing, consider that the maximum fitness according to Equation 1 for a group of N disconnected cells scales as N(12)2α. On the other hand, for the balanced bipartite network with a complete specialization strategy (i.e. v→=⟨0,1,0,1,…⟩), the fitness scales as (N2β2N+2). The ratio of these fitnesses is (N2β2N2+2N)22α≈β22α-1, where the approximation is for large N. So for larger groups and when α>12-logβ2log2, if a group can evolve resource sharing (i.e. letting β→1 and adopting the specialist investment strategy) its maximum fitness will increase. Benefit of specialization We now consider a simple example to highlight why specialization can be adaptive despite saturating (i.e., concave) returns from trade. Consider groups of four cells, connected via the nearest-neighbor topology (i.e. in a ring). We directly calculate the group-level fitness of generalists and specialists for two scenarios: α=0.9 and α=1 by summing the contributions of each cell within these groups (Figure 2). In this simple scenario, reproductive specialization strongly increases group fitness (33% for α=1 and 16% for α=0.9). Figure 2 Download asset Open asset To explore how specialization can be favored by the nearest-neighbor topology, we compare the fitness of a four member system when cells are (A) generalists and (B) specialists. We first consider the case of linear functional returns (α=1). For the case of generalists (A), each cell receives as much viability as it shares, and all nodes contribute equally to the fitness of the group. Therefore, the fitness of the group is W=4⋅12⋅12=1. For the case of specialists, however, the viability specialist cells (blue) have v~b~=0, while the fecundity specialist cells have nonzero v~b~ due to the fact that they receive 13 of each viability specialist’s output. Thus the fitness of the group is W=2(2⋅13)=43. Thus, fitness is higher for the group of specialists, so specialization is favored. For α=0.9, the fitness of generalists is 1.15, and the fitness of specialists is 1.33. Thus, even though the returns on investment are saturating (i.e. concave), specialization is favored. The benefit of specialization in neighbor networks increases with group size. For a ring of size N, fitness under the specialist strategy v→=⟨0,1,0,1…⟩ is W=β3N. For a ring of generalists the fitness is W=N(12)2α. Therefore, whenever α>log3-logβ2log2, the ring of complete specialists enjoys a greater fitness than the ring of complete generalists. Again, note that complete generalization becomes disfavored when α>34β, so there is a narrow regime where 34β<α1. However, when connectivity is small but not zero, specialization arises most readily. We conjecture that the troughs in Figure 3b, where specialization occurs for the lowest values of α, occur when connectivity is just large enough so that the existence of a spanning tree is more likely than not. Figure 3 Download asset Open asset Sparsity encourages specialization. Heat maps showing conditions that favor specialists (white) and generalists (black) for nearest neighbor topologies (A, left) and randomly generated graphs with the same connectivity as nearest neighbor topologies (A, right). Specialization is adaptive on a neighbor network for α>34β; random networks with the same mean connectivity as the nearest neighbor topology behave similarly. (B) The sparsity of a random graph affects how likely it is to favor specialization. We numerically maximize fitness for random graphs of size N=10 (left), N=20 (middle), and N=100 (right) at different levels of sparsity, and subsequently measure the specialization 𝒮 of the fitness maximizing investment strategy. The horizontal axis is the fraction of possible connections present ranging from 0 (none) to 1 (all). The vertical axis is the specialization power α, and the colormap shows mean specialization. Filaments and trees Sparse topologies like the neighbor network configuration have significant biological relevance, and direct ties to early multicellularity. The first step in the evolution of multicellularity is the formation of groups of cells (Szathmáry and Smith, 1995; Kirk, 2005; Willensdorfer, 2008; Bonner, 1998; Fairclough et al., 2010). Simple groups readily arise through incomplete cell division, forming either simple filaments (Figure 4a) or tree-like morphologies (Figure 4b; Bengtson et al., 2017b; Droser and Gehling, 2008; Berman-Frank et al., 2007; Ratcliff et al., 2012). Filament topologies have been widely observed in independently-evolved simple multicellular organisms, from ancient fossils of early red algae (Butterfield, 2000; Figure 4a) to extant multicellular bacteria (Claessen et al., 2014) and algae (Umen, 2014). Branching multicellular phenotypes have also been observed to readily evolve from baker’s yeast (Ratcliff et al., 2015; Figure 4b), and are reminiscent of ancient fungus-like structures (Bengtson et al., 2017a) and early multicellular fossils of unknown phylogenetic position from the early Ediacaran (Droser and Gehling, 2008). Figure 4 Download asset Open asset Simple multicellular organisms with sparse topologies. We show two examples of simple multicellular organisms with linear and branched topologies. The image in (A) is a fossilized rhodophyte specimen of Bangiomorpha pubescens, courtesy of Prof. Nicholas Butterfield (see e.g. Butterfield, 2000); the image in (B) is a confocal image of ‘snowflake yeast’ showing cell volumes in blue and cell-cell connections in green; the image in (C) is an epifluorescence image of individual yeast cells from a planktonic culture, with the same staining technique as in (B). Scale bars in pictures = 10 µm. Panels include cartoons depicting simplified topologies. Topologically similar to the two-neighbor configuration, these configurations yield similar simulation results. Specialization is plotted as a function of α. Solid (A) and blue (B) vertical lines (A and B) indicate analytical solutions for the transition point where the Hessian evaluated at v→=121→ stops being negative definite, that is, α*; dotted lines indicate roughly where the simulation curves cross specialization of 0.5, that is, the 'true' transition value of α where specialization becomes favored. (C) In contrast, for a well-mixed group with fully connected topology, α∗=0.5, indicating specialization only occurs when there are accelerating returns on investment. (D) To further explore trees and filaments we analytically solved for α* for various types of trees and filaments of different sizes. α* is plotted versus group size for several topologies. This is a proxy measure of how amenable a network structure is to specialization. Prof. Butterfield has granted permission to distribute the image in panel A under the terms of a Creative Commons Attribution license [https://creativecommons.org/licenses/by/4.0/]; further reproduction of this image should adhere to the terms of the CC BY 4.0 license with an attribution to Prof. Butterfield. Simulations of populations of groups with filamentous and branched topologies reveal that specialization is indeed favored in the sub-linear regime (Figure 4a and b) ; conversely, sub-linear specialization is never observed for fully connected topologies (Figure 4c). While the generalist strategy is never a critical point for these networks (which have 𝐜≠𝐜T, see Materials and methods), we conjecture that there is a nearby critical point which maximizes fitness at small values of α and becomes unstable at larger values of α. We introduce a new metric, α*, defined as the value of α such that the largest (least negative) eigenvalue of the Hessian evaluated at the complete generalist strategy is zero when β=1. For topologies in which each member has the same number of neighbors, α* is a critical value at which generalization is no longer an optimal strategy. However, even for groups where the number of neighbors for each cell varies, we can still use α* as a proxy for how amenable a topology is to saturating specialization. The smaller α*, the more specialization is likely to be favored. We plot vertical lines where α=α* (solid lines in Figure 4(a) Figure 4(b)), and dotted lines to indicate roughly where the simulation curves cross specialization of 0.5. These results show that, for these topologies, α* acts as an effective metric for how amenable a network is to saturating specialization. This metric α* only depends on topology and can in principle be calculated analytically given any network. We examined the value of α* as filaments and a variety of tree-like structures grow larger, and find that specialization becomes more strongly favored (Figure 4D ). While group size has no effect on specialization for some topologies, like the neighbor network, filaments and trees all see a decrease in α* as group size increases; α* eventually plateaus once groups are larger than a few tens of cells. Simple and easily accessible routes to multicellular group formation can readily evolve in response to selection for organismal size (Ratcliff et al., 2012), and this process may also strongly favor the evolution of cellular differentiation (McCarthy and Enquist,