Morphological Data Sets Fit a Common Mechanism Much More Poorly than DNA Sequences and Call Into Question the Mkv Model
89
Citation
44
Reference
10
Related Paper
Citation Trend
Abstract:
The Mkv evolutionary model, based on minor modifications to models of molecular evolution, is being increasingly used to infer phylogenies from discrete morphological data, often producing different results from parsimony. The critical difference between Mkv and parsimony is the assumption of a "common mechanism" in the Mkv model, with branch lengths determining that probability of change for all characters increases or decreases at the same tree branches by the same exponential factor. We evaluate whether the assumption of a common mechanism applies to morphology, by testing the implicit prediction that branch lengths calculated from different subsets of characters will be significantly correlated. Our analysis shows that DNA (38 data sets tested) is often compatible with a common mechanism, but morphology (86 data sets tested) generally is not, showing very disparate branch lengths for different character partitions. The low levels of branch length correlation demonstrated for morphology (fitting models without a common mechanism) suggest that the Mkv model is too unrealistic and inadequate for the analysis of most morphological data sets. [Bayesian analysis; Mkv model; morphological data; phylogenetics.].Keywords:
Tree (set theory)
Maximum parsimony
Morphology
Phylogenetics is the study of evolutionary relationships among organisms or genes. One of the phylogenetic studies purposes is to reconstruct evolutionary ties between organisms and the other one is to estimate the time of divergence between organisms since they last shared a common ancestor. To illustrate the evolutionary relationships among a group of organisms we use and construct phylogenetic trees. Phylogenetic relationships within the suborder Tricladida are not jet fully understood. In Ball’s proposal some uncertainties were left open, one of them being the closer similarities in eye structure between the Dugesiidae and the land planarians (Terricola) than between Dugesiidae and the non-dugesiid members of the Paludicola (Ball 1981). Analyses of freshwater planarians were based on DNA sequence variation of mitochondrial gene for cytochrome oxidase I (COI). Phylogenetic studies were carried out using maximum likelihood, maximum parsimony and bayesian methods. Maximum likelihood (ML) is one commonly used method for reconstructing trees. To perform maximum likelihood method we used two programs - Modeltest 3.7 (Posada and Crandal 1998) and PAUP 4 beta 10 (Swoford 2001). The first step is to find a model of evolution that fits the DNA changes in the aligned sequences that are being used. In this analysis we used Akaike information criteria (AIC). Resulting phylogenetic tree was visualizated using Treeview 1.6.6. (Page 2001). To perform maximum parsimony method we used PAUP 4 beta 10 (Swoford 2001). We used nexus format file of COI sequences. Statistical support for the tree was evaluated by bootstrapping. One more commonly used method in phylogenetic analyses is Bayesian analysis. We performed it using Mr. Bayes 3.1.1. (Huelselbeck i Ronquist 2003) program. To root our phylogenetic trees we used Bdelloura candida as an outgroup. Results of this phylogenetic research contribute to our better understanding of relations between this group of organisms.
Maximum parsimony
Computational phylogenetics
Tree rearrangement
Cite
Citations (0)
Should we build our own phylogenetic trees based on gene sequence data, or can we simply use available synthesis phylogenies? This is a fundamental question that any study involving a phylogenetic framework must face at the beginning of the project. Building a phylogeny from gene sequence data (purpose-built phylogeny) requires more effort, expertise, and cost than subsetting an already available phylogeny (synthesis-based phylogeny). However, we still lack a comparison of how these two approaches to building phylogenetic trees influence common community phylogenetic analyses such as comparing community phylogenetic diversity and estimating trait phylogenetic signal. Here, we generated three purpose-built phylogenies and their corresponding synthesis-based trees (two from Phylomatic and one from the Open Tree of Life, OTL). We simulated 1,000 communities and 12,000 continuous traits along each purpose-built phylogeny. We then compared the effects of different trees on estimates of phylogenetic diversity (alpha and beta) and phylogenetic signal (Pagel's λ and Blomberg's K). Synthesis-based phylogenies generally yielded higher estimates of phylogenetic diversity when compared to purpose-built phylogenies. However, resulting measures of phylogenetic diversity from both types of phylogenies were highly correlated (Spearman's ρ > 0.8 in most cases). Mean pairwise distance (both alpha and beta) is the index that is most robust to the differences in tree construction that we tested. Measures of phylogenetic diversity based on the OTL showed the highest correlation with measures based on the purpose-built phylogenies. Trait phylogenetic signal estimated with synthesis-based phylogenies, especially from the OTL, was also highly correlated with estimates of Blomberg's K or close to Pagel's λ from purpose-built phylogenies when traits were simulated under Brownian motion. For commonly employed community phylogenetic analyses, our results justify taking advantage of recently developed and continuously improving synthesis trees, especially the Open Tree of Life.
Community
Cite
Citations (96)
Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks.
Maximum parsimony
Tree rearrangement
Computational phylogenetics
Tree (set theory)
Cite
Citations (0)
Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks.
Maximum parsimony
Tree rearrangement
Tree (set theory)
Computational phylogenetics
Cite
Citations (0)
Tree (set theory)
Phylogenomics
Cite
Citations (28)
Maximum parsimony
Tree rearrangement
Computational phylogenetics
Tree (set theory)
Cite
Citations (11)
Abstract Should we build our own phylogenetic trees based on gene sequence data, or can we simply use available synthesis phylogenies? This is a fundamental question that any study involving a phylogenetic framework must face at the beginning of the project. Building a phylogeny from gene sequence data (purpose-built phylogeny) requires more effort, expertise, and cost than subsetting an already available phylogeny (synthesis-based phylogeny). However, we still lack a comparison of how these two approaches to building phylogenetic trees influence common community phylogenetic analyses such as comparing community phylogenetic diversity and estimating trait phylogenetic signal. Here, we generated three purpose-built phylogenies and their corresponding synthesis-based trees (two from Phylomatic and one from the Open Tree of Life [OTL]). We simulated 1,000 communities and 12,000 continuous traits along each purpose-built phylogeny. We then compared the effects of different trees on estimates of phylogenetic diversity (alpha and beta) and phylogenetic signal (Pagel’s λ and Blomberg’s K). Synthesis-based phylogenies generally yielded higher estimates of phylogenetic diversity when compared to purpose-built phylogenies. However, resulting measures of phylogenetic diversity from both types of phylogenies were highly correlated (Spearman’s ρ > 0.8 in most cases). Mean pairwise distance (both alpha and beta) is the index that is most robust to the differences in tree construction that we tested. Measures of phylogenetic diversity based on the OTL showed the highest correlation with measures based on the purpose-built phylogenies. Trait phylogenetic signal estimated with synthesis-based phylogenies, especially from the OTL, were also highly correlated with estimates of Blomberg’s K or close to Pagel’s λ from purpose-built phylogenies when traits were simulated under Brownian Motion. For commonly employed community phylogenetic analyses, our results justify taking advantage of recently developed and continuously improving synthesis trees, especially the Open Tree of Life.
Phylogenetic diversity
Trait
Phylogenetic comparative methods
Tree (set theory)
Cite
Citations (3)
The determination of the phylogenetic relationships among microorganisms has long relied primarily on gene sequence information. Given that prokaryotic organisms often lack morphological characteristics amenable to phylogenetic analysis, prokaryotic phylogenies, in particular, are often based on sequence data. In this work, we explore a new source of phylogenetic information, the distribution of protein structural domains within fully sequenced prokaryotic genomes. The evolution of the structural domains we use has been studied extensively, allowing us to base our phylogenetic methods on testable theoretical models of structural evolution. We find that the methods that produce reasonable phylogenetic relationships are indeed the methods that are most consistent with theoretical evolutionary models. This work represents, to our knowledge, the first such theoretically motivated phylogeny, as well as the first application of structural information to phylogeny on this scale. Our results have strong implications for the phylogenetic relationships among prokaryotic organisms and for the understanding of protein evolution as a whole.
Computational phylogenetics
Sequence (biology)
Cite
Citations (41)
Abstract Reconstructing the evolutionary relationships of species is a major goal in biology. Despite the increasing number of completely sequenced genomes, a large number of phylogenetic projects rely on targeted sequencing and analysis of a relatively small sample of marker genes. The selection of these phylogenetic markers should ideally be based on accurate predictions of their combined, rather than individual, potential to accurately resolve the phylogeny of interest. Here we present and validate a new phylogenomics strategy to efficiently select a minimal set of stable markers able to reconstruct the underlying species phylogeny. In contrast to previous approaches, our methodology does not only rely on the ability of individual genes to reconstruct a known phylogeny, but it also explores the combined power of sets of concatenated genes to accurately infer phylogenetic relationships of species not previously analyzed. We applied our approach to two broad sets of cyanobacterial and ascomycetous fungal species, and provide two minimal sets of six and four genes, respectively, necessary to fully resolve the target phylogenies. This approach paves the way for the informed selection of phylogenetic markers in the effort of reconstructing the tree of life.
Phylogenomics
Tree (set theory)
Cite
Citations (52)
Studies examining phylogenetic community structure have become increasingly prevalent, yet little attention has been given to the influence of the input phylogeny on metrics that describe phylogenetic patterns of co-occurrence. Here, we examine the influence of branch length, tree reconstruction method, and amount of sequence data on measures of phylogenetic community structure, as well as the phylogenetic signal (Pagel's λ) in morphological traits, using Trichoptera larval communities from Churchill, Manitoba, Canada. We find that model-based tree reconstruction methods and the use of a backbone family-level phylogeny improve estimations of phylogenetic community structure. In addition, trees built using the barcode region of cytochrome c oxidase subunit I (COI) alone accurately predict metrics of phylogenetic community structure obtained from a multi-gene phylogeny. Input tree did not alter overall conclusions drawn for phylogenetic signal, as significant phylogenetic structure was detected in two body size traits across input trees. As the discipline of community phylogenetics continues to expand, it is important to investigate the best approaches to accurately estimate patterns. Our results suggest that emerging large datasets of DNA barcode sequences provide a vast resource for studying the structure of biological communities.
DNA Barcoding
Tree (set theory)
Barcode
Cite
Citations (40)