Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson-Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily or dubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene trees were excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets.
New methods for parsimony analysis of large data sets are presented. The new methods are sectorial searches, tree-drifting, and tree-fusing. For Chase et al.'s 500-taxon data set these methods (on a 266-MHz Pentium II) find a shortest tree in less than 10 min (i.e., over 15,000 times faster than PAUP and 1000 times faster than PAUP*). Making a complete parsimony analysis requires hitting minimum length several times independently, but not necessarily all “islands” for Chase et al.'s data set, this can be done in 4 to 6 h. The new methods also perform well in other cases analyzed (which range from 170 to 854 taxa).
The Mkv evolutionary model, based on minor modifications to models of molecular evolution, is being increasingly used to infer phylogenies from discrete morphological data, often producing different results from parsimony. The critical difference between Mkv and parsimony is the assumption of a "common mechanism" in the Mkv model, with branch lengths determining that probability of change for all characters increases or decreases at the same tree branches by the same exponential factor. We evaluate whether the assumption of a common mechanism applies to morphology, by testing the implicit prediction that branch lengths calculated from different subsets of characters will be significantly correlated. Our analysis shows that DNA (38 data sets tested) is often compatible with a common mechanism, but morphology (86 data sets tested) generally is not, showing very disparate branch lengths for different character partitions. The low levels of branch length correlation demonstrated for morphology (fitting models without a common mechanism) suggest that the Mkv model is too unrealistic and inadequate for the analysis of most morphological data sets. [Bayesian analysis; Mkv model; morphological data; phylogenetics.].
Algorithms to speed up tree searches under Sankoff parsimony are described. For T terminal taxa, an exact algorithm allows calculating length during searches T to 2T times faster than a complete down-pass optimization. An approximate but accurate method is from 3T to 8T times faster than a down-pass. Other algorithms that provide additional increases of speed for simple symmetrical transformation costs are described.
Grant and Kluge have recently stated that Bremer support and their own REP ("relative explanatory power"), are the only objective measures of group support. This paper discusses their claim, showing that their philosophical arguments have no basis, and that their own numerical examples actually serve to illustrate shortcomings of REP.
Abstract Areas of endemism characterize geographical regions by their unique biotas, providing the basis for studies on the ecological and historical drivers of these biologically distinct units. Tribe Bignonieae (Bignoniaceae) are a highly diverse clade of lianas distributed throughout the Neotropics, representing an excellent model for studying the drivers of species diversity and distribution patterns in this region. We used a dataset representing 98% of the diversity of Bignonieae and 21 170 unique locality records to perform an analysis of endemicity using NDM/VNDM. We recovered areas of endemism distributed across the Neotropics, including a higher number of areas at coarser spatial scales. Although overlapping and nested patterns of endemism were common and the spatial congruence with the individual units of previous regionalization schemes was low, the patterns of endemism recovered were in general agreement with those documented for other taxa. Our findings are generally consistent with key Neotropical biogeographical hypotheses. These results highlight the importance of studying detailed distribution patterns of selected taxa for an improved understanding of Neotropical biogeography.