Arabia is the largest peninsula in the world, with >3000 species of vascular plants. Not much effort has been made to generate a multi-locus marker barcode library to identify and discriminate the recorded plant species. This study aimed to determine the reliability of the available Arabian plant barcodes (>1500; rbcL and matK) at the public repository (NCBI GenBank) using the unsupervised and supervised methods. Comparative analysis was carried out with the standard dataset (FINBOL) to assess the methods and markers' reliability. Our analysis suggests that from the unsupervised method, TaxonDNA's All Species Barcode criterion (ASB) exhibits the highest accuracy for rbcL barcodes, followed by the matK barcodes using the aligned dataset (FINBOL). However, for the Arabian plant barcode dataset (GBMA), the supervised method performed better than the unsupervised method, where the Random Forest and K-Nearest Neighbor (gappy kernel) classifiers were robust enough. These classifiers successfully recognized true species from both barcode markers belonging to the aligned and alignment-free datasets, respectively. The multi-class classifier showed high species resolution following the two classifiers, though its performance declined when employed to recognize true species. Similar results were observed for the FINBOL dataset through the supervised learning approach; overall, matK marker showed higher accuracy than rbcL. However, the lower rate of species identification in matK in GBMA data could be due to the higher evolutionary rate or gaps and missing data, as observed for the ASB criterion in the FINBOL dataset. Further, a lower number of sequences and singletons could also affect the rate of species resolution, as observed in the GBMA dataset. The GBMA dataset lacks sufficient species membership. We would encourage the taxonomists from the Arabian Peninsula to join our campaign on the Arabian Barcode of Life at the Barcode of Life Data (BOLD) systems. Our efforts together could help improve the rate of species identification for the Arabian Vascular plants.
Abstract Background Docking the tails of young lambs in long-tailed sheep breeds is a common practice worldwide. This practice is associated with pain, suffering and damage to the affected animals. Breeding for a shorter tail in long-tailed sheep breeds could offer one of the alternatives. This study aimed to analyze the natural tail length variation in the most common German Merino variety, and to identify possible causal alleles for the short tail phenotype segregating within a typical long-tailed breed. Results Haplotype-based mapping in 362 genotyped (Illumina OvineSNP50) and phenotyped Merinolandschaf lambs resulted in a genome-wide significant mapping at position 37,111,462 bp on sheep chromosome 11 and on chromosome 2 at position 94,538,115 bp (Oar_v4.0). Targeted capture sequencing of these regions in 48 selected sheep and comparative analyses of WGS data of various long and short-tailed sheep breeds as well as wild sheep subspecies identified a SNP and a SINE element as the promising candidates. The PCR genotyping of these candidates revealed complete linkage of both the candidate variants. The SINE element is located in the promotor region of HOXB13 , while the SNP was located in the first exon of HOXB13 and predicted to result in a nonsynonymous mutation. Conclusions Our approach successfully identified HOXB13 as candidate genes and the likely causal variants for tail length segregating within a typical long-tailed Merino breed. This would enable more precise breeding towards shorter tails, improve animal welfare by amplification of ancestral alleles and contribute to a better understanding of differential embryonic development.
Additional file 5: Table S4. Genetic differentiation index. Pairwise values for DEST, per analyzed breed. Maximum and minimum values within each of the predefined geographical groups are depicted in bold letters in gray rectangle. Maximum and minimum values considering all 115 breeds are in bold letters in green and yellow, respectively.
Both natural and artificial selection are among the main driving forces shaping genetic variation across the genome of livestock species. Selection typically leaves signatures in the genome, which are often characterized by high genetic differentiation across breeds and/or a strong reduction in genetic diversity in regions associated with traits under intense selection pressure. In this study, we evaluated selection signatures and genomic inbreeding coefficients, FROH, based on runs of homozygosity (ROH), in six Ugandan goat breeds: Boer (n = 13), and the indigenous breeds Karamojong (n = 15), Kigezi (n = 29), Mubende (n = 29), Small East African (n = 29), and Sebei (n = 29). After genotyping quality control, 45,294 autosomal single nucleotide polymorphisms (SNPs) remained for further analyses. A total of 394 and 6 breed-specific putative selection signatures were identified across all breeds, based on marker-specific fixation index (FST-values) and haplotype differentiation (hapFLK), respectively. These regions were enriched with genes involved in signaling pathways associated directly or indirectly with environmental adaptation, such as immune response (e.g., IL10RB and IL23A), growth and fatty acid composition (e.g., FGF9 and IGF1), and thermo-tolerance (e.g., MTOR and MAPK3). The study revealed little overlap between breeds in genomic regions under selection and generally did not display the typical classic selection signatures as expected due to the complex nature of the traits. In the Boer breed, candidate genes associated with production traits, such as body size and growth (e.g., GJB2 and GJA3) were also identified. Furthermore, analysis of ROH in indigenous goat breeds showed very low levels of genomic inbreeding (with the mean FROH per breed ranging from 0.8% to 2.4%), as compared to higher inbreeding in Boer (mean FROH = 13.8%). Short ROH were more frequent than long ROH, except in Karamojong, providing insight in the developmental history of these goat breeds. This study provides insights into the effects of long-term selection in Boer and indigenous Ugandan goat breeds, which are relevant for implementation of breeding programs and conservation of genetic resources, as well as their sustainable use and management.
Identifying the relationship between the polymorphism segregating in a population and phenotypic differences of a trait observed between the individuals of a population is of major biological interest and represents the basis of forward genetics. Much of the traits of interest are influenced by several polymorphic genes and environmental conditions. Often the loci associated with such measurable traits are referred to as Quantitative trait loci. These loci are identified using several statistical approaches. One of them is combined linkage disequilibrium and linkage analysis (cLDLA). This approach, first proposed by Meuwissen and colleagues in 2002, is shown to be robust against population stratification/family structure and requires a relatively lower sample size compared to a genome-wide association study design. Previously, we have successfully used this approach in mapping several important traits in livestock such as identifying the genetic basis of polled condition in cattle and tail length in sheep. A cLDLA requires several complex computation processing and intermediary file conversion steps; for some of these steps no open-source tools are available. Therefore, running this analysis, manually, can prove challenging, tedious, or error-prone. We present, cldla, a bioinformatics workflow implemented in nextflow which takes the vcf file and phenotype file as inputs and implements all the downstream processing required for cLDLA. Additionally, it also has a separate workflow to estimate SNP-based heritability and features for interactive visualization of the results. The workflow is freely available at: https://github.com/Popgen48/cldla.
The folder contains 50K SNP genotypes of 48 Fleckvieh cattle in the form of ped and map files. Note that the marker positions correspond to the ARS-UCD1.2 cattle assembly, This dataset is used in the paper titled,"A de novo frameshift mutation in ZEB2 causes polledness, abnormal skull shape, small body stature and subfertility in Fleckvieh cattle". This is the doi of the paper: 10.1038/s41598-020-73807-5. Please cite this paper, if you decide to use this data.
Summary Uganda is endowed with a large population of goats from predominantly indigenous breeds reared in diverse production systems, whose existence is threatened by crossbreeding with exotic Boer goats. Knowledge about the genetic characteristics and relationships among these Ugandan goat breeds and the potential admixture of the exotic breed Boer is still limited. Using a medium density single nucleotide polymorphism (SNP) panel, we assessed the genetic diversity, population structure and admixture in six Ugandan goat breeds. Samples from five indigenous Ugandan goat breeds including Mubende (n=29), Kigezi (n=29), Small East African (n=29), Sebei (n=29) and Karamojong (n=15), and the exotic breed Boer (n=13) from different agro-ecological regions of Uganda were genotyped using the GoatSNP50 BeadChip. Analysis of genotype data revealed high levels of polymorphism with the proportion of polymorphic SNPs ranging from 0.885 in Kigezi to 0.928 in Sebei. The overall mean genetic diversity indices across breeds for HO and HE was 0.355±0.147 and 0.384±0.143 respectively. Principle components, genetic distances and ADMIXTURE analyses revealed weak population sub-structuring among the breeds. Principle components separate Kigezi and weakly Small East African from other indigenous goats. Sebei and Karamojong are tightly entangled together while Mubende occupies a more central position with high admixture from all other local breeds. The Boer breed showed a unique cluster from the Ugandan indigenous goat breeds. The results reflect common ancestry but also some level of geographical differentiation. ADMIXTURE and four population test analyses further revealed gene-flow from Boer to Ugandan indigenous goat breeds and varying levels of admixture among the Ugandan indigenous breeds. Generally, moderate to high levels of genetic variability were observed in the Ugandan goat breeds. Our findings provide useful insight to devise strategies to maintain genetic diversity in local goat breeds from Uganda and to design appropriate breeding programs to exploit within breed diversity and heterozygote advantage in cross-breeding schemes.
Abstract Background The indigenous cattle populations from Greece and Cyprus have decreased to small numbers and are currently at risk of extinction due to socio-economic reasons, geographic isolation and crossbreeding with commercial breeds. This study represents the first comprehensive genome-wide analysis of 10 indigenous cattle populations from continental Greece and the Greek islands, and one from Cyprus, and compares them with 104 international breeds using more than 46,000 single nucleotide polymorphisms (SNPs). Results We estimated several parameters of genetic diversity (e.g. heterozygosity and allelic diversity) that indicated a severe loss of genetic diversity for the island populations compared to the mainland populations, which is mainly due to the declining size of their population in recent years and subsequent inbreeding. This high inbreeding status also resulted in higher genetic differentiation within the Greek and Cyprus cattle group compared to the remaining geographical breed groups. Supervised and unsupervised cluster analyses revealed that the phylogenetic patterns in the indigenous Greek breeds were consistent with their geographical origin and historical information regarding crosses with breeds of Anatolian or Balkan origin. Cyprus cattle showed a relatively high indicine ancestry. Greek island populations are placed close to the root of the tree as defined by Gir and the outgroup Yak, whereas the mainland breeds share a common historical origin with Buša. Unsupervised clustering and D-statistics analyses provided strong support for Bos indicus introgression in almost all the investigated local cattle breeds along the route from Anatolia up to the southern foothills of the Alps, as well as in most cattle breeds along the Apennine peninsula to the southern foothills of the Alps. Conclusions All investigated Cyprus and Greek breeds present complex mosaic genomes as a result of historical and recent admixture events between neighbor and well-separated breeds. While the contribution of some mainland breeds to the genetic diversity pool seems important, some island and fragmented mainland breeds suffer from a severe decline of population size and loss of alleles due to genetic drift. Conservation programs that are a compromise between what is feasible and what is desirable should focus not only on the still highly diverse mainland breeds but also promote and explore the conservation possibilities for island breeds.
Abstract Domestic reindeer in Russia are a valuable resource of vital importance to the physical and cultural survival of the Northern indigenous minority. During the last decades, the mitochondrial (mt) genetic markers have been widely used as a molecular tool to investigate genetic structure and diversity of livestock species. Here we aimed at the assessing the mtDNA diversity of the domestic reindeer inhabiting the area from the Kola Peninsula in the west to the Chukotka region in the east. A complete cytochrome b (cytb) sequences (1,140 bp) from representatives of six populations, including Nenets (NEN, n = 16), Evenk (EVK, n = 12), Even (EVN, n = 6), Chukotka (CHU, n = 6), Chukotka-Khargin (CHUKH, n = 6) and Tuva (TUVA, n = 6) were obtained. Sequences’ alignment was conducted using MUSCLE algorithm in R package msa. In total, 34 haplotypes were identified. Median-joining network, constructed in PopART 1.7, revealed three major groups of haplotypes: the first one joined the samples of all the populations, the second one included NEN, EVN and CHUKH, and the third group was presented by the one sample of CHU. AMOVA, calculated in Arlequin 3.5.2.2, showed that only 9.58% of molecular variance could be explained by the differences between populations and 90.42% - within populations. Genetic diversity parameters calculated in DnaSP 6.12.03, demonstrated that average number of nucleotide differences (K) was highest in CHUKH (28.333) and EVN (27.409) and lowest in TUVA (4.533) and EVK (5.400). Nucleotide diversity (Pi) was 0.01238±0.00559, 0.00474±0.00091, 0.02404±0.00453, 0.01281±0.00464, 0.02485±0.00744, and 0.00398±0.00110 for NEN, EVK, EVN, CHU, CHUKH and TUVA, respectively. Our study demonstrated the lack of clear genetic structure of the studied reindeer populations in relation to cytb sequence. The level of genetic diversity was associated with census size and was lowest in the smallest Tuva population. This study was supported by RSF-21-16-00071 and Russian Ministry of Science and Higher Education-0445-2019-0024.