Genome update: the 1000th genome ― a cautionary tale
2010
There are now more than 1000 sequenced prokaryotic genomes deposited in
public databases and available for analysis. Currently, although the sequence
databases GenBank, DNA Database of Japan and EMBL are synchronized continually,
there are slight differences in content at the genomes level for a variety
of logistical reasons, including differences in format and loading errors,
such as those caused by file transfer protocol interruptions. This means that
the 1000th genome will be different in the various databases. Some of the
data on the highly accessed web pages are inaccurate, leading to false conclusions
for example about the largest bacterial genome sequenced. Biological diversity
is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene
families — more genes than are recognized in the human genome. Moreover,
of the 1000 genomes available, not a single protein is conserved across all
genomes. Excluding the members of the Archaea, only a total of four
genes are conserved in all bacteria: two protein genes and two RNA genes.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
32
References
54
Citations
NaN
KQI