Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements.
2004
The International Human Genome Sequencing Consortium recently reported that ∼99% of the gene-rich euchromatic portion of the genome has been sequenced and assembled. Each base pair of this 99% was sequenced five times on average, ensuring an error rate of less than one base in 50,000. The finished sequence nowhas an N50 contig size of 27 Mb, and the number of gaps has been reduced from 80,000 in the draft to ∼400. These 400 gaps represent genome sequences not found in the screened genomic bacterial artificial chromosome (BAC), P1-derived artificial chromosome (PAC), or other fosmid and cosmid libraries. Most of these gaps represent the type 3 gaps that are the most difficult to evaluate, because the genome sequence flanking these gaps is often not precisely aligned with the fingerprinted clones. The gaps represent ∼30 Mb or 1.0% of the human genome (Grimwood and Schmutz 2003). Although almost the whole genome is considered “finished” and offers a wealth of information, cloning and sequencing the tough leftovers of the human genome is essential. Without these sequences, we will not know what we are missing. Each missed sequence can potentially contain a gene, and each missed gene is potentially a missed drug target. Even gene-poor areas might be critical for gene regulation.
A traditional method of filling gaps includes screening additional BAC and cosmid libraries. However, this approach is time-consuming and may be not applicable to some gap regions with unusual DNA structures. For example, it is well documented that long inverted repeats, AT-rich sequences, and sequences with structures such as Z-DNA are extremely unstable in Escherichia coli (Hagan and Warren 1982; Schroth and Ho 1995; Kang and Cox 1996; Razin et al. 2001). These sequences may be underrepresented or even lost when cloned in E. coli.
The introduction of alternative cloning systems and hosts, allowing isolation of genomic segments that are poorly clonable in E. coli cells, may assist the effort to close the gaps. Such a system is yeast artificial chromosome (YAC) cloning in yeast. Several recent reports demonstrate that genomic segments that are unstable in E. coli vectors can be accurately recovered as YACs in yeast (Bigger et al. 2000; Gardner et al. 2002; Kouprina et al. 2003). In some cases (Dictyostelium discoideum), YACs may represent the only viable method for the construction of large insert libraries (Glockner et al. 2002). An additional advantage of using yeast is the opportunity to directly isolate a desired genomic region by transformation-associated recombination (TAR) cloning (Kouprina and Larionov 2003). Two main TAR cloning schemes were developed and applied for isolation of dozens of full-size mammalian genes. When DNA sequence information is available from the 3′- and 5′-flanking regions of the gene of interest, the region is isolated using a vector with two unique targeting sequences (hooks; Larionov et al. 1997). Another version of TAR cloning, called radial TAR cloning, uses a vector with one hook that is a unique sequence from the chromosomal region of interest and the other hook that is a repeated sequence occurring frequently and randomly in the genomic DNA (i.e., Alu repeats; Kouprina et al. 1998a).
For the purpose of gap closure, the radial TAR cloning is the most suitable, because sequences of the flanking clones may be deleted or rearranged, making the development of two specific targeting hooks difficult. In the present study, the TAR cloning approach and screening of additional genomic libraries were used to close gaps on human chromosome 19. A subsequent analysis of the gap sequences allowed us to annotate four human genes and shed light on the nature of poorly clonable chromosome segments.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
28
References
45
Citations
NaN
KQI