Analysis of 5′ junctions of human LINE-1 and Alu retrotransposons suggests an alternative model for 5′-end attachment requiring microhomology-mediated end-joining
2005
The human LINE-1 (Long interspersed nuclear element 1, L1) is one of the best characterized members of the extensive group of non-LTR retrotransposons (Malik et al. 1999). Roughly 520,000 L1 copies account for ∼17% of the human genome (Lander et al. 2001). Additionally, L1 retrotransposons are indirectly responsible for the generation of ≥13% of the human genome by mobilizing Alu elements (Dewannieux et al. 2003) and by creating processed pseudogenes in trans (Esnault et al. 2000).
A functional full-length L1 element is ∼6 kb long (Fig. 1) and includes a 5′-untranslated region (5′-UTR) bearing an internal promoter, two open reading frames (ORFs) separated by a 63-nt intergenic region, and a 3′-UTR terminating in a poly(A) tail (Dombroski et al. 1991). ORF1 encodes an RNA-binding protein that has nucleic acid chaperone activity in vitro (Kolosha and Martin 1997, 2003; Martin and Bushman 2001), but no known specific role in the L1 replication mechanism. The ORF2-encoded protein (ORF2p) contains three domains critical for L1 propagation: endonuclease (EN) (Feng et al. 1996), reverse transcriptase (RT) (Mathias et al. 1991; Dombroski et al. 1994), and a 3′-terminal Zn finger-like domain (Fanning and Singer 1987). There are several variations in the structure of genomic L1 elements. Only 15% of all intact L1 insertions flanked by target site duplications (TSDs) represent full-length insertions, 85% are 5′-truncated, and 19% are both 5′-truncated and 5′-inverted (Szak et al. 2002).
Figure 1.
Retrotransposition of a functional human L1 element frequently results in 5′-truncated copies with overlapping nucleotides at the 5′ junctions. These microhomologies (yellow box) between the genomic target site duplication (TSD) and the ...
Retrotransposition of a new L1 copy into the degenerate genomic consensus sequence 5′-TTTT/A-3′ (Cost and Boeke 1998; Gilbert et al. 2002; Szak et al. 2002) is initiated by a process termed “target-primed reverse transcription” (TPRT), in which ORF2p nicks the target DNA to generate a free 3′-OH. This hydroxyl acts as primer for reverse transcription using L1 RNA as template (Luan et al. 1993; Cost et al. 2002). The result is simultaneous reverse transcription and joining of the 5′-end of the first-strand cDNA with the genome. The second strand of the genomic target is nicked at variable distances downstream of the complementary sequence of the degenerate genomic consensus site, preferably within 15-16 bp (Jurka 1997; Szak et al. 2002). The subsequent steps of the integration process of elements that are both 5′-truncated and 5′-inverted can be satisfactorily explained by a variation of TPRT called “twin priming” (Ostertag and Kazazian Jr. 2001b), which is corroborated by the existence of short patches of complementary nucleotides at the junction between the 5′-TSD and the inverted L1 sequence. However, for both full-length and 5′-truncated L1 insertions, neither the means by which the 3′-end of the cDNA is attached to chromosomal DNA nor the mechanisms initiating second-strand synthesis have been elucidated so far. It is also yet unknown which mechanism leads to the generation of 5′-truncated L1 copies. L1 truncation has long been explained by an inability of the L1 RT to copy the entire L1 RNA, either by prematurely dissociating from the RNA or by competing with a cellular RNAse that digests the RNA before completion of reverse transcription (Ostertag and Kazazian Jr. 2001a).
Several mechanisms have been suggested to explain the attachment of the 5′-end of non-LTR retrotransposons to the chromosome, which are based on regions of microcomplementarity found at the junctions between the 5′-end of the retrotransposon and the 3′-end of the adjacent TSD (Fig. 1). These overlapping nucleotides have been described initially for L1 insertions in the mouse genome and for Cin4 elements in maize, and led to replication models that require bridging of chromosomal double-strand breaks (DSBs) by L1 RNA (Voliva et al. 1984; Schwarz-Sommer et al. 1987). However, these gap repair models that require two hybridization events of the retrotransposon RNA to the target DNA are not compatible with what is currently known about L1 EN activity and the mechanism of TPRT. Therefore, a different mechanism has been proposed, called double-TPRT, in which second-strand synthesis is primed by annealing of the newly synthesized cDNA to complementary sequences in the genomic target. The sequential TPRT reactions were suggested to be carried out by the element-encoded protein machinery (Supplemental Fig. 1). This model was originally developed to explain R1Bm replication (Feng et al. 1998), and was subsequently adapted to account for the insertion mechanism of mouse and human L1. As L1 EN is much less sequence-specific than R1 EN, L1 RT was proposed to use fortuitous matches between the L1 cDNA and the target sequence to prime second-strand synthesis (Martin and Bushman 2001; Symer et al. 2002; Martin et al. 2005).
In the course of our efforts to investigate the means by which the L1 5′-end is attached to the chromosomal DNA, we evaluated whether there is a preference for overlapping nucleotides (nt) between the 5′-end of pre-existing L1 insertions and the 3′-end of the adjacent genomic TSDs, as previously reported for a small number of de novo L1 integrants obtained from tissue culture experiments (Symer et al. 2002). To this end, we performed a comprehensive genome-wide analysis of TSDs flanking extant genomic L1 and Alu insertions. Characterization of the junctions between genomic target sequences and the 5′-ends of L1 and Alu insertions revealed that, in contrast to full-length insertions and 5′-truncated Alu elements, 5′-truncated L1s preferentially exhibit features that are observed after DSB repair by alternative, error-prone nonhomologous end-joining (NHEJ). Based on our observations, we propose that there are at least two different mechanisms responsible for attachment of the 5′-end of L1 and the initiation of second-strand synthesis.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
58
References
84
Citations
NaN
KQI