Reading between the LINEs: Human genomic variation induced by LINE-1 retrotransposition

2000 
Long interspersed element-1 (LINE-1) sequences are a large family of transposable elements found in the genomes of all mammals (Burton et al. 1986; Xiong and Eickbush 1990). They belong to the poly(A)-containing (also called the non-long-terminal-repeat) class of retrotransposons. The consensus human LINE-1 (L1Hs) is 6.0 kb long, contains two nonoverlapping reading frames, terminates in an A-rich tail, and is surrounded by a short (4–20 bp) duplication of non-LINE-1 (L1) sequence, the target site duplication (Fanning and Singer 1987). The human genome contains an estimated 105 truncated and 4 × 103 full-length L1Hs elements (Adams et al. 1980; Grimaldi et al. 1984; Hwu et al. 1986), which together constitute ∼15% of the genome (Smit 1996). The majority of L1Hs elements are not capable of transposition because they are truncated or rearranged or contain other significant mutations. Nevertheless, abundant evidence indicates that L1Hs transposition continues to occur. Several examples of recent de novo transposition events have been identified largely as the result of mutations caused by the insertion of new L1Hs elements into functional genes (Kazazian et al. 1988; Woods-Samuels et al. 1989; Miki et al. 1992; Narita et al. 1993; Bleyl et al. 1994; Holmes et al. 1994). All but one of the newly transposed L1Hs sequences in the human genome belongs to a subfamily of L1 elements called Ta (transcribed, subset a). This subfamily was first recognized as a group of expressed elements with a high degree of sequence identity to one another (Skowronski et al. 1988). The Ta subfamily of L1 elements (L1Hs-Ta) are characterized by the presence of the sequence ACA in the 3′ untranslated region at position 5930–5932; numbers refer to the actively transposing element LRE-1 [Dombroski et al. 1991]). Elements with the genomic L1Hs consensus sequence have a GAG sequence at this position (Skowronski et al. 1988). Recent experiments have suggested that the human genome may contain 30–60 active L1Hs retrotransposons (Sassaman et al. 1997). The de novo insertion of a transposable element into the genome creates a new polymorphic genetic marker with a number of unique properties, as first described for Alu-insertion polymorphisms (Batzer and Deininger 1991; Perna et al., 1992; Deininger and Batzer 1993, 1995, 1999; Batzer et al. 1994, 1996; Stoneking et al. 1997). As with Alu-insertion polymorphisms, each L1Hs insertion represents a unique historic event. This is a result of the large number of potential target sites (theoretically equal to 3 × 109, the number of base pairs in the human genome) for the integration of new mobile elements. Thus, there is an extremely low likelihood that two independent L1Hs insertions would land between the exact same base pairs, and in the unlikely event that this should occur, the two L1 elements would probably differ in length. Accordingly, individual loci bearing the same L1Hs insertion are identical by descent. In addition, the ancestral state of an L1Hs insertion is the absence of the element, because the direction of mutation is the insertion of a new mobile element into the genome. Orthologous loci in nonhuman primates may also be analyzed for the presence of the mobile element insertion to verify the ancestral state. Once inserted, most L1Hs elements are stable over long periods of time (Smit et al. 1995). In rare instances when a transposable element is deleted from the genome, the process is often imperfect, and a “footprint” of the original mobile element is left behind (Edwards and Gibbs 1992). Finally, L1 transposition has been occurring in mammals for millions of years and continues to this day, suggesting that a series of dimorphic L1Hs insertions that have arisen throughout human evolution may be found in the present-day human population; loci that are dimorphic in a population via the presence or absence of an L1 element are called LINE-1 insertion dimorphisms (LIDs). Most other types of genetic markers do not share these properties (Batzer et al. 1994, 1996; Stoneking et al. 1997). Thus, dimorphic transposable elements, such as Alu or L1, have a number of unique, useful properties for the study of human population genetics. Previously, dimorphic Alu elements have been used to provide insights into human genetic diversity and evolution (Perna et al. 1992; Batzer et al. 1994, 1996; Hammer 1994; Novick et al. 1995, 1998; Tishkoff et al. 1996; Stoneking et al. 1997). Dimorphic L1 elements could present another potentially useful class of genetic polymorphisms if a large number of such dimorphisms could be readily identified. Here, we describe the identification of dimorphic LINE elements from the human genome using a method called L1 display to ascertain the LIDs. Using this approach we have identified six LIDs from six individuals of diverse geographic backgrounds. In addition, we developed PCR-based assays to genotype these six individual LIDs in 850 individuals from 14 worldwide populations. Our results show that evolutionarily young LIDs can be readily identified by the L1 display assay and that these elements are a novel source of genomic variation for the study of human population genetics and forensics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    68
    References
    124
    Citations
    NaN
    KQI
    []