GENCODE: massively expanding the lncRNA catalog through capture long-read RNA sequencing
Gazaldeep KaurTamara PerteghellaSílvia Carbonell SalaJosé M. GonzálezToby HuntTomasz MądryIrwin JungreisCarme ArnanJulien LagardeBeatrice BorsariCristina SisuYunzhe JiangRuth BennettAndrew BerryDaniel Cerdán-VélezKelly CochranCovadonga VaraClaire DavidsonSarah DonaldsonCagatay DursunSilvia González-LópezSasti Gopal DasMatthew P. HardyZoe HollisMike KayJosé Carlos MontañésPengyu NiRamil NurtdinovEmilio PalumboCarlos Pulido-QuetglasMarie‐Marthe SunerX. YuDingyao ZhangJane LovelandM. Mar AlbàMark DiekhansAndrea TanzerJonathan M. MudgePaul FlicekFergal J. MartinMark GersteinM. KellisAnshul KundajeBenedict PatenMichael L. TressRory JohnsonBarbara Uszczyńska-RatajczakAdam FrankishRoderic Guigó
2
Citation
67
Reference
10
Related Paper
Citation Trend
Abstract:
Abstract Accurate and complete gene annotations are indispensable for understanding how genome sequences encode biological functions. For twenty years, the GENCODE consortium has developed reference annotations for the human and mouse genomes, becoming a foundation for biomedical and genomics communities worldwide. Nevertheless, collections of important yet poorly-understood gene classes like long non-coding RNAs (lncRNAs) remain incomplete and scattered across multiple, uncoordinated catalogs, slowing down progress in the field. To address these issues, GENCODE has undertaken the most comprehensive lncRNAs annotation effort to date. This is founded on the manual annotation of full-length targeted long-read sequencing, on matched embryonic and adult tissues, of orthologous regions in human and mouse. Altogether 17,931 novel human genes (140,268 novel transcripts) and 22,784 novel mouse genes (136,169 novel transcripts) have been added to the GENCODE catalog representing a 2-fold and 6-fold increase in transcripts, respectively - the greatest increase since the sequencing of the human genome. Novel gene annotations display evolutionary constraints, have well-formed promoter regions, and link to phenotype-associated genetic variants. They greatly enhance the functional interpretability of the human genome, as they help explain millions of previously-mapped “orphan” omics measurements corresponding to transcription start sites, chromatin modifications and transcription factor binding sites. Crucially, our targeted design assigned human-mouse orthologs at a rate beyond previous studies, tripling the number of human disease-associated lncRNAs with mouse orthologs. The expanded and enhanced GENCODE lncRNA annotations mark a critical step towards deciphering the human and mouse genomes.Keywords:
Massive parallel sequencing
Massive parallel sequencing
Personal genomics
Ion semiconductor sequencing
Hybrid genome assembly
Cancer genome sequencing
Illumina dye sequencing
Single cell sequencing
Cite
Citations (3)
Massive parallel sequencing
Cite
Citations (1)
Our aim is to establish genetic diagnosis of congenital generalized lipodystrophy (CGL) using targeted massively parallel sequencing (MPS), also known as next-generation sequencing (NGS).Nine unrelated individuals with a clinical diagnosis of CGL were recruited. We used a customized panel to capture genes related to genetic lipodystrophies. DNA libraries were generated, sequenced using the Illumina MiSeq, and bioinformatics analysis was performed.An accurate genetic diagnosis was stated for all nine patients. Four had pathogenic variants in AGPAT2 and three in BSCL2. Three large homozygous deletions in AGPAT2 were identified by copy-number variant analysis.Although we have found allelic variants in only 2 genes related to CGL, the panel was able to identify different variants including deletions that would have been missed by Sanger sequencing. We believe that MPS is a valuable tool for the genetic diagnosis of multi-genes related diseases, including CGL.
Sanger sequencing
Massive parallel sequencing
Genetic diagnosis
Cite
Citations (2)
Next-Generation Sequencing (NGS) originally refers to high-throughput, massively parallel sequencing methods that allow the sequencing of up to billions of small (50-1000 bp), amplified DNA fragments at the same time but nowadays, there are NGS techniques that determine the sequence of long (up to 50 kbp) single molecules. Over the past years, NGS technologies become widely available with increasing throughput and decreasing sequencing costs per base making them more cost effective than the previously used capillary sequencing methods based on Sanger biochemistry. Nowadays, high-throughput DNA sequencing is routinely used on a wide range of important fields of biology and medicine enabling large-scale sequencing projects like analysis of complete genomes, disease association studies, whole transcriptomes, methylomes and provide new insights into complex biological systems. In addition, more and more NGS-based diagnostic tools are being introduced into the clinical practice, for example, on the fields of oncology, inherited and infectious diseases or pre-implantation and prenatal genetic screenings.
Massive parallel sequencing
Sanger sequencing
Single cell sequencing
Cite
Citations (3)
In this thesis I describe the steps taken to harness the power of massively parallel sequencing in neuroblastoma research. With running cost being prohibitive for experimental design in the early days of this technique, our primary focus was on optimizing the use of the sequencing platforms available. DNA target enrichment is a crucial technique in this process as it allows researchers to focus their research question on the parts of the genome studied. We have developed a highly efficient massively parallel nanowell PCR based capture platform. An optimal primer design as well as multiple improvements to the cycling process has resulted in an industry leading enrichment uniformity. This platform is now being commercialized by Wafergen Biosystems (USA) as a means of gene panel resequencing both for research and diagnostic applications. As genomics is taking a more prominent role in treatment decisions for cancer, such platforms may transform routine diagnostics.
The advent of massively parallel sequencing technology has shifted the challenge of genomics studies away from data generation and towards data analysis. Our efforts in the management of the huge data flows from massively parallel sequencing experiments have resulted in a cloud based platform for exome and genome resequencing data analysis. Seqplorer is scalable and flexible and delivers the power of massively parallel sequencing data analysis tools through a user-friendly web interface.
Finally, I have applied exome sequencing to the study of neuroblastoma murine model systems. These model systems are of enormous value in pre-clinical drug testing. These analyses have confirmed that the murine neuroblastoma models mimic the human disease at the genomic level very well and have provided important new insights in globally deregulated miRNA processing in human neuroblastoma cells.
Massive parallel sequencing
Exome
Cite
Citations (0)
Massive parallel sequencing
Hybrid genome assembly
Personal genomics
Cancer genome sequencing
Pyrosequencing
Single cell sequencing
Cite
Citations (0)
DNA sequencing, starting with Sanger's chain termination method in 1977 and evolving into the next generation sequencing (NGS) techniques of today that employ massively parallel sequencing (MPS), has become essential in application areas such as biotechnology, virology, and medical diagnostics. Reflected by the growing number of articles published over the last 2-3 years, these techniques have also gained attention in the forensic field. This review contains a brief description of first, second, and third generation sequencing techniques, and focuses on the recent developments in human DNA analysis applicable in the forensic field. Relevance to the forensic analysis is that besides generation of standard STR-profiles, DNA repeats can also be sequenced to look for polymorphisms. Furthermore, additional SNPs can be sequenced to acquire information on ancestry, paternity or phenotype. The current MPS systems are also very helpful in cases where only a limited amount of DNA or highly degraded DNA has been secured from a crime scene. If enough autosomal DNA is not present, mitochondrial DNA can be sequenced for maternal lineage analysis. These developments clearly demonstrate that the use of NGS will grow into an indispensable tool for forensic science.
Massive parallel sequencing
Sanger sequencing
Cite
Citations (158)
Massive parallel sequencing
Personal genomics
Cite
Citations (0)
The full arrival and broader availability of next-generation sequencing (NGS) is transforming the practice of medicine, including neurology. Compared with the traditional one-gene-at-a-time Sanger sequencing, NGS, or massively parallel sequencing, is a radically different approach to genetic sequencing. NGS allows for a large number of genes to be captured and sequenced in parallel, creating an enormous amount of data in a relatively short period of time at much lower cost "per gene."
Sanger sequencing
Massive parallel sequencing
Cite
Citations (30)
Massive parallel sequencing
Ion semiconductor sequencing
Personal genomics
Hybrid genome assembly
Cancer genome sequencing
Single cell sequencing
Illumina dye sequencing
Cite
Citations (12)