Daniel Swan

Ipswich Hospital

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Philip Leder

Harvard University

Bastian Greshake Tzovaras

Université Paris Cité

Gustavo Glusman

Institute for Systems Biology

Mike Cariaso

SNPedia

Manuel Corpas

University of Cambridge

David A. Young

Newcastle University

Seymour Packman

University of California, San Francisco

Jong Bhak

Ulsan National Institute of Science and Technology

John C. Mathers

Newcastle University

Heinrich Matthaei

University of Bonn

Cooperative Institutions

Newcastle University

112

Sir Charles Gairdner Hospital

University College London

University of Birmingham

University of Newcastle Australia

National Institute for Health Research

Royal London Hospital

University of Auckland

Monash University

National Institutes of Health

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

A processed human immunoglobulin epsilon gene has moved to chromosome 9.

Proceedings of the National Academy of Sciences (1982)

Jim Battey Edward E. Max Wesley O. McBride Daniel Swan Philip Leder

Processed genes--genes that resemble processed RNA transcripts rather than interrupted genomic sequences--have been identified as dispersed members of several gene families. Here we describe a processed gene that is one of the three human IgE-like sequences present in the human genome. The processed IgE gene has precisely lost its three intervening sequences, thereby fusing its four coding domains. The homology of the gene to its functional counterpart ends in an adenine-rich tail followed by an 11-base-pair sequence that is directly repeated 150 base pairs 5' to its first coding domain. In addition, the processed gene is located on human chromosome 9 rather than on chromosome 14, the site of the active immunoglobulin locus. The structure and evident mobility of this sequence support the concept that sequences can move about in the genome via RNA intermediates and that processed genes are a prominent feature of genomic structure.

Homology

Coding region

Gene density

Gene prediction

10.1073/pnas.79.19.5956

Cite

Citations (84)

An approach to describing and analysing bulk biological annotation quality: a case study using UniProtKB

Bioinformatics (2012)

M. J. Bell Colin S. Gillespie Daniel Swan Phillip Lord

Motivation: Annotations are a key feature of many biological databases, used to convey our knowledge of a sequence to the reader. Ideally, annotations are curated manually, however manual curation is costly, time consuming and requires expert knowledge and training. Given these issues and the exponential increase of data, many databases implement automated annotation pipelines in an attempt to avoid un-annotated entries. Both manual and automated annotations vary in quality between databases and annotators, making assessment of annotation reliability problematic for users. The community lacks a generic measure for determining annotation quality and correctness, which we look at addressing within this article. Specifically we investigate word reuse within bulk textual annotations and relate this to Zipf's Principle of Least Effort. We use the UniProt Knowledgebase (UniProtKB) as a case study to demonstrate this approach since it allows us to compare annotation change, both over time and between automated and manually curated annotations. Results: By applying power-law distributions to word reuse in annotation, we show clear trends in UniProtKB over time, which are consistent with existing studies of quality on free text English. Further, we show a clear distinction between manual and automated analysis and investigate cohorts of protein records as they mature. These results suggest that this approach holds distinct promise as a mechanism for judging annotation quality. Availability: Source code is available at the authors website: http://homepages.cs.ncl.ac.uk/m.j.bell1/annotation. Contact:phillip.lord@newcastle.ac.uk

UniProt

Bantu languages

10.1093/bioinformatics/bts372

Cite

Citations (18)

Chromosomal location of human kappa and lambda immunoglobulin light chain constant region genes

The Journal of Experimental Medicine (1982)

O. Wesley McBride P. Heiter GF Hollis Daniel Swan MC Otey

The chromosomal location of human constant region light chain immunoglobulin (Ig) genes has been determined by analyzing a group of human fibroblast/rodent somatic cell hybrids with nucleic acid probes prepared from cloned human kappa and lambda constant region genes. Human chromosomes in each cell line were identified by isoenzyme analysis. The DNA from hybrid cells was digested with restriction endonucleases, size fractionated by gel electrophoresis, transferred to nitrocellulose or DBM paper, and hybridized with (32)P-labeled nucleic acid probes. The C(kappa) gene was assigned to human chromosome 2 and the C(lambda) genes to chromosome 22, based upon analysis of these hybrid cell lines, and these assignments were confirmed by analysis of subclones. A group of previously unassigned loci can be mapped to chromosome 2 by virtue of their close linkage to C(kappa). The lambda and kappa light chain and heavy chain Ig genes have now been assigned to all three human chromosomes that are involved in translocations with chromosome 8 in human B cell neoplasms. These techniques and probes provide a means to study the detailed arrangement of human Ig genes and their pseudogenes.

Pseudogene

10.1084/jem.155.5.1480

Cite

Citations (292)

23andMe SNPs for which SNPedia annotations are available

Figshare (2012)

Gustavo Glusman Mike Cariaso Rafael C. Jiménez Daniel Swan Bastian Greshake Tzovaras

Each file contains all SNPs in the individual matching an annotated SNP in SNPedia. SNPedia annotations contain a magnitude value (subjective measure of the importance of the potential phenotypical effect) and a phenotype description of the condition of particular genotype affects.

10.6084/m9.figshare.92757

Cite

Citations (1)

23andMe SNP chip genotype data

Figshare (2012)

Gustavo Glusman Mike Cariaso Rafael C. Jiménez Daniel Swan Bastian Greshake Tzovaras

23andMe genotype data for Mother, Father, Son, Daughter and Aunt. Son is 23andMe version 2 data and the rest of the family are 23andMe version 3 data.

SNP

SNP genotyping

10.6084/m9.figshare.92682

Cite

Citations (2)

Organization of Immunoglobulin Genes: Reiteration Frequency of the Mouse κ Chain Constant Region Gene

Proceedings of the National Academy of Sciences (1974)

Tasuku Honjo Seymour Packman Daniel Swan Marion M. Nau Philip Leder

Hybridization kinetic analyses with synthetic DNA indicate that there are only two to three copies of the κ constant region gene per haploid genome. This result lends weight to the argument that the immunoglobulin light chain is encoded by more than one continuous gene sequence.

Immunoglobulin gene

10.1073/pnas.71.9.3659

Cite

Citations (75)

Induction of the mammalian node requires Arkadia function in the extraembryonic lineages

Nature (2001)

Vasso Episkopou Ruth M. Arkell Paula M. Timmons James J. Walsh Rebecca L. Andrew

Primitive streak

Embryonic Induction

10.1038/35071095

Cite

Citations (110)

Effects of Sirt1 on DNA methylation and expression of genes affected by dietary restriction

AGE (2012)

Laura Jane Ions Luisa Wakeling Helen Bosomworth J. E. J. Hardyman Suzanne M Escolme

10.1007/s11357-012-9485-8

Cite

Citations (30)

Identification of the pathogenic pathways in osteoarthritic hip cartilage: commonality and discord between hip and knee OA

Osteoarthritis and Cartilage (2012)

Yuqing Xu Matt J. Barter Daniel Swan Kenneth S. Rankin A.D. Rowan

Biological pathway

10.1016/j.joca.2012.02.319

Cite

Citations (0)

A study of the genetics of cholesteatoma through systematic review and whole exome sequencing

Barbara Jennings Carl Philpott Mahmood F. Bhutta Gavin Willis Daniel Swan

Introduction: A cholesteatoma is a mass of keratinising epithelium in the middle ear. It is a rare disorder, associated with significant morbidity. Its OMIM entry (#604183) cites minimal evidence for Mendelian inheritance, but we have observed 31 multiply affected families in Norfolk; including individuals with bilateral disease, suggesting a genetic component for its aetiology. Methods: We conducted a systematic literature review (SR) to identify any published studies about the genetics of cholesteatoma and established a national biobank for subsequent whole exome sequencing (WES) studies of familial disease. We have also completed a pilot sequencing study to identify candidate variants that segregate with the disease phenotype (using NimbleGen exome capture; and the Illumina HiSeq4000 platform). Results: In our SR, we identified 8 case-series with multiply-affected families and associations with congenital malformation syndromes. DNA and clinical data have been collected from 42 participants (from 9 multiply affected Norfolk families) to date. In 2018, participants will also be recruited from 10 additional UK centres. Our pilot: WES study of 16 participants from 4 families identified 95,437 variants. Variant filtering, using pedigree analysis, has identified 430 candidate genes for further filtering using the Ensembl Variant Effect Predictor. Conclusion: We have completed our SR (see PROSPERO register CRD42015023579) and established the first biobank to explore the genetics-of-cholesteatoma. A WES strategy and bioinformatics pipeline have been developed in the pilot study; and preliminary filtering has identified candidate variants that could have an impact on TGF β signalling and inflammatory processes.

Candidate gene

Exome

Mendelian inheritance

Cite

Citations (0)