Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios. We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases. This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts.
Multiple studies have suggested nonsyndromic cleft lip with or without cleft palate (NSCL/P), and lung cancer may have common genetic etiology. Previous studies have showed genetic variants in nicotinic cholinergic receptor genes (CHRNs) may influence risk of lung cancer. We aimed to explore the effect of CHRNs on risk of NSCL/P considering gene-gene (GxG) interaction for these genes.We selected 120 markers in 14 CHRNs to test for GxG interaction using 806 Chinese case-parent trios recruited from an international consortium established for a GWAS of oral clefts.Totally, two pairs of SNPs yielded significant GxG interactions after Bonferroni correction (rs935865 and rs2337980 with p = 4.04 × 10-5 , rs2741335 and rs3743077 with p = 4.80 × 10-4 ), and these pairwise interactions were confirmed in permutation tests. In addition, the relative risk (RR) of the putative interaction between rs935865 and rs2337980 was 1.10 (95% CI: 0.92~1.31).While the single SNP association and the gene-environment interaction analysis of 14 CHRN genes yielded no signal, this study did demonstrate the importance of considering potential GxG interaction for exploring etiology of NSCL/P. This study suggests an important role for particular combinations of SNPs in CHRN genes in influencing risk to NSCL/P, which needs further study.
Abstract Motivation: Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this article, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS cannot only be used for feature selection, but can also be employed for discrimination. Results: In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several 10 SNPs, but can also be employed to analyze whole-genome data. Availability: Software can be downloaded from http://ls2-www.cs.uni-dortmund.de/~nunkesser/#Software Contact: robin.nunkesser@uni-dortmund.de
We performed a systematic analysis of gene expression features in early (10–21 days) development of human vs mouse embryonic cells (hESCs vs mESCs). Many development features were found to be conserved, and a majority of differentially regulated genes have similar expression change in both organisms. The similarity is especially evident, when gene expression profiles are clustered together and properties of clustered groups of genes are compared. First 10 days of mESC development match the features of hESC development within 21 days, in accordance with the differences in population doubling time in human and mouse ESCs. At the same time, several important differences are seen. There is a clear difference in initial expression change of transcription factors and stimulus responsive genes, which may be caused by the difference in experimental procedures. However, we also found that some biological processes develop differently; this can clearly be shown, for example, for neuron and sensory organ development. Some groups of genes show peaks of the expression levels during the development and these peaks cannot be claimed to happen at the same time points in the two organisms, as well as for the same groups of (orthologous) genes. We also detected a larger number of upregulated genes during development of mESCs as compared to hESCs. The differences were quantified by comparing promoters of related genes. Most of gene groups behave similarly and have similar transcription factor (TF) binding sites on their promoters. A few groups of genes have similar promoters, but are expressed differently in two species. Interestingly, there are groups of genes expressed similarly, although they have different promoters, which can be shown by comparing their TF binding sites. Namely, a large group of similarly expressed cell cycle-related genes is found to have discrepant TF binding properties in mouse vs human.
Recently, genome-wide association studies have identified and validated genetic variations associated with urinary bladder cancer (UBC). However, it is still unknown whether the high-risk alleles of several SNPs interact with one another, leading to an even higher disease risk. Additionally, there is no information available on how the UBC risk due to these SNPs compare to the risk of cigarette smoking and to occupational exposure to urinary bladder carcinogens, and whether the same or different SNP combinations are relevant in smokers and non-smokers. To address these questions, we analyzed the genotypes of six SNPs, previously found to be associated with UBC, together with the GSTM1 deletion, in 1,595 UBC cases and 1,760 controls, stratified for smoking habits. We identified the strongest interactions of different orders and tested the stability of their effect by bootstrapping. We found that different SNP combinations were relevant in smokers and non-smokers. In smokers, polymorphisms involved in detoxification of cigarette smoke carcinogens were most relevant (GSTM1, rs11892031), in contrast to those in non-smokers with MYC and APOBEC3A near polymorphisms (rs9642880, rs1014971) being the most influential. Stable combinations of up to three high-risk alleles resulted in higher odds ratios (OR) than the individual SNPs, although the interaction effect was less than additive. The highest stable combination effects resulted in an OR of about 2.0, which is still lower than the ORs of cigarette smoking (here, current smokers' OR: 3.28) and comparable to occupational carcinogen exposure risks which, depending on the workplace, show mostly ORs up to 2.0.
Abstract Identifying associations between interindividual variability in brain structure and behaviour requires large cohorts, multivariate methods, out-of-sample validation and, ideally, out-of-cohort replication. Moreover, the influence of nature vs nurture on brain-behaviour associations should be analysed. We analysed associations between brain structure (grey matter volume, cortical thickness, and surface area) and behaviour (spanning cognition, emotion, and alertness) using regularized canonical correlation analysis and a machine learning framework that tests the generalisability and stability of such associations. The replicability of brain-behaviour associations was assessed in two large, independent cohorts. The load of genetic factors on these associations was analysed with heritability and genetic correlation. We found one heritable and replicable latent dimension linking cognitive-control/executive-functions and positive affect to brain structural variability in areas typically associated with higher cognitive functions, and with areas typically associated with sensorimotor functions. These results revealed a major axis of interindividual behavioural variability linking to a whole-brain structural pattern.