A fundamental principle guiding the publication of scientific results is that the data supporting any scholarly work must be made fully available to the research community, in a form that allows the basic conclusions to be evaluated independently. In the context of molecular biology, this has typically meant that authors of a paper describing a newly sequenced genome, gene, or protein must deposit the primary data in a permanent, public data repository, such as the sequence databases maintained by the DNA Data Bank of Japan (DDBJ), European Bioinformatics Institute (EBI), and National Center for Biotechnology Information (NCBI). Similarly, we, members of the Microarray Gene Expression Data Society (MGED; http://www.mged.org), believe that all scholarly scientific journals should now require the submission of microarray data to public repositories as part of the process of publication. While some journals have already made this a condition of acceptance, we feel that submission requirements should be applied consistently and that journals should recognize ArrayExpress (Brazma et. al. 2003), Gene Expression Omnibus (GEO) (Edgar et. al 2002), and the Center for Information Biology Gene Expression Database (CIBEX) (Ikeo et. al. 2003) as acceptable public repositories.
To this end, the members of MGED propose the following as a new paradigm for the publication of microarray-based studies. (1) Authors should continue to take primary responsibility for ensuring that all data collected and analyzed in their experiments adhere to the “Minimum Information about a Microarray Experiment” (MIAME) guidelines and should continue to use the MIAME checklist (www.mged.org/Workgroups/MIAME/miame_checklist.html) as a means of achieving this goal. (2) Scientific journals should require that all primary microarray data are submitted to one of the public repositories—ArrayExpress, GEO, or CIBEX—in a format that complies with the MIAME guidelines. (3) Public databases should work with authors and scientific journals to establish data submission and release protocols to assure compliance with MIAME guidelines. (4) To assist with the review process, the databases should continue to work in collaboration with publishers to provide qualified referees with secure means of accessing prepublication data. Authors should be strongly encouraged to submit data to the databases during review.
Naturally, data should be protected from general release prior to either publication or authorization from the data submitters, whichever comes first. At a minimum, journals should require valid accession numbers for microarray data as a requirement for publication, and these accession numbers should be included in the text of the manuscript to allow members of the community to find and access the underlying data.
Since its inception in 1999, MGED has been working with the broader scientific community to establish standards for the exchange and annotation of microarray data. In December 2001, we proposed the MIAME guidelines (Brazma et al. 2001) and requested that interested parties provide feedback on its relevance and utility. The feedback from both researchers and scientific journals was overwhelmingly positive, yet almost everyone who responded also asked for help in implementing these guidelines.
Subsequently, in the summer of 2002, we submitted an open letter to various journals (e.g., Ball et al. 2002a, 2002b) urging the community to adopt the MIAME requirements for microarray data publication. We provided a checklist so that authors could ensure that sufficient information to allow their data to be re-analyzed by others would be available. Again, the response from the community was extremely positive, and most of the major scientific journals now require publications describing microarray experiments to comply with the MIAME standards. While the adoption of these standards has greatly improved the accessibility of microarray data, much of it remains on individual authors' websites in a variety of formats; consequently, obtaining and comparing datasets remains a significant challenge. Clearly we need additional requirements for publication that include submission of expression data to public data repositories.
Though one might ask why this requirement was not part of the original MIAME recommendation, the answer is quite simple—MIAME was ahead of its time. While NCBI and the EBI had developed nascent microarray data repositories, and work was underway to create a similar database at the DDBJ, submitting data to these databases was a considerable burden for authors. However, since that time, improvements in the data-entry utilities available for GEO (www.ncbi.nlm.nih.gov/geo), ArrayExpress (www.ebi.ac.uk/arrayexpress), and CIBEX (cibex.nig.ac.jp), as well as a growing number of commercial and academic software packages capable of writing MAGE-ML documents (Spellman et al. 2002) that can be directly submitted to these public databases, have lowered the barriers for data submission to the point where we as a community must now reconsider that submission to one of these databases be a requirement.
Requiring authors to submit microarray data to a public database will provide a number of distinct advantages to the entire research community. (1) These established repositories have a commitment to continued community service and to providing some level of assurance that published gene expression datasets will continue to be available into the future. (2) Having the data available in these public repositories in a standardized format will not only make them more accessible, but it will allow expression data to be integrated with other relevant data, including the available genome sequences, single nucleotide polymorphism and haplotype mapping information, the literature, and other resources that can aid in further interpretation of expression patterns. Although many authors now provide some or all of this information, the established databases are much more likely to assure that these links are maintained and current. (3) Curation of data submitted to public data repositories will assist authors, reviewers, and publishers in assuring that the data comply with the MIAME requirements, further enhancing their utility. (4) The standardization of microarray data formats will enable the development of additional data analysis and integration tools and makes it easier for scientists to access, query, and share data. (5) Finally, submission prior to publication will make it easier for referees to access the data confidentially, facilitating the review and publication process.
In the same way that availability of sequence data had a profound impact on a wide range of disciplines, we believe that requiring that microarray data be deposited in public repositories as a necessity for publication will accelerate the rate of scientific discovery.
What this proposal requires is a change in the way in which we approach the publication of microarray-based studies. Both authors and journals have a responsibility to assure that the requisite data are available, and because submitting MIAME-compliant data can take considerable time and effort, this process should be factored into review and publication timelines. However, while this process may be time consuming and painful at first, we believe that the benefits of building an open repository of microarray data will far outweigh any initial disadvantages. As always, it is our sincere hope that these suggestions stimulate discussion within the community and that together we can arrive at a consensus that ensures that microarray data are widely and easily accessible. Finally we would like to urge the DDBJ, EBI, and NCBI to work together towards exchanging all MIAME-compliant microarray data.
An essential component of functional genomics studies is the sequence of DNA expressed in tissues of interest. To provide a resource of bovine-specific expressed sequence data and facilitate this powerful approach in cattle research, four normalized cDNA libraries were produced and arrayed for high-throughput sequencing. The libraries were made with RNA pooled from multiple tissues to increase efficiency of normalization and maximize the number of independent genes for which sequence data were obtained. Target tissues included those with highest likelihood to have impact on production parameters of animal health, growth, reproductive efficiency, and carcass merit. Success of normalization and inter- and intralibrary redundancy were assessed by collecting 6000–23,000 sequences from each of the libraries (68,520 total sequences deposited in GenBank). Sequence comparison and assembly of these sequences was performed in combination with 56,500 other bovine EST sequences present in the GenBank dbEST database to construct a cattle Gene Index (available from The Institute for Genomic Research at http://www.tigr.org/tdb/tgi.shtml ). The 124,381 bovine ESTs present in GenBank at the time of the analysis form 16,740 assemblies that are listed and annotated on the Web site. Analysis of individual library sequence data indicates that the pooled-tissue approach was highly effective in preparing libraries for efficient deep sequencing.
Spinal and bulbar muscular atrophy is caused by polyglutamine (polyQ) expansions in androgen receptor (AR), generating gain-of-function toxicity that may involve phosphorylation. Using cellular and animal models, we investigated what kinases and phosphatases target polyQ-expanded AR, whether polyQ expansions modify AR phosphorylation, and how this contributes to neurodegeneration. Mass spectrometry showed that polyQ expansions preserve native phosphorylation and increase phosphorylation at conserved sites controlling AR stability and transactivation. In small-molecule screening, we identified that CDC25/CDK2 signaling could enhance AR phosphorylation, and the calcium-sensitive phosphatase calcineurin had opposite effects. Pharmacologic and genetic manipulation of these kinases and phosphatases modified polyQ-expanded AR function and toxicity in cells, flies, and mice. Ablation of CDK2 reduced AR phosphorylation in the brainstem and restored expression of Myc and other genes involved in DNA damage, senescence, and apoptosis, indicating that the cell cycle–regulated kinase plays more than a bystander role in SBMA-vulnerable postmitotic cells.
It would seem to be an unnecessary and superfluous task to present a paper on this subject before an assembly of specialists in nasal, aural and throat diseases, but as all our knowledge in medicine is the result of the aggregate experience of different observers, I would crave your indulgence whilst I add my mite to the general fund. Much has been written of late years on Luschka's tonsil, adenoid tissue, lymphoid vegetations, etc., and whilst the fact that deafness sometimes results, especially in children, from any excessive accumulation of this formation, no particular stress has been laid on the active influence it exerts in the production and aggravation of naso-aural and naso-pharyngeal troubles, even when present in a slight degree only. Whilst it may be true, as stated by so many observers, that the presence of glandular hypertrophy in the post-nasal space is the frequent result of a previously
Despite substantial progress in sequencing, current strategies can genetically solve only approximately 55-60% of inherited retinal degeneration (IRD) cases. This can be partially attributed to elusive mutations in the known IRD genes, which are not easily identified by the targeted next-generation sequencing (NGS) or Sanger sequencing approaches. We hypothesized that copy-number variations (CNVs) are a major contributor to the elusive genetic causality of IRDs.Twenty-eight cases previously unsolved with a targeted NGS were investigated with whole-genome single-nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) arrays.Deletions in the IRD genes were detected in 5 of 28 families, including a de novo deletion. We suggest that the de novo deletion occurred through nonallelic homologous recombination (NAHR) and we constructed a genomic map of NAHR-prone regions with overlapping IRD genes. In this article, we also report an unusual case of recessive retinitis pigmentosa due to compound heterozygous mutations in SNRNP200, a gene that is typically associated with the dominant form of this disease.CNV mapping substantially increased the genetic diagnostic rate of IRDs, detecting genetic causality in 18% of previously unsolved cases. Extending the search to other structural variations will probably demonstrate an even higher contribution to genetic causality of IRDs.Genet Med advance online publication 13 October 2016.