The FAIR guiding principles aim to enhance the Findability, Accessibility, Interoperability and Reusability of digital resources such as data, for both humans and machines. The process of making data FAIR (“FAIRification”) can be described in multiple steps. In this paper, we describe a generic step-by-step FAIRification workflow to be performed in a multidisciplinary team guided by FAIR data stewards. The FAIRification workflow should be applicable to any type of data and has been developed and used for “Bring Your Own Data” (BYOD) workshops, as well as for the FAIRification of e.g., rare diseases resources. The steps are: 1) identify the FAIRification objective, 2) analyze data, 3) analyze metadata, 4) define semantic model for data (4a) and metadata (4b), 5) make data (5a) and metadata (5b) linkable, 6) host FAIR data, and 7) assess FAIR data. For each step we describe how the data are processed, what expertise is required, which procedures and tools can be used, and which FAIR principles they relate to.
Abstract Background: The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in bioinformatics, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in bioinformatics and to find empirical evidence supporting their claimed (dis)advantages. Results: From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes in bioinformatics: ontology construction, repair, mapping, and ontology-based (federated) data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). Conclusion: Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in bioinformatics. Second, the low adherence to formal methods illustrates how the bioinformatics field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in bioinformatics can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.
Teaching communication skills plays a pivotal role in medical curricula. The aim of this article is to describe and evaluate a new communication curriculum developed at the Faculty of Medicine, University of Augsburg (KomCuA), which was conceptualized by an interdisciplinary team based on recommended quality standards (i.e., helical, integrated, longitudinal).
Last year I have reported on the results of a first annual questionnaire sent out to all collaborators in the Flora Malesiana network early 1993, asking information on progress, possible bottlenecks, and the expected date of completion of the manuscript. The present report deals with the results of the second questionnaire, which was circulated early 1994. The response was somewhat lower compared to the first one (c. 80% and 85%, respectively).
Polycystic kidney disease (PKD) is a major cause of end-stage renal disease. The disease mechanisms are not well understood and the pathogenesis toward renal failure remains elusive. In this study, we present the first RNASeq analysis of a Pkd1-mutant mouse model in a combined meta-analysis with other published PKD expression profiles. We introduce the PKD Signature, a set of 1,515 genes that are commonly dysregulated in PKD studies. We show that the signature genes include many known and novel PKD-related genes and functions. Moreover, genes with a role in injury repair, as evidenced by expression data and/or automated literature analysis, were significantly enriched in the PKD Signature, with 35% of the PKD Signature genes being directly implicated in injury repair. NF-κB signaling, epithelial-mesenchymal transition, inflammatory response, hypoxia, and metabolism were among the most prominent injury or repair-related biological processes with a role in the PKD etiology. Novel PKD genes with a role in PKD and in injury were confirmed in another Pkd1-mutant mouse model as well as in animals treated with a nephrotoxic agent. We propose that compounds that can modulate the injury-repair response could be valuable drug candidates for PKD treatment.
The average cellular positions of the ftsQAZ region (2 min) and the minB region (26.5 min) during the cell cycle was determined by fluorescent in situ hybridization using the position of oriC as a reference point. At the steady‐state growth conditions used, newborn cells had replicated about 50% of the chromosome. By measuring the distances of the labelled oriC s with respect to mid‐cell, we found two well‐separated average oriC positions in cells of newborn length. These average oriC positions moved further apart along with cell elongation. The cellular position of the ftsQAZ gene region resembled the position of oriC , although its average position was closer to mid‐cell. In contrast, a single minB focus was observed at cell birth. Separated minB foci appeared towards the end of DNA replication. The average positions of oriC , ftsQAZ and minB relative to each other fitted a model in which DNA replication takes place in the cell centre and subsequent gene regions pass sequentially through this centre. We have interpreted the polarized orientation of the studied gene regions as a consequence of the mode of DNA segregation.
Annotations of blood modules associated with HD. In this file we include detailed information about the annotations per semantic type of each module in blood that is associated with HD. (XLS 37 kb)
Patient registries are an essential tool to increase current knowledge regarding rare diseases. Understanding these data is a vital step to improve patient treatments and to create the most adequate tools for personalized medicine. However, the growing number of disease-specific patient registries brings also new technical challenges. Usually, these systems are developed as closed data silos, with independent formats and models, lacking comprehensive mechanisms to enable data sharing. To tackle these challenges, we developed a Semantic Web based solution that allows connecting distributed and heterogeneous registries, enabling the federation of knowledge between multiple independent environments. This semantic layer creates a holistic view over a set of anonymised registries, supporting semantic data representation, integrated access, and querying. The implemented system gave us the opportunity to answer challenging questions across disperse rare disease patient registries. The interconnection between those registries using Semantic Web technologies benefits our final solution in a way that we can query single or multiple instances according to our needs. The outcome is a unique semantic layer, connecting miscellaneous registries and delivering a lightweight holistic perspective over the wealth of knowledge stemming from linked rare disease patient registries.
A systematic way of recording data use conditions that are based on consent permissions as found in the datasets of the main public genome archives (NCBI dbGaP and EMBL-EBI/CRG EGA).