Compared to other pulmonary function tests, there is a lack of standardization regarding how a maximum voluntary ventilation (MVV) maneuver is performed. Specifically, little is known about the variation in breathing frequency (fR) and its potential impact on the accuracy of test results. This study examines the effect of several preselected values for fR and one self-selected fR (fRself) on MVV. Ten participants performed MVV maneuvers at various fR values, ranging from 50 to 130 breaths·min− 1 in 10 breaths·min− 1 intervals and at one fRself. Three identical trials with 2-min rest periods were conducted at each fR, and the sequence in which fR was tested was randomized. Ventilation and related parameters were measured directly by gas exchange analysis via a metabolic measurement system. A third-order polynomial regression analysis showed that MVV = − 0.0001(fR)3 + 0.0258(fR)2–1.38(fR) + 96.9 at preselected fR and increased up to approximately 100 breaths·min− 1 (r2 = 0.982, P < 0.001). Paired t-tests indicated that average MVV values obtained at all preselected fR values, but not fRself, were significantly lower than the average maximum value across all participants. A linear regression analysis revealed that tidal volume (VT) = − 2.63(MVV) + 300.4 at preselected fR (r2 = 0.846, P < 0.001); however, this inverse relationship between VT and MVV did not remain true for the self-selected fR. The VT obtained at this fR (90.9 ± 19.1% of maximum) was significantly greater than the VT associated with the most similar MVV value (at a preselected fR of 100 breaths·min− 1, 62.0 ± 10.4% of maximum; 95% confidence interval of difference: (17.5, 40.4%), P < 0.001). This study demonstrates the shortcomings of the current lack of standardization in MVV testing and establishes data-driven recommendations for optimal fR. The true MVV was obtained with a self-selected fR (mean ± SD: 69.9 ± 22.3 breaths·min− 1) or within a preselected fR range of 110–120 breaths·min− 1. Until a comprehensive reference equation is established, it is advised that MVV be measured directly using these guidelines. If an individual is unable to perform or performs the maneuver poorly at a self-selected fR, ventilating within a mandated fR range of 110–120 breaths·min− 1 may also be acceptable.
Many medical and biological genetics and functional genomics studies include genome-wide analysis. Due to the coordination of cellular functions, the behavior of groups of genes rather than of a single gene can be more informative in these studies. Experimental and technical developments now allow genome-wide measurement of many molecular components of the cell, including mRNA transcripts (1), DNA sequence (2–5) and structure (6–10), DNA binding by transcriptional regulators (11), microRNAs, proteins, and metabolites (12). For each type of data, analysis software has been developed, much of it available within the R/Bioconductor framework (13).
A major issue remains for data mining or statistical inference on high-throughput data due to the “curse of dimensionality” arising from the tens of thousands of molecular components generally being measured in only tens or hundreds of conditions. A logical approach to this problem is the use of Bayesian statistics (14), where prior information developed from many years of targeted biological studies can be used to reduce the search space during model fitting.
For many analyses, there are several steps required for data processing, from image acquisition and processing through normalization to data mining or statistical inference. Often, it is necessary to create a pipeline for the analysis. The ideal pipeline would allow the integration of both prior knowledge and potentially the use of measurements in one molecular domain to guide inference in another. For instance, genes known a priori to function in parallel redundant pathways may be more likely to show genetic interactions in a genome-wide association study (GWAS). Alternatively genes that share transcription factor binding determined by ChIP-seq measurements may be more likely to show correlated expression. The Bayesian framework is quite natural for data exchange in this case, especially for programs that handle different forms of gene-related information and different representations of the data.
XML (eXtendible Markup Language) was invented in the late 1990’s (15) as a way to represent documents in a machine-readable hypertext form. The represented information is organized as a tree, and a pre-given description of the tree allows verifying the data. The tree nodes are XML elements. Elements can contain each other. If a node A is a child of node B, the element corresponding to B contains that corresponding to A. Each of the elements belongs to a type, and the list of the types and their possible relations is the essence of the description (XML schema) mentioned above. Each schema corresponds to a definite data type, e.g. a book, an image, a worksheet, etc. Over the past decade, XML became the most common way of Internet data exchange.
Current bioinformatics practice uses a large variety of XML-based languages that describe different data types (e.g., for a review see (15)). We mention a few of them that are most applicable to this domain. XEMBL (16) is an XML format for EMBL data. CisML (17) and SmallBisMark (18) are for sequence motif information such as transcription factor binding sites, while MAGE-ML (19, 20) is intended for microarray metadata representation. SBML (21) and CellML (22) capture biological network models, and MFAML (23) describes metabolic fluxes.
In addition, there are XML formats (24, 25) that represent Bayesian information in a very general form. However, we require a format for Bayesian information that is suited to biological systems, but which is not too specialized, unlike the biological XMLs noted above. Our goal is an XML to encode relationships as probabilities of interactions for the purposes of genetics and bioinformatics, with the interpretation of the message in the XML depending on the context of the parser. This will permit the interchange of probabilistic information between bioinformatics frameworks that refer to different aspects of genomics knowledge.
Abstract In the context of epilepsy studies, intracranially-recorded interictal high-frequency oscillations (HFOs) in EEG signals are emerging as promising spatial neurophysiological biomarkers for epileptogenic zones. While significant efforts have been made in identifying and understanding these biomarkers, deep learning is carving novel avenues for biomarker detection and analysis. Yet, transitioning such methodologies to clinical environments is difficult due to the rigorous computational needs of processing EEG data via deep learning. This paper presents our development of an advanced end to end software platform, PyHFO, aimed at bridging this gap. PyHFO provides an integrated and user-friendly platform that includes time-efficient HFO detection algorithms such as short-term energy (STE) and Montreal Neurological Institute and Hospital (MNI) detectors and deep learning models for artifact and HFO with spike classification. This application functions seamlessly on conventional computer hardware. Our platform has been validated to adeptly handle datasets from 10-minute EEG recordings captured via grid/strip electrodes in 19 patients. Through implementation optimization, PyHFO achieves speeds up to 50 times faster than the standard HFO detection method. Users can either employ our pre-trained deep learning model for their analyses or use their EEG data to train their model. As such, PyHFO holds great promise for facilitating the use of advanced EEG data analysis tools in clinical practice and large-scale research collaborations.
Background: Technically successful but futile angiographic reperfusion (AR) has been reported in 20-50% of acute ischemic stroke patients undergoing endovascular treatment. Persistent occlusions at distal arteries and ischemic capillary bed (i.e. no-reflow phenomenon) have been shown as one of the culprits resulting in infarct growth and poor functional outcome despite thrombectomy. We aimed to quantify the residual no-reflow perfusion in patients who underwent successful AR and assess its effect on infarct growth and functional outcome. Methods: In this retrospective single institution study, patients with anterior circulation LVO, successful AR (TICI ≥ 2b), and pre and posttreatment MR perfusion (MRP) were included. Hypoperfusion volume along the affected territory was calculated on Tmax maps using thresholds >2, 4, and 6 sec on both pre and posttreatment MRP scans. Infarct growth was calculated by subtracting baseline ischemic core from the final infarct volume. A total of 12 variables including demographic, clinical, and Tmax thresholds were evaluated to predict infarct growth and functional outcome (90day mRS). Results: A total of 50 patients met inclusion criteria, of whom 19 had infarct growth ≥ 10 mL while 31 had < 10 mL infarct growth. Functional outcome was poor in 27 patients (90day mRS>2). Univariate analysis showed no statistical significance (p>0.05) for either of the baseline Tmax volumes in prediction of infarct growth or functional outcome. Posttreatment hypoperfusion (no-reflow volumes) was significant for prediction of infarct growth at all Tmax volumes (p=0.048, 0.023, 0.021 for Tmax > 2, 4, and 6 sec respectively). Volume of Tmax >6sec was significant (p=0.044) for prediction of poor functional outcome. Multivariate logistic regression analysis revealed Tmax >6 sec residual hypoperfusion volume as an independent variable for prediction of infarct growth ≥ 10 mL (p=0.002) and functional outcome (p=0.007). Conclusion: In conclusion, no-reflow zones following successful AR were associated with infarct growth at all Tmax thresholds, however volume of residual Tmax > 6 sec was independently associated with infarct growth > 10 ml and poor functional outcome.
Clinical narrative in the medical record provides perhaps the most detailed account of a patient's history. However, this information is documented in free-text, which makes it challenging to analyze. Efforts to index unstructured clinical narrative often focus on identifying predefined concepts from clinical terminologies. Less studied is the problem of analyzing the text as a whole to create temporal indices that capture relationships between learned clinical events. Topic models provide a method for analyzing large corpora of text to discover semantically related clusters of words. This work presents a topic model tailored to the clinical reporting environment that allows for individual patient timelines. Results show the model is able to identify patterns of clinical events in a cohort of brain cancer patients.