Abstract Adaptive immune receptor repertoires (AIRRs) are rich with information that can be mined for insights into the workings of the immune system. Gene usage, CDR3 properties, clonal lineage structure, and sequence diversity are all capable of revealing the dynamic immune response to perturbation by disease, vaccination, or other interventions. Here we focus on a conceptual introduction to the many aspects of repertoire analysis and orient the reader toward the uses and advantages of each. Along the way, we note some of the many software tools that have been developed for these investigations and link the ideas discussed to chapters on methods provided elsewhere in this volume.
The identification and application of biomarkers in the clinical and medical fields has an enormous impact on society. The increase of digital devices and the rise in popularity of health-related mobile apps has produced a new trove of biomarkers in large, diverse, and complex data. However, the unclear definition of digital biomarkers, population groups, and their intersection with traditional biomarkers hinders their discovery and validation. We have identified current issues in the field of digital biomarkers and put forth suggestions to address them during the DayOne Workshop with participants from academia and industry. We have found similarities and differences between traditional and digital biomarkers in order to synchronize semantics, define unique features, review current regulatory procedures, and describe novel applications that enable precision medicine.
Mycobacterium avium subsp. hominissuis is an opportunistic pathogen that is associated with biofilm-related infections of the respiratory tract and is difficult to treat. In recent years, extracellular DNA (eDNA) has been found to be a major component of bacterial biofilms, including many pathogens involved in biofilm-associated infections. To date, eDNA has not been described as a component of mycobacterial biofilms. In this study, we identified and characterized eDNA in a high biofilm-producing strain of Mycobacterium avium subsp. hominissuis (MAH). In addition, we surveyed for presence of eDNA in various MAH strains and other nontuberculous mycobacteria. Biofilms of MAH A5 (high biofilm-producing strain) and MAH 104 (reference strain) were established at 22°C and 37°C on abiotic surfaces. Acellular biofilm matrix and supernatant from MAH A5 7 day-old biofilms both possess abundant eDNA, however very little eDNA was found in MAH 104 biofilms. A survey of MAH clinical isolates and other clinically relevant nontuberculous mycobacterial species revealed many species and strains that also produce eDNA. RAPD analysis demonstrated that eDNA resembles genomic DNA. Treatment with DNase I reduced the biomass of MAH A5 biofilms when added upon biofilm formation or to an already established biofilm both on abiotic surfaces and on top of human pharyngeal epithelial cells. Furthermore, co-treatment of an established biofilm with DNase 1 and either moxifloxacin or clarithromycin significantly increased the susceptibility of the bacteria within the biofilm to these clinically used antimicrobials. Collectively, our results describe an additional matrix component of mycobacterial biofilms and a potential new target to help treat biofilm-associated nontuberculous mycobacterial infections.
Johne's disease is caused by Mycobacterium avium subsp. paratuberculosis (MAP), which results in serious economic losses worldwide in farmed livestock such as cattle, sheep, and goats. To control this disease, an effective vaccine with minimal adverse effects is needed. In order to identify a live vaccine for Johne's disease, we evaluated eight attenuated mutant strains of MAP using a C57BL/6 mouse model. The persistence of the vaccine candidates was measured at 6, 12, and 18 weeks post vaccination. Only strains 320, 321, and 329 colonized both the liver and spleens up until the 12-week time point. The remaining five mutants showed no survival in those tissues, indicating their complete attenuation in the mouse model. The candidate vaccine strains demonstrated different levels of protection based on colonization of the challenge strain in liver and spleen tissues at 12 and 18 weeks post vaccination. Based on total MAP burden in both tissues at both time points, strain 315 (MAP1566::Tn5370) was the most protective whereas strain 318 (intergenic Tn5367 insertion between MAP0282c and MAP0283c) had the most colonization. Mice vaccinated with an undiluted commercial vaccine preparation displayed the highest bacterial burden as well as enlarged spleens indicative of a strong infection. Selected vaccine strains that showed promise in the mouse model were moved forward into a goat challenge model. The results suggest that the mouse trial, as conducted, may have a relatively poor predictive value for protection in a ruminant host such as goats.
Digital technologies are transforming the health care system. A large part of information is generated as real-world data (RWD). Data from electronic health records and digital biomarkers have the potential to reveal associations between the benefits and adverse events of medicines, establish new patient-stratification principles, expose unknown disease correlations, and inform on preventive measures. The impact for health care payers and providers, the biopharmaceutical industry, and governments is massive in terms of health outcomes, quality of care, and cost. However, a framework to assess the preliminary quality of RWD is missing, thus hindering the conduct of population-based observational studies to support regulatory decision-making and real-world evidence.To address the need to qualify RWD, we aimed to build a web application as a tool to translate characterization of some quality parameters of RWD into a metric and propose a standard framework for evaluating the quality of the RWD.The RWD-Cockpit systematically scores data sets based on proposed quality metrics and customizable variables chosen by the user. Sleep RWD generated de novo and publicly available data sets were used to validate the usability and applicability of the web application. The RWD quality score is based on the evaluation of 7 variables: manageability specifies access and publication status; complexity defines univariate, multivariate, and longitudinal data; sample size indicates the size of the sample or samples; privacy and liability stipulates privacy rules; accessibility specifies how the data set can be accessed and to what granularity; periodicity specifies how often the data set is updated; and standardization specifies whether the data set adheres to any specific technical or metadata standard. These variables are associated with several descriptors that define specific characteristics of the data set.To address the need to qualify RWD, we built the RWD-Cockpit web application, which proposes a framework and applies a common standard for a preliminary evaluation of RWD quality across data sets-molecular, phenotypical, and social-and proposes a standard that can be further personalized by the community retaining an internal standard. Applied to 2 different case studies-de novo-generated sleep data and publicly available data sets-the RWD-Cockpit could identify and provide researchers with variables that might increase quality.The results from the application of the framework of RWD metrics implemented in the RWD-Cockpit application suggests that multiple data sets can be preliminarily evaluated in terms of quality using the proposed metrics. The output scores-quality identifiers-provide a first quality assessment for the use of RWD. Although extensive challenges remain to be addressed to set RWD quality standards, our proposal can serve as an initial blueprint for community efforts in the characterization of RWD quality for regulated settings.
Abstract Dengue virus poses a serious threat to global health and there is no specific therapeutic for it. Broadly neutralizing antibodies recognizing all serotypes may be an effective treatment. High-throughput adaptive immune receptor repertoire sequencing (AIRR-seq) and bioinformatic analysis enable in-depth understanding of the B-cell immune response. Here, we investigate the dengue antibody response with these technologies and apply machine learning to identify rare and underrepresented broadly neutralizing antibody sequences. Dengue immunization elicited the following signatures on the antibody repertoire: (i) an increase of CDR3 and germline gene diversity; (ii) a change in the antibody repertoire architecture by eliciting power-law network distributions and CDR3 enrichment in polar amino acids; (iii) an increase in the expression of JNK/Fos transcription factors and ribosomal proteins. Furthermore, we demonstrate the applicability of computational methods and machine learning to AIRR-seq datasets for neutralizing antibody candidate sequence identification. Antibody expression and functional assays have validated the obtained results.
Abstract High-throughput sequencing of adaptive immune receptor repertoires (AIRR, i.e., IG and TR) has revolutionized the ability to carry out large-scale experiments to study the adaptive immune response. Since the method was first introduced in 2009, AIRR sequencing (AIRR-Seq) has been applied to survey the immune state of individuals, identify antigen-specific or immune-state-associated signatures of immune responses, study the development of the antibody immune response, and guide the development of vaccines and antibody therapies. Recent advancements in the technology include sequencing at the single-cell level and in parallel with gene expression, which allows the introduction of multi-omics approaches to understand in detail the adaptive immune response. Analyzing AIRR-seq data can prove challenging even with high-quality sequencing, in part due to the many steps involved and the need to parameterize each step. In this chapter, we outline key factors to consider when preprocessing raw AIRR-Seq data and annotating the genetic origins of the rearranged receptors. We also highlight a number of common difficulties with common AIRR-seq data processing and provide strategies to address them.
ABSTRACT Inhibition of apoptotic death of macrophages by Mycobacterium tuberculosis represents an important mechanism of virulence that results in pathogen survival both in vitro and in vivo . To identify M. tuberculosis virulence determinants involved in the modulation of apoptosis, we previously screened a transposon bank of mutants in human macrophages, and an M. tuberculosis clone with a nonfunctional Rv3354 gene was identified as incompetent to suppress apoptosis. Here, we show that the Rv3354 gene encodes a protein kinase that is secreted within mononuclear phagocytic cells and is required for M. tuberculosis virulence. The Rv3354 effector targets the metalloprotease (JAMM) domain within subunit 5 of the COP9 signalosome (CSN5), resulting in suppression of apoptosis and in the destabilization of CSN function and regulatory cullin-RING ubiquitin E3 enzymatic activity. Our observation suggests that alteration of the metalloprotease activity of CSN by Rv3354 possibly prevents the ubiquitin-dependent proteolysis of M. tuberculosis -secreted proteins. IMPORTANCE Macrophage protein degradation is regulated by a protein complex called a signalosome. One of the signalosomes associated with activation of ubiquitin and protein labeling for degradation was found to interact with a secreted protein from M. tuberculosis , which binds to the complex and inactivates it. The interference with the ability to inactivate bacterial proteins secreted in the phagocyte cytosol may have crucial importance for bacterial survival within the phagocyte.
Dengue infection is a global threat. As of today, there is no universal dengue fever treatment or vaccines unreservedly recommended by the World Health Organization. The investigation of the specific immune response to dengue virus would support antibody discovery as therapeutics for passive immunization and vaccine design. High-throughput sequencing enables the identification of the multitude of antibodies elicited in response to dengue infection at the sequence level. Artificial intelligence can mine the complex data generated and has the potential to uncover patterns in entire antibody repertoires and detect signatures distinctive of single virus-binding antibodies. However, these machine learning have not been harnessed to determine the immune response to dengue virus. In order to enable the application of machine learning, we have benchmarked existing methods for encoding biological and chemical knowledge as inputs and have investigated novel encoding techniques. We have applied different machine learning methods such as neural networks, random forests, and support vector machines and have investigated the parameter space to determine best performing algorithms for the detection and prediction of antibody patterns at the repertoire and antibody sequence levels in dengue-infected individuals. Our results show that immune response signatures to dengue are detectable both at the antibody repertoire and at the antibody sequence levels. By combining machine learning with phylogenies and network analysis, we generated novel sequences that present dengue-binding specific signatures. These results might aid further antibody discovery and support vaccine design.