The Structure-Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure-function relationships for functionally diverse enzyme superfamilies.Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction.Thus, despite their different functions, members of these superfamilies 'look alike', making them easy to misannotate.To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy.Browsing and searching options in the SFLD provide access to all of these levels.The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels.Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks.The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.
Metagenomic next-generation sequencing (NGS) of cerebrospinal fluid (CSF) has the potential to identify a broad range of pathogens in a single test.In a 1-year, multicenter, prospective study, we investigated the usefulness of metagenomic NGS of CSF for the diagnosis of infectious meningitis and encephalitis in hospitalized patients. All positive tests for pathogens on metagenomic NGS were confirmed by orthogonal laboratory testing. Physician feedback was elicited by teleconferences with a clinical microbial sequencing board and by surveys. Clinical effect was evaluated by retrospective chart review.We enrolled 204 pediatric and adult patients at eight hospitals. Patients were severely ill: 48.5% had been admitted to the intensive care unit, and the 30-day mortality among all study patients was 11.3%. A total of 58 infections of the nervous system were diagnosed in 57 patients (27.9%). Among these 58 infections, metagenomic NGS identified 13 (22%) that were not identified by clinical testing at the source hospital. Among the remaining 45 infections (78%), metagenomic NGS made concurrent diagnoses in 19. Of the 26 infections not identified by metagenomic NGS, 11 were diagnosed by serologic testing only, 7 were diagnosed from tissue samples other than CSF, and 8 were negative on metagenomic NGS owing to low titers of pathogens in CSF. A total of 8 of 13 diagnoses made solely by metagenomic NGS had a likely clinical effect, with 7 of 13 guiding treatment.Routine microbiologic testing is often insufficient to detect all neuroinvasive pathogens. In this study, metagenomic NGS of CSF obtained from patients with meningitis or encephalitis improved diagnosis of neurologic infections and provided actionable information in some cases. (Funded by the National Institutes of Health and others; PDAID ClinicalTrials.gov number, NCT02910037.).
Abstract Background Clinical metagenomic next-generation sequencing (mNGS) testing of cerebrospinal fluid (CSF) increases diagnostic yield for suspected central nervous system (CNS) infections. Nevertheless, up to 45% of cases remain unknown despite extensive testing and 6 months of follow-up. We developed artificial intelligence-machine learning (AI-ML) classification models (classifiers) based on RNA gene expression / host response data from CSF mNGS testing to enhance diagnostic performance.Figure 1.An artificial intelligence machine learning classifier. (A) Schematic illustration of the different steps to generate the sub-classifiers then integrated into a consensus classifier. (B) The parasitic sub-classifier was generated using the "leave-one-out" algorithm due to the low sample numbers. (C) Performance metrics of the main classifiers. Methods From June 2016 and April 2023, 464 CSF mNGS results from UCSF patients with a confirmed viral (n=175), bacterial (n=91), fungal (n=44), or non-infectious (n=155) diagnosis were randomly divided into training and test subsets in an 80:20 ratio (Figure 1A). Clinicians adjudicated their confidence in the final diagnosis based on blinded medical chart review and laboratory test results. Gene (feature) selection was carried out using high-confidence samples, followed by 50-fold cross-validation. All possible pairwise comparisons were performed to generate sub-classifiers which were then integrated into a consensus classifier. Performance metrics were obtained by running the consensus classifier on the independent test subset. A separate classifier was developed for parasitic infections (n=24) using a “leave-one-out” algorithm to assess performance (Figure 1B). Classifier results were displayed as a score ranging from 0 to 10, corresponding to specificities of ≤70% to ≥99%.Figure 2.Differentially expressed gene analysis with pair-wise comparison between sub-classifiers. Left panel shows genes selected by the LASSO (least absolute shrinkage and selection operator) algorithm, and right panel the pathway/function these genes. Results Classifier accuracy based on the test set was 83%, with individual area under the curve (AUC) scores ranging from 0.88-0.93 for each category (Figure 1C). The function of the selected genes corresponded well with the category (Figure 2). Classifier examples include patients with (i) culture-negative CNS tuberculosis, (ii) persistent Toxoplasma gondii infection despite 19 days of treatment, (iii) a rare autoimmune syndrome, and (iv) a chronic enterovirus infection misclassified as an atypical bacterial infection, suggesting that acute and chronic host response profiles differ (Figure 3).Figure 3.Classifier examples. (i) culture-negative CNS tuberculosis, (ii) persistent Toxoplasma gondii infection despite 19 days of treatment, (iii) a rare autoimmune syndrome, and (iv) a chronic enterovirus infection misclassified as an atypical bacterial infection. Conclusion Host response profiling with AI-ML classifiers complements mNGS testing and can enhance diagnostic yield for unexplained CNS syndromes. Disclosures Charles Chiu, MD, PhD, Abbott Laboratories, Inc: Grant/Research Support|Biomeme: Advisor/Consultant|Biomeme: Board Member|BiomeSense: Advisor/Consultant|BiomeSense: Board Member|Delve Bio: Advisor/Consultant|Delve Bio: Board Member|Delve Bio: Grant/Research Support|Flightpath Biosciences: Advisor/Consultant|Flightpath Biosciences: Board Member|Mammoth Biosciences: Advisor/Consultant|Mammoth Biosciences: Board Member|Pathogen detection using next generation sequencing: US patent 11380421|Poppy Health: Advisor/Consultant|Poppy Health: Board Member
Abstract Tools for rapid identification of novel and/or emerging viruses are urgently needed for clinical diagnosis of unexplained infections and pandemic preparedness. Here we developed and clinically validated a largely automated metagenomic next-generation sequencing (mNGS) assay for agnostic detection of respiratory viral pathogens from upper respiratory swab and bronchoalveolar lavage samples in <24 hours. The mNGS assay achieved mean limits of detection of 543 copies/mL, viral load quantification with 100% linearity, and 93.6% sensitivity, 93.8% specificity, and 93.7% accuracy compared to gold-standard clinical multiplex RT-PCR. Performance increased to 97.9% overall predictive agreement after discrepancy testing and clinical adjudication, which was superior to that of RT-PCR (95.0% overall agreement). To enable discovery of novel, sequence-divergent human viruses with pandemic potential, de novo assembly and translated nucleotide algorithms were incorporated into the automated SURPI+ computational pipeline used by the mNGS assay for pathogen detection. Using in silico analysis, we showed after removal of all human viral sequences from the reference database that 70 (100%) of 70 representative human viral pathogens could still be identified based on homology to related animal or plant viruses. Our assay, which was granted breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023, demonstrates the feasibility of routine mNGS testing in clinical and public health laboratories, thus enabling a robust and rapid response to the next viral respiratory pandemic.
We report unbiased metagenomic detection of chikungunya virus (CHIKV), Ebola virus (EBOV), and hepatitis C virus (HCV) from four human blood samples by MinION nanopore sequencing coupled to a newly developed, web-based pipeline for real-time bioinformatics analysis on a computational server or laptop(MetaPORE). At titers ranging from 107-108 copies per milliliter, reads to EBOV from two patients with acute hemorrhagic fever and CHIKV from an asymptomatic blood donor were detected within 4 to 10 minutes of data acquisition, while lower titer HCV virus (1x105 copies per milliliter) was detected within 40 minutes. Analysis of mapped nanopore reads alone, despite an average individual error rate of 24% [range 8-49%], permitted identification of the correct viral strain in all 4 isolates, and 90% of the genome of CHIKV was recovered with >98% accuracy. Using nanopore sequencing, metagenomic detection of viral pathogens directly from clinical samples was performed within an unprecedented <6 hours sample-to-answer turnaround time and in a timeframe amenable for actionable clinical and public health diagnostics.
Abstract We used unbiased metagenomic next-generation sequencing to diagnose a fatal case of meningoencephalitis caused by St. Louis encephalitis virus in a patient from California in September 2016. This case is associated with the recent 2015–2016 reemergence of this virus in the southwestern United States.