Recent studies have demonstrated that the excessive inflammatory response is an important factor of death in coronavirus disease 2019 (COVID-19) patients. In this study, we propose a deep representation on heterogeneous drug networks, termed DeepR2cov, to discover potential agents for treating the excessive inflammatory response in COVID-19 patients. This work explores the multi-hub characteristic of a heterogeneous drug network integrating eight unique networks. Inspired by the multi-hub characteristic, we design 3 billion special meta paths to train a deep representation model for learning low-dimensional vectors that integrate long-range structure dependency and complex semantic relation among network nodes. Based on the representation vectors and transcriptomics data, we predict 22 drugs that bind to tumor necrosis factor-α or interleukin-6, whose therapeutic associations with the inflammation storm in COVID-19 patients, and molecular binding model are further validated via data from PubMed publications, ongoing clinical trials and a docking program. In addition, the results on five biomedical applications suggest that DeepR2cov significantly outperforms five existing representation approaches. In summary, DeepR2cov is a powerful network representation approach and holds the potential to accelerate treatment of the inflammatory responses in COVID-19 patients. The source code and data can be downloaded from https://github.com/pengsl-lab/DeepR2cov.git.
ABSTRACT Cell-cell communications (CCCs) from multiple sender cells collaboratively affect downstream functional events in receiver cells, thus influencing cell phenotype and function. How to rank the importance of these CCCs and find the dominant ones in a specific downstream functional event has great significance for deciphering various physiological and pathogenic processes. To date, several computational methods have been developed to focus on the identification of cell types that communicate with enriched ligand-receptor interactions from single-cell RNA-seq (scRNA-seq) data, but to the best of our knowledge, all of them lack the ability to identify the communicating cell type pairs that play a major role in a specific downstream functional event, which we call it “dominant cell communication assembly (DCA)”. Here, we proposed scDCA, a multi-view graph learning method for deciphering DCA from scRNA-seq data. scDCA is based on a multi-view CCC network by constructing different cell type combinations at single-cell resolution. Multi-view graph convolution network was further employed to reconstruct the expression pattern of target genes or the functional states of receiver cells. The DCA was subsequently identified by interpreting the model with the attention mechanism. scDCA was verified in a real scRNA-seq cohort of advanced renal cell carcinoma, accurately deciphering the DCA that affect the expression patterns of the critical immune genes and functional states of malignant cells. Furthermore, scDCA also accurately explored the alteration in cell communication under clinical intervention by comparing the DCA for certain cytotoxic factors between patients with and without immunotherapy. scDCA is free available at: https://github.com/pengsl-lab/scDCA.git .
Data analysis of electronic health record (EHRs) system using machine learning, statistical methods can predict relevant clinical tasks. However, there is no uniform standard for current electronic health record systems, and the clinical outcome prediction models trained on one EHR dataset cannot be applied well on other EHR datasets from different medical institutions. Data differences between different medical institutions pose a huge challenge to the study of electronic health records. In this study, we proposed a general transfer learning strategy which can enable models to make clinical prediction acrossing diverse EHRs datasets and validated its strong versatility on three deep learning models. Two different intensive care units (ICU) databases (MIMIC-III and eICU) and one clinical task (in-hospital mortality) are used to evaluate our method. At first, we trained the deep learning models on the source dataset and saved the model states after each epoch. Then, we selected the best performing model as the pre-training model, transferred it to the target dataset and fine-tuned the whole network on target dataset. Finally, we use the fine-tuned models to make predictions on the target dataset. Experiment results show that AUROC score increased by 3%-20% with transfer strategy, which indicated that the general strategy can provide more reliable predictions acrossing EHRs databases to predict clinical tasks.
In large-scale high-throughput sequencing projects and biobank construction, sample tagging is essential to prevent sample mix-ups. Despite the availability of fingerprint panels for DNA data, little research has been conducted on sample tagging of whole genome bisulfite sequencing (WGBS) data. This study aims to construct a pipeline and identify applicable fingerprint panels to address this problem.Using autosome-wide A/T polymorphic single nucleotide variants (SNVs) obtained from whole genome sequencing (WGS) and WGBS of individuals from the Third China National Stroke Registry, we designed a fingerprint panel and constructed an optimized pipeline for tagging WGBS data. This pipeline used Bis-SNP to call genotypes from the WGBS data, and optimized genotype comparison by eliminating wildtype homozygous and missing genotypes, and retaining variants with identical genomic coordinates and reference/alternative alleles. WGS-based and WGBS-based genotypes called from identical or different samples were extensively compared using hap.py. In the first batch of 94 samples, the genotype consistency rates were between 71.01%-84.23% and 51.43%-60.50% for the matched and mismatched WGS and WGBS data using the autosome-wide A/T polymorphic SNV panel. This capability to tag WGBS data was validated among the second batch of 240 samples, with genotype consistency rates ranging from 70.61%-84.65% to 49.58%-61.42% for the matched and mismatched data, respectively. We also determined that the number of genetic variants required to correctly tag WGBS data was on the order of thousands through testing six fingerprint panels with different orders for the number of variants. Additionally, we affirmed this result with two self-designed panels of 1351 and 1278 SNVs, respectively. Furthermore, this study confirmed that using the number of genetic variants with identical coordinates and ref/alt alleles, or identical genotypes could not correctly tag WGBS data.This study proposed an optimized pipeline, applicable fingerprint panels, and a lower boundary for the number of fingerprint genetic variants needed for correct sample tagging of WGBS data, which are valuable for tagging WGBS data and integrating multi-omics data for biobanks.
With the advance of hardware and network technologies, video transmission becomes increasingly popular, especially through mobile networks. Take different networks and displays into account, a hybrid video transmission architecture is presented. Firstly, captured videos are compressed using a H.264/AVC hard coding module, packed with RTP/RTCP protocols, and sent based on UDP sockets. Secondly, a cloud server is built to forward video streams to end users. To improve the security, Ngrok, an open source reverse proxy project, is included. Finally at the client side, video data are unpacked and decompressed, and then post-processed according to the types of end devices. Experimental results show that no obvious time delays are observed, whereas the transmission bandwidths and packet loss rates are acceptable.