Abstract In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models(LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP related tasks and benchmarks and got excellent results. To evaluate the performance of ChatGPT on biomedical related tasks, this paper presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions and so on. Through a series of experiments, we demonstrated the effectiveness and versatility of Chat-GPT in biomedical text understanding, reasoning and generation.
Abstract Rapid and accurate prediction of molecular properties is a fundamental task in drug discovery. In recent years, deep learning-based molecular property prediction methods have received much attention and recent successes have shown that learning the representations of molecular structures by applying graph neural networks (GNNs) can achieve better prediction results. However, most previous approaches typically focus on learning atomic embedding, while in this paper, we propose a novel attention method based on atom pair embedding, and it was applied to two types of prediction task. Firstly, learning of atom pair embedding was done on 2D molecular graphs for predicting a series of ligand properties and secondly, the atom pair embedding was learned on ligand/protein 3D complex structures together with axial attention network to predict protein-ligand interaction. In MolecularNet benchmark datasets, our method achieved better performance than previous state-of-the-art models in ten property prediction tasks and in the task for protein-ligand interaction prediction, our method also obtained superior results on the PDB2016 dataset than a collection of reference models. Our source code will be publicly available upon the acceptance of the manuscript.
Event detection is a fundamental task in information extraction.Most previous approaches typically view event detection as a triggerbased classification problem, focusing on using syntactic dependency structure or external knowledge to boost the classification performance.To overcome the inherent issues with existing trigger classification based models, we propose a novel approach to event detection by formulating it as a graph parsing problem, which can explicitly model the multiple event correlations and naturally utilize the rich information conveyed by event type and subtype.Furthermore, to cope with data sparsity, we employ a pretrained sequence-tosequence (seq2seq) model to transduce an input sentence into an accurate event graph without the need for trigger words.Extensive experimental results on the public ACE2005 dataset show that, our approach outperforms all previous state-of-the-art models for event detection by a large margin, obtaining an improvement of 4.2% F1 score.The result is very encouraging since we achieve this with a conceptually simple seq2seq model; moreover, by extending the graph structure, this proposed architecture can be flexibly applied to more information extraction problems for sentences.
Background: Colon adenocarcinoma (COAD) is the second leading cause of cancer death worldwide thus, identification of COAD biomarkers is critical. Mitotic Arrest Deficient 2 Like 2 (MAD2L2) is a key factor in mammalian DNA damage repair and is highly expressed in many malignant tumors. This is a comprehensive study of MAD2L2 expression, its diagnostic value, prognostic analysis, potential biological function, and impact on the immune system of patients with COAD. Methods: Gene expression, clinical relevance, prognostic analysis, diagnostic value, GO/KEGG cluster analysis, data obtained from TCGA, and bioinformatics statistical analysis were performed using the R package. Immune responses to MAD2L2 expression in COAD were analyzed using TIMER. The expression of MAD2L2 in HCT116 cells induced by the inflammatory factor TNF-α was detected using Western blot. Results: Our results underscore the clinical diagnostic value and potential biological significance of MAD2L2 in patients with COAD. A high level of MAD2L2 expression has been found in COAD and correlated with tumor status and colon polyps. ROC curve analysis showed that MAD2L2 expression has high diagnostic value in COAD. Analysis of immune infiltration results showed that MAD2L2 expression was positively correlated with neutrophil levels. The western blot results demonstrated that MAD2L2 was dose-dependently present with TNF-α. GO/KEGG revealed that MAD2L2 overexpressed and coexpressed genes were mostly involved in biological functions, including hypoxia response, response to reduced oxygen levels, mitochondrial translation elongation, and other processes. Conclusion: MAD2L2 as a new COAD biomarker contributes to our understanding of how alterations in gene expression and the immunological environment contribute to the development of colon cancer. Following further investigation, MAD2L2 may prove to be a viable target factor for clinical diagnosis and therapy of COAD.
Background: Colorectal cancer is a major global health concern, exacerbated by tumor necrosis factor-alpha (TNF-α) and its role in inflammation, with the effects of Mitotic Arrest Deficient 2 Like 2 (MAD2L2) in this context still unclear.Methods: The colorectal carcinoma cell lines HCT116 and SW620 were exposed to TNF-α for a period of 24 h to instigate an inflammatory response.Subsequent assessments were conducted to measure the expression of inflammatory cytokines, the activity within the p38 mitogen-activated protein kinase (p38 MAPK) and Phosphoinositide 3-Kinase/AKT Serine/Threonine Kinase pathway (PI3K/AKT) signaling cascades.Transcriptome sequencing and subsequent integrative analysis with the Cancer Genome Atlas (TCGA) program database revealed a significant downregulation of the key factor MAD2L2.Enhancement of MAD2L2 expression was facilitated via lentiviral vector-mediated transfection.The influence of this overexpression on TNF-α-prompted inflammation, intracellular signaling pathways, and the migratory and invasive behaviors of the colorectal cancer cells was then scrutinized.Results: TNF-α treatment significantly increased the expression of Interleukin-1 beta (IL-1β) and Interleukin-6 (IL-6), activated the MAPK p38 and PI3K/AKT signaling pathways, and enhanced cell migration and invasion.A decrease in MAD2L2 expression was observed following TNF-α treatment.However, overexpression of MAD2L2 reversed the effects of TNF-α, reducing IL-1β and IL-6 levels, attenuating PI3K/AKT pathway activation, and inhibiting cell migration and invasion.Conclusions:Overexpression of MAD2L2 attenuates the pro-inflammatory effects of TNF-α, suggesting that MAD2L2 plays a protective role against TNF-α-induced migration and invasion of colorectal carcinoma cells.Therefore, MAD2L2 holds potential as a therapeutic target in the treatment of colorectal cancer.
AbstractBackground: Cancer cells can develop resistance to DNA interstrand crosslinker agents through a DNA repair bypass pathway called TLS. JH-RE-06, a TLS-targeting inhibitor, has been shown to increase melanoma cell susceptibility to cisplatin. Nevertheless, whether JH-RE-06 can be used in combination with Mitomycin C (MMC) to benefit Colorectal Cancer (CRC) patients receiving hyperthermic intraperitoneal chemotherapy (HIPEC) treatment remains unknown. Methods: Colon adenocarcinoma (COAD) and Rectum adenocarcinoma (READ) data were obtained from The Cancer Genome Atlas (TCGA) database, and the expression of Rev1-associated proteins in normal and malignant tissues were compared to generate receiver operating characteristic curves (ROC) . The association between Rev1 and Rev7 expression and the prognosis of CRC patients was derived from the PrognoScan database. Expression at the protein level was verified with a tissue microarray. Western blot was performed to identify alterations in the protein levels of Rev1 and Rev7 following MMC treatment of HCT116 cells, whereas CCK8 revealed alterations in the IC50 value of MMC following the knockdown of Rev7 and Rev1. Co-Immunoprecipitation for the targeting of JH-RE-06. EdU demonstrated the inhibitory effect of JH-RE-06 and MMC on cancer cell growth; Wound healing, and clone formation assays were carried out to evaluate the cell migration and clone formation abilities, respectively. Flow cytometry analysis was performed to detect cell apoptosis, and a commercial reagent kit was used to detect ROS and NAD+/NADH changes. Immunofluorescence was used to analyze cellular DNA damage. Finally, the potential mechanism of action and targets of JH-RE-06 in the treatment of CRC were investigated by network pharmacology. Results: Analysis of bioinformatics data revealed high expression of Rev1 and Rev1-associated proteins Rad18, Rev3, and Rev7 in CRC tumor tissues compared to normal tissues, with Rad18 and Rev7 showing high diagnostic values for CRC. High Rev1 expression was associated with a poor prognosis, whereas high Rev7 expression was associated with a favorable prognosis. The protein-level expression of Rev1 and Rev7 was verified by immunohistochemistry, indicating that the downregulation of Rev1 and Rev7 may increase HCT116 susceptibility to MMC treatment. Co-treatment with JH-RE-06 may augment the therapeutic efficacy of MMC in CRC cells, increase cell apoptosis, mitochondrial and DNA damage, and limit cancer cell migration and clone formation. Results from network pharmacology revealed that JH-RE-06 treatment may also involve the MAPK, PI3K, and Akt signaling pathways. Conclusions: Rad18 and Rev7 can be employed as predictive biomarkers for CRC. Targeting TLS renders HCT116 sensitive to MMC treatment, and JH-RE-06 has the potential to serve as a combination therapy medication for the MMC treatment of peritoneal metastatic CRC in HIPEC.
In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP-related tasks and benchmarks and got excellent results. With exciting performance on daily chat, researchers began to explore the capacity of ChatGPT on expertise that requires professional education for human and we are interested in the biomedical domain.To evaluate the performance of ChatGPT on biomedical-related tasks, this article presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions, and so on. Typical NLP tasks like named entity recognization, relation extraction, sentence similarity, question and answering, and document classification are included. Overall, ChatGPT got a BLURB score of 58.50 while the state-of-the-art model had a score of 84.30. Through a series of experiments, we demonstrated the effectiveness and versatility of ChatGPT in biomedical text understanding, reasoning and generation, and the limitation of ChatGPT build on GPT-3.5.All the datasets are available from BLURB benchmark https://microsoft.github.io/BLURB/index.html. The prompts are described in the article.
Abstract Event extraction is a fundamental task in information extraction. Most previous approaches typically transform event extraction into two subtasks: trigger classification and argument classification, and solve them via classification-based methods, which suffer from some inherent drawbacks. To overcome these issues, in this paper we propose a novel event extraction model Seq2EG by first formulating event extraction as an event graph parsing problem, and then exploiting a pre-trained sequence-to-sequence (seq2seq) model to transduce an input sentence into an accurate event graph without the need for trigger words. Based on the generative event graph parsing formulation, our model Seq2EG can explicitly model the multiple event correlations and argument sharing, and can naturally incorporate some graph-structured features and the rich semantic information conveyed by the labels of event types and argument roles. Extensive experimental results on the public ACE2005 dataset show that, our approach outperforms all previous state-of-the-art models for event extraction by a large margin, respectively obtaining an improvement of 3.4% F1 score for event detection and an improvement of 4.7% F1 score for argument classification over the best baselines.