Computational prediction of ATC codes of drug-like compounds using tiered learning

Citation

Reference

Related Paper

Abstract:

The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns codes to compounds based on their therapeutic, pharmacological and chemical characteristics as well as the in-vivo site of activity. The ability to predict the ATC code of an arbitrary compound with high accuracy can go a long way in selecting molecules for lead identification. We propose a computational approach to this problem that utilizes a natural pharmacological constraint, namely, that anatomical-therapeutic biological activity of certain types must preclude activities of many other types. The method proposed here utilizes machine learning in a tiered architecture; prediction of the ATC code at a certain level is constrained by the ATC code at the higher levels. Using this learning architecture, we have built classifiers that incorporate information from a compound's structure, as well as its chemical and protein interactions. The proposed approach has been validated using 2335 drugs from the ChEMBL database in both cross-validation and test setting. The prediction accuracy obtained with this approach is 78.72% and is comparable or better than the prediction accuracy of other methods at the state of the art.

Keywords:

chEMBL

Code (set theory)

Identification

Drug target

Topics:

Computational Drug Discovery Methods

Machine Learning in Bioinformatics

10.1109/iccabs.2015.7344719

Cite

Deep Neural Network-Assisted Drug Recommendation Systems for Identifying Potential Drug–Target Interactions

ACS Omega (2022)

Yogesh Kalakoti Shashank Yadav Durai Sundar

In silico methods to identify novel drug–target interactions (DTIs) have gained significant importance over conventional techniques owing to their labor-intensive and low-throughput nature. Here, we present a machine learning-based multiclass classification workflow that segregates interactions between active, inactive, and intermediate drug–target pairs. Drug molecules, protein sequences, and molecular descriptors were transformed into machine-interpretable embeddings to extract critical features from standard datasets. Tools such as CHEMBL web resource, iFeature, and an in-house developed deep neural network-assisted drug recommendation (dNNDR)-featx were employed for data retrieval and processing. The models were trained with large-scale DTI datasets, which reported an improvement in performance over baseline methods. External validation results showed that models based on att-biLSTM and gCNN could help predict novel DTIs. When tested with a completely different dataset, the proposed models significantly outperformed competing methods. The validity of novel interactions predicted by dNNDR was backed by experimental and computational evidence in the literature. The proposed methodology could elucidate critical features that govern the relationship between a drug and its target.

chEMBL

Drug target

10.1021/acsomega.2c00424

Cite

Citations (10)

HyperPCM: Robust Task-Conditioned Modeling of Drug–Target Interactions

Journal of Chemical Information and Modeling (2024)

Emma Svensson Pieter-Jan Hoedt Sepp Hochreiter Günter Klambauer

A central problem in drug discovery is to identify the interactions between drug-like compounds and protein targets. Over the past few decades, various quantitative structure-activity relationship (QSAR) and proteo-chemometric (PCM) approaches have been developed to model and predict these interactions. While QSAR approaches solely utilize representations of the drug compound, PCM methods incorporate both representations of the protein target and the drug compound, enabling them to achieve above-chance predictive accuracy on previously unseen protein targets. Both QSAR and PCM approaches have recently been improved by machine learning and deep neural networks, that allow the development of drug-target interaction prediction models from measurement data. However, deep neural networks typically require large amounts of training data and cannot robustly adapt to new tasks, such as predicting interaction for unseen protein targets at inference time. In this work, we propose to use HyperNetworks to efficiently transfer information between tasks during inference and thus to accurately predict drug-target interactions on unseen protein targets. Our HyperPCM method reaches state-of-the-art performance compared to previous methods on multiple well-known benchmarks, including Davis, DUD-E, and a ChEMBL derived data set, and particularly excels at zero-shot inference involving unseen protein targets. Our method, as well as reproducible data preparation, is available at https://github.com/ml-jku/hyper-dti.

chEMBL

Drug target

Training set

Applicability domain

10.1021/acs.jcim.3c01417

Cite

Citations (6)

Multi-task learning with a natural metric for quantitative structure activity relationship learning

Journal of Cheminformatics (2019)

Noureddin Sadawi Iván Olier Joaquin Vanschoren Jan N. van Rijn Jérémy Besnard

Abstract The goal of quantitative structure activity relationship (QSAR) learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the compound. We employed multi-task learning (MTL) to exploit commonalities in drug targets and assays. We used datasets containing curated records about the activity of specific compounds on drug targets provided by ChEMBL. Totally, 1091 assays have been analysed. As a baseline, a single task learning approach that trains random forest to predict drug activity for each drug target individually was considered. We then carried out feature-based and instance-based MTL to predict drug activities. We introduced a natural metric of evolutionary distance between drug targets as a measure of tasks relatedness. Instance-based MTL significantly outperformed both, feature-based MTL and the base learner, on 741 drug targets out of 1091. Feature-based MTL won on 179 occasions and the base learner performed best on 171 drug targets. We conclude that MTL QSAR is improved by incorporating the evolutionary distance between targets. These results indicate that QSAR learning can be performed effectively, even if little data is available for specific drug targets, by leveraging what is known about similar drug targets.

chEMBL

Cheminformatics

Feature (linguistics)

Drug target

10.1186/s13321-019-0392-1

Cite

Citations (19)

Complementary Approaches to Existing Target Based Drug Discovery for Identifying Novel Drug Targets

Biomedicines (2016)

Suhas Vasaikar Pooja Bhatia P. G. Bhatia Koon-Chu Yaiw

In the past decade, it was observed that the relationship between the emerging New Molecular Entities and the quantum of R&D investment has not been favorable. There might be numerous reasons but few studies stress the introduction of target based drug discovery approach as one of the factors. Although a number of drugs have been developed with an emphasis on a single protein target, yet identification of valid target is complex. The approach focuses on an in vitro single target, which overlooks the complexity of cell and makes process of validation drug targets uncertain. Thus, it is imperative to search for alternatives rather than looking at success stories of target-based drug discovery. It would be beneficial if the drugs were developed to target multiple components. New approaches like reverse engineering and translational research need to take into account both system and target-based approach. This review evaluates the strengths and limitations of known drug discovery approaches and proposes alternative approaches for increasing efficiency against treatment.

Identification

Drug target

10.3390/biomedicines4040027

Cite

Citations (30)

Proteomic approaches in drug discovery

Drug Discovery Today Technologies (2006)

Timothy D. Veenstra

Drug target

10.1016/j.ddtec.2006.10.001

Cite

Citations (18)

Applications of Machine Learning in Drug Target Discovery

Current Drug Metabolism (2020)

Dongrui Gao Qingyuan Chen Yuanqi Zeng Meng Jiang Yongqing Zhang

Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.

Biomedicine

Drug target

Drug Development

10.2174/1567201817999200728142023

Cite

Citations (13)

Artificial intelligence for drug discovery: Resources, methods, and applications

Molecular Therapy — Nucleic Acids (2023)

Wei Chen Xuesong Liu Sanyin Zhang Shilin Chen

Conventional wet laboratory testing, validations, and synthetic procedures are costly and time-consuming for drug discovery. Advancements in artificial intelligence (AI) techniques have revolutionized their applications to drug discovery. Combined with accessible data resources, AI techniques are changing the landscape of drug discovery. In the past decades, a series of AI-based models have been developed for various steps of drug discovery. These models have been used as complements of conventional experiments and have accelerated the drug discovery process. In this review, we first introduced the widely used data resources in drug discovery, such as ChEMBL and DrugBank, followed by the molecular representation schemes that convert data into computer-readable formats. Meanwhile, we summarized the algorithms used to develop AI-based models for drug discovery. Subsequently, we discussed the applications of AI techniques in pharmaceutical analysis including predicting drug toxicity, drug bioactivity, and drug physicochemical property. Furthermore, we introduced the AI-based models for de novo drug design, drug-target structure prediction, drug-target interaction, and binding affinity prediction. Moreover, we also highlighted the advanced applications of AI in drug synergism/antagonism prediction and nanomedicine design. Finally, we discussed the challenges and future perspectives on the applications of AI to drug discovery. Conventional wet laboratory testing, validations, and synthetic procedures are costly and time-consuming for drug discovery. Advancements in artificial intelligence (AI) techniques have revolutionized their applications to drug discovery. Combined with accessible data resources, AI techniques are changing the landscape of drug discovery. In the past decades, a series of AI-based models have been developed for various steps of drug discovery. These models have been used as complements of conventional experiments and have accelerated the drug discovery process. In this review, we first introduced the widely used data resources in drug discovery, such as ChEMBL and DrugBank, followed by the molecular representation schemes that convert data into computer-readable formats. Meanwhile, we summarized the algorithms used to develop AI-based models for drug discovery. Subsequently, we discussed the applications of AI techniques in pharmaceutical analysis including predicting drug toxicity, drug bioactivity, and drug physicochemical property. Furthermore, we introduced the AI-based models for de novo drug design, drug-target structure prediction, drug-target interaction, and binding affinity prediction. Moreover, we also highlighted the advanced applications of AI in drug synergism/antagonism prediction and nanomedicine design. Finally, we discussed the challenges and future perspectives on the applications of AI to drug discovery.

DrugBank

chEMBL

10.1016/j.omtn.2023.02.019

Cite

Citations (113)

Drug Target Protein Prediction using SVM

한국정보과학회 학술발표논문집 (2007)

정휘성 현보라 정석훈 장우혁 한동수

Drug discovery is a long process with a low rate of successful new therapeutic discovery regardless of the advances in information technologies. Identification of candidate proteins is an essential step for the drug discovery and it usually requires considerable time and efforts in the drug discovery. The drug discovery is not a logical, but a fortuitous process. Nevertheless, considerable amount of information on drugs are accumulated in UniProt, NCBI, or DrugBank. As a result, it has become possible to try to devise new computational methods classifying drug target candidates extracting the common features of known drug target proteins. In this paper, we devise a method for drug target protein classification by using weighted feature summation and Support Vector Machine. According to our evaluation, the method is revealed to show moderate accuracy 85~90%. This indicates that if the devised method is used appropriately, it can contribute in reducing the time and cost of the drug discovery process, particularly in identifying new drug target proteins.

DrugBank

Drug target

UniProt

Identification

Drug repositioning

Source

Cite

Citations (1)

MBC and ECBL libraries: outstanding tools for drug discovery

Frontiers in Pharmacology (2023)

Tiziana Ginex Enrique López Madruga Ana Martı́nez Carmen Gil

Chemical libraries have become of utmost importance to boost drug discovery processes. It is widely accepted that the quality of a chemical library depends, among others, on its availability and chemical diversity which help in rising the chances of finding good hits. In this regard, our group has developed a source for useful chemicals named Medicinal and Biological Chemistry (MBC) library. It originates from more than 30 years of experience in drug design and discovery of our research group and has successfully provided effective hits for neurological, neurodegenerative and infectious diseases. Moreover, in the last years, the European research infrastructure for chemical biology EU-OPENSCREEN has generated the European Chemical Biology library (ECBL) to be used as a source of hits for drug discovery. Here we present and discuss the updated version of the MBC library (MBC v.2022), enriched with new scaffolds and containing more than 2,500 compounds together with ECBL that collects about 100,000 small molecules. To properly address the improved potentialities of the new version of our MBC library in drug discovery, up to 44 among physicochemical and pharmaceutical properties have been calculated and compared with those of other well-known publicly available libraries. For comparison, we have used ZINC20, DrugBank, ChEMBL library, ECBL and NuBBE along with an approved drug library. Final results allowed to confirm the competitive chemical space covered by MBC v.2022 and ECBL together with suitable drug-like properties. In all, we can affirm that these two libraries represent an interesting source of new hits for drug discovery.

Chemical space

DrugBank

chEMBL

Chemical library

10.3389/fphar.2023.1244317

Cite

Citations (6)

Docking and scoring: applications to drug discovery in the interactomics era

Expert Opinion on Drug Discovery (2009)

Solène Grosdidier Juan Fernández‐Recio

Computational approaches such as docking and scoring are becoming routine in drug discovery as a complement to other more traditional techniques. However, so far, computer drug design methods have been applied to inhibit the function of individual proteins, and there is little available data on the use of these computational techniques to target protein-protein interactions.To establish a strategy for the use of current computational tools in drug discovery targeting protein-protein interactions.Individual techniques applied to specific cases could be studied to derive a general strategy for targeting protein-protein interactions.Protein docking, interface prediction and hot-spot identification can contribute to the discovery of small molecule inhibitors targeting protein interactions of therapeutic interest, especially when little structural information is available.

Docking (animal)

Drug target

Protein–ligand docking

Computational model

10.1517/17460440903002067

Cite

Citations (16)