logo
    Abstract:
    Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.
    Keywords:
    Cheminformatics
    Interpretability
    Popularity
    Representation
    Interface (matter)
    The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the "interpretability spectrum". The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. It is found that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.
    Interpretability
    Black box
    The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the 'interpretability spectrum'. The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. I find that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.
    Interpretability
    Black box
    Citations (1)
    Data augmentation strategies are actively used when training deep neural networks (DNNs). Recent studies suggest that they are effective at various tasks. However, the effect of data augmentation on DNNs' interpretability is not yet widely investigated. In this paper, we explore the relationship between interpretability and data augmentation strategy in which models are trained with different data augmentation methods and are evaluated in terms of interpretability. To quantify the interpretability, we devise three evaluation methods based on alignment with humans, faithfulness to the model, and the number of human-recognizable concepts in the model. Comprehensive experiments show that models trained with mixed sample data augmentation show lower interpretability, especially for CutMix and SaliencyMix augmentations. This new finding suggests that it is important to carefully adopt mixed sample data augmentation due to the impact on model interpretability, especially in mission-critical applications.
    Interpretability
    Sample (material)
    Citations (3)
    Although the synthesis of programs encoding policies often carries the promise of interpretability, systematic evaluations were never performed to assess the interpretability of these policies, likely because of the complexity of such an evaluation. In this paper, we introduce a novel metric that uses large-language models (LLM) to assess the interpretability of programmatic policies. For our metric, an LLM is given both a program and a description of its associated programming language. The LLM then formulates a natural language explanation of the program. This explanation is subsequently fed into a second LLM, which tries to reconstruct the program from the natural-language explanation. Our metric then measures the behavioral similarity between the reconstructed program and the original. We validate our approach with synthesized and human-crafted programmatic policies for playing a real-time strategy game, comparing the interpretability scores of these programmatic policies to obfuscated versions of the same programs. Our LLM-based interpretability score consistently ranks less interpretable programs lower and more interpretable ones higher. These findings suggest that our metric could serve as a reliable and inexpensive tool for evaluating the interpretability of programmatic policies.
    Interpretability
    Similarity (geometry)
    Citations (2)
    With the spreading of Marxism popularity in recent years,profound research has been done to the popularity of Marxism from different aspects and perspectives,including the popularity’s scientific implication,necessity,problems,past experiences,and the ways of promoting the popularity.The research is advantageous to the better development of Marxism popularity.
    Popularity
    Citations (0)
    Due to the "black-box' nature of artificial intelligence (AI) recommendations, interpretability is critical to the consumer experience of human-AI interaction. Unfortunately, improving the interpretability of AI recommendations is technically challenging and costly. Therefore, there is an urgent need for the industry to identify when the interpretability of AI recommendations is more likely to be needed. This study defines the construct of Need for Interpretability (NFI) of AI recommendations and empirically tests consumers' need for interpretability of AI recommendations in different decision-making domains. Across two experimental studies, we demonstrate that consumers do indeed have a need for interpretability toward AI recommendations, and that the need for interpretability is higher in utilitarian domains than in hedonic domains. This study would help companies to identify the varying need for interpretability of AI recommendations in different application scenarios.
    Interpretability
    Black box
    Abstract Cheminformatics is the use of computer and informational techniques, applied to a range of problems in the field of chemistry. It is also known as chemoinformatics and chemical informatics. These in silico techniques are used in pharmaceutical companies in the process of drug discovery. Cheminformatics can also be applied to data analysis for various industries such as paper and pulp, dyes and such allied industries. The primary application of cheminformatics is in the storage of information relating to compounds. Quantitative structure–activity relationship (QSAR) analysis also forms a part of cheminformatics. Several in silico cheminformatic tools are currently available for predicting physio‐chemical properties and biological activity of many different chemical molecules. Thus, chemoinformatics helps to reduce the time taken for identifying potential drug targets as well as to understand physical, chemical and biological properties of several chemical compounds. Outputs of chemoinformatics may also direct the course of wet laboratory experiments.
    Cheminformatics