logo
    Uncertainty Modeling in Ultrasound Image Segmentation for Precise Fetal Biometric Measurements
    0
    Citation
    0
    Reference
    10
    Related Paper
    Abstract:
    Medical image segmentation, particularly in the context of ultrasound data, is a crucial aspect of computer vision and medical imaging. This paper delves into the complexities of uncertainty in the segmentation process, focusing on fetal head and femur ultrasound images. The proposed methodology involves extracting target contours and exploring techniques for precise parameter measurement. Uncertainty modeling methods are employed to enhance the training and testing processes of the segmentation network. The study reveals that the average absolute error in fetal head circumference measurement is 8.0833mm, with a relative error of 4.7347%. Similarly, the average absolute error in fetal femur measurement is 2.6163mm, with a relative error of 6.3336%. Uncertainty modeling experiments employing Test-Time Augmentation (TTA) demonstrate effective interpretability of data uncertainty on both datasets. This suggests that incorporating data uncertainty based on the TTA method can support clinical practitioners in making informed decisions and obtaining more reliable measurement results in practical clinical applications. The paper contributes to the advancement of ultrasound image segmentation, addressing critical challenges and improving the reliability of biometric measurements.
    Keywords:
    Interpretability
    Fetal head
    The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the "interpretability spectrum". The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. It is found that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.
    Interpretability
    Black box
    The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the 'interpretability spectrum'. The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. I find that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.
    Interpretability
    Black box
    Citations (1)
    Data augmentation strategies are actively used when training deep neural networks (DNNs). Recent studies suggest that they are effective at various tasks. However, the effect of data augmentation on DNNs' interpretability is not yet widely investigated. In this paper, we explore the relationship between interpretability and data augmentation strategy in which models are trained with different data augmentation methods and are evaluated in terms of interpretability. To quantify the interpretability, we devise three evaluation methods based on alignment with humans, faithfulness to the model, and the number of human-recognizable concepts in the model. Comprehensive experiments show that models trained with mixed sample data augmentation show lower interpretability, especially for CutMix and SaliencyMix augmentations. This new finding suggests that it is important to carefully adopt mixed sample data augmentation due to the impact on model interpretability, especially in mission-critical applications.
    Interpretability
    Sample (material)
    Citations (3)
    Although the synthesis of programs encoding policies often carries the promise of interpretability, systematic evaluations were never performed to assess the interpretability of these policies, likely because of the complexity of such an evaluation. In this paper, we introduce a novel metric that uses large-language models (LLM) to assess the interpretability of programmatic policies. For our metric, an LLM is given both a program and a description of its associated programming language. The LLM then formulates a natural language explanation of the program. This explanation is subsequently fed into a second LLM, which tries to reconstruct the program from the natural-language explanation. Our metric then measures the behavioral similarity between the reconstructed program and the original. We validate our approach with synthesized and human-crafted programmatic policies for playing a real-time strategy game, comparing the interpretability scores of these programmatic policies to obfuscated versions of the same programs. Our LLM-based interpretability score consistently ranks less interpretable programs lower and more interpretable ones higher. These findings suggest that our metric could serve as a reliable and inexpensive tool for evaluating the interpretability of programmatic policies.
    Interpretability
    Similarity (geometry)
    Citations (2)
    In this paper, we propose new evaluation measures for scene segmentation results, which are based on computing the difference between a region extracted from a segmentation map and the corresponding one on an ideal segmentation. The proposed measures take into account separately both under and over detected pixels. It also associates in its computation the compactness of the region under investigation.
    Segmentation-based object categorization
    Region growing
    Citations (23)
    Due to the "black-box' nature of artificial intelligence (AI) recommendations, interpretability is critical to the consumer experience of human-AI interaction. Unfortunately, improving the interpretability of AI recommendations is technically challenging and costly. Therefore, there is an urgent need for the industry to identify when the interpretability of AI recommendations is more likely to be needed. This study defines the construct of Need for Interpretability (NFI) of AI recommendations and empirically tests consumers' need for interpretability of AI recommendations in different decision-making domains. Across two experimental studies, we demonstrate that consumers do indeed have a need for interpretability toward AI recommendations, and that the need for interpretability is higher in utilitarian domains than in hedonic domains. This study would help companies to identify the varying need for interpretability of AI recommendations in different application scenarios.
    Interpretability
    Black box