In this paper, we address the challenge of how to build a disclosed lifelog dataset by proposing the principles for building and sharing such types of data. Based on the proposed principles, we describe processes for how we built the benchmarking lifelog dataset for NTCIR-13 - Lifelog 2 tasks. Further, a list of potential applications and a framework for anonymisation are proposed and discussed.
Missing data is a prevalent issue that can significantly impair model performance and interpretability. This paper briefly summarizes the development of the field of missing data with respect to Explainable Artificial Intelligence and experimentally investigates the effects of various imputation methods on the calculation of Shapley values, a popular technique for interpreting complex machine learning models. We compare different imputation strategies and assess their impact on feature importance and interaction as determined by Shapley values. Moreover, we also theoretically analyze the effects of missing values on Shapley values. Importantly, our findings reveal that the choice of imputation method can introduce biases that could lead to changes in the Shapley values, thereby affecting the interpretability of the model. Moreover, and that a lower test prediction mean square error (MSE) may not imply a lower MSE in Shapley values and vice versa. Also, while Xgboost is a method that could handle missing data directly, using Xgboost directly on missing data can seriously affect interpretability compared to imputing the data before training Xgboost. This study provides a comprehensive evaluation of imputation methods in the context of model interpretation, offering practical guidance for selecting appropriate techniques based on dataset characteristics and analysis objectives. The results underscore the importance of considering imputation effects to ensure robust and reliable insights from machine learning models.
Classical supervised methods commonly used often suffer from the requirement of an abudant number of training samples and are unable to generalize on unseen datasets. As a result, the broader application of any trained model is very limited in clinical settings. However, few-shot approaches can minimize the need for enormous reliable ground truth labels that are both labor intensive and expensive. To this end, we propose to exploit an optimization-based implicit model agnostic meta-learning {iMAML} algorithm in a few-shot setting for medical image segmentation. Our approach can leverage the learned weights from a diverse set of training samples and can be deployed on a new unseen dataset. We show that unlike classical few-shot learning approaches, our method has improved generalization capability. To our knowledge, this is the first work that exploits iMAML for medical image segmentation. Our quantitative results on publicly available skin and polyp datasets show that the proposed method outperforms the naive supervised baseline model and two recent few-shot segmentation approaches by large margins.
Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presence information. In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set. The 2020 CATARACTS Semantic Segmentation Challenge, which was a sub-challenge of the 2020 MICCAI Endoscopic Vision (EndoVis) Challenge, presented three sub-tasks to assess participating solutions on anatomical structure and instrument segmentation. Their performance was assessed on a hidden test set of 531 images from 10 videos of the CATARACTS test set.
Assisted reproductive technology addresses infertility by combining sperm and egg in-vitro and implanting the resultant embryo into the female for a healthy birth. The success of 'assisted reproduction' relies on selecting the most viable embryo, a task manually carried out by embryologists. Embryologists assess the viability of an embryo within a cohort by examining the morphological development of embryo cleavage stages. However, this selection process is subjective and resource intensive. This study advances the field of ART by employing artificial intelligence to assess embryos at the morula stage. The morula stage is an essential yet less explored cleavage stage in embryo development. This approach attempts to bridge the existing gap in understanding the morphological structure of the morula and potentially transform assisted reproductive processes by improving the embryo selection criteria. Specifically, training a video classifier to analyze the morphological development of embryos from the beginning to the end of the morula stage and predict the embryo's fate-whether to transfer, cryopreserve, or discard it. The classifiers efficiently predicted 'discarded' videos and frequently confused between 'transferred' with 'cryopreserved' videos. The classifier understanding of the morphological characteristics at the morula stage was investigated by using the Grad-CAM explainable AI technique, providing embryologists deeper insights into embryo development at this stage. The GradCAM heatmaps showed distinct morula development patterns for predicted outcome: 'cryopreserve' and 'discard' videos, while 'transfer' videos lacked relevant morphology cues.