Improve Image Classification Tasks Using Simple Convolutional Architectures with Processed Metadata Injection

2019 
Image Analysis tasks have increased with the advent of the Deep Learning-based approach. Especially techniques based on Convolutional Neural Networks have been applied to reach significant improvements to solve some important problems such as, among other things, image classification, and their segmentation. This led developers and researchers to study more complex architectures in order to deal with increasing complexities deriving from those challenges. In some cases, however, those architectures may prove overkilling in relation to the resolution of some specific task. In a recent project we had to completely automate the defect image identification in an industrial context: images are accompanied by a set of more than 200 metadata and both metadata and images are used to classify the samples distinguishing severe defects from non-severe ones. In this case using a Convolutional Neural Network with simple architecture does not completely resolve the task, while the usage of metadata only is not sufficient to distinguish well between the severe and not severe defects. On the other hand, using CNN with more sophisticated architectures, e.g. Inception or DenseNet, it is possible to reach significant improvements at the cost of longer training time of the models and of larger size in terms of memory occupation. In this paper we propose an approach that combines different networks and techniques: by means of a PCA on available metadata, Hotelling's T2 and Q residuals have been calculated and used to remove anomalous samples. One dataset was composed of those clean samples, while another one has been defined using a PLS-DA to identify which metadata were useful for classification. Then, we structured a combination of two networks: we designed a simple convolutional architecture to extract significant features from images, while another neural network has been designed to elaborate the preprocessed metadata corresponding to the image in input on the convolutional side: a feed-forward neural network was used to combine features extracted from image and features calculated from metadata to get the final classification. The results of combining and applying techniques of dimensionality reduction and feature selection to support deep learning in this way allowed us to reach the same, often better, results of more complex and larger models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    2
    Citations
    NaN
    KQI
    []