Deep Multimodal Speaker Naming
54
Citation
26
Reference
10
Related Paper
Citation Trend
Abstract:
Automatic speaker naming is the problem of localizing as well as identifying each speaking character in a TV/movie/live show video. This is a challenging problem mainly attributes to its multimodal nature, namely face cue alone is insufficient to achieve good performance. Previous multimodal approaches to this problem usually process the data of different modalities individually and merge them using handcrafted heuristics. Such approaches work well for simple scenes, but fail to achieve high performance for speakers with large appearance variations. In this paper, we propose a novel convolutional neural networks (CNN) based learning framework to automatically learn the fusion function of both face and audio cues. We show that without using face tracking, facial landmark localization or subtitle/transcript, our system with robust multimodal feature extraction is able to achieve state-of-the-art speaker naming performance evaluated on two diverse TV series. The dataset and implementation of our algorithm are publicly available online.Keywords:
Merge (version control)
Heuristics
Robustness
Cite
Citations (5)
Cost partitioning is a method for admissibly combining admissible heuristics. In this work, we extend this concept to merge-and-shrink (M&S) abstractions that may use labels that do not directly correspond to operators. We investigate how optimal and saturated cost partitioning (SCP) interact with M&S transformations and develop a method to compute SCPs during the computation of M&S. Experiments show that SCP significantly improves M&S on standard planning benchmarks.
Heuristics
Merge (version control)
Cite
Citations (4)
End-to-end learning-based image dehazing methods tend to overdehaze or underdehaze in real scenes due to inefficient feature extraction and feature fusion. In this letter, we propose a multiscale supervision-guided context aggregation network (MSGCAN) based on two principles: improving feature extraction and enhancing feature mapping. To improve feature extraction, an attention-guided context aggregation (AGCA) module is adopted to merge context features extracted by several residual dense blocks (RDB). Moreover, we output these aggregated context features on each scale and form multiscale supervision to enhance feature mapping and ensure that the extracted features on each scale contain more realistic details. The experimental results show that the proposed MSGCAN performs better than other state-of-the-art dehazing methods in both synthetic and real-world scenes.
Merge (version control)
Feature (linguistics)
Context model
Cite
Citations (19)
Merge-and-shrink (M&S) is a framework to generate abstraction heuristics for cost-optimal planning. A recent approach computes simulation relations on a set of M&S abstractions in order to identify states that are better than others. This relation is then used for pruning states in the search when a "better" state is already known. We propose the usage of simulation relations inside the M&S framework in order to detect irrelevant transitions in abstract state spaces. This potentially simplifies the abstraction allowing M&S to derive more informed heuristics. We also tailor M&S to remove irrelevant operators from the planning task. Experimental results show the potential of our approach to construct well-informed heuristics and simplify the planning tasks prior to the search.
Merge (version control)
Heuristics
Pruning
Abstraction
Cite
Citations (9)
Abstract This paper is concerned with heuristics for capacitated plant location models where locations have different capacities. In this case ADD-heuristics normally lead to bad solutions. We present some starting procedures (priority rules) in order to overcome this difficulty. Finally, we report numerical results, including comparisons between ADD-heuristics with starting procedures and DROP-heuristics.
Heuristics
Cite
Citations (0)
Abstract This paper is concerned with heuristics for capacitated plant location models where locations have different capacities. In this case ADD-heuristics normally lead to bad solutions. We present some starting procedures (priority rules) in order to overcome this difficulty. Finally, we report numerical results, including comparisons between ADD-heuristics with starting procedures and DROP-heuristics.
Heuristics
Cite
Citations (0)
Deep learning has made spectacular achievements in analysing natural images, but it faces challenges for medical applications partly due to inadequate images.Aiming to classify malignant and benign pulmonary nodules using CT images, we explore different strategies to utilize the state-of-the-art deep convolutional neural networks (CNN).Experiments are conducted using the Lung Image Database Consortium image collection (LIDC-IDRI), which is a public database containing 1018 cases. Three strategies are implemented including to 1) modify some state-of-the-art CNN architectures, 2) integrate different CNNs and 3) adopt transfer learning. Totally, 11 deep CNN models are compared using the same dataset.Study demonstrates that, for the model modification scheme, a concise CifarNet performs better than the other modified CNNs with more complex architectures, achieving an area under ROC curve of AUC = 0.90. Integrated CNN models do not significantly improve the classification performance, but the model complexity is reduced. Transfer learning outperforms the other two schemes and ResNet with fine-tuning leads to the best performance with an AUC = 0.94, as well as the sensitivity of 91% and an overall accuracy of 88%.Model modification, model integration, and transfer learning can play important roles to identify and generate optimal deep CNN models in classifying pulmonary nodules based on CT images efficiently. Transfer learning is preferred when applying deep learning to medical imaging applications.
Transfer of learning
Cite
Citations (54)
Manual Fruit classification is the traditional way of classifying fruits. It is manual contact-labor that is time-consuming and often results in lesser productivity, inconsistency, and sometimes damaging the fruits (Prabha & Kumar, 2012). Thus, new technologies such as deep learning paved the way for a faster and more efficient method of fruit classification (Faridi & Aboonajmi, 2017). A deep convolutional neural network, or deep learning, is a machine learning algorithm that contains several layers of neural networks stacked together to create a more complex model capable of solving complex problems. The utilization of state-of-the-art pre-trained deep learning models such as AlexNet, GoogLeNet, and ResNet-50 was widely used. However, such models were not explicitly trained for fruit classification (Dyrmann, Karstoft, & Midtiby, 2016). The study aimed to create a new deep convolutional neural network and compared its performance to fine-tuned models based on accuracy, precision, sensitivity, and specificity.
Deep Neural Networks
Cite
Citations (2)
Researchers in both the “heuristics and biases” school and the school that extols the use of “fast and frugal” heuristics not only share similar methods but agree that people frequently make perfectly satisfactory judgments using limited information and limited computational abilities. However, those in the H&B school emphasize the degree to which the use of heuristics often prevents us from choosing options that would maximize expected value in the way that conventional rational choice theorists believe we do. Those who think of heuristics as “fast and frugal” techniques to make decisions that achieve an organism’s ends in a given environment are considerably less interested in “biases” than in achievements. Moreover, scholars in the two schools think differently about why using heuristics sometimes leads to mistakes; about the nature of rationality; about how heuristics emerge and operate; and about whether people are less prone to use heuristics when they have certain background traits or have more time or cognitive resources to reach a decision.
Heuristics
Social heuristics
Cite
Citations (0)
Deep learning is now an active research area. Deep learning has done a success in computer vision and image recognition. It is a subset of the Machine Learning. In Deep learning, Convolutional Neural Network (CNN) is popular deep neural network approach. In this paper, we have addressed that how to extract useful leaf features automatically from the leaf dataset through Convolutional Neural Networks (CNN) using Deep Learning. In this paper, we have shown that the accuracy obtained by CNN approach is efficient when compared to accuracy obtained by the traditional neural network.
Deep Neural Networks
Cite
Citations (7)