The manifold of symmetric positive definite (SPD) matrices has drawn significant attention because of its widespread applications. SPD matrices provide compact nonlinear representations of data and form a special type of Riemannian manifold. The direct application of support vector machines on SPD manifold maybe fails due to lack of samples per class. In this paper, we propose a support vector metric learning (SVML) model on SPD manifold. We define a positive definite kernel for point pairs on SPD manifold and transform metric learning on SPD manifold to a point pair classification problem. The metric learning problem can be efficiently solved by standard support vector machines. Compared with classifying points on SPD manifold by support vector machines directly, SVML effectively learns a distance metric for SPD matrices by training a binary support vector machine model. Experiments on video based face recognition, image set classification, and material classification show that SVML outperforms the state-of-the-art metric learning algorithms on SPD manifold.
In this paper, we give some estimates for the essential norm of weighted composition operators from the Bloch space and the Zygmund space to the Bloch space.
Cross-domain face translation aims to transfer face images from one domain to another. It can be widely used in practical applications, such as photos/sketches in law enforcement, photos/drawings in digital entertainment, and near-infrared (NIR)/visible (VIS) images in security access control. Restricted by limited cross-domain face image pairs, the existing methods usually yield structural deformation or identity ambiguity, which leads to poor perceptual appearance. To address this challenge, we propose a multi-view knowledge (structural knowledge and identity knowledge) ensemble framework with frequency consistency (MvKE-FC) for cross-domain face translation. Due to the structural consistency of facial components, the multi-view knowledge learned from large-scale data can be appropriately transferred to limited cross-domain image pairs and significantly improve the generative performance. To better fuse multi-view knowledge, we further design an attention-based knowledge aggregation module that integrates useful information, and we also develop a frequency-consistent (FC) loss that constrains the generated images in the frequency domain. The designed FC loss consists of a multidirection Prewitt (mPrewitt) loss for high-frequency consistency and a Gaussian blur loss for low-frequency consistency. Furthermore, our FC loss can be flexibly applied to other generative models to enhance their overall performance. Extensive experiments on multiple cross-domain face datasets demonstrate the superiority of our method over state-of-the-art methods both qualitatively and quantitatively.
Monotonic classification is a kind of special task in machine learning and pattern recognition. Monotonicity constraints between features and decision should be taken into account in these tasks. However, most existing techniques are not able to discover and represent the ordinal structures in monotonic datasets. Thus, they are inapplicable to monotonic classification. Feature selection has been proven effective in improving classification performance and avoiding overfitting. To the best of our knowledge, no technique has been specially designed to select features in monotonic classification until now. In this paper, we introduce a function, which is called rank mutual information, to evaluate monotonic consistency between features and decision in monotonic tasks. This function combines the advantages of dominance rough sets in reflecting ordinal structures and mutual information in terms of robustness. Then, rank mutual information is integrated with the search strategy of min-redundancy and max-relevance to compute optimal subsets of features. A collection of numerical experiments are given to show the effectiveness of the proposed technique.
Yager's entropy was proposed to compute the information of fuzzy indiscernibility relation. In this paper we present a novel interpretation of Yager's entropy in discernibility power of a relation point of view. Then some basic definitions in Shannon's information theory are generalized based on Yager's entropy. We introduce joint entropy, conditional entropy, mutual information and relative entropy to compute the information changes for fuzzy indiscerniblity relation operations. Conditional entropy and relative conditional entropy are proposed to measure the information increment, which is interpreted as the significance of an attribute in fuzzy rough set model. As an application, we redefine independency of an attribute set, reduct, relative reduct in fuzzy rough set model based on Yager's entropy. Some experimental results show the proposed approach is suitable for fuzzy and numeric data reduction.
Establishing high-order interactions among pixels and object parts is one of the most fundamental problems in semantic segmentation. The recent proposals are based on non-local methods which utilize the self-attention mechanism to capture the long-range correlations. However, non-local methods could be very expensive, both theoretically and experimentally. Moreover, non-local methods are typically designed to address spatial correlations rather than feature correlations across channels. In this work, we propose a Row-Column Attention Network (RCANet) to encode globally contextual information. It consists of a row-wise intra-channel attention module and a column-wise intra-channel attention module, followed by a cross-channel interaction module. We conduct experiments on two datasets: Cityscapes and ADE20K. The results show that our method is comparable to the state-of-the-art methods for semantic segmentation.