This paper presents eight PAC-Bayes bounds to analyze the generalization performance of multi-view classifiers. These bounds adopt data dependent Gaussian priors which emphasize classifiers with high view agreements. The center of the prior for the first two bounds is the origin, while the center of the prior for the third and fourth bounds is given by a data dependent vector. An important technique to obtain these bounds is two derived logarithmic determinant inequalities whose difference lies in whether the dimensionality of data is involved. The centers of the fifth and sixth bounds are calculated on a separate subset of the training set. The last two bounds use unlabeled data to represent view agreements and are thus applicable to semi-supervised multi-view learning. We evaluate all the presented multi-view PAC-Bayes bounds on benchmark data and compare them with previous single-view PAC-Bayes bounds. The usefulness and performance of the multi-view bounds are discussed.
In Bayesian machine learning, sampling methods provide the asymptotically unbiased estimation for the inference of the complex probability distributions, where Markov chain Monte Carlo (MCMC) is one of the most popular sampling methods. However, MCMC can lead to high autocorrelation of samples or poor performances in some complex distributions. In this paper, we introduce Langevin diffusions to normalization flows to construct a brand-new dynamical sampling method. We propose the modified Kullback-Leibler divergence as the loss function to train the sampler, which ensures that the samples generated from the proposed method can converge to the target distribution. Since the gradient function of the target distribution is used during the process of calculating the modified Kullback-Leibler, which makes the integral of the modified Kullback-Leibler intractable. We utilize the Monte Carlo estimator to approximate this integral. We also discuss the situation when the target distribution is unnormalized. We illustrate the properties and performances of the proposed method on varieties of complex distributions and real datasets. The experiments indicate that the proposed method not only takes the advantage of the flexibility of neural networks but also utilizes the property of rapid convergence to the target distribution of the dynamics system and demonstrate superior performances competing with dynamics based MCMC samplers.
Recently, the restricted Boltzmann machine (RBM) has aroused considerable interest in the multiview learning field. Although effectiveness is observed, like many existing multiview learning models, multiview RBM ignores the local manifold structure of multiview data. In this article, we first propose a novel graph RBM model, which preserves the data manifold structure and is amenable to Gibbs sampling. Then, we develop a multiview graph RBM model on the basis of the graph RBM, which performs local structural learning and multiview representation learning simultaneously. The proposed multiview model has the following merits: 1) it preserves the data manifold structure for multiview classification and 2) it performs view-consistent representation learning and view-specific representation learning simultaneously. The experimental results show that the proposed multiview model outperforms other state-of-the-art multiview classification algorithms.
This paper introduces an ensemble approach for electroencephalogram (EEG) signal classification, which aims to overcome the instability of the Fisher discriminant feature extractor for brain-computer interface (BCI) applications. Through the random selection of electrodes from candidate electrodes, multiple individual classifiers are constructed. In a feature subspace determined by a couple of randomly selected electrodes, principal component analysis (PCA) is first used to implement dimensionality reduction. Successively Fisher discriminant is adopted for feature extraction, and a Bayesian classifier with a Gaussian mixture model (GMM) is trained to carry out classification. The outputs from all the individual classifiers are combined to give a final label. Experiments with real EEG signals taken from a BCI indicate the validity of the proposed random electrode selection (RES) approach.
Traditional neural network approaches for traffic flow forecasting are usually single task learning (STL) models, which do not take advantage of the information provided by related tasks. In contrast to STL, multitask learning (MTL) has the potential to improve generalization by transferring information in training signals of extra tasks. In this paper, MTL based neural networks are used for traffic flow forecasting. For neural network MTL, a backpropagation (BP) network is constructed by incorporating traffic flows at several contiguous time instants into an output layer. Nodes in the output layer can be seen as outputs of different but closely related STL tasks. Comprehensive experiments on urban vehicular traffic flow data and comparisons with STL show that MTL in BP neural networks is a promising and effective approach for traffic flow forecasting.
Transfer learning, serving as one of the most important research directions in machine learning, has been studied in various fields in recent years. In this paper, we integrate the theory of multi-view learning into transfer learning and propose a new algorithm named Multi-View Transfer Learning with Adaboost (MV-TL Adaboost). Different from many previous works on transfer learning, we not only focus on using the labeled data from one task to help to learn another task, but also consider how to transfer them in different views synchronously. We regard both the source and target task as a collection of several constituent views and each of these two tasks can be learned from every views at the same time. Moreover, this kind of multi-view transfer learning is implemented with adaboost algorithm. Furthermore, we analyze the effectiveness and feasibility of MV-TL Adaboost. Experimental results also validate the effectiveness of our proposed approach.
We develop a new paradigm for the task of joint entity relation extraction. It first identifies entity spans, then performs a joint inference on entity types and relation types. To tackle the joint type inference task, we propose a novel graph convolutional network (GCN) running on an entity-relation bipartite graph. By introducing a binary relation classification task, we are able to utilize the structure of entity-relation bipartite graph in a more efficient and interpretable way. Experiments on ACE05 show that our model outperforms existing joint models in entity performance and is competitive with the state-of-the-art in relation performance.