This paper proposed a hybrid grasshopper optimization algorithm to overcome the disadvantages that the grasshopper optimization algorithm was easy to fall into local optimal solution and low accuracy.Firstly, this work used reverse learning strategy to generate the initial population to enhance the global search efficiency and the quality of the solution; secondly, the dynamic compression factor is introduced to replace the linear adaptation of the key parameters in the basic grasshopper optimization algorithm to enhance the global search ability of the algorithm; finally, this paper adapts to the metropolis receiving criterion of simulated annealing algorithm to receive the poor solution with a certain probability, so that the algorithm can be used Enough to jump out of the local optimal solution.Experiments show that the hybrid grasshopper optimization algorithm has stronger global search ability, better accuracy, and can effectively jump out of the local optimal solution.
Tactile sensing is important for both humans and robots especially about force sensing. Although recently some successful tactile sensors have been designed, robot's is still underdeveloped, particularly about sensing multi-modal tactile information and reducing its price. There is a great demand for developing tactile sensors. When a robot exploring perform a complicated manipulation task such as grasping some fragile things, sensing of multi-axis force is important. In this paper, focusing on force sensing, we first develop a stereo vision based optical multi-modal sensor. This sensor consists of a soft skin with markers inserting it, and two RGB cameras. When external force applied to the sensor, the soft layer's deformation would cause movement of markers. A markers tracking algorithm is developed to get the displacement of the markers. A contact force field will be estimated by tracking markers in the soft skin. This sensor could be made with off-the-shelf materials. We make a prototype of this sensor and demonstrate its usefulness in force estimation.
Based on the geometry features of lunar terrain, this paper treats lunar terrain reconstruction as a surface reconstruction problem. We define an energy functional model consisting of local energy term and smooth energy term for lunar terrain reconstruction. The solution to minimize the functional (by partial differential equations) is defined as the optimal surface. In the smooth energy term, we design a vector field of depth discontinuousness likelihood (VFDDL) to control the direction and degree of smoothing. Experiments indicate that accurate VFDDL can lead to an exact reconstructed surface. Thus, VFDDL transfers 3D terrain reconstruction into a 2D image processing problem. An innovative method is proposed to estimate VFDDL, using image local and statistical features. Experiments verify our method and show a good performance in terrain reconstruction.
The state-of-the-art multi-task multi-view learning (MTMV) tackles the learning scenario where multiple tasks are associated with each other via multiple shared feature views. However, in online practical scenarios where the learning tasks have heterogeneous features collected from multiple views, e.g., multiple sources, the state-of-the-arts with single view cannot work well. To tackle this issue, in this paper, we propose a Robust Lifelong Multi-task Multi-view Representation Learning (rLM 2 L) model to accumulate the knowledge from online multi-view tasks. More specifically, we firstly design a set of view-specific libraries to maintain the intra-view correlation information of each view, and further impose an orthogonal promoting term to enforce libraries to be as independent as possible. When online new multi-view task is coming, rLM 2 L model decomposes all views of the new task into a common view-invariant space by transferring the knowledge of corresponding library. In this view-invariant space, capturing underlying inter-view correlation and identifying task-specific views for the new task are jointly employed via a robust multi-task learning formulation. Then the view-specific libraries can be refined over time to keep on improving across all tasks. For the model optimization, the proximal alternating linearized minimization algorithm is adopted to optimize our nonconvex model alternatively to achieve lifelong learning. Finally, extensive experiments on benchmark datasets shows that our proposed rLM 2 L model outperforms existing lifelong learning models, while it can discover task-specific views from sequential multi-view task with less computational burden.
Consider the lifelong machine learning paradigm whose objective is to learn a sequence of tasks depending on previous experiences, e.g., knowledge library or deep network weights. However, the knowledge libraries or deep networks for most recent lifelong learning models are with prescribed size, and can degenerate the performance for both learned tasks and coming ones when facing with a new task environment (cluster). To address this challenge, we propose a novel incremental clustered lifelong learning framework with two knowledge libraries: feature learning library and model knowledge library, called Flexible Clustered Lifelong Learning (FCL3). Specifically, the feature learning library modeled by an autoencoder architecture maintains a set of representation common across all the observed tasks, and the model knowledge library can be self-selected by identifying and adding new representative models (clusters). When a new task arrives, our proposed FCL3model firstly transfers knowledge from these libraries to encode the new task, i.e.,effectively and selectively soft-assigning this new task to multiple representative models over feature learning library. Then, 1) the new task with a higher outlier probability will be judged as a new representative, and used to redefine both feature learning library and representative models over time; or 2) the new task with lower outlier probability will only refine the feature learning library. For model optimization, we cast this lifelong learning problem as an alternating direction minimization problem as a new task comes. Finally, we evaluate the proposed framework by analyzing several multi-task datasets, and the experimental results demonstrate that our FCL3 model can achieve better performance than most lifelong learning frameworks, even batch clustered multi-task learning models.
The 3D mesh is an important representation of geometric data. It is widely used in computer graphics and has attracted more attention in computer vision community recently. However, in the generation of mesh data, geometric deficiencies (e.g., duplicate elements, degenerate faces, isolated vertices, self-intersection, and inner faces) are unavoidable. Geometric deficiencies may violate the topology structure of an object and affect the use of 3D meshes. In this paper, we propose an end-to-end algorithm to eliminate geometric deficiencies effectively and efficiently for 3D meshes in a specific and reasonable order. Specifically, duplicate elements can be first eliminated by assessing appear times of vertices or faces. Then, degenerate faces can be removed according to the outer product of two edges. Next, since isolated vertices do not appear in any face vertices, they can be deleted directly. Afterward, self-intersecting faces are detected and remeshed by using an AABB tree. Finally, we detect and remove an inner face according to whether multiple random rays shooted from a face can reach infinity. Experiments on ModelNet40 dataset illustrate that our method can eliminate the deficiencies of 3D meshes thoroughly
Automatic endoscope video analysis is an essential function for medical robot and computer-aided diagnosis system. However, the performance of these video analysis algorithms are often degraded by low quality endoscope images under the uncontrolled environment, where some of them are difficult even for human ourselves for analysis, such as over-saturated by reflection, too dark or obscure. In this paper, we formulate the problem of gastroscopy video quality evaluation as a supervised framework and detect non-informative frames from gastroscopy video sequence. In order to achieve this goal, HSV histograms, pyramid of histograms of orientation gradients and uniform Local Binary Pattern are extracted to represent frames. And then the Random Forests classifier is used to classify non-informative frames. Experimental results in our new gastroscopy video dataset with about 110000 frames demonstrate that the accuracy of our method is about 95% with the false positive rate lower than 1.3%.
In this paper, we propose a general model to address the overfitting problem in online similarity learning for big data, which is generally generated by two kinds of redundancies: 1) feature redundancy, that is there exists redundant (irrelevant) features in the training data; 2) rank redundancy, that is non-redundant (or relevant) features lie in a low rank space. To overcome these, our model is designed to obtain a simple and robust metric matrix through detecting the redundant rows and columns in the metric matrix and constraining the remaining matrix to a low rank space. To reduce feature redundancy, we employ the group sparsity regularization, i.e., the `2;1 norm, to encourage a sparse feature set. To address rank redundancy, we adopt the low rank regularization, the max norm, instead of calculating the SVD as in traditional models using the nuclear norm. Therefore, our model can not only generate a low rank metric matrix to avoid overfitting, but also achieves feature selection simultaneously. For model optimization, an online algorithm based on the stochastic proximal method is derived to solve this problem efficiently with the complexity of O(d 2 ). To validate the effectiveness and efficiency of our algorithms, we apply our model to online scene categorization and synthesized data and conduct experiments on various benchmark datasets with comparisons to several state-of-the-art methods. Our model is as efficient as the fastest online similarity learning model OASIS, while performing generally as well as the accurate model OMLLR. Moreover, our model can exclude irrelevant / redundant feature dimension simultaneously.