Most existing lifelong machine learning works focus on how to exploit previously accumulated experiences (e.g., knowledge library) from earlier tasks, and transfer it to learn a new task. However, when a lifelong learning system encounters a large pool of candidate tasks, the knowledge among various coming tasks are imbalance, and the system should intelligently choose the next one to learn. In this paper, an effective "human cognition" strategy is taken into consideration via actively sorting the importance of new tasks in the process of unknown-to-known, and preferentially selecting the most valuable task with more information to learn. To be specific, we assess the importance of each new coming task (e.g., unknown or not) as an outlier detection issue, and propose to employ a "watchdog" knowledge library to reconstruct each task under $\ell _0$ -norm constraint. The coming candidate tasks are then sorted depending on the sparse reconstruction scores in a descending order, which is referred to as a "watchdog" mechanism. Following this, we design a hierarchical knowledge library for the lifelong learning framework to encode new task with higher reconstruction score, where the library consists of two-level task descriptors, i.e., a high-dimensional one with low-rank constraint and a low-dimensional one. Both "watchdog" knowledge library and hierarchy knowledge library can be optimized with knowledge from both previously learned tasks and current task automatically. For model optimization, we explore an alternating method to iteratively update our proposed framework with a guaranteed convergence. Experimental results on several existing benchmarks demonstrate that our proposed model outperforms various state-of-the-art task selection methods.
Accurate lesion segmentation based on endoscopy images is a fundamental task for the automated diagnosis of gastrointestinal tract (GI Tract) diseases. Previous studies usually use hand-crafted features for representing endoscopy images, while feature definition and lesion segmentation are treated as two standalone tasks. Due to the possible heterogeneity between features and segmentation models, these methods often result in sub-optimal performance. Several fully convolutional networks have been recently developed to jointly perform feature learning and model training for GI Tract disease diagnosis. However, they generally ignore local spatial details of endoscopy images, as down-sampling operations (e.g., pooling and convolutional striding) may result in irreversible loss of image spatial information. To this end, we propose a multi-scale context-guided deep network (MCNet) for end-to-end lesion segmentation of endoscopy images in GI Tract, where both global and local contexts are captured as guidance for model training. Specifically, one global subnetwork is designed to extract the global structure and high-level semantic context of each input image. Then we further design two cascaded local subnetworks based on output feature maps of the global subnetwork, aiming to capture both local appearance information and relatively high-level semantic information in a multi-scale manner. Those feature maps learned by three subnetworks are further fused for the subsequent task of lesion segmentation. We have evaluated the proposed MCNet on 1,310 endoscopy images from the public EndoVis-Ab and CVC-ClinicDB datasets for abnormal segmentation and polyp segmentation, respectively. Experimental results demonstrate that MCNet achieves [Formula: see text] and [Formula: see text] mean intersection over union (mIoU) on two datasets, respectively, outperforming several state-of-the-art approaches in automated lesion segmentation with endoscopy images of GI Tract.
Underwater camera calibration has attracted much attentions due to its significance in high-precision three-dimensional (3-D) pose estimation and scene reconstruction. However, most existing calibration methods focus on calibrating the underwater camera in a single scenario [e.g., air-glass-water], which can not well formulate the geometry constraint and further result in the complex calibration process. Moreover, the calibration precision of these methods is low, since multilayer transparent refractions with unknown layer orientation and distance make the task more difficult than that in air. To address these challenges, we develop a novel and efficient medium-driven method for underwater camera calibration (MedUCC), which can calibrate the underwater camera parameters, including the orientation and position of the transparent glass accurately. Our key idea of this article is to leverage the light-path changes formed by medium refractions between different media to acquire calibration data, which can better formulate the geometry constraint, and estimate the initial value of the underwater camera parameters. To improve the calibration accuracy of the underwater camera system, a quaternion-based solution is developed to refine the underwater camera parameters. To the end, we evaluate the calibration performance on an underwater camera system. Extensive experiment results demonstrate that our proposed method can obtain a better performance in comparison to the existing works. We also validate our proposed MedUCC method on our designed 3-D scanner prototype, which illustrates the superiority of our proposed calibration method.
As a typical railway maintenance apparatus, the rail inspection trolley is always installed with an inertial measurement unit (IMU) and several laser scanners (LS) to measure the track geometry. To meet the on-site calibration demands, a compact apparatus with a specific 3D calibration target is devised, and the measurement value of LS can be accordingly transferred to the reference coordinate of IMU via our calibration apparatus. Without any assisted motion device, the pose of the laser plane can be calculated based on the cross-ratio invariance principle of perspective projection after image detection of these concentric circle centers. Our additional contribution is to improve the laser plane localization accuracy by introducing a novel concentric circle-based center detection and light stripe center extraction algorithm under a dual-view perspective. It is verified that our method has higher accuracy and convenience compared to the ball target method in the on-site operating environment.
This work presents a novel visual-tactile fused clustering framework, called L ifelong V isual- T actile S pectral C lustering (i.e., LVTSC), to effectively learn consecutive object clustering tasks for robotic perception. Lifelong learning has become an important and hot topic in recent studies on machine learning, aiming to imitate "human learning" and reduce the computational cost when consecutively learning new tasks. Our proposed LVTSC model explores the knowledge transfer and representation correlation from a local modality-invariant perspective under modality-consistent constraint guidance. For the modality-invariant part, we design a set of modality-invariant basis libraries to capture the latent clustering centers of each modality and a set of modality-invariant feature libraries to forcibly embed the manifold information of each modality. A modal-consistent constraint reinforces the correlation between visual and tactile modalities by maximizing the feature manifold correspondences. When the object clustering task comes continuously, the overall objective is optimized by an effective alternating direction method with guaranteed convergence. Our proposed LVTSC framework has been extensively validated for its effectiveness and efficiency on the three challenging real-world robotic object perception datasets.
Multimode interference (MMI) couplers based on silicon slot-waveguide structures have received widespread attention in recent years. The key issues that need to be addressed are the size and loss of such devices. This study introduces a 1 × 3 silicon-based slot-waveguide multimode interference power splitter. The device uses a gallium-nitride slot-waveguide structure to reduce the length of the coupling region and decrease additional losses. To reduce the width of the coupling region, the multimode interference coupling area is designed with a parabolic-shaped structure. The introduction of a tapered structure between the input/output waveguides and the coupling region improves additional losses and non-uniformity. Furthermore, we conducted an analysis of the fabrication tolerances of the coupling region. In this paper, we use mode solution to simulate the design of the device in the 1550 nm optical wavelength range. The eigenmode expansion method is used to simulate and optimize the parameters of the device. The device is simulated using the eigenmode expansion solver. The simulation results show that the total length of the coupling region for the device is only 4 μm. The normalized transmission of the device is 0.992, and its excess loss and imbalance are 0.036 dB and 0.003 dB, respectively. The proposed power splitter can be applied to integrated optical circuit design, optical sensing, and optical power measurement.
3D object classification has attracted appealing attentions in academic researches and industrial applications. However, most existing methods need to access the training data of past 3D object classes when facing the common real-world scenario: new classes of 3D objects arrive in a sequence. Moreover, the performance of advanced approaches degrades dramatically for past learned classes (i.e., catastrophic forgetting), due to the irregular and redundant geometric structures of 3D point cloud data. To address these challenges, we propose a new Incremental 3D Object Learning (i.e., I3DOL) model, which is the first exploration to learn new classes of 3D object continually. Specifically, an adaptive-geometric centroid module is designed to construct discriminative local geometric structures, which can better characterize the irregular point cloud representation for 3D object. Afterwards, to prevent the catastrophic forgetting brought by redundant geometric information, a geometric-aware attention mechanism is developed to quantify the contributions of local geometric structures, and explore unique 3D geometric characteristics with high contributions for classes incremental learning. Meanwhile, a score fairness compensation strategy is proposed to further alleviate the catastrophic forgetting caused by unbalanced data between past and new classes of 3D object, by compensating biased prediction for new classes in the validation phase. Experiments on 3D representative datasets validate the superiority of our I3DOL framework.
Out-of-field tumor response, which is also called abscopal effect, bystander effect, or non-target effect, can be regarded as localized irradiation induced systemic antitumorigenic effects, indicating shrinkage of a tumor distant from the irradiated site. Although abscopal effect has been documented in several tumor types, it is a very rare phenomenon which is clinically reported in non-small-cell-lung carcinoma (NSCLC). Herein, we present a rare case of patient with NSCLC with 2 lesions in the upper lobe of left lung who, after receiving stereotactic ablative radiation therapy (SABR) to one of the tumors, had an apparent spontaneous regression of the other mass in the lung, suggestive of a radiation-induced abscopal effect.