In large-scale disaster events, the planning of optimal rescue routes depends on the object detection ability at the disaster scene, with one of the main challenges being the presence of dense and occluded objects. Existing methods, which are typically based on the RGB modality, struggle to distinguish targets with similar colors and textures in crowded environments and are unable to identify obscured objects. To this end, we first construct two multimodal dense and occlusion vehicle detection datasets for large-scale events, utilizing RGB and height map modalities. Based on these datasets, we propose a multimodal collaboration network for dense and occluded vehicle detection, MuDet for short. MuDet hierarchically enhances the completeness of discriminable information within and across modalities and differentiates between simple and complex samples. MuDet includes three main modules: Unimodal Feature Hierarchical Enhancement (Uni-Enh), Multimodal Cross Learning (Mul-Lea), and Hard-easy Discriminative (He-Dis) Pattern. Uni-Enh and Mul-Lea enhance the features within each modality and facilitate the cross-integration of features from two heterogeneous modalities. He-Dis effectively separates densely occluded vehicle targets with significant intra-class differences and minimal inter-class differences by defining and thresholding confidence values, thereby suppressing the complex background. Experimental results on two re-labeled multimodal benchmark datasets, the 4K-SAI-LCS dataset, and the ISPRS Potsdam dataset, demonstrate the robustness and generalization of the MuDet.
With the gradual opening of airspace, tracking of noncooperative low-altitude slow-speed small size (LSS) targets is important for the maintenance of security. It is still a challenging problem, especially for complex scenarios and real-time constraints. In this letter, an efficient tracking by relocalization (TRL) framework is proposed for small flying object tracking, aiming to alleviate the issue of losing moving targets in a complex background. Our designed relocalization module consists of a feature-aggregated module and a global search module. On the one hand, a feature-aggregated module is integrated into the designed framework to increase the ability to locate small targets. On the other hand, a global search module is developed to update the tracking performance, which attempts to address missed targets in long-term small object tracking tasks. What needs to be declared is that the basic tracking module cooperates with the relocalization module we designed to achieve the tracking of small targets. Performance evaluation of two small-flying target data sets and comparison with several state-of-the-art approaches demonstrate the effectiveness of the proposed framework.
Recently, a lot of deep learning (DL) methods have been proposed for infrared small target detection (ISTD). A DL-based model for the ISTD task needs large amounts of samples. However, the diversity of existing ISTD datasets is not sufficient to train a DL model with good generalization. To solve this issue, a data augmentation method called Prior-Guided Data Augmentation (PGDA) is proposed to expand the diversity of training samples indirectly without additional training data. Specifically, it decouples the target description and localization abilities by preserving the scale distribution and physical characteristics of targets. Furthermore, a multi-scene infrared small target dataset (MSISTD) consisting of 1077 images with 1343 instances is constructed. The number of images and the number of instances in MSISTD are 2.4 times and 2.5 times than those of the existing largest real ISTD dataset Single-frame Infrared Small Target (SIRST) Benchmark, respectively. Extensive experiments on the SIRST dataset and the constructed MSISTD dataset illustrate that the proposed PGDA improves the performance of existing DL-based ISTD methods without extra model complexity burdens. In comparison with SIRST, MSISTD has been evaluated as a more comprehensive and accurate benchmark for ISTD tasks.
Recently, many arbitrary-oriented object detection (AOOD) methods have been proposed and applied to remote sensing and other fields. For aerial platforms, lightweight structure and multimodal adaptations of convolutional neural network (CNN) models are urgently needed. Due to the limited model size, the performance of existing lightweight AOOD methods is low, especially in multimodal tasks. In this paper, a multimodal knowledge distillation (MKD) method is proposed for AOOD in aerial images. In MKD, a multimodal dynamic label assignment strategy is designed to select the optimal positive samples dynamically to adapt to different modalities and environments. Different multimodal localization and feature distillation modules are designed to make multimodal knowledge to be complementary and effectively learned by the lightweight model. Experiments on the public dataset demonstrated the effectiveness and advancement of MKD.
Objective To explore a performance standard for hemolytic toxins in harmful bloom algae. Methods Using Chattonella marina as hemolytic substances producing organism, methods and conditions were compared and optimized including cell breakage, distillation temperature, blood origin and storage of algal pellets in extraction and activity determination of hemolytic toxins. Results The hemolytic activity of C. marina broken by supersonic method was 288.23 HU/L, higher than that by freezing--thawing method (94.89 HU/L), suggesting that supersonic method could be more optimal to break microalgal cells. When the supersonic treatment times were 5, 10, 20 and 30 min, the hemolytic activities were 80.57, 157.45, 288.23 and 279.17 HU/L, respectively, indicating that 20 min of supersonic treatment was suitable. When the distillation temperature were 40, 60 and 80 degrees C, the hemolytic activities were 288.23, 124.97 and 120.68 HU/L, respectively, meaning that high distillation temperature in extraction of hemolytic substances lowed the hemolytic activities of samples. Bloods from various animals such as human, fish, rat and rabbit exhibited different sensitivity to the hemolytic toxins, of which rabbit erythrocyte was the most sensitive. The hemolytic activities to human, fish, rat and rabbit were 244.98, 288.23, 266.35 and 195.47HU/L, respectively. The storage of algal pellets for 3 days at the temperature of 0 degrees C did not reveal a significant loss in hemolytic activity, while significant losses were observed at the temperature of 20 degrees C or -20 degrees C only after one day. Conclusion Supersonic method could be more optimal to break cell in comparison with freeze-thaw method. Optimal conditions for broken algal cells by supersonic method were 200 W for 20 min at the temperature of 4 degrees C. The distillation temperature in extraction of hemolytic substances should be maintained under the temperature of 40 degrees C. The rabbit erythrocyte could be the most optimal blood to detect hemolytic activity due to its high sensitivity. The algal pellets could be kept at the temperature of 0 degrees C for 3 days before determination of activity.
Arbitrary-oriented object detection (AOOD) has been widely applied to locate and classify objects with diverse orientations in remote sensing images. However, the inconsistent features for the localization and classification tasks in AOOD models may lead to ambiguity and low-quality object predictions, which constrains the detection performance. In this article, an AOOD method called task-wise sampling convolutions (TS-Conv) is proposed. TS-Conv adaptively samples task-wise features from respective sensitive regions and maps these features together in alignment to guide a dynamic label assignment for better predictions. Specifically, sampling positions of the localization convolution in TS-Conv are supervised by the oriented bounding box (OBB) prediction associated with spatial coordinates, while sampling positions and convolutional kernel of the classification convolution are designed to be adaptively adjusted according to different orientations for improving the orientation robustness of features. Furthermore, a dynamic task-consistent-aware label assignment (DTLA) strategy is developed to select optimal candidate positions and assign labels dynamically according to ranked task-aware scores obtained from TS-Conv. Extensive experiments on several public datasets covering multiple scenes, multimodal images, and multiple categories of objects demonstrate the effectiveness, scalability, and superior performance of the proposed TS-Conv.
Recently, some lightweight convolutional neural network (CNN) models have been proposed for airborne or spaceborne remote sensing object detection (RSOD) tasks. However, these lightweight detectors suffer from performance degradation due to the compromise of limited computing resources on embedded devices. In order to narrow this performance gap, a direction-adaptive knowledge extraction and distillation (DKED) method is proposed. Specifically, a dynamic directional convolution (DDC) is developed to extract the typical arbitrary-oriented features, and a direction-adaptive knowledge distillation (DKD) strategy is designed for guiding the lightweight model to learn the intrinsic knowledge of the RSOD task from the high-performance model. Experiments on public datasets demonstrate that the proposed method can effectively improve the performance of the lightweight RSOD model without additional inference costs.
Arbitrary-oriented object detection (AOOD) plays a significant role for image understanding in remote sensing scenarios. The existing AOOD methods face the challenges of ambiguity and high costs in angle representation. To this end, a multi-grained angle representation (MGAR) method, consisting of coarse-grained angle classification (CAC) and fine-grained angle regression (FAR), is proposed. Specifically, the designed CAC avoids the ambiguity of angle prediction by discrete angular encoding (DAE) and reduces complexity by coarsening the granularity of DAE. Based on CAC, FAR is developed to refine the angle prediction with much lower costs than narrowing the granularity of DAE. Furthermore, an Intersection over Union (IoU) aware FAR-Loss (IFL) is designed to improve accuracy of angle prediction using an adaptive re-weighting mechanism guided by IoU. Extensive experiments are performed on several public remote sensing datasets, which demonstrate the effectiveness of the proposed MGAR. Moreover, experiments on embedded devices demonstrate that the proposed MGAR is also friendly for lightweight deployments.