Face recognition (FR) systems powered by deep learning have become widely used in various applications. However, they are vulnerable to adversarial attacks, especially those based on local adversarial patches that can be physically applied to real-world objects. In this paper, we propose RADAP, a robust and adaptive defense mechanism against diverse adversarial patches in both closed-set and open-set FR systems. RADAP employs innovative techniques, such as FCutout and F-patch, which use Fourier space sampling masks to improve the occlusion robustness of the FR model and the performance of the patch segmenter. Moreover, we introduce an edge-aware binary cross-entropy (EBCE) loss function to enhance the accuracy of patch detection. We also present the split and fill (SAF) strategy, which is designed to counter the vulnerability of the patch segmenter to complete white-box adaptive attacks. We conduct comprehensive experiments to validate the effectiveness of RADAP, which shows significant improvements in defense performance against various adversarial patches, while maintaining clean accuracy higher than that of the undefended Vanilla model.
Single object tracking (SOT) is currently one of the most important tasks in computer vision. With the development of the deep network and the release for a series of large scale datasets for single object tracking, siamese networks have been proposed and perform better than most of the traditional methods. However, recent siamese networks get deeper and slower to obtain better performance. Most of these methods could only meet the needs of real-time object tracking in ideal environments. In order to achieve a better balance between efficiency and accuracy, we propose a simpler siamese network for single object tracking, which runs fast in poor hardware configurations while remaining an excellent accuracy. We use a more efficient regression method to compute the location of the tracked object in a shorter time without losing much precision. For improving the accuracy and speeding up the training progress, we introduce the Squeeze-and-excitation (SE) network into the feature extractor. In this paper, we compare the proposed method with some state-of-the-art trackers and analysis their performances. Using our method, a siamese network could be trained with shorter time and less data. The fast processing speed enables combining object tracking with object detection or other tasks in real time.
While deep neural networks have demonstrated remarkable performance across various tasks, they typically require massive training data. Due to the presence of redundancies and biases in real-world datasets, not all data in the training dataset contributes to the model performance. To address this issue, dataset pruning techniques have been introduced to enhance model performance and efficiency by eliminating redundant training samples and reducing computational and memory overhead. However, previous works most rely on manually crafted scalar scores, limiting their practical performance and scalability across diverse deep networks and datasets. In this paper, we propose AdaPruner, an end-to-end Adaptive DAtaset PRUNing framEwoRk. AdaPruner can perform effective dataset pruning without the need for explicitly defined metrics. Our framework jointly prunes training data and fine-tunes models with task-specific optimization objectives. AdaPruner leverages (1) An adaptive dataset pruning (ADP) module, which iteratively prunes redundant samples to an expected pruning ratio; and (2) A pruning performance controller (PPC) module, which optimizes the model performance for accurate pruning. Therefore, AdaPruner exhibits high scalability and compatibility across various datasets and deep networks, yielding improved dataset distribution and enhanced model performance. AdaPruner can still significantly enhance model performance even after pruning up to 10-30\% of the training data. Notably, these improvements are accompanied by substantial savings in memory and computation costs. Qualitative and quantitative experiments suggest that AdaPruner outperforms other state-of-the-art dataset pruning methods by a large margin.
Context-aware recommender systems (CARS), which consider rich side information to improve recommendation performance, have caught more and more attention in both academia and industry. How to predict user preferences from diverse contextual features is the core of CARS. Several recent models pay attention to user behaviors and use specifically designed structures to extract adaptive user interests from history behaviors. However, few works take item history interactions into consideration, which leads to the insufficiency of item feature representation and item attraction extraction. From these observations, we model the user-item interaction as a dynamic interaction graph (DIG) and proposed a GNN-based model called Pairwise Interactive Graph Attention Network (PIGAT) to capture dynamic user interests and item attractions simultaneously. PIGAT introduces the attention mechanism to consider the importance of each interacted user/item to both the user and the item, which captures user interests, item attractions and their influence on the recommendation context. Moreover, confidence embeddings are applied to interactions to distinguish the confidence of interactions occurring at different times. Then more expressive user/item representations and adaptive interaction features are generated, which benefits the recommendation performance especially when involving long-tail items. We conduct experiments on three real-world datasets to demonstrate the effectiveness of PIGAT.
Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are both data-intensive and computation-intensive. Therefore, large scale ANNs are used with reservation for their time-consuming training to get high precision. In this paper, we present cNeural, a customized parallel computing platform to accelerate training large scale neural networks with the backpropagation algorithm. Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples. To achieve this goal, firstly, cNeural adopts HBase for large scale training dataset storage and parallel loading. Secondly, it provides a parallel in-memory computing framework for fast iterative training. Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery. Experimental results show that the overhead time cost by data loading and messaging communication is very low in cNeural and cNeural is around 50 times faster than the solution based on Hadoop MapReduce. It also achieves nearly linear scalability and excellent load balancing.