Linear approximate segmentation and data compression of moving target spatio-temporal trajectory can reduce data storage pressure and improve the efficiency of target motion pattern mining. High quality segmentation and compression need to accurately select and store as few points as possible that can reflect the characteristics of the original trajectory, while the existing methods still have room for improvement in segmentation accuracy, reduction of compression rate and simplification of algorithm parameter setting. A trajectory segmentation and compression algorithm based on particle swarm optimization is proposed. First, the trajectory segmentation problem is transformed into a global intelligent optimization problem of segmented feature points, which makes the selection of segmented points more accurate; then, a particle update strategy combining neighborhood adjustment and random jump is established to improve the efficiency of segmentation and compression. Through experiments on a real data set and a maneuvering target simulation trajectory set, the results show that compared with the existing typical methods, this method has advantages in segmentation accuracy and compression rate.
With the acceleration of the digital transformation of the power system, the electronic data in the power production operation has shown an explosive growth. However, the increasing proliferation of deep forgery technology has brought huge hidden dangers to the electronic data management of power grid enterprises, and it is urgent to build a trusted management system for electronic data. This paper proposes a blockchain-based deep forgery data identification and traceability framework. Firstly, a method of trusted identification of electronic data based on blockchain is proposed, which constructs the unique identification of data and embeds it in the electronic data as a digital watermark. Second, introduce blockchain-based electronic data forensic appraisal technology to conduct authenticity and similarity analysis of electronic data. Finally, a deep forgery data traceability mechanism based on digital identification is designed to realize deep forgery data traceability and dissemination supervision. After comparative analysis, the framework is more secure and efficient in deep forgery data supervision, and can provide key support for building a trusted content system in cyberspace.
Generalizable person Re-Identification (ReID) has attracted growing attention in recent computer vision community. In this work, we construct a structural causal model among identity labels, identity-specific factors (clothes/shoes color etc), and domain-specific factors (background, viewpoints etc). According to the causal analysis, we propose a novel Domain Invariant Representation Learning for generalizable person Re-Identification (DIR-ReID) framework. Specifically, we first propose to disentangle the identity-specific and domain-specific feature spaces, based on which we propose an effective algorithmic implementation for backdoor adjustment, essentially serving as a causal intervention towards the SCM. Extensive experiments have been conducted, showing that DIR-ReID outperforms state-of-the-art methods on large-scale domain generalization ReID benchmarks.
Incremental pedestrian attribute recognition (IncPAR) aims to learn novel person attributes continuously and avoid the catastrophic forgetting, which is an essential problem for image forensic and security applications, e.g., suspect search. Different from the conventional continual learning for visual classification, we formulate the IncPAR as a problem of multi-label continual learning with incomplete labels (MCL-IL), where the training samples in a novel task are annotated with only a few categories of interest but may implicitly contain other attributes of previous tasks. The incomplete label assignments is a challenging and frequently-encountered issue in real-world multi-label classification applications due to a number of reasons, e.g., incomplete data collection, moderate budget for annotations, etc. To tackle the MCL-IL problem, we propose a self-training based approach via dual uncertainty-aware pseudo-labeling (DUAPL) to transfer the knowledge learned in previous tasks to novel tasks. Specially, both kinds of uncertainties, i.e., aleatoric uncertainty and epistemic uncertainty, are modeled to mitigate the negative influences of noisy pseudo labels induced by low quality samples and immature models learned by inadequate training in early tasks. Based on the DUAPL, more reliable supervision signals can be estimated to prevent the model evolution from forgetting attributes seen in previous tasks. For standard evaluations of MCL-IL methods, two benchmarks on IncPAR, termed RAP-CL and PETA-CL, are constructed by re-organizing public human attribute datasets. Extensive experiments have been performed on these benchmarks to compare the proposed method with multiple baselines. The superior performance in terms of both recognition accuracies and forgetting ratios demonstrate the effectiveness of the proposed DUAPL for IncPAR.
Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask. However, language expressions can sometimes be vague in conveying an intended concept and ambiguous when similar objects in one frame are hard to distinguish by language. Meanwhile, photo masks are costly to annotate and less practical to provide in a real application. This paper introduces a new task of sketch-based video object segmentation, an associated benchmark, and a strong baseline. Our benchmark includes three datasets, Sketch-DAVIS16, Sketch-DAVIS17 and Sketch-YouTube-VOS, which exploit human-drawn sketches as an informative yet low-cost reference for video object segmentation. We take advantage of STCN, a popular baseline of semi-supervised VOS task, and evaluate what the most effective design for incorporating a sketch reference is. Experimental results show sketch is more effective yet annotation-efficient than other references, such as photo masks, language and scribble.
Intelligent video surveillance (IVS) is always an interesting research topic to utilize visual analysis algorithms for exploring richly structured information from big surveillance data. However, existing IVS systems either struggle to utilize computing resources adequately to improve the efficiency of large-scale video analysis, or present a customized system for specific video analytic functions. It still lacks of a comprehensive computing architecture to enhance efficiency, extensibility and flexibility of IVS system. Moreover, it is also an open problem to study the effect of the combinations of multiple vision modules on the final performance of end applications of IVS system. Motivated by these challenges, we develop an Intelligent Scene Exploration and Evaluation (ISEE) platform based on a heterogeneous CPU-GPU cluster and some distributed computing tools, where Spark Streaming serves as the computing engine for efficient large-scale video processing and Kafka is adopted as a middle-ware message center to decouple different analysis modules flexibly. To validate the efficiency of the ISEE and study the evaluation problem on composable systems, we instantiate the ISEE for an end application on person retrieval with three visual analysis modules, including pedestrian detection with tracking, attribute recognition and re-identification. Extensive experiments are performed on a large-scale surveillance video dataset involving 25 camera scenes, totally 587 hours 720p synchronous videos, where a two-stage question-answering procedure is proposed to measure the performance of execution pipelines composed of multiple visual analysis algorithms based on millions of attribute-based and relationship-based queries. The case study of system-level evaluations may inspire researchers to improve visual analysis algorithms and combining strategies from the view of a scalable and composable system in the future.