Person search unifies person detection and person re-identification (Re-ID) to locate query persons from the panoramic gallery images. One major challenge comes from the imbalanced long-tail person identity distributions, which prevents the one-step person search model from learning discriminative person features for the final re-identification. However, it is under-explored how to solve the heavy imbalanced identity distributions for the one-step person search. Techniques designed for the long-tail classification task, for example, image-level re-sampling strategies, are hard to be effectively applied to the one-step person search which jointly solves person detection and Re-ID subtasks with a detection-based multi-task framework. To tackle this problem, we propose a Subtask-dominated Transfer Learning (STL) method. The STL method solves the long-tail problem in the pretraining stage of the dominated Re-ID subtask and improves the one-step person search by transfer learning of the pretrained model. We further design a Multi-level RoI Fusion Pooling layer to enhance the discrimination ability of person features for the one-step person search. Extensive experiments on CUHK-SYSU and PRW datasets demonstrate the superiority and effectiveness of the proposed method.
In this paper we present the design of a HDTV decoder SoC platform which is integrated with IP cores such as MIPs CPU, HDTV video decoder, video processor, OSD and many peripheral IP devices. Cores can be integrated with the platform through non-glue wrappers. By estimating the bus and memory access bandwidth we can manage the data path effectively. It is also much flexible to append new functions without changing the system structure. So the SoC architecture fits a wide application of digital media processing.
The design effort of software system in multiprocessor system-on-a-chip (MPSoC) is getting heavier with ever-increasing system complexity. The design methodology mainly focuses on things such as co-development with hardware, reducing cost and time, and increasing portability, reusability, and robustness. This paper explores how to design a high performance, robust software system with component-based method and systematic level co-development. Complex architecture and system requirements constrain the whole process. High definition television (HDTV) decoder SoC platform consists of two processors and customized hardware and software components. As an successful experience, we will give the software component design and implementation details
This paper presents a novel frame-skipping transcoding scheme for improving picture quality and reducing computational complexity. Performing in the discrete cosine transform (DCT) domain, it eliminates the high computation from the process of DCT and inverse DCT (IDCT). To maintain an acceptable video quality after frame-rate conversion, the proposed scheme uses two strategies: one is transferring macroblock (MB) coding code by a judgment criterion to reduce the drift error, and the other is re-calculating the prediction residual to minimize the re-encoding error. Furthermore, the transcoder use a structure comprised of a separate decoder and a partial encoder to enhance the implementation efficiency. Simulation results show that, as compared to traditional transcoders, the proposed transcoder has a lower complexity and higher performance.
Focusing on retrieving abnormal events from traffic surveillance video databases, a novel video description scheme (DS) is proposed in this work, which divides the object description into moving object description and still region description. Additionally, to address the challenge posed by redundant description data, we developed one group of second order spatial relationship (SR) semantics to represent the changes of SR over time directly; Incorporated with the proposed DS, a semantic model based on description data is developed to implement the low-complexity traffic event detection. Experimental results have shown that the proposed detection model is very efficient.
Download This Paper Open PDF in Browser Add Paper to My Library Share: Permalink Using these links will ensure access to this page indefinitely Copy URL Copy DOI
Raptor codes are state-of-the-art forward error correction (FEC) solutions for multimedia transmission, which have been applied to unequal error protection (UEP) of multi-layered media such as scalable video coding. In this paper, we address the problem of UEP for single-layered video over packet erasure channels. By exploiting the different priorities of video packets inside a group of pictures (GOP) and making full use of the good characteristics of standardized Raptor codes at large block length, we propose an optimized UEP framework for single-layered video and develop an efficient algorithm to solve it. Simulation results show that significant gains can be obtained by our method in case of packet losses.