Postinfectious hydrocephalus in infants is a major health problem in sub-Saharan Africa. The conventional treatment is ventriculoperitoneal shunting, but surgeons are usually not immediately available to revise shunts when they fail. Endoscopic third ventriculostomy with choroid plexus cauterization (ETV-CPC) is an alternative treatment that is less subject to late failure but is also less likely than shunting to result in a reduction in ventricular size that might facilitate better brain growth and cognitive outcomes.We conducted a randomized trial to evaluate cognitive outcomes after ETV-CPC versus ventriculoperitoneal shunting in Ugandan infants with postinfectious hydrocephalus. The primary outcome was the Bayley Scales of Infant Development, Third Edition (BSID-3), cognitive scaled score 12 months after surgery (scores range from 1 to 19, with higher scores indicating better performance). The secondary outcomes were BSID-3 motor and language scores, treatment failure (defined as treatment-related death or the need for repeat surgery), and brain volume measured on computed tomography.A total of 100 infants were enrolled; 51 were randomly assigned to undergo ETV-CPC, and 49 were assigned to undergo ventriculoperitoneal shunting. The median BSID-3 cognitive scores at 12 months did not differ significantly between the treatment groups (a score of 4 for ETV-CPC and 2 for ventriculoperitoneal shunting; Hodges-Lehmann estimated difference, 0; 95% confidence interval [CI], -2 to 0; P=0.35). There was no significant difference between the ETV-CPC group and the ventriculoperitoneal-shunt group in BSID-3 motor or language scores, rates of treatment failure (35% and 24%, respectively; hazard ratio, 0.7; 95% CI, 0.3 to 1.5; P=0.24), or brain volume (z score, -2.4 and -2.1, respectively; estimated difference, 0.3; 95% CI, -0.3 to 1.0; P=0.12).This single-center study involving Ugandan infants with postinfectious hydrocephalus showed no significant difference between endoscopic ETV-CPC and ventriculoperitoneal shunting with regard to cognitive outcomes at 12 months. (Funded by the National Institutes of Health; ClinicalTrials.gov number, NCT01936272 .).
The goal of video hashing is to design hash functions that summarize videos by short fingerprints or hashes. While traditional applications of video hashing lie in database searches and content authentication, the emergence of websites such as YouTube and DailyMotion poses a challenging problem of anti-piracy video search. That is, hashes or fingerprints of an original video (provided to YouTube by the content owner) must be matched against those uploaded to YouTube by users to identify instances of "illegal" or undesirable uploads. Because the uploaded videos invariably differ from the original in their digital representation (owing to incidental or malicious distortions), robust video hashes are desired. We model videos as order-3 tensors and use multilinear subspace projections, such as a reduced rank parallel factor analysis (PARAFAC) to construct video hashes. We observe that, unlike most standard descriptors of video content, tensor-based subspace projections can offer excellent robustness while effectively capturing the spatio-temporal essence of the video for discriminability. We introduce randomization in the hash function by dividing the video into (secret key based) pseudo-randomly selected overlapping sub-cubes to prevent against intentional guessing and forgery. Detection theoretic analysis of the proposed hash-based video identification is presented, where we derive analytical approximations for error probabilities. Remarkably, these theoretic error estimates closely mimic empirically observed error probability for our hash algorithm. Furthermore, experimental receiver operating characteristic (ROC) curves reveal that the proposed tensor-based video hash exhibits enhanced robustness against both spatial and temporal video distortions over state-of-the-art video hashing techniques.
Hash functions are frequently called message digest functions. Their purpose is to extract a short binary string from a large digital message. A key feature of conventional cryptographic (and other) hashing algorithms such as message digest 5 (MD5) and secure hash algorithm 1 (SHA-1) is that they are extremely sensitive to the message; i.e., changing even one bit of the input message will change the output dramatically. However, multimedia data such as digital images undergo various manipulations such as compression and enhancement. An image hash function should instead take into account the changes in the visual domain and produce hash values based on the image's visual appearance. Such a function would facilitate comparisons and searches in large image databases. Other applications of a perceptual hash lie in content authentication and watermarking.
This dissertation proposes a unifying framework for multimedia signal hashing. The problem of media hashing is divided into two stages. The first stage extracts media-dependent intermediate features that are robust under incidental modifications while being different for perceptually distinct media with high probability. The second stage performs a media-independent clustering of these features to produce a final hash.
This dissertation focuses on feature extraction from natural images such that the extracted features are largely invariant under perceptually insignificant modifications to the image (i.e. robust). An iterative geometry preserving feature detection algorithm is developed based on an explicit modeling of the human visual system via end-stopped wavelets. For the second stage. I show that the decision version of the feature clustering problem is NP-complete. Then, for any perceptually significant feature extractor, I develop polynomial time clustering algorithms based on a greedy heuristic.
Existing algorithms for image/media hashing exclusively employ either cryptographic or signal processing methods. A pure signal processing approach achieves robustness to perceptually insignificant distortions but compromises security which is desirable in applications for multimedia, protection. Likewise pure cryptographic techniques while secure, completely ignore the requirement of being robust to incidental modifications of the media. The primary contribution of this dissertation is a joint signal processing and cryptography approach to building robust as well as secure image hashing algorithms. The ideas proposed in this dissertation can also be applied to other problems in multimedia security, e.g. watermarking and data hiding.
This paper reviews the second NTIRE challenge on image dehazing (restoration of rich details in hazy image) with focus on proposed solutions and results. The training data consists from 55 hazy images (with dense haze generated in an indoor or outdoor environment) and their corresponding ground truth (haze-free) images of the same scene. The dense haze has been produced using a professional haze/fog generator that imitates the real conditions of haze scenes. The evaluation consists from the comparison of the dehazed images with the ground truth images. The dehazing process was learnable through provided pairs of haze-free and hazy train images. There were ~ 270 registered participants and 23 teams competed in the final testing phase. They gauge the state-of-the-art in image dehazing.
This paper reviews the first NTIRE challenge on video deblurring (restoration of rich details and high frequency components from blurred video frames) with focus on the proposed solutions and results. A new REalistic and Diverse Scenes dataset (REDS) was employed. The challenge was divided into 2 tracks. Track 1 employed dynamic motion blurs while Track 2 had additional MPEG video compression artifacts. Each competition had 109 and 93 registered participants. Total 13 teams competed in the final testing phase. They gauge the state-of-the-art in video deblurring problem.
Sparse modeling has demonstrated its superior performances in many applications. Compared to optimization based approaches, Bayesian sparse modeling generally provides a more sparse result with a knowledge of confidence. Using the Spike and Slab priors, we propose the hierarchical sparse models for the scenario of single task and multitask - Hi-BCS and CHi-BCS. We draw the connections of these two methods to their optimization based counterparts and use expectation propagation for inference. The experiment results using synthetic and real data demonstrate that the performance of Hi-BCS and Chi-BCS are comparable or better than their optimization based counterparts.
This paper reviews the first challenge on single image super-resolution (restoration of rich details in an low resolution image) with focus on proposed solutions and results. A new DIVerse 2K resolution image dataset (DIV2K) was employed. The challenge had 6 competitions divided into 2 tracks with 3 magnification factors each. Track 1 employed the standard bicubic downscaling setup, while Track 2 had unknown downscaling operators (blur kernel and decimation) but learnable through low and high res train images. Each competition had ∽100 registered participants and 20 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.
Surviving geometric attacks in image authentication is considered to be of great importance. This is because of the vulnerability of classical watermarking and digital signature based schemes to geometric image manipulations, particularly local geometric attacks. In this paper, we present a general framework for image content authentication using salient feature points. We first develop an iterative feature detector based on an explicit modeling of the human visual system. Then, we compare features from two images by developing a generalized Hausdorff distance measure. The use of such a distance measure is crucial to the robustness of the scheme, and accounts for feature detector failure or occlusion, which previously proposed methods do not address. The proposed algorithm withstands standard benchmark (e.g. Stirmark) attacks including compression, common signal processing operations, global as well as local geometric transformations, and even hard to model distortions such as print and scan. Content changing (malicious) manipulations of image data are also accurately detected
Image dehazing continues to be one of the most challenging inverse problems. Deep learning methods have emerged to complement traditional model-based methods and have helped define a new state of the art in achievable dehazed image quality. Yet, practical challenges remain in dehazing of real-world images where the scene is heavily covered with dense haze, even to the extent that no scene information can be observed visually. Many recent dehazing methods have addressed this challenge by designing deep networks that estimate physical parameters in the haze model, i.e. ambient light (A) and transmission map (t). The inverse of the haze model may then be used to estimate the dehazed image. In this work, we develop two novel network architectures to further this line of investigation. Our first model, denoted as At-DH, designs a shared DenseNet based encoder and two distinct DensetNet based decoders to jointly estimate the scene information viz. A and t respectively. This in contrast to most recent efforts (include those published in CVPR'18) that estimate these physical parameters separately. As a natural extension of At-DH, we develop the AtJ-DH network, which adds one more DenseNet based decoder to jointly recreate the haze-free image along with A and t. The knowledge of (ground truth) training dehazed/clean images can be exploited by a custom regularization term that further enhances the estimates of model parameters A and t in AtJ-DH. Experiments performed on challenging benchmark image datasets of NTIRE'19 and NTIRE'18 demonstrate that At-DH and AtJ-DH can outperform state-of-the-art alternatives, especially when recovering images corrupted by dense haze.