A Deterministic Deduplication Model for Removing Redundancies in Similar Videos Archive

Jyoti Malhotra,Jagdish Bakal

A Deterministic Deduplication Model for Removing Redundancies in Similar Videos Archive

2018

The traditional storage approaches are being challenged by the huge data volumes. In the multimedia content, every file does not necessarily get tagged as an exact duplicate; rather they are prone to editing and resulting in the similar copies of the same file. This paper proposes the similarity-based deduplication approach to remove the similar duplicates from the archive storage, which compares the samples of binary hashes to identify the duplicates. Query video is divided into dynamic key frames based on the video length. Binary hash codes of these frames are compared with the existing key frames to identify the differences. The similarity score is determined based on these differences, which decides the eradication of the duplicate copy. Duplicate elimination goes through two levels, namely removal of exact duplicates and similar duplicates. The proposed approach has shortened the comparison window by comparing only the candidate hash codes based on the dynamic key frames and aims the accuracy of lossless duplicate removals. The presented work is executed and tested on the produced synthetic video dataset. The results show the reduction in redundant data and increase the storage space. Binary hashes and similarity scores contributed in achieving good deduplication ratio and overall performance.

Keywords:

Correction
Cite
Save
Machine Reading By IdeaReader

References

Citations