TH-CDP: An Efficient Block Level Continuous Data Protection System

2009 
Traditional data protection technologies, such as remote mirroring, snapshot and backup, cannot completely solve virus attack, user error problems. Continuous data protection (CDP), capturing all data writes at file or block level, is an enabling technology to storage systems against malicious attacks or user mistakes, because it allows each block data to be undoable. Most of existing data protection systems or prototypes are not real CDP ones because there is a data exposure between subsequent snapshots. Therefore they provide less granular recovery points. In addition, some products work either at file system level or at application level which lack general purpose. This paper presents the design, implementation, and performance of a new block level continuous data protection system, TH-CDP. Besides providing the basic functions of true CDP, TH-CDP provides virtual volume image at any point in time without affecting the production system. Another distinctive feature of TH-CDP is its checkpointing mechanism. By encapsulating the checkpoint information into I/O request packet queue, TH-CDP can take checkpoints without temporarily halting normal business processing or any incoming request. In addition, TH-CDP uses log-structured technique to store changed block data on raw disk, thereby speeding up both data writing and space recycling. Extensive experiments on file systems, databases using IOzone, and TPC-C benchmark show that TH-CDP can effectively improve the convenience of checkpointing and recovery assurance process. Under the circumstances of the changed block data up to 20 times of the original data size, the 100% sequence read speed of the oldest virtual volume image version is nearly 1/3 to 1/4 compared to the normal iSCSI volume.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    11
    Citations
    NaN
    KQI
    []