Implementation of Integer DCT in H.264 Based on CUDA

2012 
Each residual macro block is transformed, quantized and coded in video coding standard based on blocks, such as MPEG-1, MPEG-2, MEPG-4, H.263 and H.264. In H.264, it makes use of 4×4 Discrete Cosine Transform (DCT) on luma data. And DCT is independent between each block, which provide a theoretical support for paralleling DCT on each block. In view of GPU's powerful parallel computing ability, this proposes a solution to parallelize DCT based on Compute Unified Device Architecture (CUDA). The experiment results show that the solution based on CUDA acquires a 30-50 times speedup ratio than that based on CPU.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []