Design and Verification of Heterogeneous Streaming Parallel Mechanisms on Kepler CUDA

2015 
In many-core based parallel computing field, how to optimally allocate and schedule computing core resources according to characteristics of parallel applications is one typical and fundamental problem, which touches closely to computing performances. After analyzing features and mechanisms of Kepler CUDA architecture, three heterogeneous streaming parallel computing modes and corresponding constraints, and mechanisms are studied and described in detail. Considering the performance differences between different processing steps of one parallel task, a novel mechanism for balancing resource and performance of one whole task is further studied. Finally, we present typical implementation methods on Kepler CUDA processor, and implement typical matrix-processing algorithms and complicated target-detecting algorithms with these three different computing modes. Experiments show that these modes can adapt different types of applications, and the performance of pipelining parallel computing mode is usually better.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []