Exploiting Parallelism through High Level Optimization on a Heterogeneous Multicore SoC

2009 
This paper describes a Heterogeneous multicore SoC named EVMP-SoC, which is composed of a RISC host processor and two minor-different SIMD synergistic processor that are specially optimized for embedded visual media applications. By using on chip memory and the multi-channel memory access unit, this chip achieved several different level of parallelism, such as Single-Instructionstream-Multiple-Datastream (SIMD) data-level parallelism (DLP), multicore thread-level parallelism (TLP) and memory tile pipeline parallelism. We used an affine transformation framework called PLuTo on code optimization for EVMPSoC and explored multiple level parallelism on this chip. We found that lacking of processor performance model, the general polyhedral affine transformation framework could not generate efficient parallel code for Heterogeneous architectures. Tile scheduling and pipelining techniques are adopted to make a full use of process cores and memory bandwidth. The experiment results showed that tile schedule and pipeline is effective. This chip gained a very good accelerate ratio after all the parallel optimizations1. Finally, the chip was proved to be high efficiency and availability through a case study (a typical application of three dimensional reconstruction from multi images).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []