Parallelization of MATLAB Applications for a Multi-FPGA System

2001 
We present a compiler that takes high level signal and image processing algorithms described in MATLAB and generates an optimized hardware for the WildChild™ board having nine FPGAs and external memory. We propose a Single Program Multiple Data (SPMD) style parallelization framework to automatically generate hardware for all the nine FPGAs. We propose a data alignment and data distribution scheme for minimizing communication across the different FPGAs and present a communication framework based on the WildChild interconnection network for sending and receiving data. Our results show that we get a speedup of around 6 to 7 on eight FPGAs. Further, we propose a prediction mechanism to extract parallelism within a single FPGA. We show that this results in much improved speedups of around 28 on eight FPGAs for the Image Thresholding benchmark. We show that such a framework generates hardwares which are three times slower than the most optimized manual designs, but which can be generated in seconds as compared to days taken by a manual designer.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    9
    Citations
    NaN
    KQI
    []