EFFICIENCY THROUGH REDUCED COMMUNICATION IN MESSAGE PASSING SIMULATION OF NEURAL NETWORKS

1993 
Neural algorithms require massive computation and very high communication bandwidth and are naturally expressed at a level of granularity finer than parallel systems can exploit efficiently. Mapping Neural Networks onto parallel computers has traditionally implied a form of clustering neurons and weights to increase the granularity. SIMD simulations may exceed a million connections per second using thousands of processors, but are often tailored to particular networks and learning algorithms. MIMD simulations required an even larger granularity to run efficiently and often trade flexibility for speed. An alternative technique based on pipelining fewer but larger messages through parallel. “broadcast/accumulate trees” is explored. “Lazy” allocation of messages reduces communication and memory requirements, curbing excess parallelism at run time. The mapping is flexible to changes in network architecture and learning algorithm and is suited for a variety of computer configurations. The method pushes the limits of parallelizing backpropagation and feed-forward type algorithms. Results exceed a million connections per second already on 30 processors and are up to ten times superior to previous results on similar hardware. The implementation techniques can also be applied in conjunction with others, including systolic and VLSI.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []