Efficient Cross Section Reconstruction on Modern Multi and Many Core Architectures.

2017 
The classical Monte Carlo (MC) neutron transport employs energy lookup on long tables to compute the cross sections needed for the simulation. This process has been identified as an important performance hotspot of MC simulations, because poor cache utilization caused by random access patterns and large memory footprint makes it unfriendly to modern architectures. A former study [1] shows that such method presents little vectorization potential in a real-case simulation due to the memory-bound nature. In this paper, we revisit a cross section reconstruction method introduced by Hwang [2] to evaluate another solution. The reconstruction converts the problem from memory-bound to compute-bound. Only several variables for each resonance are required instead of the conventional pointwise table covering the entire resolved resonance region. Though the memory space is largely reduced, this method is really time-consuming. After a series of optimizations, results show that the reconstruction kernel benefits well from vectorization and can achieve 1806 GFLOPS (single precision) on a Knights Landing 7250, which represents 67% of its effective peak performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []