System-level reliability modeling for MPSoCs

2010 
The reliability of multi-processor systems-on-chip (MPSoCs) is affected by several inter-dependent system-level and physical effects. Accurate and fast reliability modeling is a primary challenge in the design and optimization of reliable MPSoCs. This paper presents a reliability modeling framework that integrates device-, component-, and system-level models. This framework contains modules for electromigration, time-dependent dielectric breakdown, stress migration, and variable-amplitude thermal cycling. A new statistical reliability distribution is proposed for accurate characterization of components containing too few devices for an extreme value distribution to be appropriate. A hierarchical system-level survival lattice based Monte Carlo technique is used to estimate the temporal fault distributions of MPSoCs that use arbitrary static and dynamic reliability-enhancing redundancy schemes. Physical process variation, which may have a significant impact on MPSoC reliability, is considered in the model. The proposed modeling technique has only 5% average error in mean time to failure and reduces simulation time by nearly 3 orders of magnitude relative to a non-hierarchical Monte Carlo technique
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    74
    Citations
    NaN
    KQI
    []