Accurate Model for Application Failure Due to Transient Faults in Caches

Mehrtash Manoochehri,Michel Dubois

Accurate Model for Application Failure Due to Transient Faults in Caches

2016

To select an appropriate level of error protection in caches, the impact of various protection schemes on the cache Failure In Time (FIT) rate must be evaluated for a target benchmark suite. However, while many simulation tools exist to evaluate area, power and performance for a set of benchmark programs, there is a dearth of such tools for reliability. This paper introduces a new cache reliability model called PARMA+ that has unique features which distinguish it from previous models. PARMA+ estimates a cache's FIT rate in the presence of spatial multi-bit faults, single-bit faults, temporal multi-bit faults and different error protection schemes including parity, ECC, early write-back and bit-interleaving. We first develop the model formally, then we demonstrate its accuracy. We have run reliability simulations for many distributions of large and small fault patterns and have compared them with accelerated fault injection simulations. PARMA+ has high accuracy and low computational complexity.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations