A Framework for Design of Self-Repairing Digital Systems
2019
This paper introduces a scalable framework for the design of self-testable, self-correcting, and self-repairing digital systems. Modular redundancy and re-programmability are used to accomplish generic self-test and to enable self-repair. Bit error rates (BER) are measured throughout the design to distinguish between transient errors and errors due to semi- or permanent-logic faults. Tri-modular redundancy (TMR) is used for error correction and fault-isolation with a fourth module available for automated repair. Modular reconfiguration (repair) occurs automatically, so that the system continues to operate error-free even during partial dynamic reconfiguration (in FPGAs). The state of a repaired module is re-synchronized with the running system within one cycle after the damaged module is replaced. The framework is capable of simultaneous repair of multiple faults, while ensuring error-free operation. A case study evaluates the reliability improvement of an FPGA-based neural network image classification application.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
0
Citations
NaN
KQI