Checkpoint management with double modular redundancy based on the probability of task completion

Seong Woo Kwak, Kwan Ho You, Jung Min Yang

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

This paper proposes a checkpoint rollback strategy for real-time systems with double modular redundancy. Without built-in fault-detection and spare processors, our scheme is able to recover from both transient and permanent faults. Two comparisons are conducted at each checkpoint. First, the states stored in two consecutive checkpoints of one processor are compared for checking integrity of the processor. The states of two processors are also compared for detecting faults and the system rolls back to the previous checkpoint whenever required by logic of the proposed scheme. A Markov model is induced by the fault recovery scheme and analyzed to provide the probability of task completion within its deadline. The optimal number of checkpoints is selected so as to maximize the probability of task completion.

Original languageEnglish
Pages (from-to)273-280
Number of pages8
JournalJournal of Computer Science and Technology
Volume27
Issue number2
DOIs
StatePublished - Mar 2012

Keywords

  • Checkpoint scheme
  • Double modular redundancy (DMR)
  • Fault tolerance
  • Markov model
  • Real-time task

Fingerprint

Dive into the research topics of 'Checkpoint management with double modular redundancy based on the probability of task completion'. Together they form a unique fingerprint.

Cite this