An Improved Crash Recovery Approach for Distributed Systems

Publication Year:
Usage 98
Abstract Views 98
Repository URL:
Ramidi, Harika Reddy
thesis / dissertation description
In this paper, we have addressed the complex problem of recovery for concurrent failures in distributed computing environment. We have proposed a new approach in which we have dealt with effectively both orphan and lost messages. The proposed check pointing and recovery approaches enable a process to restart from its recent checkpoint and hence guarantees the least amount of re-computation after recovery. It also means that a process needs to save only its recent local checkpoint. The proposed value of the common check pointing interval enables an initiator process to log the minimum number of messages sent by each application process. The message complexity of the proposed check pointing algorithm as well as the recovery approach is O(n).