S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Program in Technology, Management, Economics and Policy (협동과정-기술·경영·경제·정책전공) Others_협동과정-기술·경영·경제·정책전공
Algorithm-Based Fault-tolerant Programming in Scientific Computation on Multiprocessors
- Altmann, Jorn; Bohm, A.
- Issue Date
- IEEE PDP1995
- Algorithm-Based Fault-tolerant Programming in Scientific Computation on Multiprocessors
- Efficient parallel algorithms proposed to solve many fundamental problems in scientific computation are sensitive to processor failures. Because of its low costs, algorithm-based fault tolerance is an interesting concept for introducing fault tolerance into existing multiprocessors. To facilitate fault-tolerant programming in scientific computation, we have modified and developed further an existing parallel run-time environment. In this paper the aspect of tuning known error processing techniques to the algorithm-based approach is primarily examined. Design issues for implementation and execution time overhead of a fault-tolerant application in our run-time environment are studied. In contrast to many other environments for parallel fault-tolerant programming, which use the master/slave programming model, our environment enables one to add fault tolerance to existing parallel applications in scientific computation
- Files in This Item: