Publications

Detailed Information

A recovery mechanism for errors caused by a late subjob in a system handling SLA-based Grid workflows

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

Quan, Dang Minh; Altmann, Jorn

Issue Date
2008-05
Publisher
Inderscience
Citation
Int. J. Web and Grid Services, Vol.4, No.1, pp.35-62
Keywords
Grid computingService Level AgreementSLAGrid-based workflowerror recovery
Abstract
Supporting SLAs (Service Level Agreements) for Grid-based
workflows requires providing mechanisms for handling errors (i.e., the
failures of subjobs). In the context of this paper, we propose an error
recovery mechanism which can handle one failed subjob of a workflow. The
error recovery mechanism has a maximum of three phases, depending on the
impact of the error. In each phase, we use a dedicated algorithm to remap
the subjobs of the workflow to the resources. The main contributions of the
paper are the error recovery mechanism for SLA-based workflows and
the mapping algorithm G-map, which is used in the first phase of the recovery
mechanism. The G-map remaps the groups of subjobs, which are directly
affected by an error. The efficiency of the proposed algorithm is validated
through simulation results.
ISSN
1741-1106 (print)
1741-1114 (online)
Language
English
URI
https://hdl.handle.net/10371/6766
DOI
https://doi.org/10.1504/IJWGS.2008.018493
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share