Migration and Rollback Transparency for Arbitrary Distributed Applications in Workstation Clusters

Stefan Petri, Matthias Bolz, Horst Langendörfer

Abstract

Progralnmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs. The ~aBEAM system [18] uses a global virtual name space to provide migration and rollback transparency in user space for distributed groups of processes on workstations. The system calls are interposed and their parameters translated between the name spaces. Unlike other migration mechanisms, PBEAM does not require the applications to be written for a specific programming model or communication library. In this paper we describe design and implementation of a separate system call interposition process [3] that accesses the application via the debugging interface. The main advantage of this approach is that it can handle even unmodified (e. g. commercially bought) application programs. We compare measured performance figures with previous similar approaches [15, 20].

Original languageEnglish
Title of host publicationIPPS 1998: Parallel and Distributed Processing
Number of pages12
Volume1388
PublisherSpringer Verlag
Publication date01.01.1998
Pages159-170
ISBN (Print)978-3-540-64359-3
ISBN (Electronic)978-3-540-69756-5
DOIs
Publication statusPublished - 01.01.1998
Externally publishedYes
Event10 Workshops held in conjunction with 12th International Parallel Symposium and 9th Symposium on Parallel and Distributed Processing - Orlando, United States
Duration: 30.03.199803.04.1998
Conference number: 146019

Fingerprint

Dive into the research topics of 'Migration and Rollback Transparency for Arbitrary Distributed Applications in Workstation Clusters'. Together they form a unique fingerprint.

Cite this