CAP Research Project: Checkpoint/restart for Solaris and Linux


Abstract

Checkpointing is the process of taking a running process and freezing its state to permanent storage, so that it can later be resumed from the point in its execution at which it was checkpointed. While long-established on mainframe systems, implementing checkpoint-and-resume under Unix-like operating systems is somewhat involved. Checkpointing has obvious applications to fault-tolerance, and is also useful to implement process migration which may be used for load-balancing.

This project has investigated checkpointing under Solaris and Linux. We now have a working prototype which can checkpoint and resume a limited but non-trivial range of processes.



Objectives

The main objective of this project has been to investigate and develop methods for checkpointing processes under Solaris. We wish the checkpointing to be implemented without modification of the Solaris kernel, and without requiring modification to the executables of programs to be checkpointed.

Ideally we wish the checkpointing system to be able to save and restore the following:

In practice, we aim to implement some of the above, and to estimate the difficulty of the remainder. It particular we wish to determine what constraints must be placed on programs in order that it is possible or practical to checkpoint them.

Status

We have built a working prototype checkpointing system, called esky, which works under both Solaris 2.6 and Linux 2.2. It satisfies the basic conditions we require, namely, that it requires no kernel modifications and no modifications to executables.

esky uses a monitor program to manage the checkpointing procedure. The monitor manipulates the target process by intercepting calls to various C library functions. It does this by setting the LD_PRELOAD environment variable to the name of a shared library which the run-time linker will load before the standard C library. Hence checkpointed programs must be dynamically linked.

esky is currently capable of saving and restoring the following aspects of a process's state in addition to the basic memory image and CPU state:

esky is not currently capable of checkpointing fork()-parallel programs or multi-threaded programs. For open files and shared mmap()s, esky assumes that the contents of the opened or mapped file does not change between checkpoint and resume time. esky cannot restore opened files or memory mappings if the file in question has been unlinked between the time of opening or mapping and the time of restore. In particular it is unable to handle the case where a program makes a shared mapping to a file which it then unlinks (this is quite commonly used to implement shared memory between processes).

The current version of esky (version 0.1.1) was released on Novemeber 1 1999 under the GNU Lesser General Public License. It has a web page from which the source code can be downloaded.

Plans and Prospects

The most important features missing from esky are support for fork()-parallel and multithreaded programs. Implementing support for multiprocess programs is expected to be a substantial but essentially straightforward programming effort.

Support for multithreaded programs under Solaris is more complicated. The interface to the Solaris kernel's lightweight process (LWP) concept makes resuming a multithreaded process with the correct LWP IDs somewhat difficult. In addition the basic method for checkpointing a multithreaded process is more complex, since some actions (such as saving the CPU state) must be performed on every LWP, whereas others (such as saving memory contents) must only be performed once for each (heavyweight) process. The problems here are significant but should not be insurmountable. Full support for Linux clone() based threads is more difficult still. However, minor modifications to code supporting fork()-parallel and Solaris multithreaded programs should be able to handle the most common ways of using clone().

In addition there are a number of smaller enhancements which could be made to esky. These include:

A more detailed document on esky is currently being written. It covers details of the capabilities and implementation of esky. It will also describe in more detail possible future extensions of esky. An incomplete draft of this report is included in the esky package available for download. Once completed it is intended that the document will become an ANU DCS technical report.

Contact: David Gibson