On the Use of Cloud Computing for Scientific Workflows
Christina Hoffa1, Gaurang Mehta2, Timothy Freeman3, Ewa Deelman2, Kate Keahey3, Bruce
Berriman4, John Good4
1Indiana University, 2University of Southern California, 3Argonne National Laboratory, 4Caltech
Abstract
This paper explores the use of cloud computing for
scientific workflows, focusing on a widely used astronomy
application-Montage. The approach is to evaluate from the
point of view of a scientific workflow the tradeoffs between
running in a local environment, if such is available, and
running in a virtual environment via remote, wide-area
network resource access. Our results show that for
Montage, a workflow with short job runtimes, the virtual
environment can provide good compute time performance
but it can suffer from resource scheduling delays and wide-
area communications.
1. Introduction
Recently, cloud computing [1, 2] has been under a growing
spotlight as a possible solution for providing a flexible, on-
demand computing infrastructure for a number of
applications. Clouds are being explored as a solution to
some of the problems with Grid computing, but the
differences between cloud computing and the Grid are
often so diminished and obscured that they become
indistinguishable from one another. The term “Grid”
computing was coined in the early 1990's to liken a
distributed computing infrastructure to the electrical power
grid [3]. Like the electrical power grid, a computational
Grid uses resources that are potentially very geographically
far apart. These resources can be allotted to combinations
of one or more groups of users, with the owners of the
resources deciding when and to whom they should be
allotted. In this manner, collaborations can integrate pools
of resources to give supercomputer-class capability to their
users.
Just as resources can be spread out geographically, so too
can the members of a virtual organization. A virtual
organization is a group of individuals or institutions who
sh