A computational grid, built using grid computing technology, is a network of
computing resources that work together as a single, uniform operating
environment. It can be viewed as a virtual supercomputer designed for
large-scale applications. One important characteristic of these applications is
that they are no longer being developed as monolithic and single-executable
codes, but incorporate multiple dependent computational modules. The execution
of these applications involves the concurrent and sequential execution of
multiple modules in a predefined order, and the automatic and timely data
transfer between modules. These applications are often referred to as scientific
workflow applications.
A very important issue in executing a scientific workflow application in
computational grids is how to map and schedule workflow modules onto multiple
distributed resources, and handle module dependencies in a timely manner to
deliver users expected performance. The goal of this research is to develop a
workflow system to address the issue of workflow scheduling in computational
grid environments. In our work, we have developed a grid workflow description
language that addresses the limitation of lacking support for resource request
specification in current related efforts. An integrated workflow scheduling
architecture has been defined that provides the capabilities of workflow
execution planning, resource allocation and execution coordination. Our workflow
scheduler applies advanced scheduling techniques, such as planning, resource
reservation and performance predictions in the resource allocation process. The
simulation results show that our workflow scheduler reduces the workflow
execution time by about 20% on average under moderate to high resource load,
compared to the scheduling policies used in most of current workflow systems.