Embarrassingly Parallel Jobs Are Not Embarrassingly Easy to Schedule on the Grid

Enis Afgan, Department of Computer and Information Sciences, University of Alabama at Birmingham

Purushotham Bangalore, Department of Computer and Information Sciences, University of Alabama at Birmingham

Location and Time: November 17th, 2008, Room 11AB, 11:35AM

Abstract

Embarrassingly parallel applications represent an important workload in today's grid environments. Scheduling and execution of this class of applications is considered mostly a trivial and well-understood process on homogeneous clusters. However, while grid environments provide the necessary computational resources, associated resource heterogeneity represents a new challenge for efficient task execution for these types of applications across multiple resources. This paper presents a set of examples illustrating how execution characteristics of individual tasks, and consequently a job, are affected by the choice of task execution resources, task invocation parameters, and task input data attributes. It is the aim of this work to highlight this relationship between an application and an execution resource to promote development of better metascheduling techniques for the grid. By exploiting this relationship, application throughput can be maximized, also resulting in higher resource utilization. In order to achieve such benefits, a set of job scheduling and execution concerns is derived leading toward a computational pipeline for scheduling embarrassingly parallel applications in grid environments.

Links: [paper, slides]