ViGs: A Grid Simulation and Monitoring Tool for ATLAS Workflows

Aaron T. Thor, Department of Computer Science and Engineering, University of Texas at Arlington

Gergely V. Záruba, Department of Computer Science and Engineering, University of Texas at Arlington

David Levine, Department of Computer Science and Engineering, University of Texas at Arlington

Kaushik De, Department of Physics, University of Texas at Arlington

Torre J. Wenaus, Brookhaven National Lab

Location and Time: November 17th, 2008, Room 11AB, 11:10AM

Abstract

With the recent success in transmitting the first beam through Large Hadron Collider (LHC), generation of vast amount of data from experiments would soon follow in the near future. The data generated that will need to be processed will be enormous, averaging 15 petabytes per year which will be analyzed and processed by one- to two-hundred-thousand jobs per day[1]. These jobs must be scheduled, processed and managed on computers distributed over many countries worldwide. The ability to construct computer clusters on such a virtually unbounded scale will result in increased throughput, removing the barrier of a single computing architecture and operating system, while adding the ability to process jobs across different administrative boundaries, and encouraging collaborations. To date, setting up large scale grids has been mostly accomplished by setting up experimental medium-sized clusters and using trial-and-error methods to test them. However, this is not only an arduous task but is also economically inefficient. Moreover, as the performance of a grid computing architecture is closely tied with its networking infrastructure across the entire virtual organization, such trial-and-error approaches will not provide representative data. A simulation environment, on the other hand, may be ideal for this evaluation purpose as virtually all factors within a simulated VO (virtual organization) can easily be modified for evaluation. Thus we introduce”Virtual Grid Simulator”(ViGs), developed as a large scale grid environment simulator, with the goal of studying the performance, behavioral, and scalability aspects of a working grid environment, while catering to the needs for an underlying networking infrastructure.

Links: [paper, slides]