DataSys: Data-Intensive Distributed Systems LaboratoryData-Intensive Distributed Systems Laboratory

Illinois Institute of Technology
Department of Computer Science

CFP (TXT, PDF) | News | Topics | Dates | Submission | Organization | Program | Sponsors

7th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) 2014

Co-located with Supercomputing/SC 2014
In cooperation with ACM SIGHPC 
New Orleans, Louisiana -- November 16th, 2014

News

Overview

The 7th workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of large-scale many-task computing (MTC) applications on large scale clusters, clouds, grids, and supercomputers. MTC, the theme of the workshop encompasses loosely coupled applications, which are generally composed of many tasks to achieve some larger application goal. This workshop will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel file-system contention and scalability, data management, I/O management, reliability at scale, and application scalability. We welcome paper submissions in theoretical, simulations, and systems topics with special consideration to papers addressing the intersection of petascale/exascale challenges with large-scale cloud computing. We invite the submission of original research work of 6 pages.

Scope

The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Scientific Computing has already begun to change how science is done, enabling scientific breakthroughs through new kinds of experiments that would have been impossible only a decade ago.  As computing becomes a pervasive part of the scientific process, there is a great opportunity to make powerful computing techniques, previously reserved for projects with only the largest investments, available to a broad scientific community.

The massive increase in concurrency provided by modern hardware presents a challenge to scientific applications with large existing investments in previously developed software and limited ability to redesign from scratch using the latest programming models. Many-task computing (MTC) studies technologies, simple and advanced, to rapidly compose highly scalable applications from existing sequential codes. MTC encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. Growing from the successes of Globus, Condor, and national-scale grid computing infrastructures, MTC techniques have been deployed on many systems from single many-core systems (leveraging GPGPUs and Intel MIC accelerators), to the largest multi-petascale high-performance computing (HPC) systems. The development and deployment of these MTC systems have expanded the utility of the underlying technologies and fed back to improve the performance and usability of the technologies themselves. Similarly, technologies developed for cloud computing (including MapReduce-based models) can provide additional connections and innovations in computing techniques.  MTAGS is a unique venue to promote HPC-related concepts to the broader scientific and cloud computing communities.

We are entering into a “big data” era, as advances in networking, instrumentation, simulation technologies, Internet computing and social networks are producing data at an unprecedented rate.  The collection, storage, analysis and sharing of this data are thus one of the greatest challenges in the 21st century. Support for data intensive computing is critical to advancing modern science as storage systems have experienced an increasing gap between its capacity and its bandwidth by more than 10-fold over the last decade. There is an emerging need for advanced techniques to manipulate, visualize and interpret large datasets.  While commonly associated with Hadoop and related systems, technologies from HPC and MTC are also applicable. This provides an opportunity to exchange large-scale data management technologies between scientific applications and industrial techniques, which is another emphasis of MTAGS.

Scientific Computing is the key to many domains' "holy grail" of new knowledge, and comes in many shapes and forms.  Exchange of ideas from HPC, MTC, and cloud communities is a critical path to the adoption of advanced techniques to best utilize emerging, highly concurrent systems.  Underlying techniques for concurrency and data processing originating in the HPC space must be delivered to the broader community to promote future investment in HPC research programs and, more generally, advance scientific investigations.

The 7th workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS14) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of large-scale many-task computing (MTC) applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. This workshop will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel/distributed file system contention and scalability, data management, I/O management, reliability at scale, and application scalability. This workshop encourages interaction and cross-pollination between those developing applications, algorithms, software, hardware and networking, emphasizing many-task computing for large-scale distributed systems. We believe the workshop will be an excellent place to help the community define the current state-of-the-art, determine future goals, and define architectures and services for future high-end computing infrastructure.

For more information about the workshop series, see http://datasys.cs.iit.edu/events/MTAGS. For this year’s workshop, see http://datasys.cs.iit.edu/events/MTAGS14. To see last year’s workshop program agenda, and accepted papers and presentations, please see http://datasys.cs.iit.edu/events/MTAGS13. For the prior year workshops, please see http://datasys.cs.iit.edu/events/MTAGS12, http://datasys.cs.iit.edu/events/MTAGS11/, http://datasys.cs.iit.edu/events/MTAGS10, http://datasys.cs.iit.edu/events/MTAGS09 and  http://datasys.cs.iit.edu/events/MTAGS08. We also ran a special issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) which appeared in June 2011, and it can be found at http://datasys.cs.iit.edu/events/TPDS_MTC; the proceedings can be found online at http://www.computer.org/portal/web/csdl/abs/trans/td/2011/06/ttd201106toc.htm. We, the workshop organizers, also published two papers that are highly relevant to this workshop. One paper is titled "Toward Loosely Coupled Programming on Petascale Systems", and was published in SC08; the second paper is titled “Many-Task Computing for Grids and Supercomputers”, which was published in MTAGS08, both of which have been highly cited, with 111  and 184 citations respectively.

Topics

We invite the submission of original work that is related to the topics below. The papers should be 6 pages, including all figures and references. We aim to cover topics related to Many-Task Computing on each of the three major distributed systems paradigms, Cloud Computing, Grid Computing and Supercomputing. Topics of interest include:

Compute resource management

o   Scheduling

o   Job execution frameworks

o   Local resource manager extensions

o   Performance evaluation of resource managers in use on large scale systems

o   Dynamic resource provisioning

o   Techniques to manage extreme concurrency and accelerators

o   Challenges and opportunities in running many-task workloads on HPC systems

o   Challenges and opportunities in running many-task workloads on Cloud Computing infrastructure

Storage architectures and implementations

o   Distributed file systems

o   Parallel file systems

o   Distributed metadata management

o   Content distribution systems for large data

o   Data caching frameworks and techniques

o   Data management within and across data centers

o   Data-aware scheduling

o   Data-intensive computing applications

o   Eventual-consistency storage usage and management

Programming models and tools

o   MapReduce, its generalizations, and implementations

o   Many-task computing middleware and applications

o   Parallel programming frameworks

o   Ensemble MPI

o   Service-oriented science applications

Large-scale workflow systems

o   Workflow system performance and scalability analysis

o   Scalability of workflow systems

o   Workflow infrastructure and e-Science middleware

o   Programming paradigms and models

Large-scale many-task applications

o   High-throughput computing (HTC) applications

o   Data-intensive applications

o   Quasi-supercomputing applications, deployments, and experiences

o   Application coupling, integration, and composition

o   Algorithms for many-task applications- Monte Carlo, parameter sweep/search, uncertainty quantification

o   Performance evaluation

Performance evaluation

o   Theoretical vs. real systems

o   Simulations

o   Reliability and fault tolerance of large systems

Important Dates

Paper Submission

Authors are invited to submit papers with unpublished, original work of not more than 6 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per ACM 8.5 x 11 manuscript guidelines; document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates. The final 6 page papers in PDF format must be submitted online at https://cmt.research.microsoft.com/MTAGS2014/ before the deadline of September 8th, 2014 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (in cooperation with SIGHPC). Notifications of the paper decisions will be sent out by September 29th, 2014. Accepted workshop papers will be eligible for additional post-conference publication as journal articles in the IEEE Transaction on Cloud Computing, Special Issue on Many-Task Computing in the Cloud (papers will be due in February 2015). Submission implies the willingness of at least one of the authors to register and present the paper. For more information, please see http://datasys.cs.iit.edu/events/MTAGS14/.       

Organization

General Chairs

Steering Committee

Program Committee

Sponsors

IITCSLUChicagoCI/UChicago

Graduate College