DataSys: Data-Intensive Distributed Systems LaboratoryData-Intensive Distributed Systems Laboratory

Illinois Institute of Technology
Department of Computer Science

CFP (TXT, PDF) | News | Topics | Dates | Submission | Organization

IEEE Transaction on Cloud Computing

Special Issue on Many-Task Computing in the Cloud

News

Abstract

The Special Issue on Many-Task Computing (MTC) in the Cloud will provide the scientific community a dedicated forum, within the prestigious IEEE Transactions on Cloud Computing journal, for presenting new research, development, and deployment efforts of loosely coupled large scale applications on Cloud Computing infrastructure. MTC, the theme of this special issue, encompasses loosely coupled applications, which are generally composed of many tasks to achieve some larger application goal. This special issue will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel file-system contention and scalability, data management, I/O management, reliability at scale, and application scalability. We welcome paper submissions in theoretical, simulations, and systems topics with special consideration to papers addressing the intersection of petascale/exascale challenges with large-scale cloud computing. We seek submission of papers that present new, original and innovative ideas for the "first" time in TCC (Transactions on Cloud Computing). That means, submission of "extended versions" of already published works (e.g., conference/workshop papers) is not encouraged unless they contain significant number of "new and original" ideas/contributions along with more than 49% brand "new" material. For more information on this special issue, please see http://datasys.cs.iit.edu/events/TCC-MTC15/.

Special Issue Overview

The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Scientific Computing has already begun to change how science is done, enabling scientific breakthroughs through new kinds of experiments that would have been impossible only a decade ago.  As computing becomes a pervasive part of the scientific process, there is a great opportunity to make powerful computing techniques, previously reserved for projects with only the largest investments, available to a broad scientific community.

The massive increase in concurrency provided by modern hardware presents a challenge to scientific applications with large existing investments in previously developed software and limited ability to redesign from scratch using the latest programming models. Many-task computing (MTC) studies technologies, simple and advanced, to rapidly compose highly scalable applications from existing sequential codes. MTC encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. Growing from the successes of Globus, Condor, and national-scale grid computing infrastructures, MTC techniques have been deployed on many systems from single many-core systems (leveraging GPGPUs and Intel MIC accelerators), to the largest multi-petascale high-performance computing (HPC) systems. The development and deployment of these MTC systems have expanded the utility of the underlying technologies and fed back to improve the performance and usability of the technologies themselves. Similarly, technologies developed for cloud computing (including MapReduce-based models) can provide additional connections and innovations in computing techniques.  This special issue is a unique venue to promote HPC-related concepts to the broader scientific and cloud computing communities.

We are entering into a “big data” era, as advances in networking, instrumentation, simulation technologies, Internet computing and social networks are producing data at an unprecedented rate.  The collection, storage, analysis and sharing of this data are thus one of the greatest challenges in the 21st century. Support for data intensive computing is critical to advancing modern science as storage systems have experienced an increasing gap between its capacity and its bandwidth by more than 10-fold over the last decade. There is an emerging need for advanced techniques to manipulate, visualize and interpret large datasets.  While commonly associated with Hadoop and related systems, technologies from HPC and MTC are also applicable. This provides an opportunity to exchange large-scale data management technologies between scientific applications and industrial techniques, which is another emphasis of this special issue.

Scientific Computing is the key to many domains' "holy grail" of new knowledge, and comes in many shapes and forms.  Exchange of ideas from HPC, MTC, and cloud communities is a critical path to the adoption of advanced techniques to best utilize emerging, highly concurrent systems.  Underlying techniques for concurrency and data processing originating in the HPC space must be delivered to the broader community to promote future investment in HPC research programs and, more generally, advance scientific investigations.

The Special Issue on Many-Task Computing (MTC) in the Cloud will provide the scientific community a dedicated forum, within the prestigious IEEE Transactions on Cloud Computing journal, for presenting new research, development, and deployment efforts of loosely coupled large scale applications on Cloud Computing infrastructure. MTC, the theme of this special issue, encompasses loosely coupled applications, which are generally composed of many tasks to achieve some larger application goal. This special issue will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel file-system contention and scalability, data management, I/O management, reliability at scale, and application scalability. This special issue encourages interaction and cross-pollination between those developing applications, algorithms, software, hardware and networking, emphasizing many-task computing for large-scale distributed systems. We believe this special issue will be an excellent place to help the community define the current state-of-the-art, determine future goals, and define architectures and services for future cloud computing infrastructure.

The guest editors of this special issue have been running a workshop series called “Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS)” since 2008, at the IEEE/ACM Supercomputing/SC conference. For more information about the workshop series, see http://datasys.cs.iit.edu/events/MTAGS. For this year’s workshop, see http://datasys.cs.iit.edu/events/MTAGS14. To see last year’s workshop program agenda, and accepted papers and presentations, please see http://datasys.cs.iit.edu/events/MTAGS13. For the prior year workshops, please see http://datasys.cs.iit.edu/events/MTAGS12, http://datasys.cs.iit.edu/events/MTAGS11/, http://datasys.cs.iit.edu/events/MTAGS10, http://datasys.cs.iit.edu/events/MTAGS09 and  http://datasys.cs.iit.edu/events/MTAGS08. We also ran a special issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) which appeared in June 2011, and it can be found at http://datasys.cs.iit.edu/events/TPDS_MTC; the proceedings can be found online at http://www.computer.org/portal/web/csdl/abs/trans/td/2011/06/ttd201106toc.htm. We, the workshop organizers, also published two papers that are highly relevant to this workshop. One paper is titled "Toward Loosely Coupled Programming on Petascale Systems", and was published in SC08; the second paper is titled “Many-Task Computing for Grids and Supercomputers”, which was published in MTAGS08, both of which have been highly cited, with 111 and 184 citations respectively.

Topics

We seek submission of papers that present new, original and innovative ideas for the "first" time in TCC (Transactions on Cloud Computing). That means, submission of "extended versions" of already published works (e.g., conference/workshop papers) will only be encouraged if they contain significant number of "new and original" ideas/contributions along with more than 49% brand "new" material. TCC expects submissions to be complete in all respects including author names, affiliation, bios etc. Manuscript should be 14 double column pages (all regular paper page limits include references and author biographies). We aim to cover topics related to Many-Task Computing and Cloud Computing. Topics of interest include:

Compute resource management

o   Scheduling

o   Job execution frameworks

o   Local resource manager extensions

o   Performance evaluation of resource managers in use on large scale systems

o   Dynamic resource provisioning

o   Techniques to manage extreme concurrency and accelerators

o   Challenges and opportunities in running many-task workloads on Cloud Computing infrastructure

Storage architectures and implementations

o   Distributed file systems

o   Parallel file systems

o   Distributed metadata management

o   Content distribution systems for large data

o   Data caching frameworks and techniques

o   Data management within and across data centers

o   Data-aware scheduling

o   Data-intensive computing applications

o   Eventual-consistency storage usage and management

Programming models and tools

o   MapReduce, its generalizations, and implementations

o   Many-task computing middleware and applications

o   Parallel programming frameworks

o   Ensemble MPI

o   Service-oriented science applications

Large-scale workflow systems

o   Workflow system performance and scalability analysis

o   Scalability of workflow systems

o   Workflow infrastructure and e-Science middleware

o   Programming paradigms and models

Large-scale many-task applications

o   High-throughput computing (HTC) applications

o   Data-intensive applications

o   Quasi-supercomputing applications, deployments, and experiences

o   Application coupling, integration, and composition

o  Algorithms for many-task applications- Monte Carlo, parameter sweep/search, uncertainty quantification

o   Performance evaluation

Performance evaluation

o   Theoretical vs. real systems

o   Simulations

o   Reliability and fault tolerance of large systems

Important Dates

Paper Submission

Authors are invited to submit unpublished and original work to the IEEE Transactions on Cloud Computing (TCC), Special Issue on Many-Task Computing in the Cloud. If the paper is extended from an initial work, the submission must contain at least 50% new material that can be qualified as “brand” new ideas and results. The paper must be in the IEEE TCC format, namely 14 double-column pages or 30 single-column pages (Note: All regular paper page limits include references and author biographies). Please note that the double-column format will translate more readily into the final publication format. A double-column page is defined as a 7.875”×10.75” page with 10-point type, 12-point vertical spacing, and 0.5 inch margins. A single-column page is defined as an 8.5”×11” page with 12-point type and 24-point vertical spacing, containing approximately 250 words. All of the margins should be one inch (top, bottom, right and left). These length limits are taking into account reasonably-sized figures and references. Papers must be submitted using the submission system: https://mc.manuscriptcentral.com/tcc-cs, by selecting the special issue option “SI-MTC”. For more information, please see http://datasys.cs.iit.edu/events/TCC-MTC15/. Email can be sent to the editors with further questions at mtc15-tcc-editors@datasys.cs.iit.edu

Organization

Guest Editors (mtc15-tcc-editors@datasys.cs.iit.edu)