DataSys: Data-Intensive Distributed Systems LaboratoryData-Intensive Distributed Systems Laboratory

Illinois Institute of Technology
Department of Computer Science

CFP (TXT, PDF) | News | Topics | Dates | Submission | Organization | Program | Sponsors

6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) 2013

Co-located with Supercomputing/SC 2013
Denver Colorado -- November 17th, 2013

Panel: Many-Task Computing meets Big Data

Location: Roomm 502
Sunday, November 17th, 2013
4PM - 5:15PM

Slides   

Panel Abstract   

 

Applications and experiments in all areas of science are becoming increasingly complex and more demanding in terms of their computational and data requirements. Some applications generate data volumes reaching hundreds of terabytes and petabytes. As scientific applications become more data intensive, the management of data resources and dataflow between the storage and compute resources is becoming the main bottleneck. Analyzing, visualizing, and disseminating these large data sets has become a major challenge and data intensive computing is now considered as the ''fourth paradigm'' in scientific discovery after theoretical, experimental, and computational science. Some call this data-intensive computing, while others call it Big Data.

Many-Task Computing (MTC) is a computing model that aims to bridge the gap between high-performance computing (HPC) and high-throughput computing (HTC). MTC denotes high-performance computations comprising multiple distinct activities, coupled via data movent operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. The data-flow focused nature of many MTC-optimized systems make them a good fit to address many of the challenges brought forward by Big Data. This panel will attempt to address how many-task computing can address the data-intensive computing challenges of today and tomorrow at increasingly larger scales.

 

Panelists

Dr. Robert Grossman

Professor and Director, Division of Biological Sciences & Computation Institute, University of Chicago

Dr. Xian-He Sun

Chair and Professor, Computer Science, Illinois Institute of Technology

Dr. Judy Qiu

Assistant Professor, Computer Science and Informatics, Indiana University

Dr. Alexandru Iosup

Assistant Professor, Faculty of Engineering, Mathematics and Computer Science, Delft University of Technology, the Netherlands 

Moderator

TBA

Panelists Biographies

Robert L. GrossmanDr. Robert Grossman is the Chief Research Informatics Officer (CRIO), the Director of the Initiative in Data Intensive Science and a Professor in the Division of Biological Sciences at the University of Chicago. he is also a Core Faculty and Senior Fellow at the Institute for Genomics and Systems Biology (IGSB) and the Computation Institute. His research group focuses on big data, data science, bioinformatics, cloud computing and related areas. He is also the Founder and a Partner of Open Data Group. Open Data Group has provided analytic services so that companies can build predictive models over big data since 2002. He is the Chair of the not-for-profit Open Cloud Consortium, which develops and operates clouds to support research in science, medicine, health care, and the environment. He can be reached via Linkedin Linkedin or Google+.   
Dr. Xian-He Sun is the director of the SCS laboratory. He is the Chair and a professor of the Department of Computer Science at the Illinois Institute of Technology, an IEEE fellow and a guest faculty in the Division of Mathematics and Computer Science at Argonne National Laboratory. His current research interests include parallel and distributed processing, memory and I/O systems, software system for Big Data applications, and performance evaluation and optimization.  
Photo of Judy  QiuDr. Judy Qiu is an Assistant Professor in the School of Informatics and Computing at Indiana University. Her research interests are on data-intensive computing at the intersection of Cloud and multicore technologies with an emphasis on life science applications using MapReduce and traditional parallel and distributed computing approaches. Dr. Qiu leads the SALSA projectin the Pervasive Technology Institute at Indiana University. Data intensive science, Cloud computing and Multicore computing are converging and will revolutionize next generation of computing in architectural design and programming challenges. They enable the pipeline: data becomes information becomes knowledge becomes wisdom.  in the Pervasive Technology Institute at Indiana University. Data intensive science, Cloud computing and Multicore computing are converging and will revolutionize next generation of computing in architectural design and programming challenges. They enable the pipeline: data becomes information becomes knowledge becomes wisdom.    
Alex IosupDr. Alexandru Iosup received his Ph.D. in Computer Science in 2009 from the Delft University of Technology (TU Delft), the Netherlands. He is currently an Assistant Professor with the Parallel and Distributed Systems Group at TU Delft. He was a visiting scholar at U. Dortmund, U.Wisconsin-Madison, U. Innsbruck, and U.California-Berkeley in 2004, 2006, 2008, and 2010, respectively. In 2011 he received a Dutch NWO/STW Veni grant (the Dutch equivalent of the US NSF CAREER.) His research interests are in the area of distributed computing; keywords: cloud computing, grid computing, peer-to-peer systems, scientific computing, massively multiplayer online games, scheduling, scalability, reliability, performance evaluation, and workload characterization. Dr. Iosup is the author of over 50 scientific publications and has received several awards and distinctions, including best paper awards at IEEE CCGrid 2010, Euro-Par 2009, and IEEE P2P 2006. He is the co-founder of the Grid Workloads, the Peer-to-Peer Trace, and the Failure Trace Archives, which provide open access to workload and resource operation traces from large-scale distributed computing environments. He is currently working on cloud resource management for e-Science and consumer workloads.