Tutorial ProgramTime: 1:30PM-5:30PM Thursday May 29th.
- Autonomic Clouds
Omer Rana and Manish Parashar Room: Regency D Cloud computing continues to increase in complexity due to a number of factors: (i) increasing availability of configuration options from public Cloud providers (Amazon, for instance, offers over 4000 different configuration options) (ii) increasing variability and types of application instances that can be deployed over such platforms, such as tuning options in hypervisors that enable different virtual machine instances to be associated with physical machines, storage, compute and I/O preferences that offer different power and price, to operating system configurations that provide differing degrees of security. This complexity can also be seen in enterprise-scale datacenters that dominate computing infrastructures in industry, which are growing in size and complexity, and are also enabling new classes and scales of complex business applications. Autonomic computing offers self-* capabilities that enable self-management of systems. Proposed as a vision by IBM Research, the concepts in Autonomic Systems are much older and can be applied to each component within a Cloud system (resource manager/scheduler, power manager etc), or could be applied within an application that makes use of such a Cloud system. Understanding where such capability can be most effectively used is a decision variable often hard to fully appreciate and explored in this tutorial.
- Algorithms and tools to increase the efficiency of data centers: the case of Eco4CloudCarlo Mastroianni and Ian Taylor Room: Acapulco Improving the efficiency of data centers, with a focus on power consumption and carbon emission, is a topical theme on which we are witnessing a huge and increasing amount of academic and industrial research. Despite the notable progress in reducing the PUE (Power Usage Effectiveness), which measures physical efficiency, there is much room for improvement in computational efficiency. The IT resources of data centers are not used efficiently as most servers exploit only a fraction of their computational power. A viable solution is to dynamically consolidate virtual machines on as few servers as possible, and either put the remaining servers in a sleep state or reuse them to accommodate additional workload. The tutorial will give the rationale and the mathematical basis of the problem, and will review some state-of-the-art algorithms. Then, the tutorial will introduce Eco4Cloud (www.eco4cloud.com), a bio-inspired approach for the adaptive and dynamic assignment of virtual machines to servers, as well as the tools used to assess the performance: mathematical models, simulation and physical deployment. A demo will then show how the Eco4Cloud integrated dashboard can be used to monitor the utilization of resources and consolidate the load by dynamically assigning virtual machines to physical servers.
- Accelerating Big Data Processing with Hadoop and Memcached on Modern Clusters
Dhabaleswar Panda and Xiaoyi Lu Room: Regency C Apache Hadoop is gaining prominence in handling Big Data and analytics. Similarly, Memcached in Web 2.0 environment is becoming important for large-scale query processing. These middleware are traditionally written with sockets and do not deliver best performance on clusters with modern high performance networks. In this tutorial, we will provide an in-depth overview of the architecture of Hadoop components (HDFS, MapReduce, RPC, HBase, etc.) and Memcached. We will examine the challenges in re-designing the networking and I/O components of these middleware with modern interconnects, protocols (such as InfiniBand, iWARP, RoCE, and RSocket) with RDMA and storage architecture. Using the publicly available RDMA for Apache Hadoop (http://hadoop-rdma.cse.ohio-state.edu) software package, we will provide case studies of the new designs for several Hadoop components and their associated benefits. Through these case studies, we will also examine the interplay between high performance interconnects, storage systems (HDD and SSD), and multi-core platforms to achieve the best solutions for these components.
- Globus: Scalable Research Data Management Infrastructure for Campuses and High-Performance Computing Facilities
Steve Tuecke and Rajkumar KettimuthuRoom: Toronto Rapid growth of data in science is placing massive demands on campus computing centers and high-performance computing (HPC) facilities. Computing facilities must provide robust data services built on high-performance infrastructure, while continuing to scale as needs increase. Traditional research data management (RDM) solutions are difficult to use and error-prone, and the underlying networking and security infrastructure is often complex and inflexible, resulting in user frustration and sub-optimal resource usage. An approach that is increasingly common in HPC facilities includes software-as-a-service (SaaS) solutions like Globus for moving, syncing, and sharing large data sets. SaaS approach allows HPC resource owners and systems administrators to deliver enhanced RDM services to end-users at optimal quality of service, while minimizing administrative and operations overhead associated with traditional software. Usage of Globus has grown rapidly, with more than 14,500 registered users and over 37 petabytes moved. Globus’s reliable file transfer and data sharing services are key functionalities for bridging campus and external resources, and is enabling scientists to easily scale their research workflows. Tutorial attendees will be introduced to RDM functions of Globus, learn how scientists and resource owners are using Globus, and have the opportunity for hands-on interaction with Globus at various levels of technical depth.