ClusterComp Published Libre

6
CampusCloud: Aggregating Universities Computing Resources in Ad-Hoc Clouds Hany H Ammar 1,5 , Alaa Hamouda 2 , Mustafa Gamal 3 , Walid Abdelmoez 4 , Ahmed Moussa 5 1 The Lane Department of Computer Science and Electrical Engineering, West Virginia University, USA 2 Computer Engineering Department, Faculty of Engineering, Alazhar University, Cairo, Egypt 3 MOVE-IT Company, Cairo, Egypt 4 Arab Academy of Science and Technology, Egypt 5 Computer Science Department, Faculty of Computers and Information, Cairo University, Egypt Abstract Cloud Computing has recently emerged as a new computing paradigm based on the concept of virtualization with the goal of creating a shared and highly scalable computing infrastructure from aggregated physical resources to deliver seamless and on-demand provisioning of software, hardware, and data as services. Universities typically have large amounts of computing resources to support instructional and research activities. This paper investigates the challenges of developing a Campus Cloud based on aggregating resources in multiple universities. The requirements model and the architecture model of this cloud environment are presented. An implementation methodology using open source cloud middleware is also discussed. Keywords- Cloud Computing; Software Engineering; Grid Computing; High Performance Computing I. INTRODUCTION Resource availability is a key factor to achieve prosperity of any society, and particularly important are computing resources, which are needed in many applications, such as e- learning, e-banking, e-business, and e-government. In particular to allow for faster economic growth, computing infrastructure is essential for small companies in their typical day-to-day business and in research and development activities. However, to attain their full potential, computing resources need to be efficiently utilized in an aggregated manner. Cloud Computing has recently emerged as a new computing paradigm. The Cloud is a type of parallel and distributed computing designed to scale and share computing resources among multiple consumers. This yields improved utilization rates, as servers are not unnecessarily left idle [1]. Deploying applications on a Cloud can help to achieve scalability and simplify/optimize IT environments. A variety of challenges arise when deploying and operating applications and services on a Cloud. Some examples of such challenges are: how to manage and guarantee service level agreements (SLAs) of services deployed in the Cloud; how to integrate services deployed on-premise and on different Clouds; how to deploy applications and business processes and monitor their runtime status, among others [2, 3]. Universities typically have large amounts of computing resources to support instructional and research activities. These are dispersed in instruction/research laboratories and administration offices in most university faculties. Moreover, the university campus has a reasonable level of computer network connectivity among almost all university buildings and to the Internet. However, these resources are underutilized; for instance, most computer labs are expectedly used only during instruction sessions, which span a small fraction of their asset time, and often are used for simple applications that consume a small fraction of their computing power. Taking into consideration that computer manufacturing technology advances in a fast pace, and today’s computers are most likely going to be replaced within three to four years, there is a dire need to make full utilization of the available computing resources before their asset time ends. Several projects were conducted in the past years to develop campus grids. One Example of a campus grid is the University of Virginia Campus Grid (UVaCG) [8]. The grid has been designed explicitly to re-use as much existing infrastructure in the campus environment as possible in creating a grid based on the Web Services Resource Framework (WSRF). Another example of a campus grid is CamGrid at Cambridge University [9]. Cam-Grid is a distributed computing resource based on the Condor middleware [9]. Yet another example of a campus grid that was also developed using the Condor middleware is Oxford University campus grid named OXGrid [10]. The design of the OxGrid system is such that registered users have seamless access to a variety of computational and data storage resources around the university. This paper investigates the challenges of developing a Campus Cloud based on aggregating resources in multiple universities. Section II discusses the requirements model while section III presents the architecture model of the campus cloud environment. The proposed system design is © ICCIT 2012 273

Transcript of ClusterComp Published Libre

Page 1: ClusterComp Published Libre

CampusCloud: Aggregating Universities Computing Resources in Ad-Hoc Clouds

Hany H Ammar1,5, Alaa Hamouda2, Mustafa Gamal3, Walid Abdelmoez4, Ahmed Moussa5

1The Lane Department of Computer Science and Electrical Engineering, West Virginia University, USA

2Computer Engineering Department, Faculty of Engineering, Alazhar University, Cairo, Egypt 3MOVE-IT Company, Cairo, Egypt

4Arab Academy of Science and Technology, Egypt 5Computer Science Department, Faculty of Computers and Information, Cairo University, Egypt

Abstract— Cloud Computing has recently emerged as a new computing paradigm based on the concept of virtualization with the goal of creating a shared and highly scalable computing infrastructure from aggregated physical resources to deliver seamless and on-demand provisioning of software, hardware, and data as services. Universities typically have large amounts of computing resources to support instructional and research activities. This paper investigates the challenges of developing a Campus Cloud based on aggregating resources in multiple universities. The requirements model and the architecture model of this cloud environment are presented. An implementation methodology using open source cloud middleware is also discussed.

Keywords- Cloud Computing; Software Engineering; Grid Computing; High Performance Computing

I. INTRODUCTION

Resource availability is a key factor to achieve prosperity of any society, and particularly important are computing resources, which are needed in many applications, such as e-learning, e-banking, e-business, and e-government. In particular to allow for faster economic growth, computing infrastructure is essential for small companies in their typical day-to-day business and in research and development activities. However, to attain their full potential, computing resources need to be efficiently utilized in an aggregated manner.

Cloud Computing has recently emerged as a new computing paradigm. The Cloud is a type of parallel and distributed computing designed to scale and share computing resources among multiple consumers. This yields improved utilization rates, as servers are not unnecessarily left idle [1].

Deploying applications on a Cloud can help to achieve scalability and simplify/optimize IT environments. A variety of challenges arise when deploying and operating applications and services on a Cloud. Some examples of such challenges are: how to manage and guarantee service level agreements (SLAs) of services deployed in the Cloud; how to integrate services deployed on-premise and on

different Clouds; how to deploy applications and business processes and monitor their runtime status, among others [2, 3].

Universities typically have large amounts of computing resources to support instructional and research activities. These are dispersed in instruction/research laboratories and administration offices in most university faculties. Moreover, the university campus has a reasonable level of computer network connectivity among almost all university buildings and to the Internet. However, these resources are underutilized; for instance, most computer labs are expectedly used only during instruction sessions, which span a small fraction of their asset time, and often are used for simple applications that consume a small fraction of their computing power. Taking into consideration that computer manufacturing technology advances in a fast pace, and today’s computers are most likely going to be replaced within three to four years, there is a dire need to make full utilization of the available computing resources before their asset time ends.

Several projects were conducted in the past years to develop campus grids. One Example of a campus grid is the University of Virginia Campus Grid (UVaCG) [8]. The grid has been designed explicitly to re-use as much existing infrastructure in the campus environment as possible in creating a grid based on the Web Services Resource Framework (WSRF). Another example of a campus grid is CamGrid at Cambridge University [9]. Cam-Grid is a distributed computing resource based on the Condor middleware [9]. Yet another example of a campus grid that was also developed using the Condor middleware is Oxford University campus grid named OXGrid [10]. The design of the OxGrid system is such that registered users have seamless access to a variety of computational and data storage resources around the university.

This paper investigates the challenges of developing a Campus Cloud based on aggregating resources in multiple universities. Section II discusses the requirements model while section III presents the architecture model of the campus cloud environment. The proposed system design is

© ICCIT 2012 273

Page 2: ClusterComp Published Libre

illustrated in section IV and then the implementation methodology is discussed in section V. Finally, conclusion and future work is provided in section VI.

II. SYSTEM REQUIREMENTS

This section describes Campus Cloud requirements to show the benefit and the impact of this work to the community. The requirement focuses on the basic use cases that can be provided with the first release of Campus Cloud. First subsection describes different cloud service. Then, in the next section, the paper goes in more details about campus cloud use cases and their relation to these models.

A. Cloud Service Models

Cloud service models define the highest abstract level of cloud usage. The NIST [4] definition of cloud computing defines three delivery models. Each of these deliver model has its use case:

Software as a Service (SaaS): The consumer uses an applications running on a cloud infrastructure. The applications are accessible from various client devices through either a thin client interface, such as a web browser or a program interface. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.

Platform as a Service (PaaS): The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment.

Infrastructure as a Service (IaaS): The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components

Each of these types of delivery models has its use cases as it will be described later in this section. Also there is a use cases related to the deployment model of the cloud. The following are different Cloud Deployment Models based on NIST definition:

Private cloud: The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers (e.g., business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises.

Public cloud: The cloud infrastructure is provisioned for open use by the general public. It may be owned, managed, and operated by a business, academic, or government organization, or some combination of them. It exists on the premises of the cloud provider.

Community cloud: The cloud infrastructure is provisioned for exclusive use by a specific community of consumers from organizations that have shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be owned, managed, and operated by one or more of the organizations in the community, a third party, or some combination of them, and it may exist on or off premises.

Hybrid cloud: The cloud infrastructure is a composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds).

Campus Cloud can be considered as Public Cloud. Also Campus Cloud can be applied in more than one university and collaborate with each other in a form of Hyper cloud or community cloud.

Next subsection describes Campus Cloud users and different usage scenarios in a form of Use Cases.

B. Campus Cloud Use Cases

Cloud Users can be End User, Researchers, Software Developer, Enterprise Application or Another Cloud. Campus Cloud can provide services for all of these types of users. The following are more details of each user type.

1) End User Cloud Service SaaS model is best fit for end user, where

the user focus on service provided to satisfy his needs without high technical skills to use these services. End user can be Home User or Faculty member (Student or Instructor).

Home users are users whom use cloud from home to view their emails, facebook or any other internet activities. Campus Cloud can provide them an opportunity to share files by providing them storage and portal to upload and download files.

Faculty Members: Campus Cloud can be used to host Learning Management System (LMS) that can be used to upload Teaching materials including

274

Page 3: ClusterComp Published Libre

videos and files. Also campus cloud can host student lab applications such as Matlab, PSPICE, Java, …etc.

2) Researchers Many of researches uses computer to solve their

problems. Many of these researches need high computational power specially researches that are based on simulation techniques. Researchers can use Campus Cloud to allocate more Computation Power and storage to execute their research applications. By leveraging a student lab with 31 computers, we were able to produce results in two weeks that would have taken an entire year on a single CPU.

Also they can conduct researches related to Cloud Computing itself. Researchers need more access to cloud infrastructure. Researcher uses the cloud services in all of its models as follow:

IaaS model to control allocating more Virtual machines and control their usage.

PaaS model to develop applications and deploy them to the cloud such as Matlab applications.

SaaS model to use cloud as file server, and use other cloud SaaS services such as using email server functions.

3) Software Developer Software Developer uses cloud mainly in PaaS cloud

service model. Use cases for Software developers can be as follow:

SaaS development framework: it should provide multi tenancy support. Also handle Data-store for their applications.

Deploy a developed application

Test environment to test their application on the cloud before releasing it.

Version Management to manage different versions of their applications.

Database host

Allocate - de-allocate More resources

Uploading, deploying starting, stopping, restarting, and deleting images.

Development tools

Application Programming Interface to access all previous service from code.

4) Enterprise Applications Cloud can be used for business in its three service

models. Use cases for enterprise are as follow:

Use cloud huge storage to store and share their files.

Enterprise Applications can use cloud services to store and share data and files through cloud API.

Enterprise Applications integrate with some modules over the cloud.

Part or all of Enterprise Application can be hosted or deployed in the cloud.

5) Another Cloud Clouds can provide services for each other through a

collaboration interface. Use Cases for Cloud to use another cloud:

Acquire computation power from campus cloud on their peak time.

Acquire more storage from campus cloud.

Figure 1 summarizes Campus Cloud’s main use cases.

Figure 1: Campus Cloud Use Cases

III. SYSTEM ARCHITECTURE

To satisfy the system requirements, the general architecture of the campus cloud system, as in Figure 2, includes three layers; Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).

IaaS delivers computer infrastructure – typically a platform virtualization environment – as a service, along with raw (block) storage and networking. Rather than purchasing servers, software, data-center space or network equipment, clients instead buy those resources as a fully outsourced service. Section 2-a discusses the design of the IaaS in the Campus Cloud.

275

Page 4: ClusterComp Published Libre

PaaS offerings facilitate the deployment of applications without the cost and complexity of buying and managing the underlying hardware and software and provisioning hosting capabilities, providing all of the facilities required to support the complete life cycle of building and delivering web applications and services entirely available from the Internet. PaaS offerings may include facilities for application design, application development, testing, deployment and hosting as well as application services such as team collaboration, web service integration, database integration, security, scalability, storage, persistence, state management, application versioning, application instrumentation and developer community facilitation. These services may be provisioned as an integrated solution over the web. Section 2-B discusses the design of the PaaS in the Campus Cloud.

SaaS is a software delivery model in which software and its associated data are hosted centrally and are typically accessed by users using a thin client, normally using a web browser over the Internet. SaaS has become a common delivery model for most business applications, including accounting, collaboration, customer relationship management (CRM), enterprise resource planning (ERP), invoicing, human resource management (HRM), content management (CM) and service desk management. SaaS has been incorporated into the strategy of all leading enterprise software companies. It also may include some components like exception handling, logging, data validation, and billing. Section 2-C discusses the design of the SaaS in the Campus Cloud.

Figure 2: Overall System Architecture

IV. SYSTEM DESIGN

We propose the details of the system design in the three layers, IaaS, PaaS, and SaaS as follows:

A. IAAS DESIGN

In this section we describe the design of the lowest layer of Campus Cloud that provides the architecture design

of the IaaS layer. The design of Campus Cloud can be developed using the open source framework defined by the RESERVOIR Project [5]. This project is funded by the FP7 program of the European Union to develop a framework for IaaS cloud service providers.

We start first by describing the architecture of OpenNebula (ON) [6] open source cloud middleware that provides a virtualized execution environment manager of the virtualized resources in the RESERVIOR framework. Figure 3 shows the architecture of ON and the application programming interfaces (APIs) provided. In order to support hybrid cloud environments consisting of private and public clouds, ON implements the functionality supported by the Amazon’s EC2 API [5], mainly those related to virtual machine management. It also supports the Open Cloud Computing Interface (OCCI) that provides a remote management API for IaaS model based services. OCCI can be used also to serve the PaaS and SaaS layers of the cloud computing architecture.

The ON Cloud API (OCA) provides an interface to the core components of ON and it is used to develop advanced IaaS tools using Java or Ruby. The ON core interacts with the physical infrastructure via drivers for storage, monitoring, virtualization, and authentication. A persistent data-base is used to save the state information of ON as well as other accounting information

Figure 3: OpenNebula Architecture and APIs [7]

The RESERVOIR framework provides a layer above ON, namely Claudia, for service providers to manage the IaaS services. The components of this layer are shown in figure 4. The Dashboard provides a web GUI to manage the cloud. The monitoring component stores and distributes the status of the services. The lifecycle manager controls the deployment and dynamic scalability processes of the services. The scalability and optimization manager dynamically drives he configuration and scalability of services. Figure 5 shows the overall RESERVIO framework that combines both ON and Claudia. Service Provides can access the service manager

276

Page 5: ClusterComp Published Libre

Figure 4: The Claudia Service manager Architecture [6]

Figure 5: The RESERVOIR Framework [6] ON and OCCI can be used by researchers and developers

to satisfy the use cases mentioned in Requirement section related to IaaS needs, where both researchers and developers can allocate and de-allocate resources. As shown in figure 4, OCCI Layer satisfies use cases related to interfacing with another clouds.

B. PAAS DESIGN

To satisfy the system requirements of the developer and researcher, we propose AppScale as a PaaS framework. It is an open-source framework for running Google App Engine applications. It is an implementation of a cloud computing PaaS platform, supporting Xen, KVM , Amazon EC2 and Eucalyptus. It has been developed and is maintained by the RACELab at UC Santa Barbara [8]. AppScale allows users to upload multiple App Engine applications to a cloud. It supports multiple distributed backends such as HBase, Hypertable, Apache Cassandra, MySQL Cluster, and Redis. It has support for Python, Go, and Java applications, taking the open source SDK provided by Google App Engine and implementing scalable services such as the datastore, memcache, blobstore, user's API, and channel API.

C. SAAS DESIGN

Several platforms are now emerging to support SaaS development. Oracle provides 'Oracle SaaS Platform', which utilizes Oracles products. Google App Engine, Microsoft and Amazon are also offering development environments for cloud computing and also there are other platforms like Heroku and 10gen [2]. SaaS services main customers are small and medium enterprises over the world. While these platforms do a good job in hiding many of the infrastructure problems from the application developers, they still lack many features that SaaS application developers need form a platform. Mainly: 1. Interoperability. Almost all offerings lack a standard method of accessing different services and hence applications coded against one platform would not work on the other and causes platform lock-in. Standardization is needed in many ways. 2. Ready-made components. Many SaaS applications require features like account activity tracking, action logging, security management and much more. While these can be implemented on the application level it makes more sense to provide them as platform components along with any other utilities that could be shared by many applications. 3. Openness. Most of the current solutions are closed source. A real achievement will be in defining a platform that any data center (given certain requirements) can implement and deploy with relative ease. A platform that open source developers can contribute to or easily take into account when building systems is needed. Due to all of that, we propose developing a SaaS platform that provides special services (APIs) to the SaaS applications. Examples of these services are logging, exception handling, and data binding and validation. Business services like billing and UI Design can be provided in a higher layer as shown in figure 6.

Figure 6: SaaS Framework

277

Page 6: ClusterComp Published Libre

V. IMPLEMENTATION METHODOLOGY

In this section we outline our implementation methodology briefly. We adopt a bottom up approach where we identify the physical resource layer first based on available clusters in the different collaborating institutions.

We propose to start first with resources that have been aggregated into clusters in the collaborating institutions. For example, a model for exploiting resource aggregation to build higher computational power has already been built at the Faculty of Computers and Information, Cairo University. HiPer-FC is the High Performance Computing Laboratory of the Faculty of Computers and Information at Cairo University. This is a project for building High Performance Computing systems based on Linux clusters for use in research and education. It is also intended to be used for testing Cloud Computing middleware and software solutions. The latest incarnation of the project is a 2-clusters grid consisting of 12 single-cores cluster and a 14 nodes multi-core cluster of 5 Quad core machines and 9 dual core machines for a total of 38 processors on the second cluster. The overall computing power of the grid of 2 clusters now is 50 cores. Success of the project led to visibility on both national and international levels. Nationally, the project led to another ambitious cluster building at the AUC, Department of Physics and collaboration on several projects with the National Authority for Remote Sensing and Space Sciences (NARSS). The heterogeneous structure of HiPer-FC is intentionally implemented for research and development on load balancing. The developed techniques for load balancing and performance enhancement will in turn be used and tested when several clusters and computing nodes are in turn aggregated in Campus Cloud. Physical resources at Alazhar University and the Arab Academy of Science and Technology will also be aggregated in the Campus cloud environment to form larger pool of resources. We will also explore the development of a federation of different clouds from the resources in the three institutions.

The development of the IaaS layer will be based on the open source frameworks and middleware described I the previous section. For virtualization and cloud development tools, we adopt Xen as a virtualization tool, OpenNebula as a middleware for IaaS, and Claudia for managing the services of IaaS. The PaaS and SaaS layer will be developed later using the described in the previous sections.

The implementation will support first the activities of the researcher and developer to allocate resources based on virtual machine images that have Appache and Appache ftp servers. Therefore, the use-cases in the requirements model that can be developed first are those for the researcher and developer. These use-case are: Allocate Resources, Deploy Applications (using java or Python), and Web Host. A Virtual machine image will be made available for students programing labs.

VI. CONCLUSION AND FUTURE WORK We presented in this paper the requirements and design models for a cloud computing environment that can be implemented using open-source frameworks and tools. The environment can be developed by aggregating existing underutilized campus resources in multiple institutions.

As future work, we intend to implement the proposed environment and run pilot tests for scientific applications and for the development of student projects. The proposed environment can provide cloud computing research infrastructure and advance the state of the art of software development methodologies to help solve research, business, and industry problems. The proposed cloud also enables world class research activities by providing computing resources for high performance computing and large database intensive applications.

ACKNOWLEDGEMENT This research work is funded by Qatar National

Research Fund (QNRF) under the National Priori-ties Research Program (NPRP) Grant No.: 09-1205-2-470.

REFERENCES 1. Bernstein, David; Ludvigson, Erik; Sankar, Krishna; Diamond, Steve; Morrow, Monique, “Blueprint for the Intercloud – Protocols and Formats for Cloud Computing Interoperability”, IEEE Computer Society, 24-5-2009. 2. Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, Ivona Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility”, Journal of Future Generation Computer Systems, December 2008. 3. K. Lai, L. Rasmusson, E. Adar, L. Zhang, and B. A. Huberman. “Tycoon: An implementation of a distributed, market-based resource allocation system. Multiagent and Grid Systems”, 1(3):169–182, 2005. 4. P. Mell, T. Grance, “The NIST Definition of Cloud Computing” U.S. Department of Commerce, Special Publication 800-14, Sep 2011, pp 6-7. 5. The RESERVOIR Project http://62.149.240.97/ 6. OpenNebula http://www.opennebula.org/ 7. Urquhart, James (22 June 2009). "The new generation of cloud-development platforms". cnet News. CBS Interactive Inc. http://news.cnet.com/8301-19413_3-10270365-240.html. Retrieved 2009-09-23.architectures 8. Marty Humphrey, and Glenn Wasson, The University of Virginia Campus Grid: Integrating Grid Technologies with the Campus Information Infrastructure, Lecture Notes in Computer Science, Volume 3470/2005, pp 50-58.

9. CamGrid. http://www.escience.cam.ac.uk/projects/camgrid/

10. OxGrid. http://www.oerc.ox.ac.uk/resources/oxgrid/oxgrid-concept

278