Nutshell Simulation Toolkit for Modeling Data Center ... · This work was supported by the National...

Received December 8, 2018, accepted January 14, 2019, date of publication January 31, 2019, date of current version February 22, 2019.

Digital Object Identifier 10.1109/ACCESS.2019.2894725

Nutshell—Simulation Toolkit for Modeling DataCenter Networks and Cloud ComputingUBAID UR RAHMAN1,2, KASHIF BILAL 1, AIMAN ERBAD1,OSMAN KHALID 2, AND SAMEE U. KHAN31Department of Computer Science and Engineering, Qatar University, Doha, Qatar2Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Abbottabad 22060, Pakistan3Department of Electrical and Computer Engineering, North Dakota State University, Fargo, ND 58105, USA

Corresponding author: Aiman Erbad ([email protected])

This work was supported by the National Priorities Research Program from the Qatar National Research Fund (a member of QatarFoundation) under Grant [8-519-1-108]. The work of S. U. Khan was supported by the National Science Foundation.

ABSTRACT Cloud computing provides flexibility, reliability, and scalability to its consumers. Applicationsthat run on Cloud are now more resource hungry than ever, and the need is constantly growing. Thenumber of Cloud applications with diverse nature is growing, expecting greater performance in low cost.Fulfilling the quality of service (QoS) demands of such diverse applications is a challenging task and requiresefficient network architectures, robust scheduling schemes, powerful and adaptable routing algorithms,reliable resource management, effective data exchange policies, and QoS improvement policies. Researchcommunity endeavors to provide solutions to various challenges faced by Cloud computing. These solutionsmust be tested thoroughly before their implementation in a real Cloud. Simulation seems a viable choicefor testing a solution in different scenarios. There are a number of simulators available, focusing on certainaspects of Cloud while neglecting others. Their generalized implementation of the Cloud ignores variouscritical factors, such as networking characteristics, consequently affecting the tests and results of a solution.To tackle these issues, simplify simulation process, and provide the detailed implementation of the Cloud,we present Nutshell: a novel Cloud simulator that makes it easy to model, simulate, and experiment newCloud solutions. Salient features offered by Nutshell include 1) provision of a platform for modeling andsimulating Cloud infrastructure, 2) built-in essential components and helpers to create new architectures,3) pre-built data center architectures, 4) plug-in-based architecture, 5) communication protocols, 6) pre-built virtual machine scheduler, 7) addressing schemes, 8) user applications, 9) virtual machines, 10) Jobsplitting, and 11) simulation data collection and exporting.

INDEX TERMS Cloud computing, cloud simulator, data center architectures, energy models, FatTreerouting, scheduling algorithms, simulation, simulator, virtual machines.

I. INTRODUCTIONC loud computing is a service based pay-as-you-go busi-ness model [1], [2]. Cloud’s elastic nature and economicservices makes it an ideal solution to process large sets ofdata for startups, educational institutions, research centers,and companies that require analytics and business intelli-gence (BI) to predict the market trends [1], [3]. Cloud ispowered by the data centers, which are facilities housingtens to hundreds of thousands of computational nodes inter-connected via communication infrastructure [3]–[5]. Userapplications run on these computational nodes, communicate

The associate editor coordinating the review of this manuscript andapproving it for publication was Kashif Munir.

with each other, and work towards a common goal [2]. Userapplications have different configurations and requirements.The majority of applications have higher network depen-dencies, e.g., data analytics are carried out as MapReducejobs [6] in which data is mapped on different nodes andprocessed individually. The result is collected to a singlenode in reduce phase, resulting in many to many communi-cation scenarios. All communication takes place through thedata center’s network, which includes users’ data, networkmanagement information, replicated data, and system statedata. Management data is required by schedulers for makingdecisions to place a Virtual Machine (VM) or to migratea VM in case of expected Quality of Service (QoS) viola-tion [6]–[8]. Such enormous volumes of data communication

199222169-3536 2019 IEEE. Translations and content mining are permitted for academic research only.

Personal use is also permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

VOLUME 7, 2019

https://orcid.org/0000-0002-4381-8094

https://orcid.org/0000-0002-1985-4896

U. U. Rahman et al.: Nutshell—Simulation Toolkit for Modeling Data Center Networks and Cloud Computing

requires a carefully designed communication network as itis a pivotal component of Cloud infrastructure in terms ofperformance.

Diverse disciplines that have adopted Cloud computing fortheir application’s computing and storage needs; have intro-duced new challenges. The Cloud community actively findsoptimal solutions in order to handle problems related to net-work, scheduling, energy efficiency, brokering policies, QoSpolicies, and VMmigration policies. Proposed solutions needto be tested against different realistic scenarios to gauge theirperformance and efficacy. Deployment without rigorous andrealistic testing may yield unanticipated results. For example,Google reported 20% revenue loss due to delay in searchresults by 0.5 seconds [9], [10]. Similarly, a delay of 0.1 sec-onds caused a reduction in Amazon’s sales by 1%. A mis-configuration in Blackberry’s network core switch caused itsfailure, leaving millions of customers without internet accessfor three days [11], [12].

Testing of a solution can be performed on a Cloud’s phys-ical testbed [13], [14]. However, a testbed’s configurationand maintenance requires excessive time, cost, and resources,thereby making this approach infeasible. An affordable,dynamic, and most practiced alternative is simulation. Thereare a number of simulators available for Cloud computing,mostly focusing on certain aspects of Cloud while neglectingothers [15], [16], [25], [26], [17]–[24]. Major shortcomingsof existing simulators are: (a) the abstraction of networkand network components, (b) absence of realistic data centerinterconnection architectures, (c) absence or limited commu-nication model, (d) absence or limited network addressingschemes, (e) abstract packet routing, and (f) absence or verylimited communication details. Simulation in existing sim-ulators can be difficult due to the absence of data centerarchitectures, components, and helpers.

Considering these shortcomings, we present our simulatorNutshell: a state-of-the-art simulator for Cloud. Salient fea-tures of Nutshell include:• Modeling and simulation of Cloud infrastructures con-sidering realistic data center network architecture.

• Essential components and helpers to assist in quick cre-ation of Cloud simulation.

• Plug-in architecture that makes integration of new algo-rithm easy.

• Pre-built data center architectures, such as FatTree [6],ThreeTier [27], and DCell [28].

• Communication protocols, especially router compatibleFatTree routing protocol.

• Pre-built VM scheduling schemes, such as First ComeFirst Serve-First Fit (FCFS-FF), Shortest Job First –First Fit (SJF-FF) and Longest Job First – First Fit(LJF-FF).

• Realistic IP addressing schemes.• Virtualization, VMs with both computational and net-working capability.

• Job splitting compatibility and VM collaboration.• Simulation data collection and exporting for analysis.

The rest of the paper is organized as follows. Section IIpresents comparison and limitations of existing Cloud sim-ulators. Section III presents a review of data center architec-tures. Section IV introduces Nutshell simulator in detail. Thissection is divided into multiple subsections explaining thevarious components available in Nutshell for creating sim-ulation, how application execution is performed, data centerarchitecture implementation details, Nutshell VM schedulers,and built-in data collectors of Nutshell. Section V presentsdetails of reports that can be collected from a simulation.Section VI presents obtained results of simulation carriedout with proposed Nutshell simulator, and finally Section VIIconcludes the paper.

II. RELATED WORKWe conducted a detailed survey in [3] of existing simulatorsto rationalize the need of a new simulator. Chosen Cloudsimulators include CloudSim, simulators built as an extensionof CloudSim, and others, such as, GreenCloud and DCSim.Our study, highlights their strengths and weaknesses. Thestudy concluded that most simulators have abstract networkimplementation and low-level details are missing. In thissection, a comparison of all previous simulators on the basisof the following listed parameters is presented in Table 1:

• Platform: The framework which is extended, e.g.,CloudSim extends SimJava.

• Language: The programming language of simulator.• Availability: Simulator availability to community,free or commercial.

• Graphical User Interface (GUI): Simulator isequipped with a GUI or not.

• Application Model:Refers to user application modeland traffic generation.

• Communication Model: Refers to communication fea-ture among simulator nodes.

• Energy Model: Models for calculation of the energyconsumption by data center or the data center compo-nents.

• VM Support: The presence of models that simulateVMs in the simulator.

• SLA Support: Models for calculation of SLA viola-tions.

• Cost Model: Models for Cloud’s pay-as-you-go costcalculations.

• Network Topology Model: Models that implement thedata center network topologies, such as ThreeTier, Fat-Tree, DCell, or any other variation.

• Addressing Schemes: Addressing scheme models foreach Network Interface Controller (NIC) in the datacenter topology, e.g., FatTree has a special addressingscheme.

• Congestion Control: It refers to the availability of fea-tures that detect and mitigate congestion in the networkof data center.

VOLUME 7, 2019 19923


TABLE 1. Summarized analysis of cloud simulators.

• Traffic patterns: It refers to realistic traffic load for thedata center.

Data center constitutes the underlying architecture ofCloud computing. Existing simulators are often limited asthey focus on a particular Cloud module. Each simulatorusually induces some limitations to reduce complexity byabstract implementation of network and network compo-nents. A closer look at the realistic data center dynamicsreveals that one of the most important components is a datacenter’s ‘‘network’’.Performance of Cloud applications depends on the

resilience, reliability, and efficiency of the data center net-work [12]. CloudSim [1] and its most of extensions, suchas CloudAnalyst [29], EMUSIM [30], CDOSim [15], andMR-CloudSim [16] have similar network implementations,i.e., topology information is contained in BRITE file, andsimulators calculate their packet transmission time and delaysfollowing the principles of graph theory. So, in these group ofsimulators the network models are very abstract, and deviatefrom realistic scenarios.

CloudSim extension equipped with network and net-work components includes NetworkCloudSim [17] andDartCSim+ [18]. NetworkCloudSim network implementa-tion is also abstract because switches are likemagic boxes thatonly take data from upstream and forward it to downstream

and vice-versa. There are no queues or buffers on switchports. Therefore, computed delays deviate from a realisticscenario. There is also no mechanism for tracking packetloss or network congestion. Network topology is defined inBRITE file, and has no implementation of data center net-work topologies. DartCSim+ is amore computation oriented.There is an implementation of the NIC but it lacks queues.The NIC is mostly used to calculate power consumed bythe network component. Although the simulator has somemechanisms for retransmission of lost packets but congestionis not detected. It has an abstract network implementation.There is no implementation of queues on switches either, andthe Data Center Network (DCN) topologies are absent.

CloudSim’s extension TeachCloud [19] has various DCNtopologies, however they are abstract, as the purpose of thesimulator is to teach Cloud computing. No detailed imple-mentation of the network is present. GreenCloud [20] hasdetailed implementation of network and network compo-nents. It supports detection of packet loss and retransmissionof lost packets. However, the simulator is more power ori-ented focusing on energy calculation for the data center anduses generic ThreeTier topologies. There is no implementa-tion of VM, scheduling of VM and transfer.

ICanCloud [21], MDCSim [22], GroudSim [23],DCSim [24], SimIC [25], and SPECI [26] are Cloud

19924 VOLUME 7, 2019


FIGURE 1. ThreeTier architecture.

simulators with either limited implementation of the net-work or none at all. All of these simulators lackmultiple DCNtopologies, such as Fat-Tree, and mechanisms for congestionand packet-loss detection.

In summary, existing simulators lack necessary features toprovide accurate testing results. Therefore, there is a need fornew simulator with features that assist user in creating sim-ulations quickly and with enough details to show confidencein new solution for Cloud.

III. DATA CENTER ARCHITECTURESThis section introduces briefly two data center architectures,ThreeTier and FatTree for readers new to Cloud and explainsa major issue with FatTree architecture’s routing when imple-mented with routers. This section helps understand the pur-pose of each component introduced in our proposed simulatorNutshell.

A. THREE TIERThreeTier is the most commonly deployed data center archi-tecture. The architecture consists of three layers. First isthe lower layer called access network layer which is con-nected to resources. Second is the middle layer, called aggre-gation layer. Finally, the third is the top layer, called thecore network. Both the upper and middle layer comprise ofswitches or routers as shown in Figure 1 [31].

In ThreeTier, the computational resources are groupedtogether in racks. The rack typically holds around 40 com-putational resources, connected to a switch called Top of theRack (ToR) switch [32]. This is the access network or theaccess layer of the ThreeTier architecture. Multiple accesslayer switches are connected to the middle layer routers,introducing redundancy. The aggregate layer is the secondlayer of routers within the ThreeTier network architecture.The third layer in the architecture is equipped with high-endrouters, which are able to process packets faster than aggre-gate and access network switches. Each core layer router isconnected to all of the routers in the middle layer. Rack’s

FIGURE 2. FatTree data center architecture with k = 4.

FIGURE 3. Two-level table example. This is the table at switch 10.2.2.1.

internal traffic is handled by the ToR switch. The trafficbetween two racks having ToR switch connected to sameaggregate router is handled by that aggregate router. Trafficbetween racks where ToR switches of source and destinationare linked to different aggregate layer routers is forwardedto core layer router, which is connected to every aggregaterouter, and the flow is forwarded to the destination aggregaterouter [6]. Higher layers of the ThreeTier network architec-ture experience higher oversubscription ratios. Oversubscrip-tion ratio is the worst-case available bandwidth between theend hosts from the total bisection bandwidth of the networktopology [6].

B. FAT TREEThe FatTree data center architecture is a clos based arrange-ment of commodity network switches to deliver 1:1 over-subscription ratio [6]. The arrangement of computationalresources and commodity network is in three layers asThreeTier data center architecture but with different inter-connectivity patterns and number of devices as shownin Figure 2.

In FatTree, the concept of pod represented by ‘‘k’’ dic-tates the connectivity and number of computational resourcesand devices on each layer. Pod is a logical organizationunit of FatTree. The core layer in FatTree architecture hasa total of (k/2)2 switches. The aggregate and access layerseach contain k/2 number of switches in each pod. To eachaccess switch, k/2 computational servers are connected, andeach access switch is connected to k/2 aggregate switches,which are in the same pod. The total number of switchesin a single pod is k, i.e., k/2 aggregate and k/2 access,arranged in two levels as shown in Figure 2. The FatTreeDCN architecture exhibits better scalability, throughput, andenergy efficiency compared to the ThreeTier DCN. TheFatTree architecture uses a custom addressing and routingscheme [6].

VOLUME 7, 2019 19925


FIGURE 4. Pod connectivity test with router in Cisco packet tracer.

1) FAT TREE ADDRESSING SCHEMEFatTree original architecture uses IPv4 addressing schemerepresented with similar dotted notation as 10.0.0.0/8, sameblock is used to assign addresses to devices in the architec-ture. The IPs are assigned with the following conditions:

For pod switches the address follows the form of10.pod.switch.1, where:• pod number lies between the range [0, k – 1].• switch number lies between the range [0, k – 1] assignedfrom left to right and bottom to top.

For core switches the address follows the form of 10.k.j.i,where• j and i denote the switch’s coordinates in the (k/2)2 corebetween the range [1, (k/2)], assigning from top-left totop-right.

For hosts: 10.pod.switch.ID, where• ID is the host’s position in that subnet in range [2, k/2+1],assigning from left to right.

Figure 2 shows examples of this addressing scheme for a fat-tree corresponding to k = 4.

2) FAT TREE ROUTINGIn FatTree topology, the routing table is modified to allow atwo-level prefix lookup. There is an additional pointer to asmall secondary table containing the suffix and port entries.The main table is used to match the network prefix. If a matchis found, the packet is forwarded to that port. If the searchwith network prefix fails, then the pointer to secondary tableis consulted for suffix as shown in the Figure 3. The packetsare forwarded to the port where suffix matches.

The routing algorithm works differently on coreswitch. On the first two levels of switches in the FatTree,the traffic is filtered, as both layers contain terminatingnetwork prefixes to subnets in that pod. If a source host sendsa packet to a destination host that lies in the same pod buthave different subnet address assigned to it, then the upper-layer switches in the same pod will have a terminating prefix,forwarding traffic to the destination subnet’s switch.

For all other outgoing inter-pod traffic, the secondary tableis searched. If the suffix is matched with table entry, then thepacket is forwarded to that port. On the core side, only theprefix with subnet /16 is checked to forward the packet toits appropriate pod. Further, the pod switch checks for theterminating prefix and forwards the packet to the destinationsubnet’s switch.

3) FAT TREE IMPLEMENTATION ISSUEThe proposed architecture of FatTree in [6] describes aswitched (layer 3) network. The connectivity in FatTree isinfeasible when implemented using routers. Routers dividethe broadcast domain, hence, making address assignmentin FatTree architecture impracticle.. To explain the issue,we start from addressing. The servers are arranged in arack and the router replaces the ToR switch. Therefore,there is a point-to-point link from a router to server, andhas its own broadcast domain. As seen in Figure 4 when apacket tracer [33] is used and the FatTree topology (depictedin Figure 2) is followed by assigning same address to twodifferent ports of the router for a single pod, an error isshown.

The same pod test has beenmodified in terms of addressingto split the broadcast domain, as shown in the Figure 5.From Figure 5, the addresses assigned are from the samenetwork 10.0.0.0/8. The single address is converted to a class-less addressing, and each link is given a subnet with maskof /30.

This change in addressing scheme requires changes increation of routing table and packet forwarding scheme.Using the secondary table where the suffix is matchedby extracting the suffix of the destination IP address,a close match is found to the suffix and the packet isforwarded to the destination port. The close match istaken so that the traffic is evenly distributed amongst thelinks.

A routing algorithm that creates routing table and imple-ments packet forwarding is essential and therefore, has beendeveloped in Nutshell for FatTree topology.

19926 VOLUME 7, 2019


FIGURE 5. Pod connectivity test with modified addressing scheme.

IV. NUTSHELL CLOUD SIMULATORA. SIMULATION ARCHITECTUREFindings from the analysis presented in [3] and related workmotivates us to develop a new fine-grained network Cloudsimulator that help users to create accurate and fast simula-tion. It must also help users to focus on their own solution byrelieving them from complexities involved in creating Cloudarchitectures. Nutshell has NS-3 [34] as its base that wasselected for its salient features:

• Besides simulation, NS-3 also can act as an emulator,which means that the simulation can directly be run ona physical cluster.

• NS-3 offers detailed and fine-grained networking mod-els being a network simulator.

• NS-3 imitates near realistic packets with dummy pay-load instead of statistically calculating time spans foreach packet.

• NS-3 has implementation of queues on each networkcomponent that portrays the exact behavior of a networkdevice and what happens to a packet in different scenar-ios, such as congestion.

Before explaining Nutshell, its components, and inner work-ing it is best to show how a simulation can be created withNutshell. Next subsection shows a demo code for ThreeTierand FatTree data center architectures.

1) NUTSHELL SIMULATION CODE USEWe start with an example simulation code to illustrate howeasy it is to create simulations in Nutshell. All what isrequired by a Nutshell’s user to create and configure an objectfor the data center architecture and then pass this object to thearchitecture’s class. Nutshell has a plug-in architecture thatmeans any algorithm the user is working on can be passedto the configuration object, and the architecture class willuse it while running the simulation. Figures 6 and 7 showexample simulation codes for ThreeTier and FatTree datacenter architectures.

FIGURE 6. ThreeTier code example in Nutshell.

In Figure 6 code sample, Line 1–7 define link objects tobe used for connection from node to access switch, accessto aggregate switch and aggregate to core. Line 8 declares

VOLUME 7, 2019 19927


FIGURE 7. FatTree code example in Nutshell.

the configuration object. Line 9 configures node proper-ties with range 500 TFLOPS – 1000 TFLOPS processingpower, 10–30GB primary storage and 1–10TB of secondarystorage. Nodes will be created with in these ranges. Line10 configures general VM properties, and the parametersused are: number of VMs to create, minimum and maximumprocessing power, minimum and maximum primary stor-age, minimum and maximum secondary storage, minimumand maximum application size respectively. Lines 11 and12 configure data requirements for VMs, and the parametersused are: flag to enable data requirement, number of VMsthat require data from storage server, distribution of requireddata size, minimum amount of data, maximum amount of

FIGURE 8. Nutshell software architecture.

data required, minimum hard disk read/write rate, maximumhard disk read/write rate, minimum memory read/write rate,maximum memory read/write rate, minimum and maximumnumber of processor access to memory, minimum and max-imum percentage of memory available for data fetching,minimum andmaximum hard disk access time, minimum andmaximum memory access time, respectively. Line 13 config-ures the split of VM in case of resource unavailability and towhat ratio the VM should be divided. Line 14 configures thearrival time of VMs assigned between the range. Line 15 setsthe network configuration for VMs, the parameters are:Maxi-mumTransmission Unit (MTU) size, protocol, andmaximumdata rate of NIC. Line 16 sets the number of storage serversin topology. Line 19 sets the number of nodes each accessswitch will have. Line 20 sets the number of switches, thefirst parameter is number of access switches, the second isaggregate, and third is core switches. Line 21 sets the numberof logical pods. Line 22 sets the network address to assignaddresses from. Line 23 sets the links declared in Line 1–7.Line 24 declares scheduler object and is assigned in Line 25 toconfiguration. Line 26–38 configure data collectors for net-work and VM statistics. Line 39 creates ThreeTier topologypassing the configuration object to constructor. The topologyclass begins its work, creates the topology and gather sim-ulation data. After simulation finishes, the data is exportedin excel sheet and text files with the configured prefix string(Line 43). Figure 7 code has almost similar steps to Figure 6,with the difference in Lines 1-3 which creates link similarly,however, this single link will be used for all connections. Line4 creates FatTree configuration object, and Line 15 sets thenumber of pods which drives the whole topology’s number ofnodes, switches, and links.

2) NUTSHELL COMPONENTSFigure 8 shows the layered architecture of Nutshell simu-lator. The bottom layer is NS-3, which provides the core

19928 VOLUME 7, 2019


functionalities of network elements and protocols at granularlevel. Next layer, Utilities, take care of conversion of dataused by different models, performs basic time calculations,and is used to represent resources in other models of simula-tors. Utilities layer include, Processing Power, Storage, andApplication size.Component layer placed above utilities. Models in this

layer use utilities to define their resources. This layer’smodels implement basic components of a data center, suchas, computational node representing a node in data centerwith resources. Virtual Machine (Computational and networkcapable) implements virtualization and execute user appli-cations of particular application size. Routing Protocols arethe additional protocols like FatTree routing for architectureto enable successful communication. Addressing schemesdynamically creates IP addresses to fit a particular scenario.Storage Server application enables a node to become a storageserver and transmits data to its clients.Helpers constitute the next layer and are added to reduce

complexity in dealing with creation and management ofNutshell objects. Helpers provide an interface to work withlower layer of Nutshell components easily. Helpers include,Computational Node Helper for computational node, VirtualMachine Helper for interaction with VM models, VirtualMachine Container to contain and manage the created VM,FatTree Routing Helper to handle FatTree routing model,Connectivity Helper to minimize repeatable block of codeand help create and configure basic connectivity betweennodes and switches, Access Network to create a fully con-figured access network. Architecture Configurations layerprovide configuration classes specific to each architecture.Users can configure the object according to their scenario.DCN Architecture classes use the configured object to createthe network and configure itself. Scheduler layer containsmodels used to schedule VMs on the computational node.Data collector exists on the side, containing classes thatcollect statistics related to nodes and data center network in asimulation.

B. SIMULATING CLOUD IN NUTSHELLCloud uses data center infrastructure comprised of hundredsof thousands of computational nodes inter-connected viaa communication medium [3]–[5]. To simulate Cloud inNutshell, we start with the basic building block, the com-putational node. Figure 9 shows high level architecture ofComputational Node. Nutshell utilities are used for definingcomputational node resources. As discussed earlier, utilitiesare used to define resources, convert human understandableunits into base units and also perform certain calculations.

Processing power calculates the amount of time it willtake to execute an application. The application size holdsthe number of instructions/floating point operations (FLOPS)and is used by the processing power utility to calculate thetime, using (1).

E =np+1, (1)

FIGURE 9. Computational node high level architecture.

where, E represents the expected execution time, n is numberof instructions, p is processing power, and 1 is an optionalparameter to represent any extra delay caused by perturba-tions and interference. Users can specify the value for 1

based on their simulation scenario. Default value for 1 isset to zero. Utilities’ acceptable formats and mapped valueexamples are listed in Table 2.

In Figure 9, the internet stack contains protocols list, andthe protocol, e.g., FatTreeRouting protocol is placed here.NetDevices are part of NS-3. The final component in Figure 9is the VMs, which run on the computational node. VMsare discussed in next subsection. Creating and managingindividual ComputationalNode objects are difficult speciallywhen hundreds of thousands of nodes are required, there-fore, ComputationalNodeContainer helper assists in creatinga large number of computational nodes. This helper defined inNutshell that can create a required number of computationalnodes with heterogeneous or homogeneous resource config-uration.

1) VIRTUAL MACHINESCloud computing relies heavily on virtualization technology.Virtualization enables a single physical machine to be usedas multiple virtual machines by sharing resources to max-imize resource utilization [35]–[37]. Nutshell implementsvirtualization and provides models for VM creation andexecution. VM model class inheritance diagram is shownin Figure 10.

In Figure 10, the VirtualMachine Class defines basicresource requirements, i.e., processing power, storage, andapplication size. User can create VM which only runs acomputational application with this model. For an appli-cation that requires data locally available on VM’s stor-age, ComputationalLocalDataVMmodel is used. This modelrequires user to set more properties such as, application’s

VOLUME 7, 2019 19929


TABLE 2. Nutshell utilities acceptable formats and mapped value.

FIGURE 10. Virtual machine class inheritance in Nutshell.

required data size, access times, read/write rate for bothprimary storage and secondary storage, and the amount ofprimary storage available for data fetching. Data fetching issequential. Application that executes a part of job simulta-neously and collaborate to complete the whole job requiresnetwork access [2], [23], [35], for which NetworkVMand itsextensions are used. NetworkVM class is a base class anddefines properties and methods to be used by its extensions.These include sockets (listening and transmitting), trans-mission data rate for NIC, data source, data source IP andPort, etc. Models, namely ProducerVM, ConsumerProduc-erVM,andConsumerVM are actually used to simulate collab-orating application parts.

Cloud applications/jobs can be represented with work-flows. Workflows are composition of tasks combined withprecedence constraints, and are modeled by Directed AcyclicGraphs (DAGs) [38]. DAG is a graph with collectionof vertices and edges. The graph is directed with nocycles or loops [39]. VMs in Cloud collaborate and completetasks, gradually advancing towards end result, similar to DAGtraversal. A VM that belongs to a job is dependent on anotherVM for data, as shown in Figure 11

From Figure 11, VM-B is dependents on VM-A for data,for which ConsumerProducerVM model is used in Nutshell,

FIGURE 11. Job split into three VMs.

since, VM-B first consumes data, completes its applicationexecution and then produces for another VM. VM-C depen-dency is on VM-B, for which ConsumerVM model is used.VM-A is implemented with ProducerVM, as it only producesdata for another VM after its application execution. Currentlyin Nutshell, the VM that consumes data will have to wait forcomplete data consumption before continuing onwith its ownexecution. To ensure complete data transfer, retransmissionmechanism has been implemented. In order to make it easierto manage with VMs of different types, installed applicationsand control VMs Nutshell has implemented VirtualMachine-Helper

2) STORAGE SERVERVMs may require data from a storage server in the datacenter. Nutshell has an application model (StorageServer)that turns a computational node to a storage server. The serverlistens to incoming requests and responds by sending datato a consumer. The amount of data required is sent by thereceiver of the data. A scenario where VM requesting datafrom storage server, and storage server responding is depictedin Figure 12.

So far, we have discussed a computational node and howit can be used to run different VM or a server applica-tion. Now next step is to connect different nodes to createa data center topology. Nutshell makes it easy to connectand organize the whole architecture with helpers, such asConnectivityHelper and AccessNetwork. To best manageIP distribution to networks, AddressingScheme is used inNutshell.

19930 VOLUME 7, 2019


FIGURE 12. Storage server communication with VM.

FIGURE 13. Basic access network.

3) CONNECTIVITY HELPERConnectivity in network simulation is most frequent, thusmaking the size of script bigger. Extracting the connectivitycode to a separate class makes it more manageable and easierto debug.

The connectivity helper can create connection betweenboth the layer 2 and layer 3 devices. The layer 2 connectionis multiport bridge connection with Carrier Sense MultipleAccess (CSMA) enabled links, dividing only the collisiondomain. The layer 3 connection is the connection of routerto other devices (router, node, or layer 2 device) connectedvia point-to-point.

4) ACCESS NETWORKCommon part in ThreeTier, FatTree, and DCell architecturesis the access network (or cell). The access network is a startopology. At the end of each link is a computational nodeconnected, which is connected to a switch or router at theother end as shown in Figure 13. The class AccessNetwork

TABLE 3. List of network addresses created by AddressingScheme.

in Nutshell models such topology, and uses the connectivityhelper to create the connection between each node.

5) ADDRESSING SCHEMEDifferent data center architectures have different schemes.Specifically, in FatTree, the routing is dictated by the address-ing scheme. The AddressingScheme class creates subnetsfrom a single address. It maintains a list of created networkaddresses. The subnets are created with equal number ofhosts. For instance, for hosts equal to 2 and networks equalto 4 with address 10.0.0.0/24, the scheme creates 4 networkaddresses with 2 addresses available for host, 1 for networkaddress and 1 for broadcast. Therefore, the addresses we getare shown in Table 3:

There are scenarios when difference between two top levelnetworks is more than 1 network e.g., first network IP addressis 10.0.0.0/24 and required second network is 10.0.3.0/24.For such scenarios, the class provides function to add jumpsto network addresses, i.e., if the first network address is10.0.0.0/24 and jump is of 10 networks, then the next net-work address is 10.0.11.0/24. A jump vector is required forcreating network addresses with required jump. Consider thefollowing scenario. Three network addresses required withnetwork difference as:• The second network address is after 5 networks fromfirst (jump = 5).

• Third network address is the very next of second network(jump = 0).

• Fourth network is after 10 networks from the third (jump= 10).

The jump vector becomes [5, 0, 10], and is provided to theclass to generate network addresses.

C. APPLICATION EXECUTION DETAILSIn previous section, we discussed VMs and their executionof a user’s application in Cloud. In Nutshell VM imple-mentation first execute the application, and then generatean output, or terminate itself. This section explains how theapplication execution time is calculated. For a computationalapplication, the execution time calculation is simple, i.e., cal-culated by processing power utility given in (1). If the userapplication requires data, then the time calculation is differ-ent.

To calculate the amount of time, understanding data fetchprocess is essential. Figure 14 shows the process. The chipsetcircuit is on CPU in an AMD processor. The data is brought

VOLUME 7, 2019 19931


FIGURE 14. Transition of data from HDD to CPU.

to Random Access Memory (RAM) from Hard Disk Drive(HDD) and then fetched by the processor. The VM calculatesthe amount of time required to access all data used by theapplication and the time it takes to execute the application asa whole using (2-8).

RAMa =Ps × RAMD

100. (2)

RAMa reflects the available RAM, RAMD is percentage ofRAM to be used for data fetching. Ps represents the size ofprimary storage for VM.

NMEM→HDD =Dp

RAMa. (3)

where NMEM→HDD is the number of accesses from memoryto hard drive. Dp is the data to be transferred for processing.RAMa is the available RAM from (2).

DRAM→PROC =Dpna

. (4)

whereDRAM→PROC refers to the size of data that is transferredfrom RAM to the processor, na is the average number ofaccesses from processor to RAM during the program execu-tion. Dp is the data to be processed. The values ofmemory andhard disk read/write rate are in MB/s and must be convertedto B/s using (5):

r = v× 220, (5)

where r is rate in B/s, v is the value of read/write rate providedin MB/s. To calculate total time taken to transfer the wholedata from hard disk to memory is shown in (6):

THDD→MEM = NMEM→HDD ×

(ATHDD +

RAMa

rHDD

). (6)

In (6) following parameters are used:• THDD→MEM : Time required to transfer all data fromhard disk to memory (seconds).

• NMEM→HDD : Number of accesses required to transferall data during program execution.

• ATHDD : The access time for hard disk (in seconds).• RAMa : Amount of RAM available for data storage.• rHDD : the data read/write rate for hard disk.

To calculate the time taken while transferring data fromRAMto processor, on each access, data of size DRAM→PROC is cal-culated using (4). The total time is calculated using equation(7).

TMEM→PROC = Naccess ×(ATMEM +

DRAM→PROC

rMEM

).

(7)

In (7), the parameters used are:

FIGURE 15. Class hierarchy of data center configuration.

• TMEM→PROC : Time required to transfer all data frommemory to processor (in seconds).

• Naccess : It is the number of accesses to memory by theprocessor.

• ATMEM : It is the access time for memory to locate thedata in the array (in seconds).

• DRAM→PROC :The amount of data transferred in a singleaccess to memory by processor.

• rMEM : It is the read/write rate for memory.

Now we use (8) to find the total time that reflects the amountof time taken by data to transfer from hard disk to processor.

T = THDD→MEM + TMEM→PROC . (8)

D. DATA CENTER ARCHITECTURESNutshell’s utilities, components, and helpers contribute increating data center simulation from scratch. However, for auser working on a particular problem of Cloud e.g., routingprotocol, the creation of data center architecture can be anextra work for the user. Nutshell provides configurable datacenter architectures right out of the box. As seen in subsection: Nutshell Simulation Code Use, the user only has toconfigure a configuration class object and pass this objectto the architecture class object, and the whole architectureadopts this new configuration.

1) CONFIGURATION CLASSESThe configuration classes used in Nutshell are, main data cen-ter configuration class (DatacenterConfig), ThreeTier config-uration class (ThreeTierConfig), FatTree configuration class(FatTreeConfig) and DCell configuration class (DCellCon-fig). Their inheritance is shown in the Figure 15.DatacenterConfig: The Data center configuration class is

responsible for holding configuration of computational nodesand VMs. Configuration data structures set a range (min-imum and maximum value) for processing power, primarystorage, and secondary storage. VM needs additional config-urations which include the application size (min and max),the read/write rate for primary and secondary storage, accesstimes, amount of data required, and percentage of primarystorage available for storing fetched data. The configurationalso has other attributes, such as, enable split of VM (whole

19932 VOLUME 7, 2019


Job) into multiple VMs, maximum number of splits, the ratioof split – which part gets how much of the application size,and arrival time range.ThreeTierConfig: This is specific configuration class for

ThreeTier architecture, this class holds:• Number of core, aggregate and access routers.• Number of nodes per access router.• Number of pods (only logical, no effect on number ofrouters or nodes).

• A base network address (network address and subnetmask).

• Point-to-point connections for all three levels.• An internet stack.• Addressing scheme, if custom addressing scheme is tobe applied.

• VM schedulerFatTreeConfig: This is configuration holder for FatTree

architecture, and holds:• Number of pods to create.• Point-to-point link for connections.• An internet stack.• A custom routing object (default is FatTreeIpv4RoutingProtocol).

• Addressing scheme (default is FatTreeAddressingScheme).

• A base network address (network address and subnetmask).

• VM scheduler.

2) THREE TIER ARCHITECTUREConfigured ThreeTierConfig object is used to create thearchitecture, and pass it to ThreeTier class object. The objectcalculates and creates computational nodes setting processingpower, primary storage, and secondary storage from definedrange. Equation (9) is used to find total number of nodes.

Tn = Naccsw × nacc. (9)

where, Tn is total number of nodes, Naccsw is the numberof access switches, and nacc is the number of nodes peraccess switch. Total number of networks and their differ-ences (jumps) are calculated next, according to the addressingscheme. Equations (10) and (11) calculates the number ofaccess and aggregate switches per logical pod:

pagg =NaggswNpods

(10)

pacc =NaccswNpods

(11)

where, Nagg sw is the number of aggregate switches, Nacc swis the number of access switches, and Npods is the number oflogical pods. Equation (12) gives the number of links betweenaggregate routers, which is a two-way connection betweenadjacent routers (LAgg→Agg):

LAgg→Agg = Npods ×((2× pagg

)− 2

)(12)

Number of links between aggregate and access routers(LAgg→Acc) are calculated using (13):

LAgg→Acc = Npods ×(pagg × pacc

)(13)

The total number of networks are calculated using (14):

tn = Ncore sw + LAgg→Agg + Nagg sw + Nacc sw (14)

where, Ncore sw is the number of core switches. Addressingscheme for ThreeTier creates correct network addresses foreach layer, which requires correct network difference (jumps)between two network addresses. Algorithm 1 creates thejump vector for addressing scheme to use. The addressescreated are a subnet of 256 hosts between 0 – 255.

The topology is created in the next step starting with thebasic access network. Aggregate and core switches are cre-ated, and connected together starting with connection fromaggregate to access routers. The two-way aggregate to aggre-gate connections inside the logical pods are created nextskipping the connection of border aggregate routers. Finally,each core router is connected to each aggregate as dictated bythe topology. With the topology created, the VM schedulerinitiates the scheduling of VMs.

3) FAT TREE ARCHITECTUREConfigured FatTreeConfig object is used by FatTree to cre-ate the topology. Value for pods (k) is used to calculatethe number of routers and nodes. Addresses are generatedby FatTreeAddressingSchemeobject. The process of creatingtopology starts with creating Pods. Inside the pod creationprocess, nodes are created having total number calculatedusing (15):

tn =p3

4(15)

where tn is the total number of nodes, p is number of pods.Access network and aggregate routers are created next forthe pod. The access and aggregate routers of pod are con-nected for all iteration of the Pod, [0 – (k-1)] connectivityis established successfully, and the aggregate routers areready to connect to core routers. The aggregate routers areconnected to k/2 core switches. After the process is finished,VM scheduler is called to begin its job of assigning VMs tothe computational nodes.FatTree Addressing Scheme: From the discussion in sub

section: FatTree Implementation Issue it was concluded thatthe addressing scheme in the FatTree architecture does notwork in a router-based architecture. To map almost similaraddressing scheme, the network subnets are created with asubnet mask of /30. An example is shown in Figure 16 forPod 0.With the change in addressing scheme the routingmustalso be updated, as discussed next.

4) FAT TREE ROUTINGFatTreeIpv4RoutingProtocol creates the routing table forFatTree and uses it to forward traffic. The protocol after

VOLUME 7, 2019 19933


Algorithm 1 ThreeTier Addressing Scheme Jumps Algo-rithmInput: totalNetworks, which is the total number of

networks to be createdOutput: jumps array with addresses for each Access

Network of ThreeTier

/* numAccSwitches is number of accessswitches */

firstLimit ←− numAccSwitches;

/* numAggSwitches is the number ofaggregate switches */

secondLimit ←−firstLimit+numAggSwitches;

/* aggToAggLinks is the number oflinks between aggregate toaggregate switches */

thirdLimit ←−secondLimit+aggToAggLinks;

/* numCore is the number of coreswitches in topology */

fourthLimit ←−thirdLimit+numCore;

/* counter for jump array */jc←− 0;

/* jumps array to be returned at theend */

jumps[n];

for i← 0 to totalNetworks doif i < firstLimit then

/* numNodesPerAcc is the numberof nodes per access switch */

jumps[jc]←− numNodesPerAcc ;

if i >= firstLimit AND i < secondLimit then/* numAccPerPod is the number of

access switch per pod */jumps[jc]←− numAccPerPod ;

if i >= secondLimit AND i < thirdLimit thenjumps[jc]←− 0 ;

if i >= thirdLimit AND i < fourthLimit thenjumps[jc]←− numCore ;

jc++;

return jumps;

initialization, finds router index. The index helps deter-mine whether the router is a core or not. UDP sock-ets for each interface is created to send and receiveupdates. Algorithm 2 shows the initialization of routingprotocol.

During the initialization, the update is sent to each con-nected node. The update algorithm collects each interfaceaddress, calculates the network address (i.e., prefix), and thenumber of ports. For each interface, a routing table entryobject is created and an update packet is sent to each of the

Algorithm 2 FatTree Routing Protocol Initialization

/* default port for FatTree routingprotocol */

FAT_PORT ←− 2222;/* GetAllPorts() refers to NS3 API

method to get node’s all ports */routerPorts←− GetAllPorts();portAddress[n];socketList[n];receivingScoket ←− 0;/* get address of each port */for i← to routerPorts.Size() do

portAddress[i]←− routerPorts[i].getAddressByMask("0.0.255.0");

/* create scoket to connected ports

*/for i← to routerPorts.Size() do

for j← to portAddress .Size() dobroadCastAddr ←−portAddress[j].getBroadCast();if InterfaceScope == GLOBAL then

socket ←− UdpSocket.CreateSocket(portAddress[j] ,FATPORT );socketList.pushback(socket);

if notSet(receivingScoket) thenreceivingScoket ←−UdpSocket.CreateSocket(any,FATPORT );receivingSocket.SetCallback(populateTable);

neighbor attached to it. Upon receiving an update, the nodecalculates the routing table. The table created is different forcore and pods’ routers (aggregate and access). The core routerhas prefix length of /16 while the pods’ routers have prefixlength of /30.

After network convergence, traffic can now be forwarded.The lookup method searches the routing table for possibleroutes and the packet is forwarded to next hop or destinationupon successful search. Algorithm 3 shows look up of routingprotocol.Routing Helper: The helper’s main objective is to create

the object for the routing protocol and aggregate the object tothe node it is assigned to.

E. VIRTUAL MACHINE SCHEDULERSIn previous section, we discussed the data center architec-tures in Nutshell. To introduce user applications and scheduleVMs, one of global VM schedulers in Nutshell is used. GlobalVM schedulers available are, (a) First Come First Serve—First Fit, (b) Shortest Job First—First Fit, and (c) Longest JobFirst—First Fit. Each scheduler requires DatacenterConfigobject. In addition to configuration object, the class requiresaComputationalNodeContainer of data center computationalnodes and Ipv4InterfaceContainer of these nodes. The class

19934 VOLUME 7, 2019


Algorithm 3 FatTree Routing Protocol Route Lookup

pathFound ←− false;if IsRouterCore then

packetDestNetwork ←−packetDestAddr .GetNetwork("0.0.255.255");for i← 0 to prefixNetworkList.Size() do

if packetDestNetwork == prefixNetworkList[i]then

paths.push_back(prefixNetworkList[i]);pathFound ←− true;

elsepacketDestNetwork ←−packetDestAddr .GetNetwork("0.0.0.255");for i← 0 to prefixNetworkList.Size() do

if packetDestNetwork == prefixNetworkList[i]then

paths.pushback(prefixNetworkList[i]);pathFound ←− true;

ifpathFound thendestSuffix =packetDestAddr .GetSuffix("255.255.255.0");for i← 0 to suffixList.Size() do

if destSuffix == suffixList[i] thenpaths.push_back(suffixList[i]);pathFound ←− true;

if pathFound thenroute←− findCloseMatch(packetDestAddr, paths);packet.route(route);

elsepacket.discard();

creates a list of VMs, configuring the VM’s resources, appli-cation size, arrival time, the data amount required, read/writerate, access time for HDD and RAM, memory percentagefor data fetching, and transmission rate using the randomdistribution defined in the configuration. To schedule in rightorder, the list is sorted according to arrival time. User canconfigure the VM to split. VM splitting is an idea from gridcomputing, where a job is divided on different nodes for fastexecution [2], [35]. When configuration is set to allow thesplit, the scheduler needs the split ratio. For instance, the ratio1:2:3 is translated as execute (1/6 × VMresources) on onecomputational node, (2/6× VM resources) on second node,and (3/6× VM resources) on a third node.

The scheduler also creates a number of storage serverswhich is set in the configuration object. The created serversare then installed on last servers by index in the computationalnode container. The scheduler then initializes the VM andschedules it on a computational node meeting the require-ment. Next, we discuss available VM schedulers in Nutshell.

Algorithm 4 FCFS-FF Begin Scheduling

if vmList .Size() == 0 thenCreateVmList();SortVmListByTime;

if numOfStorageServer > 0 ANDdataSource == STORAGESERVER then

CreateStorageServers();

if VmRequireData thenif distribution == RANDOM then

/* randomly selects VMs to setdata source as storage server

*/for i← 0 to numOfStorageServer do

randomIndex ←−RandomValue(0, numOfStorageServer);vmList[randomIndex].dataSource←−STORAGE_SERVER;

while vmList .Size() > 0 dovm←− vmList[0] ;vmList .erase(0);schedule(vm);

1) FIRST COME FIRST SERVE—FIRST FIT VM SCHEDULERThe first come first serve, first fit (FCFS-FF) VM scheduleris simplest and straightforward. This algorithm schedules thearrived VM on first available computational node. The algo-rithm further checks if the VM to be executed requires data.If it requires data, there is another check for the data source.For data source set to storage server, the VM type created isof ConsumerVm type with a random connection to one ofthe storage servers. For data source set to local, the VM oftype ComputationalLocalDataVm is created. Alternatively,if the required data is not set for the VM in question, a simplecomputational VM is created of type ComputationalLocal-DataVm with the required data attribute set to false.

If the search fails, then the method checks for the attributethat allows the scheduler to split VM. If set to true, the VM issplit according to the ratio. Then, the dispatching algorithmcycles again for each part of VM split. If the source of datais storage server, the first VM created is of ConsumerPro-ducerVm type, other part of VM can be either ConsumerPro-ducerVm (in case of split more then 2) or ConsumerVm (lastsplit).

If the VM’s data source is local, the first VM part is ofProducerVm type, other can be either ConsumerProducerVm(in case of split more then 2) or ConsumerVm (last split).If the VM does not need any data, then it is by default a com-putational node, split into a number of VM according to theratio. All VMs can run concurrently on different nodes. Witha successful dispatch of VM, the VM is added to executedVM List and in case of failed attempt, it is added to list of notexecuted. The algorithms are given in Algorithms 4-6.

VOLUME 7, 2019 19935


TABLE 4. List of built-in collectors and their details.

19936 VOLUME 7, 2019


Algorithm 5 FCFS-FF Dispatching VM to NodeInput: A VM that is about to be dispatched on a nodevmRequireData thenif vm.dataAmount > vm.secondaryStorage then

AddToNotExecutedList(vm);return;

for i← 0 to numOfNodes doif resources available then

if vm.dataSource == STORAGE_SERVER thendispatchVM ←−Initialize(vm,CONSUMER);

elsedispatchVM ←− Initialize(vm,COMPUTATIONAL_LOCAL_DATA_VM);

elsedispatchVM ←−Initialize(vm,COMPUTATIONAL_VM );

/* The VM is initialized, now todispach it to first fit node */

vmInstalled = InstallOnFirstFit(dispatchVM);vmInstalled.Start();if start successfull then

break;return;

elsefound ←− false;

if notfound thenSplitAndDispatch();

FIGURE 16. Router based addressing scheme for FatTree architecture.

2) SHORTEST JOB FIRST—FIRST FIT VM SCHEDULERThe Shortest Job First–First Fit (SJF-FF) VM schedulerschedules incoming VMs by prioritizing the VMs with short-est job. SJF-FF is mostly similar to FCFS-FF with onlydifference of VM list sorting. The dispatching algorithms aresimilar and are discussed in FCFS-FF section. The schedulinginitiation shown in Algorithm 4 is implemented with one linechanged, i.e., instead of calling SortVmListByTime function,

Algorithm 6 FCFS-FF splitAndDispatch FunctionInput: VM to be split and dispatched/* split vm according to ratio set by user */splitVmList[n]←− SplitVM (vm, ratio);/* list of nodes from first fit */selectedNodes[n]←− findFirstNodes(vm,splitVmList );/* keep track of assigned portion */assigned[n];/* check if enogh nodes to execute VM */if selectedNodes .Size() == splitVmList .Size() then

for i← 0 to splitVmList .Size() dopartialVm = splitVmList[i];dispatchVm;if require data then

if data source == STORAGE_SERVER thenif i >= 0 AND i < (splitVmList .Size()− 1) then

/* consumer producer vmconnected to either astorage server or previousconsumer producer and/ornext consumerproducer or consumer */

dispatchVm←− Initialize(partialVm,CONSUMER_PRODUCER_VM );

else/* consumer connected to

previous consumer producer*/


elseif i == 0 then

dispatchVm←− Initialize(partialVm,PRODUCER_VM );

else if i > 0 AND i < (splitVmList .Size()− 1)then


elsedispatchVm←− Initialize(partialVm,CONSUMER_VM );

elsedispatchVm =Initialize(partialVM ,COMPUTATIONAL_VM );

/* installing */vmInstalled ←− Install(dispatchVm, selectedNodes [i] );if install success then

assigned[i]←− true

if all assigned items true thenAddToExecuted(vm);startVM(splitVmList);

elseAddToNotExecuted(vm);

the scheduling initiation of SJF-FF calls SortVmListByShort-estJob function.

SortVmListByShortestJob algorithm shown in Algo-rithm 7 first sorts the list by time. Then, the algorithm iteratesthrough the whole list comparing current VM with next.The algorithm adds weight to VMs with similar arrival timecomparing different VM requirements, i.e., (a) data size –since data over network takes more time then locally avail-able, (b)application size – larger the application, the more

VOLUME 7, 2019 19937


Algorithm 7 SortVmListByShortestJob Function

SortVmListByTime();/* Now to check for shortest job

between VMs with similar arrivaltime */

for i← 0 to n− 1 docheckingVm←− vmList[i];nextVm←− vmList[i+ 1];checkVmScore←− 0;nextVm←− 0;if checkingVm.arrivalTime == nextVm.arrivalTimethen

if checkingVm.dataSize > nextVm.dataSize thencheckVmScore+ = 10;nextVmScore+ = 5;

elsecheckVmScore+ = 5;nextVmScore+ = 10;

if checkingVm.applicationSize >

nextVm.applicationSize thencheckVmScore+ = 10;nextVmScore+ = 5;


if checkingVm.processingPower >

nextVm.processingPower thencheckVmScore+ = 2;nextVmScore+ = 5;


if checkVmScore > nextVmScore thenvmList[i]←− nextVm;vmList[i+ 1]←− checkingVm;

time it requires to complete, and (c) processing power –more processing power means less execution time. Combinedweight of these requirements decides which VM is shortest,and therefore, is prioritized and moved up to execute first,i.e., a VM with low score in comparison get priority.

3) LONGEST JOB FIRST—FIRST FIT VM SCHEDULERThe Longest Job First – First Fit (LJF-FF) VM schedulerprioritizes VM with longest job. The algorithm is similarto SJF-FF VM scheduler, with only difference in sortingVM list. LJF-FF like SJF-FF uses weighted score to prioritizea VM. The only difference in algorithm is in comparisonof score. If a VM in comparison has high score, then it isprioritized and moved up in the list. The algorithm is shownin Algorithm 8.

Algorithm 8 SortVmListByLongestJob Function

SortVmListByTime();/* Now to check for longest job

between VMs with similar arrivaltime */

for i← 0 to n− 1 docheckingVm←− vmList[i];nextVm←− vmList[i+ 1];checkVmScore←− 0;nextVm←− 0;if checkingVm.arrivalTime == nextVm.arrivalTimethen

if checkingVm.dataSize > nextVm.dataSize thencheckVmScore+ = 10;nextVmScore+ = 5;


if checkingVm.applicationSize >

nextVm.applicationSize thencheckVmScore+ = 10;nextVmScore+ = 5;


if checkingVm.processingPower >

nextVm.processingPower thencheckVmScore+ = 2;nextVmScore+ = 5;


/* for longest job score of currentif low then next we swap it. */

if checkVmScore < nextVmScore thenvmList[i]←− nextVm;vmList[i+ 1]←− checkingVm;

V. REPORTING IN NUTSHELLIt was stated in the related work section that most of theexisting Cloud simulators neglect network details at granularlevel. Nutshell aims to overcome the said limitation by pro-viding trace sources by different components of the simulator.Most simulation requires statistics listed inTable 4, which areprovided by the built-in data collector classes. Users can usebuilt-in data collectors or create their own by extending thebase collector. User can connect to many trace sources pro-vided by each component, such as PhyTxBegin, MacTxDrop,PhyRxBegin, MacRxDrop, and by PointToPointNetDevice,Enqueue, Dequeue provided by Queue to name a few. Thecomplete list of available network related trace sources isavailable in [40].

19938 VOLUME 7, 2019


TABLE 5. ThreeTier simulation configuration.

FIGURE 17. VM details dump for executed, split executed or not executedVMs.

Nutshell also logs individual node’s utilization thatcan be exported by setting a single flag. Logs of VMs

TABLE 6. FatTree simulation configuration.

that gets executed, failed to execute or ‘‘split and exe-cuted’’ can also be dumped, and the results are shownin Figure 17.

VI. RESULTS IN NUTSHELLResults for a simulation in Nutshell with parameters listed inTables 5 and 6, are shown in Figures 18 – 23. Figures 18 and19 show results obtained with built-in network data collec-tor class for ThreeTier and FatTree data center architecture,respectively. Each column calculation is explained inTable 4.The network collector calculates the average of flows on a

VOLUME 7, 2019 19939


FIGURE 18. Three tier network data.

FIGURE 19. FatTree network data.

FIGURE 20. FatTree node 0 utilization.

FIGURE 21. FatTree Avg. resource utilization.

FIGURE 22. ThreeTier node 0 utilization.

regular interval (6 seconds). The data shows that FatTreeflows are less than ThreeTier as the routers’ communicationis taken as flows which take portion of bandwidth thereby

FIGURE 23. ThreeTier Avg. resource utilization.

reducing throughput. Packet loss is also greater in ThreeTierthan FatTree.

With node utilizing data, the user performing simulationcan have a clear picture of what a scheduling algorithm isdoing, and can fine tune the algorithm based on the data uti-lization. It also helps a user to calculate energy consumptionby each node, and can find an optimum way to manage nodesin the data center.

Data dump related to VM that executed on a node canfurther narrow down the scheduler’s decision policy and pointto any scenario that may result in SLA violation.

VII. CONCLUSION AND FUTURE WORKCloud technology is evolving. Evolution induce problemsand new challenges. Mitigating these problems necessitatenew solutions. New solutions cannot be tested on a realsystem as it involves a great cost for the tester as well asthe system provider. Simulation in such scenario is a viablesolution. Various simulators exist for Cloud, focusing oncertain aspect of Cloud, neglecting others, e.g., lacking theaccuracy required to build confidence on a solution due tothe absence or an abstract implementation of network and net-work components. Data center’s network performance deter-mines the performance and resilience of Cloud. Neglectingnetwork details deviates the performance results of developedsolution. Existing simulators also lack necessary tools to

19940 VOLUME 7, 2019


speed the simulation development process, and often a userends up working on unrelated areas of Cloud.

To overcome these limitations, we have developed a newtool for Cloud simulation, Nutshell.Nutshell has a detailedimplementation of networks and networking components.Nutshell provides its user the tools to create data centernetwork architectures quickly so the user can focus on theirnarrowed down problem. Nutshell also provides built-in datacenter architectures, with the ability to scale to any require-ment. With Nutshell’s plug-in architecture, users can easilyintegrate their solution with existing models. Global sched-ulers are also available as part of the package, along with col-lecting and exporting simulation data models. The simulatortakes maximum number of variables into account while simu-lating, thereby, resulting in an increased confidence regardinga solution. In future, we plan to extend the Nutshell whichincludes:

ACKNOWLEDGMENTNS3 code base support was provided by Tommaso Pecorellaand NS3 development team.

• Modification of VMs to incorporate asynchronous dataacquisition and processing. VM collaboration expansionto 1:n and n:n, to leverage BigData processing.

• Incorporate data writing to Storage Servers and imple-mentation of SANs. Implement policies regarding datawrite.

• DCell architecture rectification to conform to the systemof Nutshell.

• New global and local scheduling schemes.• New congestion detection and control schemes.• Implement new models representing real traffic patternsof data center.

REFERENCES[1] R. N. Calheiros, R. Ranjan, C. A. F. De Rose, and R. Buyya.

(2009). ‘‘CloudSim: A novel framework for modeling and simulationof cloud computing infrastructures and services.’’ [Online]. Available:https://arxiv.org/abs/0903.2525

[2] I. Foster, Y. Zhao, I. Raicu, and S. Lu, ‘‘Cloud computing and grid comput-ing 360-degree compared,’’ in Proc. Grid Comput. Environ. Work. (GCE),Nov. 2008, pp. 1–10.

[3] U. U. Rahman, O. Hakeem, M. Raheem, K. Bilal, S. U. Khan, andL. T. Yang, ‘‘Nutshell: Cloud simulation and current trends,’’ inProc. IEEEInt. Conf. Smart City/SocialCom/SustainCom (SmartCity), Dec. 2015,pp. 77–86.

[4] K. Bilal et al., ‘‘A comparative study of data center network architectures,’’in Proc. 26th Eur. Conf. Modeling Simulation, 2012, pp. 526–532.

[5] K. Bilal, S. U. R. Malik, S. U. Khan, and A. Y. Zomaya, ‘‘Trends andchallenges in cloud datacenters,’’ IEEE Cloud Comput., vol. 1, no. 1,pp. 10–20, May 2014.

[6] M. Al-Fares, A. Loukissas, and A. Vahdat, ‘‘A scalable, commodity datacenter network architecture,’’ ACM SIGCOMM Comput. Commun. Rev.,vol. 38, no. 4, p. 63, 2008.

[7] K. Bilal et al., ‘‘A taxonomy and survey on Green data center networks,’’Futur. Gener. Comput. Syst., vol. 36, pp. 189–208, Jul. 2014.

[8] R. Buyya, R. Ranjan, and R. N. Calheiros, ‘‘Modeling and simulationof scalable cloud computing environments and the CloudSim toolkit:Challenges and opportunities,’’ in Proc. Int. Conf. High Perform. Comput.Simulation (HPCS), 2009, pp. 1–11.

[9] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, ‘‘The cost of acloud: Research problems in data center networks,’’ SIGCOMM Comput.Commun. Rev., vol. 39, no. 1, pp. 68–73, 2009.

[10] K. Bilal, S. U. Khan, and A. Y. Zomaya, ‘‘Green data center networks:Challenges and opportunities,’’ in Proc. 11th Int. Conf. Frontiers Inf.Technol. (FIT), Dec. 2013, pp. 229–234.

[11] M. Perlin. Downtime, Outages and Failures—Understanding TheirTrue Costs. Accessed: Dec. 14, 2018. [Online]. Available: https://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html

[12] K. Bilal, O. Khalid, S. Ur, R. Malik, M. Usman, and S. Khan, ‘‘Faulttolerance in the cloud,’’ Encycl. Cloud Comput., pp. 1–13, 2016.

[13] M. Rosbach, ‘‘Verification of network simulators: The good, the bad andthe ugly,’’ M.S. thesis, Univ. Oslo, Oslo, Norway, 2012

[14] X. Bai, M. Li, B. Chen, W.-T. Tsai, and J. Gao, ‘‘Cloud testing tools,’’in Proc. 6th IEEE Int. Symp. Services Syst. Eng. (SOSE), Dec. 2011,pp. 1–12.

[15] F. Fittkau, S. Frey, and W. Hasselbring, ‘‘CDOSim: Simulating clouddeployment options for software migration support,’’ in Proc. IEEE 6thInt. Workshop Maintenance Evol. Service-Oriented Cloud-Based Syst.(MESOCA), Sep. 2012, pp. 37–46.

[16] J. Jung and H. Kim, ‘‘MR-CloudSim: Designing and implementingMapReduce computing model on CloudSim,’’ in Proc. Int. Conf. ICTConverg., 2012, pp. 504–509.

[17] S. K. Garg and R. Buyya, ‘‘NetworkCloudSim: Modelling parallel appli-cations in cloud simulations,’’ in Proc. 4th IEEE Int. Conf. Utility CloudComput. (UCC), Dec. 2011, pp. 105–113.

[18] X. Li, X. Jiang, K. Ye, and P. Huang, ‘‘DartCSim+: Enhanced cloudsimwith the power and network models integrated,’’ in Proc. IEEE Int. Conf.Cloud Comput. (CLOUD), Jun. 2013, pp. 644–651.

[19] Y. Jararweh, Z. Alshara, M. Jarrah, M. Kharbutli, and M. N. Alsaleh,‘‘TeachCloud: A cloud computing educational toolkit,’’ Int. J. Cloud Com-put., vol. 2, pp. 237–257, Jan. 2012.

[20] D. Kliazovich, P. Bouvry, and S. U. Khan, ‘‘GreenCloud: A packet-levelsimulator of energy-aware cloud computing data centers,’’ J. Supercom-put., vol. 62, no. 3, pp. 1263–1283, 2012.

[21] A. Núñez et al., ‘‘iCanCloud: A flexible and scalable cloud infrastructuresimulator,’’ J. Grid Comput., vol. 10, no. 1, pp. 185–209, 2012.

[22] S. H. Lim, B. Sharma, G. Nam, E. K. Kim, and C. R. Das, ‘‘MDCSim:A multi-tier data center simulation, platform,’’ in Proc. IEEE Int. Conf.Cluster Comput. (ICCC), Aug. 2009, pp. 1–9.

[23] S. Ostermann, K. Plankensteiner, R. Prodan, and T. Fahringer, ‘‘GroudSim:An event-based simulation framework for computational grids andclouds,’’ in Euro-Par 2010 Parallel Processing (Lecture Notes in Com-puter Science), vol. 6586. Amsterdam, The Netherlands: Springer, 2011,pp. 305–313.

[24] M. Tighe, G. Keller, M. Bauer, and H. Lutfiyya, ‘‘DCSim: A data centresimulation tool for evaluating dynamic virtualized resource management,’’in Proc. 8th Int. Conf. Netw. Service Manage. (CNSM), Workshop Syst.Vitalization Manage. (SVM), Oct. 2012, pp. 385–392.

[25] S. Sotiriadis, N. Bessis, N. Antonopoulos, and A. Anjum, ‘‘SimIC: Design-ing a new Inter-Cloud simulation platform for integrating large-scaleresource management,’’ in Proc. Int. Conf. Adv. Inf. Netw. Appl. (AINA),2013, pp. 90–97.

[26] I. Sriram, ‘‘SPECI, a simulation tool exploring cloud-scale data centres,’’ inCloud Computing (Lecture Notes in Computer Science), vol. 5931. 2009,pp. 381–392.

[27] Data Center Architecture Overview, Cisco, San Jose, CA, USA, 2008.[28] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, ‘‘Dcell: A scalable

and fault-tolerant network structure for data centers,’’ ACM SIGCOMMComput. Commun. Rev., vol. 38, no. 4, pp. 75–86, 2008.

[29] B. Wickremasinghe, R. N. Calheiros, and R. Buyya, ‘‘CloudAnalyst:A cloudsim-based visual modeller for analysing cloud computing environ-ments and applications,’’ in Proc. Int. Conf. Adv. Inf. Netw. Appl. (AINA),2010, pp. 446–452.

[30] S. Mostinckx, T. Van Cutsem, S. Timbermont, E. G. Boix, É. Tanter, andW. De Meuter, ‘‘Mirror-based reflection in AmbientTalk,’’ Softw., Pract.Exp., vol. 39, pp. 661–699, May 2009.

[31] Data Center Multi-Tier Model Design, Cisco, San Jose, CA, USA, 2010,pp. 1–24.

[32] A. Greenberg et al., ‘‘VL2: A scalable and flexible data center network,’’ACM SIGCOMM Comput. Commun. Rev., vol. 39, pp. 51–62, Aug. 2009.

[33] Cisco. Packet Tracer. Accessed: Nov. 5, 2018. [Online]. Available:https://www.netacad.com/about-networking-academy/packet-tracer/

VOLUME 7, 2019 19941


[34] NS-3. Accessed: May 1, 2015. [Online]. Available: https://www.nsnam.org[35] R. Buyya, C. S. Yeo, and S. Venugopal, ‘‘Market-oriented cloud com-

puting: Vision, hype, and reality for delivering IT services as computingutilities,’’ in Proc. 10th IEEE Int. Conf. High Perform. Comput. Commun.(HPCC), Sep. 2008, pp. 5–13.

[36] R. Shea, F. Wang, H. Wang, and J. Liu, ‘‘A deep investigation into networkperformance in virtual machine based cloud environments,’’ in Proc. IEEEIEEE Conf. Comput. Commun. (INFOCOM), Apr. 2014, pp. 1285–1293.

[37] Understanding Full Virtualization, ParaVirtualization, and HardwareAssist, VMware, Palo Alto, CA, USA, 2007, p. 17.

[38] D. Kliazovich, J. E. Pecero, A. Tchernykh, P. Bouvry, S. U. Khan, andA. Y. Zomaya, ‘‘CA-DAG: Communication-aware directed acyclic graphsfor modeling cloud computing applications,’’ in Proc. IEEE Int. Conf.Cloud Comput. CLOUD, Jun. 2013, pp. 277–284.

[39] F. V. Jensen, An Introduction to Bayesian Networks, vol. 210. London,U.K.: UCL Press, 1996.

[40] NS-3 List of Trace Sources. Accessed: Nov. 5, 2018. [Online].Available: https://www.nsnam.org/docs/release/3.17/doxygen/group___trace_source_list.html

UBAID UR RAHMAN received the B.S. degreein telecommunication and networking and theM.S. degree in computer science from COMSATSUniversity Islamabad, Pakistan. He worked onsimulator project with Qatar University, Doha,Qatar. He has been running an I.T businessin Pakistan. He is working on projects inhealthcare, artificial intelligence, natural lan-guage processing, and big data. He received acampus medal and an institute medal in theB.S. degree.

KASHIF BILAL received the B.S. and M.S.degrees in computer science from the COMSATSInstitute of Information Technology, Pakistan, andthe Ph.D. degree from the Department of Elec-trical and Computer Engineering, North DakotaState University, Fargo, ND, USA, in 2014. Heis currently pursuing the Ph.D. degree from QatarUniversity, Doha, Qatar.

He served as a Lecturer with COMSATS Uni-versity Islamabad, from 2004 to 2011. He has

published 15 research papers in peer reviewed conferences and journals. Hehas co-authored two book chapters. His research interests include data centernetworks, distributed computing, wireless networks, and expert systems. Hereceived a campus medal in the B.S. degree.

AIMAN ERBAD received the B.Sc. degree in com-puter engineering from the University of Wash-ington, the M.Sc. degree in embedded systemsand robotics from the University of Essex, andthe Ph.D. degree in computer science from TheUniversity of British Columbia. He is currently anAssistant Professor and the Director of ResearchSupport with Qatar University. His research inter-ests include cloud computing, distributed systems,and multimedia networking and systems.

OSMAN KHALID received the master’s degree incomputer engineering from Center for AdvancedStudies in Engineering, Pakistan, and the Ph.D.degree from North Dakota State University, Fargo,USA. He is currently an Assistant Professor withCOMSATS University Islamabad, Pakistan. Hisresearch interests include the Internet of Things(IoT), fog computing, opportunistic networks, rec-ommendation, trust, and reputation systems.

SAMEE U. KHAN is currently the Program Direc-tor with the National Science Foundation, wherehe is responsible for the Smart & AutonomousSystems Program, Critical Resilient Interdepen-dent Infrastructure Systems and Processes Pro-gram, and Computer Systems Research Cluster.He also is a Faculty at North Dakota State Uni-versity. His work appears in over 350 publicationswith an h-index of 38 and an i10-index of 136.He maintains the GreenCloud simulator and the

CloudNetSim++ simulator. Some example funded projects that he hasworked on include NSF GARDE, S&T CO2, NSF MRI, FNR GreenIT, andFNR TITAN. His research interests include optimization, robustness, and thesecurity of computer systems.

He is a Fellow of the IET and the BCS. He is a member of the Execu-tive Committee of the IEEE Technical Committee on Scalable Computing,the IEEE Technical Committee on Cyber-Physical Cloud Systems, and theIEEE SMC Technical Committee on Cybermatics. He is the Chair of theSteering Committee of the IEEE Technical Area in Green Computing andthe Vice Communication Chair of the IEEE Special Technical Communityon Sustainable Computing. He is an Associate Editor of the IEEE ACCESS,the IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, IET Wireless SensorSystems, Scalable Computing and Communications, IET Cyber-PhysicalSystems, and the IEEE IT Professional. He is an ACMDistinguished Speakerand an IEEE Distinguished Lecturer. He is also on the Advisory Board of theIET Book Series on big data.

19942 VOLUME 7, 2019

Nutshell Simulation Toolkit for Modeling Data Center ... · This work was supported by the National...

Documents

Transcript of Nutshell Simulation Toolkit for Modeling Data Center ... · This work was supported by the National...