Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P...

85
IN DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY, SECOND CYCLE, 30 CREDITS , STOCKHOLM SWEDEN 2018 Location-based Search Service for a P2P OpenStack System TONY THOMAS KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Transcript of Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P...

Page 1: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

IN DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2018

Location-based Search Service for a P2P OpenStack System

TONY THOMAS

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Page 2: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Location-based SearchService for a P2P OpenStackSystem

TONY THOMAS

Master in ICT InnovationDate: October 24, 2018Industry Supervisor (s): João Monteiro Soares, Fetahi WuhibAcademic Supervisor: Prof. Rolf StadlerExaminer: Prof. Viktoria FodorSwedish title: Lägesbaserad Söktjänst för ett Peer-to-PeerOpenStack SystemSchool of Electrical Engineering and Computer Science, KTH

Page 3: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry
Page 4: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

iii

Abstract

The thesis work designs and develops a mechanism that enables tosearch for nodes in a Peer-to-Peer (P2P) system based on their geo-graphic location. For a P2P node, the mechanism builds and maintainsa distributed search index. We introduce a bootstrapping (or startup)protocol, NSBootstrap, which builds this search index for a P2P node.We also introduce a lookup protocol that utilizes this index, NSSearch.

The mechanism has been tested on an emulated P2P environmentwith up to 3000 nodes. The evaluation suggests that the state main-tained on a node grows logarithmically with N, the total number ofP2P nodes in the system. Moreover, a new node joining the P2P sys-tem requires log(N) steps to converge through our mechanism. Ad-ditionally, a valid search query within the scope of the search index iscompleted with log(N) steps.

Page 5: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

iv

Sammanfattning

I det här examensarbetet designas och utvecklas en mekanism sommöjligör sökning efter Peer-to-peer noder baserat på deras geografiskaposition. I en peer-to-peer node skapar och underhåller mekanismenett distribuerat sökindex. I arbetet introduceras ett bootstrapping (elleruppstarts-) protokoll, som vi kallar NS-Bootstrap, som är ansvarigt föratt bygga sökindexet för en peer-to-peer-nod. Vi inför också ett sökpro-tokoll som använder detta index, NSSearch. Mekanismen har testats ien emulerad peer-to-peer-miljö med upp till 3000 noder. Den utvär-deringen tyder på att tillståndsdata som upprätthålls i en nod växerlogaritmiskt med N, där N är det totala antalet peer-to-peer noder isystemet.

Dessutom visar experimenten att en ny node som går med i peer-to-peer systemet kräver log(N) steg för att konvergera via vår meka-nism. Dessutom, en giltig sökförfrågan inom sökindexets omfattninggenomförs med i log(N) steg.

Page 6: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

v

Acknowledgements

First and foremost, I would like to thank my industry supervisors Dr.João Monteiro Soares, Dr. Fetahi Wuhib (Ericsson Research) for theirconsistent support and guidance throughout the thesis work. As ateam, we would like to thank Prof. Rolf Stadler for his interest andsupport throughout the thesis work, especially in shaping this thesis.Additionally, I would like to thank Ericsson CSP Cloud Resource Opti-mization team manager Mattias Wildeman for his encouragement andsupport. I would like to extend my gratitude to EIT Digital MasterSchool program and KTH Royal Insitute of Technology, Stockholm foroffering relevant courses as part of their ICT Innovation master pro-gram. Last but not least, I would like to thank my family and friends,who continue to bless me with their support.

Page 7: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Contents

1 Introduction 11.1 Problem presentation . . . . . . . . . . . . . . . . . . . . . 21.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Related Research 42.1 Flooding-based approaches . . . . . . . . . . . . . . . . . 4

2.1.1 Iterative deepening . . . . . . . . . . . . . . . . . . 42.1.2 Echo protocols . . . . . . . . . . . . . . . . . . . . 5

2.2 Tree-based approaches . . . . . . . . . . . . . . . . . . . . 62.2.1 GAP protocol . . . . . . . . . . . . . . . . . . . . . 6

2.3 Index-based approaches . . . . . . . . . . . . . . . . . . . 72.3.1 Local index based search . . . . . . . . . . . . . . . 72.3.2 Routing index based search . . . . . . . . . . . . . 8

2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Background 93.1 Peer to Peer (P2P) systems . . . . . . . . . . . . . . . . . . 93.2 Membership management in P2P systems . . . . . . . . . 10

3.2.1 CYLCON protocol . . . . . . . . . . . . . . . . . . 113.3 P2P architecture for OpenStack . . . . . . . . . . . . . . . 12

3.3.1 OpenStack . . . . . . . . . . . . . . . . . . . . . . . 123.3.2 P2P OpenStack architecture . . . . . . . . . . . . . 133.3.3 P2P OpenStack system: Scheduling a workload . 14

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Location-based P2P lookup 154.1 Search architecture . . . . . . . . . . . . . . . . . . . . . . 154.2 Abstract Namespace Tree (ANT) . . . . . . . . . . . . . . 17

vi

Page 8: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CONTENTS vii

4.2.1 Tracking the state of an ANT . . . . . . . . . . . . 194.2.2 Use case: Geographical locations . . . . . . . . . . 21

4.3 The query language . . . . . . . . . . . . . . . . . . . . . . 224.4 NSBootstrap: Node bootstrap protocol . . . . . . . . . . . 22

4.4.1 Using CYCLON for finger table maintenance . . . 274.4.2 Stable state for NSBootstrap . . . . . . . . . . . . . 304.4.3 Optimize NSBootstrap resource utilization . . . . 304.4.4 Factors affecting NSBootstrap convergence time . 31

4.5 NSSearch: The lookup protocol . . . . . . . . . . . . . . . 324.5.1 Iterative traversal of an ANT by NSSearch . . . . 34

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5 Evaluations and discussions 375.1 Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.2 NSBootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2.1 Metrics measured . . . . . . . . . . . . . . . . . . . 405.2.2 Validation . . . . . . . . . . . . . . . . . . . . . . . 405.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . 405.2.4 Results and discussion . . . . . . . . . . . . . . . . 415.2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . 51

5.3 NSSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.3.1 Metrics measured . . . . . . . . . . . . . . . . . . . 535.3.2 Validation . . . . . . . . . . . . . . . . . . . . . . . 535.3.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . 535.3.4 Results and discussion . . . . . . . . . . . . . . . . 54

5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6 Conclusions 586.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 586.2 Experiences . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Bibliography 61

A Sample Code 66A.1 Sample NSBootstrap operation . . . . . . . . . . . . . . . 66A.2 Sample NSBootstrap implementation . . . . . . . . . . . 67A.3 Sample CYCLON implementation . . . . . . . . . . . . . 68A.4 Finding next prefix of a label . . . . . . . . . . . . . . . . . 70A.5 NSSearchIterative . . . . . . . . . . . . . . . . . . . . . . . 70

Page 9: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

List of Figures

1.1 The problem addressed by this thesis work. A workloadscheduling problem is translated to a resource lookupproblem. Users can later set up their workload on the knodes returned. . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Tree-based approaches for resource lookup build an in-dex for a node with pointers to its parent, peers, andchildren. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1 P2P Architecture with OpenStack instances sharing anidentity service. Image is courtesy of Xin et al.[2]. . . . . 13

4.1 Search architecture, figure inspired from [21] . . . . . . . 164.2 The lifetime of a node executing our solution. CYCLON

protocol is an integral part of our solution. . . . . . . . . 164.3 An ANT formed out of consistent labeling extending

basePrefix root. Physical P2P nodes are at the leaf level(shaded gray). . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 A P2P node should keep track of at least one node in allchildPrefixes of all its prefixLevels. In effect, node4 shouldkeep track of at least one node on all branches of theANT which includes a shaded token (along with its leafsibling, node2). . . . . . . . . . . . . . . . . . . . . . . . . 20

4.5 ANT formed based on geographical location. This isproposed as a solution for our P2P OpenStack location-based search problem. . . . . . . . . . . . . . . . . . . . . 21

5.1 Sample ANT state for various configurations of D, Kand L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

viii

Page 10: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

LIST OF FIGURES ix

5.2 Evaluation results for a system with fixed K and L val-ues with Depth D. . . . . . . . . . . . . . . . . . . . . . . . 42

5.3 Evaluation results for a system with fixed D and L val-ues with K value changed for 10x runs. . . . . . . . . . . 45

5.4 Evaluation results for a system with fixed D and K val-ues with the number of Leaf P2P nodes L changed be-tween iterations. . . . . . . . . . . . . . . . . . . . . . . . . 48

5.5 CPU and Memory usage by a sample N + 1th node onexecuting NSBootstrap. The annotation points to themoment in time when the node attained a NSBootstrapstable state. . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.6 The percentage of Ntotal a node n should keep track ofwhile executing NSBootstrap . . . . . . . . . . . . . . . . 51

5.7 Number of total search iterations in the system for asearchPrefix of depth d and count c. The iterations forcount=1 overlap the results for count=2. . . . . . . . . . . 54

A.1 NSBootstrap: Lifetime of a new node root.a.b.node4 join-ing a P2P system . . . . . . . . . . . . . . . . . . . . . . . 73

Page 11: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

List of Tables

4.1 childPrefixes for all prefixLevels of root.a.b.node4 from thesample tree in Figure 4.3. The prefixLevel root.a.b. doesnot have a childPrefix as it does not have any non-leafnodes on the ANT. . . . . . . . . . . . . . . . . . . . . . . 19

4.2 Description and properties of key terms used through-out this report. . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.3 Utility functions used in Algorithm 1. . . . . . . . . . . . 254.4 Finger table of node root.a.b.node4 from Figure 4.3 after

execution of NSBootstrap lines 3-6. . . . . . . . . . . . . . 264.5 Finger table of node root.a.b.node4 from Figure 4.3 after

execution of NSBootstrap lines 7-8. . . . . . . . . . . . . . 264.6 Finger table of node root.a.b.node4 from Figure 4.3 after

several CYCLON shuffling rounds at root. prefixLevel . . 274.7 The search index built by NSBootstrap (a), and the ac-

tual finger tables built and maintained by CYCLON (b-d) for root.a.b.node4 from Figure 4.3. . . . . . . . . . . . . 29

4.8 Utility functions used in NSSearch iterative implemen-tation (Algorithm 2). . . . . . . . . . . . . . . . . . . . . . 35

5.1 Various configurations of D, K and L used for our ex-periments to evaluate NSBootstrap . . . . . . . . . . . . . 41

5.2 Sample search queries built to evaluate NSSearch by chang-ing searchPrefix and count. . . . . . . . . . . . . . . . . . . 54

5.3 NSSearch iteration steps executed by node10 to find 4fingers in root.a.a.b.b.a.b. searchPrefix. These 7 stepsare responsible for the maximum value for d=6,c=4 inFigure 5.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

x

Page 12: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 1

Introduction

OpenStack [1] is a set of software tools for cloud computing, popular inboth industry and academia. It is mostly deployed as an Infrastructureas a Service (IaaS) platform. When deployed as an IaaS, OpenStackacts as a cloud Operating System managing large pools of computing,storage and networking resources. One of the primary functionali-ties of OpenStack is to schedule Virtual Machines (VMs), among otherworkloads. The academia has been investigating architectures to scaleup OpenStack both (1) vertically, by supplying more compute, storageand networking resources to a single OpenStack instance and (2) hori-zontally, by stacking multiple OpenStack instances sharing workloads.The P2P OpenStack architecture by Xin et al. [2], is a step in the latterdirection.

In general, P2P architectures are attractive for scaling up large sys-tems. Namely, they have a low set-up barrier and offer a possibilityto aggregate resources. Moreover, their distributed nature providesresilience against faults and brute force attacks.

The proposed P2P OpenStack system by Xin et al. involves a mes-sage relaying service called an agent residing on top of an OpenStackinstance. An agent along with its corresponding OpenStack instanceforms an OpenStack cloudlet. A user interacts with an agent in the sys-tem and attains a single point of view of the whole infrastructure. Xinet al. designed and developed a working implementation of the archi-tecture and found improvements in resource utilization and servicetime for scheduling workloads [2].

Scheduling a workload in the proposed P2P OpenStack system ar-chitecture involves selecting multiple OpenStack cloudlet(s), followed

1

Page 13: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

2 CHAPTER 1. INTRODUCTION

by a filtering based on their physical properties. Modern technologieslike 5G, Edge and Fog computing require advanced selection and fil-tering of these cloudlets, say, based on their geographic location. Thisuse case is addressed by this thesis work.

1.1 Problem presentation

Modern technologies like 5G, Edge and Fog computing require work-loads to be deployed at specific geographical locations [3, 4]. Hybridclouds [5] also require deploying a workload at a particular location.For a cloud service customer, scheduling a workload at a specific lo-cation can be beneficial as it: (1) allows to set up services accordingto the data privacy laws of a region (2) helps in satisfying Quality ofService requirements. The P2P OpenStack architecture was designedto satisfy these requirements.

This thesis addresses the requirement of a P2P OpenStack systemto schedule a workload at a given location. A rough overview of theaddressed problem, and how scheduling a workload translates to asearch problem is depicted in Figure 1.1 below.

Figure 1.1: The problem addressed by this thesis work. A workloadscheduling problem is translated to a resource lookup problem. Userscan later set up their workload on the k nodes returned.

1.2 Approach

The thesis explores a distributed solution based on a consistent label-ing of P2P nodes (cloudlets) to solve the location-based search prob-lem. Our mechanism builds and maintains an index per node. Theindex acts as a cache to speed up search and discovery within the sys-tem.

Page 14: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 1. INTRODUCTION 3

A centralized solution using a registry is a simple approach to theproblem addressed by the thesis. Popular services like Domain NameSystem (DNS) already employ a centralized approach. We do not con-sider a centralized solution approach, since

1. Centralized solutions create bottlenecks or single point of fail-ures in the system.

2. A core driving principle behind the P2P architecture for Open-Stack is to avoid any centralized functionality [2]. Hence, addinga centralized component defeats the purpose of the architecture.

1.3 Contribution

The major contributions of this thesis work are:

1. Design and development of algorithms to solve the location-basedsearch problem. This involves an algorithm to (1) build a searchindex for a participating node and (2) execute a search traversingthrough this index.

2. Validation of the solution and evaluation of its impact on partici-pating nodes using an emulated testbed of up-to 3000 P2P nodes.

1.4 Outline

The rest of the report is organized as follows: Chapter 2 details rel-evant works that solve similar problems, as addressed by the thesis.This is followed by a background study on P2P systems and, specif-ically, the P2P OpenStack system by Xin et al.[2] in Chapter 3. Thecore contribution of this thesis is described later in Chapter 4. The twoalgorithms that describe our solution are described here as well. Chap-ter 5 includes the evaluation methods and test results for the mecha-nisms this thesis proposes. Future works and experience are includedin Chapter 6. Appendix A includes sample code snippets that weregenerated as part of this thesis work.

Page 15: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 2

Related Research

The location-based workload scheduling requirement of the P2P Open-Stack system translates to a P2P resource lookup problem. Resourcelookup in a P2P network is a well-known engineering problem. Li etal. list multiple approaches for solving the search problem using cen-tralized, structured, unstructured and hybrid P2P networks [6]. In thissection, we detail multiple of such techniques based on unstructuredP2P systems.

2.1 Flooding-based approaches

A flooding-based approach involves a node broadcasting a query toall (or a subset) of its peers in a P2P network and later collecting backthe results. Flooding is effective in guaranteeing a result (if it existsin the system). However, it creates a large number of messages in thesystem [6]. Li et al. describe multiple optimizations of flooding-basedP2P search to control the degree of flooding and thereby, reducing thenumber of messages in the network. Multiple P2P search implemen-tations based on flooding are listed below. For these methods, it isexpected that a node in a P2P system knows about at least one otherof its peers (neighbors).

2.1.1 Iterative deepening

Iterative deepening involves repeated Breadth First Search (BFS) traver-sal through a P2P system while incrementing the depth of traversalafter each failed try [7]. The depth of traversal is controlled by depth

4

Page 16: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 2. RELATED RESEARCH 5

limit parameter. A query is terminated when the search is answered,or if the depth limit is met. While the depth limit is not met, nodes re-transmit the message to other nodes in the system if the query is notanswered.

Iterative deepening is a controlled flooding approach. However,it can result in multiple duplicate messages in the system. Iterativedeepening is a possible solution for the use case addressed by this the-sis work. However, the uncertainties connected with (1) the numberof messages in the system, and (2) the execution time, drive us to lookfor alternatives.

2.1.2 Echo protocols

Introduced by Segall et al., the Echo protocol introduces controlled ex-pansion and contraction of messages in P2P networks [8]. For a P2Presource lookup, Echo can be useful to flood the network in a deter-ministic fashion and get back the result at the querying node. Thealgorithm describes two phases for a message in a P2P system: (1) theexpansion phase and (2) the contraction phase.

During the expansion phase, Echo builds a minimum spanning treerooted at the node that received a lookup query. The root node createsand forwards an echo message to all its neighbors. When a node re-ceives an echo message, it (1) stores a pointer to the sender as its parent(2) starts local operations as instructed by the echo message and (3)forwards the echo message to all its neighbors except its parent. Later,these pointers are used to build a minimum spanning tree.

The contraction phase starts once the query reaches a leaf node inthe P2P system. In other words, when a node receives an echo messageand it does not have anybody else in its neighbor list to forward it to,the contraction phase starts for that echo message. The leaf node repliesto its parent node with results from its local operation. This message ispropagated all the way back to the root node. In general, the amountof information processed is more at nodes closer to the root node.

The Echo protocol is a possible solution for the location-based searchproblem identified in this thesis. It has the advantage of minimizingthe amount of state locally stored per node. However, the protocol hasthe overhead of flooding the system with messages for every query,even though in a controlled way.

Page 17: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

6 CHAPTER 2. RELATED RESEARCH

2.2 Tree-based approaches

A Tree-based solution for P2P search requires a participating node tomaintain a nominal amount of pointers to other nodes in the system.These pointers can be to a relative parent, peer(s) or children node(s)in the network graph of the system. Even though unstructured P2Psystems have a flat structure, these pointers create a tree on top of theP2P system. An example is described in Figure 2.1.

Figure 2.1: Tree-based approaches for resource lookup build an indexfor a node with pointers to its parent, peers, and children.

Tree-based approaches are found to be more effective than flooding-based approaches, as a node (1) needs not to flood the entire networkto answer a query, and (2) has to store only a finite (and constantlychanging) amount of information about a set of other nodes (namelyparent, peers, and children). An implementation of the tree-based ap-proach, and how it can solve the location-based lookup problem isgiven below.

2.2.1 GAP protocol

Introduced by Dam and Stadler (2005) GAP stands for Generic Aggre-gation Protocol [9]. It is an extension of a BFS algorithm introducedby Dolev et al. [10]. Nodes executing the algorithm store pointers to arelative parent in the system and share information about its distanceto a root node. For a node on the network graph, GAP requires it tostore (1) pointers to its children and (2) the value of a system propertyF (after computing it). A spanning tree to obtain these pointers can beconstructed with an initial run of the Echo protocol (Section 2.1.2).

Page 18: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 2. RELATED RESEARCH 7

For a given system property F agreed throughout the system, aparticipating node k with children 1..n stores the result of

∑ni=1 Fi.

This aggregate provides a node, an overview of the properties sharedby its children on a link. This information is shared with its neighborson the GAP tree. An example of this function F can be the currentaggregate CPU load on an interface, network usage on a link, etc.

A change in the value of F at a node level is propagated to allneighbors of a node. A P2P node updates its parent node about thischange. Peers of the node in the network graph will then receive thisupdated information. Hence, GAP is an ideal solution to keep track of(system properties of) a constantly changing P2P network.

For the location-based search problem addressed in this thesis, GAPis a promising solution. Physical resources such as processing powerand memory at a link can be aggregated. However, locally cachingall geographical locations covered by a link is a challenging task, es-pecially with memory and storage limitations at a node level. More-over, expressing geographical locations as a mathematical function F

is equally challenging.

2.3 Index-based approaches

Index-based approaches build a lookup index per node. An example isa database which stores information about the location of user data in aP2P system. This information can be populated actively over successfulcommunications and queries, or passively on analyzing past request-response paths. Two implementations using an index-based approach,and how they can be possible solutions for the location-based lookupproblem addressed by this thesis work is presented below.

2.3.1 Local index based search

Yang et al. (2002) describe an index-based approach in which a nodestores enough information about other nodes within a k-hop distancefrom it [7] . Nodes also agree on maximum depths (on a network-graph of the P2P system) at which a query should be processed. Nodesat depths different than these will just forward the query and not pro-cess it, termed node-skips.

This approach is similar to iterative deepening (Section 2.1.1). How-ever, not all nodes are required to process a query. For the location-

Page 19: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

8 CHAPTER 2. RELATED RESEARCH

based lookup problem addressed by this thesis work, building a localsearch index per node is a promising option. However, for a node, thesearch index will have to maintain information about the geographiclocations covered by all nodes within a k-hop distance. Moreover, thenode skips affect the accuracy of the solution.

2.3.2 Routing index based search

Introduced by Crespo et al. (2002), the Routing index is similar to theLocal index (Section 2.3.1) [11]. Routing index based search requires anode to store (1) information about its immediate 1-hop routes, and (2)hints of resources these hops can serve. This allows a node to decidecorrectly on request-routing while handling a search query. This resultsin a controlled forwarding of messages in the system.

For the location-based search problem addressed by this thesis, arouting index per node results in a controlled forwarding of messagesin the system. However, geographic location has a complicated rep-resentation (address fields, state, country, continent, etc). Unless allparticipating nodes know how to derive routing decisions using theserepresentations, a single hop routing information alone makes it diffi-cult to route a location-based search request.

2.4 Summary

This chapter introduced multiple research works that solve the P2Psearch problem. These are possible solutions to our location-basedsearch problem. In general, we find that (1) nodes should agree ona representation of geographical locations throughout the system, and(2) participating nodes need to store an amount of state information toavoid flooding the system.

The next chapter details existing technologies and protocols oursolution employs to deliver its functionalities.

Page 20: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 3

Background

In this chapter, we present background information related to P2P sys-tems, their classification, and functionalities like membership man-agement. We also talk about the CYCLON protocol, which plays afundamental part in our solution. We also provide background in-formation into OpenStack and the P2P OpenStack system architectureintroduced by Xin et al. We also detail how workload-scheduling iscurrently handled by a P2P OpenStack system.

3.1 Peer to Peer (P2P) systems

P2P systems are widely studied and are alternatives to centralized sys-tems. It involves loosely or strictly connected nodes participating insystem related tasks.

Ou et al. suggest that the origin of P2P technologies can be datedback to the 1960s [12]. Since then, P2P technologies have been adoptedwidely for distributed file sharing, torrents, etc. Multiple benefits of aP2P system, when compared to a traditional centralized system are(from Ou et al.):

1. P2P systems are self-organizing, tolerant to frequent node addi-tion, failure and node churn.

2. P2P systems assure load balancing and decentralization. Thismakes them fault tolerant and resilient to Denial of Service at-tacks.

3. Nodes can be loosely co-ordinated with little to zero administra-tive arrangements.

9

Page 21: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

10 CHAPTER 3. BACKGROUND

3.2 Membership management in P2P systems

Ideally, a flat overlay network on top of physical nodes forms the P2Pnetwork for a system. Membership management protocols are respon-sible for setting up this overlay network. Hence, they are integral tothe functioning of a P2P system. The following properties of a P2Psystem make membership management a challenging task:

1. Lack of global knowledge at node level: Gossip-based dissemi-nation protocols control flooding of messages in a system by for-warding a request to a selected subset of nodes. This subset ofnodes is selected from its local index. For correct results, theseprotocols assume that every node in the system is included inthe local index list of some other node. In other words, they as-sume that the neighbor list of each node is drawn uniformly fromthe system. Ganesh et al. detail why this is not possible withoutcomplicated synchronization and why the size of stored state af-fects the performance of the system [13]. Hence, the lack of globalknowledge at the node level makes maintaining connectivity ina P2P system challenging.

2. Flooding is not an optimal solution: Flooding can broadcaststate information throughout a P2P system. It is discouraged pri-marily due to the large number of messages it creates in the P2Psystem. Lv et al. detail that flooding creates an imbalance in thetotal load on participating nodes [14].

Membership management is handled with respect to the structureof the P2P overlay. Centralized, loosely structured and unstructuredare three of these P2P overlay architectures. For these three, member-ship management is handled as follows:

1. Centralized/highly structured P2P overlays make it easier tosolve the membership management problem. In this architec-ture, a central directory actively keeps track of nodes in the sys-tem along with resources attached to them. An example of thissetup is Napster, where nodes query a central directory to fetchinformation about the location of music files. However, the ap-proach has limited scalability and results in unbalanced role shar-ing among participating nodes.

Page 22: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 3. BACKGROUND 11

2. Loosely structured/decentralized P2P overlays do not have acentralized meta-data store, but information is spread through-out the P2P system in an orderly fashion so that queries are sat-isfied without flooding the network. Handling node churn andfailure is a challenging task in this architecture. Hybrid overlaynetworks are a variation of this architecture. In hybrid overlays,a set of nodes, termed super-peers or dominating nodes are respon-sible to serve for a subset of normal nodes [6].

3. Unstructured P2P overlays are of much academic interest. Theyform P2P systems with nodes sharing equal roles. Nodes keeptrack of a set of other nodes (neighbors) through a membershipmanagement protocol. Gnutella is an example of such a protocol[15]. Information about the topology and resources held is storedarbitrarily among nodes. CYCLON is an example of an unstruc-tured P2P membership protocol that maintains a random neigh-bor list. Resource lookup gets more complex in such a setting asa node has information only about a subset of its peers.

3.2.1 CYLCON protocol

CYCLON, introduced by Voulgaris et al. is a robust, scalable, decen-tralized and inexpensive membership management protocol for an un-structured P2P system [16]. The protocol is characterized by its build-ing of network graphs with low diameter, low clustering and highlysymmetric node degrees. Each node maintains an index table with in-formation about a number of other nodes, called a neighbor table, onwhich CYCLON executes its shuffling algorithm. The shuffling algo-rithm is responsible for (1) introducing new nodes to the system (2)notifying participating nodes about node churns or fail events, and isdetailed later in this section.

CYCLON also stores an additional age parameter for each entry inthe neighbor table. This parameter is a relative measurement of theexistence of a node in the system. The P2P OpenStack architecturedeveloped by Xin et. al. uses CYCLON for membership management.We also use CYCLON in our solution for the location-based searchproblem. CYCLON maintains a neighbor table using CYCLON shufflerounds, which is an exchange of information between two nodes P andQ using the following steps, and is adapted from Voulgaris et al.

Page 23: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

12 CHAPTER 3. BACKGROUND

1. Increase by one the age of all neighbors.2. Select neighbor Q with the highest age among all neighbors, and

l − 1 other random neighbors. l is a system parameter, calledShuffle Length.

3. Replace Q’s entry with an entry of age 0 and with P’s address.4. Send the updated subset to peer Q.5. Receive from Q a subset of no more than l of its own neighbors.6. Discard entries pointing at P and entries already contained in P’s

cache.7. Update P’s cache to include all remaining entries, by firstly using

empty cache slots (if any), and secondly replacing entries amongthe ones originally sent to Q.

In essence, the algorithm makes sure that every node in the sys-tem keep a minimum number of periodically varying neighbors in itsneighbor table. The age parameter also makes sure that there are nocircular dependencies within the system as explained and evaluatedin [16].

3.3 P2P architecture for OpenStack

In this section, we introduce OpenStack, a cloud Infrastructure as aService (IaaS). We also detail the motivations behind, and the systemdesign of the P2P OpenStack architecture [2, 17] with its shortcomings.

3.3.1 OpenStack

OpenStack is an Open Source set of tools and services which providea framework to deploy public and private clouds. It provides a cost-efficient and stable solution for service providers to deploy and main-tain private clouds when compared to proprietary solutions like Ama-zon AWS, Google or Microsoft cloud solutions.

This thesis work does not modify or extend any individual Open-Stack service. Additionally, OpenStack services are available to theend-user over REST-ful APIs which make it possible to integrate mul-tiple instances of them in a P2P fashion.

Page 24: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 3. BACKGROUND 13

3.3.2 P2P OpenStack architecture

Vertical scaling up of OpenStack by adding compute nodes to a stan-dalone OpenStack instance has been found to create bottlenecks as thenumber of compute nodes increases. Benchmarking [18] shows howthe CPU and memory usage races up as a total of 1000 compute nodesare added to a single OpenStack instance. A previous study by CiscoInc. [19] found an increased amount of failure of requests at a config-uration of ≥ 147 compute nodes per OpenStack instance. These werethe motivations behind the P2P OpenStack architecture [2].

In their architecture, Xin et al. use a message proxy service calledan agent, on top of individual OpenStack instances. An OpenStackinstance with its corresponding agent is termed an OpenStack cloudlet.The agent overlay forms the P2P network in their architecture.

Figure 3.1: P2P Architecture with OpenStack instances sharing anidentity service. Image is courtesy of Xin et al.[2].

The architecture is inspired by the OpenStack multi-region deploy-ments where one identity (keystone) service serves multiple Open-Stack instances. An overview of the setup is given in Figure 3.1. TheP2P OpenStack architecture employs CYCLON for membership man-agement.

Page 25: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

14 CHAPTER 3. BACKGROUND

3.3.3 P2P OpenStack system: Scheduling a workload

Xin et al. employ a random selection algorithm to select an OpenStackinstance while scheduling a workload. An agent uses a power of twochoices [20] selection to decide to which (other) OpenStack instance arequest gets forwarded. The method randomly selects two OpenStackinstances from an agent’s neighbor table. Three weighing markers arelater used to select an instance from the two. These markers are (1) diskusage, (2) available free memory, and (3) boot image availability at anOpenStack instance. However, the random selection has the followinglimitations:

• There is no guarantee that a resource is found, even though itmight exist somewhere on the P2P system.

• It lacks the ability to process and satisfy a geographical location-based scheduler instruction. The randomness in the filtering pro-cess makes it challenging to apply a weighting based on geograph-ical location makers.

These shortcomings are motivations for our location-based searchmechanism introduced in upcoming sections.

3.4 Summary

This chapter introduced P2P systems, their properties and the chal-lenges around P2P membership management. We also talk about CY-CLON, a membership management protocol for unstructured P2P net-works. This chapter also detailed on the proposed P2P OpenStack ar-chitecture in Section 3.3.2. Additionally, we also provide insight intoexisting workload scheduling strategies employed by the P2P Open-Stack architecture.

In the following chapters, we present our solution for the location-based search problem.

Page 26: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 4

Location-based P2P lookup

In this section, we describe our approach to solve the location-basedresource lookup problem for a P2P OpenStack system. We also ex-pect that the solution can be used to solve other resource lookups in ageneric P2P system. Keeping this in mind, we present the semanticsof our solution in general terms.

The upcoming sections are organized as follows. Section 4.1 de-scribes an overall architecture of our solution. This is followed by Sec-tion 4.2 which introduces and defines the key terms used in our solu-tion. Specific requirements of the P2P OpenStack system and how ourapproach can solve them are detailed in subsections that follow. Later,we define the query language processed by the system in Section 4.3.Section 4.4 has the algorithm and implementation of NSBootstrap, ournode bootstrap protocol. This is followed by the architecture and im-plementation of our search protocol NSSearch, in Section 4.5.

4.1 Search architecture

Our P2P search architecture is described in Figure 4.1. This architec-ture is inspired from the spatial search system described by Uddin etal. in [21]. The search is designed to run on top of a decentralized P2Psystem. Users submit their queries on a node, termed search nodein the system and expect results to their query retrieved on the samenode.

As described in Section 3.3.2, the P2P plane of a P2P OpenStacksystem consists of agent nodes running on top of OpenStack instances.These agents form the P2P layer as described in Section 3.3.2. The solu-

15

Page 27: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

16 CHAPTER 4. LOCATION-BASED P2P LOOKUP

Figure 4.1: Search architecture, figure inspired from [21]

tion proposed by this thesis is an indexing-based solution. Hence, fora node, it (1) creates an index, (2) maintains an index and (3) executesa search on top of this index.

The creation, maintenance, and traversal of this index are majorcontributions of this thesis work. This index is termed the finger tableof a node throughout this report. Items in this table are termed fingersfor a given node. This is analogous to the concept of neighbors in mem-bership management protocols listed in Section 3.2. The algorithm ex-pects each node in the system to have a unique label associated withit. The semantics of this label is described in upcoming sections. Thealgorithm that initializes the finger table is denoted as NSBootstrap.NSBootstrap also triggers a finger table population and maintenanceprotocol, CYCLON. An overview of the lifetime of a node executingour solution is described in Fig 4.2. NSSearch is the location-basedsearch algorithm which makes use of the finger table presented in thisthesis. These algorithms are described in detail in upcoming sections.

Figure 4.2: The lifetime of a node executing our solution. CYCLONprotocol is an integral part of our solution.

Page 28: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 17

Once a bootstrapping node attains a minimum set of location infor-mation of the system in its finger table, we term the node has attaineda stable state. Defining this stable state for the system is also a contri-bution of this report.

4.2 Abstract Namespace Tree (ANT)

In this section, we present our label-based approach to solve the location-based lookup problem identified in this report.

Popular distributed systems like Cassandra recommends admin-istrators to use consistent naming across their replicas within data-centers [22]. These names/labels help replicas in routing decisionswithin the system. Cloud providers like Amazon EC2 also have avail-ability zones which are labels of its data-centers like eu-central-1,eu-north-2, etc.

We extend this labeling approach to hold more information to assistnodes in a P2P system to execute a location-based resource lookup. Wepropose a labeling scheme that is (1) hierarchical at the naming level(2) extends a common prefix, basePrefix, and (3) represents the phys-ical location of a node. The labeling scheme creates a tree rooted atbasePrefix for a P2P system termed ANT. The approach can be rep-resented as an enhanced use of Routing Indexes described in Section2.3.2.

Our solution is different from general tree-based solutions in thefollowing ways: (1) physical nodes are not placed in a hierarchical or-der on the ANT and are present only at the leaf level, and (2) a nodeperceives its position on the ANT through its label. Hence, the notionof the tree is an abstract one and we term it as an Abstract Names-pace Tree of the system. The abstract nature of the ANT provides anode the freedom from (1) storing extra state information of its relatednodes (parent, peers, and children) and (2) maintaining these relationsduring churns, failures or exit.

In our approach, the naming across participating P2P nodes formsan ANT rooted at basePrefix. Peers are labeled uniquely extending thisbasePrefix with string tokens separated by a delimiter. For a node inthe P2P OpenStack system, these string tokens can be a representationof their physical location.

Example: root.eu.se.stockholm.kista.dc1.node1

Page 29: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

18 CHAPTER 4. LOCATION-BASED P2P LOOKUP

is an example label extending a basePrefix root.. The labelhas information about the physical location of node1 in theP2P system.

Figure 4.3 depicts an ANT extending a basePrefix - root.. In thisexample, we assume that string tokens a, b, ...h denote the physical lo-cation of nodes 1..9. We will be using this representation instead of theactual geographical location of nodes throughout this thesis report.

Figure 4.3: An ANT formed out of consistent labeling extendingbasePrefix root. Physical P2P nodes are at the leaf level (shaded gray).

node4 in Figure 4.3 has a label: root.a.b.node4.Additionally, we define the following properties of the label of a

P2P node present in an ANT:

1. prefixLevels: For a given label, we identify prefixLevels as exten-sions of the basePrefix in its label. A P2P node root.a.b.node4in Figure 4.3 has the following prefixLevels: root., root.a.,root.a.b.

2. childPrefixes for a prefixLevel: For a given prefixLevel, we termchildPrefixes as its immediate non-leaf children on the ANT. AP2P node root.a.b.node4 in Figure 4.3 has childPrefixes: root.a.,root.d., root.e. at root. prefixLevel. For the same node, all child-Prefixes for all of its prefixLevels are described in Table 4.1.

Page 30: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 19

Table 4.1: childPrefixes for all prefixLevels of root.a.b.node4 from thesample tree in Figure 4.3. The prefixLevel root.a.b. does not have achildPrefix as it does not have any non-leaf nodes on the ANT.

prefixLevel childPrefixesroot. root.a., root.d., root.e.root.a. root.a.b., root.a.c.root.a.b. −

To summarize, some properties of the ANT which our algorithmsutilize are:

1. Actual physical P2P nodes are present only at the leaf of the ANT.

2. basePrefix is a property of the P2P system. Nodes are assignedlabels extending basePrefix prior to bootstrap.

3. A physical node belongs to all its prefixLevels. A P2P noderoot.a.b.node4 belongs to root., root.a., root.a.b.prefixLevels.

Why not use DNS for location-based lookups?

DNS is hierarchical, centralized and builds a tree rooted at its TopLevel Domain (TLD) in the Domain Name Space. For example, a do-main www.example.com has a TLD of .com. A domain name is anextension of its parent node in the Domain Name Space, separated bya dot. For our specific use case of the location-based resource lookupin P2P systems, we do not use the DNS system as it requires physicalnodes to be arranged in a hierarchical fashion. Moreover, a leadingprinciple of our P2P OpenStack architecture is to avoid any supern-odes or centralized components in the system [2].

4.2.1 Tracking the state of an ANT

Our solution architecture expects a node to build an index (finger ta-ble) to track the state of the ANT it is part of. We specify the followinglimited information a P2P node should keep track of:

1. Its own unique label in the system. Our algorithm requires aunique label be assigned to a node during its boot up. A network

Page 31: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

20 CHAPTER 4. LOCATION-BASED P2P LOOKUP

Figure 4.4: A P2P node should keep track of at least one node in allchildPrefixes of all its prefixLevels. In effect, node4 should keep track ofat least one node on all branches of the ANT which includes a shadedtoken (along with its leaf sibling, node2).

identifier is also stored, which is not explicitly mentioned butassumed.

2. A finger table with a list of all its prefixLevels along with informa-tion of at least one node in every childPrefix for each of them. Theactual number of nodes to keep track of is configurable, and itsimpact on our solution is detailed in Section 4.4.3 later.

3. Additionally, information of all of its leaf siblings on the ANT.A P2P node root.a.b.node4 in Figure 4.3 should store infor-mation of root.a.b.node2 at its root.a.b. prefixLevel. Thisensures that every node in the system belongs to at least one pre-fixLevel of some other node in the ANT.

As an example, Figure 4.4 describes the prefixLevels a noderoot.a.b.node4 should keep track of. These prefixLevels are shadedgray. The actual state information stored would be the network iden-tifier and label of a node in its corresponding prefixLevels.

To summarize, an overview of the key terms introduced by ANT isdescribed in Table 4.2

Page 32: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 21

Table 4.2: Description and properties of key terms used throughoutthis report.

Term Description Properties

basePrefixtop level string tokenof the ANT

- All nodes belong to basePrefix- Example: root.

prefixLevel extension of basePrefix- determines the size of finger table- Example: root., root.a.,root.a.b.

childPrefiximmediate non-leaf child ofa prefixLevel on an ANT

- Example: root.a, root.d.,root.e. are childPrefixes for root.prefixLevel

4.2.2 Use case: Geographical locations

The geographic location of a node can be used to generate logical la-bels for participating nodes. Classical address identifiers can also beused to label a node, which can assist later with location-based re-source lookups.

Figure 4.5: ANT formed based on geographical location. This is pro-posed as a solution for our P2P OpenStack location-based search prob-lem.

Page 33: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

22 CHAPTER 4. LOCATION-BASED P2P LOOKUP

For example, a labeling scheme of the form root.continent.country.state.city.street datacenter.nodeName can assist our search algorithmin efficiently locating a node at a given location. This is the exact usecase addressed by this thesis for a P2P OpenStack system. A samplesetup following this naming scheme is produced in Figure 4.5.

4.3 The query language

In this section, we define the query language used by our search pro-tocol. We denote it in equation 4.1 below using BNF notations. Inits raw form, namespace based labeling can support a query to findsearchCount nodes matching a prefix searchPrefix in the system’s ANT.

query : prefix = searchPrefix ∧ count = searchCount (4.1)

Notation 4.1 denotes an ∧ (AND) operator between two query pa-rameters: (1) prefix, and (2) count. The search returns searchCount re-sults for a given searchPrefix if found in the system. An example queryprocessed our search mechanism is:

query : (prefix = root.a.b.) ∧ (count = 5)

4.4 NSBootstrap: Node bootstrap protocol

We term the protocol that is responsible for initializing a finger tablefor a node NSBootstrap. The protocol also kick-starts the finger tablepopulation and maintenance protocol, CYCLON for a P2P node.

Pre-conditions: A P2P node is configured with its label and a pointerto an introducer node in the system. The introducer has pointers to atleast one other node which belongs to basePrefix level. Since all P2Pnodes on the ANT belong to basePrefix, the introducer stores point-ers to a random subset of P2P nodes in the system. Newly joinednodes are configured to update the introducer with their informationon start-up.

The node information pointers passed through nodes executingNSBootstrap is always the node network identifier (e.g. IP address)and the label. An overview of the NSBootstrap protocol executed by anew node n joining the system is:

Page 34: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 23

1. Node n determines its prefixLevels L1, L2...LN and adds its in-formation to all these levels. This initializes its finger table.

2. Node n updates the introducer about it joining the system. Si-multaneously, it requests for pointers to other nodes in basePre-fix. The response is saved as fingers for the basePrefix.

3. With CYCLON, node n exchanges its information with its basePre-fix fingers growing the finger table for basePrefix. As soon as itfinds a node matching its next prefixLevel L2, NSBootstrap ad-vances to that prefixLevel.

4. Node n continues exchanges at this prefixLevel, growing and ad-vancing through its finger table until all its prefixLevels have beentraversed.

The exchanges mentioned in (3) and (4) steps above populate andmaintain the finger table for a node. These exchanges are programmedto maintain information about all parts of the ANT, a P2P node is ex-pected to keep track of. The actual workflow of these exchanges isdiscussed later in this section.

Algorithm 1 is the pseudo-code for NSBootstrap.

Page 35: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

24 CHAPTER 4. LOCATION-BASED P2P LOOKUP

Algorithm 1 Pseudocode for a node n joining our ANT based systemexecuting NSBootstrap protocol.Input: introducer - pointer to the introducer node; myLabel - label of n.Result: Finger table N populated; maintenance protocol CYLCON

triggered for all prefixLevels of the node.Data structures:

1: F := finger table of nmessage types:

2: message(IPAddress, Label)NSBootstrap(introducer, myLabel):

3: prefixLevels← getAllPrefixLevels(myLabel)4: for prefix in prefixLevels do5: F[prefix].append(myLabel)6: end for7: fingers← introducer.RetreiveForPrefix(basePrefix)8: F[basePrefix].append(fingers)9: nextPrefix← getNextPrefix(basePrefix)

10: PopulateAtLevel(basePrefix, nextPrefix)11: return TruePopulateAtLevel( currentPrefix, nextPrefix):12: if nextPrefix == myLabel then13: return True14: end if15: StartCYCLONAtLevel(currentPrefix) // Finger table population.

Main thread waits until a node matching nextPrefix shows up.16: targetFinger← F[currentPrefix].fingerMatching(nextPrefix)17: F[nextPrefix].append(targetFinger)18: currentPrefix := nextPrefix19: nextPrefix← getNextPrefix(currentPrefix)20: PopulateAtLevel(currentPrefix, nextPrefix)21: return True

As mentioned, NSBootstrap (1) initializes the finger table for a node,and (2) kick-starts the finger table population and maintenance pro-cess, CYCLON.

The algorithm is divided as follows. Lines 3-11 represent the pri-mary execution thread of NSBootstrap. An auxiliary function Pop-ulateAtLevel is called by NSBootstrap which iterates through eachprefixLevel and is represented on lines 12-21. The NSBootstrap algo-

Page 36: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 25

rithm stops its execution once it has traversed through all prefixLevelsof a node. However, it triggers a background finger table populationand maintenance process StartCYCLONAtLevel (line 15) for each pre-fixLevel. This background process terminates only once the node leavesthe system (or in the event of a crash/failure). Moreover, Algorithm 1utilizes the utility functions described in Table 4.3.

Table 4.3: Utility functions used in Algorithm 1.

Line Utility Function Descripiton

3 getAllPrefixLevels(label)params: labelreturns: prefixLevels for a label

7 node.RetrieveForPrefix(prefix)params: prefixreturns: Fingers for a given prefixfetched from a node

9,19

getNextPrefix(currentPrefix)params: currentPrefixreturns: Next string token inlabel after currentPrefix

16 fingers.fingerMatching(prefix)params: prefixreturns: Nodes matching prefixfrom a list of fingers

For a node n with a label myLabel, NSBootstrap starts by determin-ing its prefixLevels (line 3). A utility function is utilized for the same(line 3). For node root.a.b.node4. from our previous examples,this function would return the following prefixLevels root.,root.a.,root.a.b..

The algorithm initializes a finger table F with these prefixLevels as itsrows. For each of these prefixLevels, the algorithm further advances byadding information about the executing node in line 4-6. Rows of thefinger table are exchanged as a whole, and hence adding informationabout myLabel to the finger table of node n is a necessity. The state ofthe finger table for a node root.a.b.node4 after this step is shownin Table 4.4.

Page 37: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

26 CHAPTER 4. LOCATION-BASED P2P LOOKUP

Table 4.4: Finger table of node root.a.b.node4 from Figure 4.3 after ex-ecution of NSBootstrap lines 3-6.

prefixLevel Fingersroot. root.a.b.node4root.a. root.a.b.node4root.a.b. root.a.b.node4

NSBootstrap advances by fetching information about basePrefix fromits introducer in line 7. The step also updates the introducer aboutnode n joining the system. This process makes sure that the newlyjoined node has information about at least one other node on theANT. This step also ensures connectivity throughout the P2P system,which is explained later in this Chapter. The state of the finger tablefor a node root.a.b.node4 after this step is described in Table 4.5.

Table 4.5: Finger table of node root.a.b.node4 from Figure 4.3 after ex-ecution of NSBootstrap lines 7-8.

prefixLevel Fingers

root.root.a.b.node4,root.e.f.g.node6

root.a. root.a.b.node4root.a.b. root.a.b.node4

Once the algorithm has received information about another nodein basePrefix, it calls an auxiliary function PopulateFingersAtLevel online 10. The function performs the following steps:

1. It triggers CYCLON, the finger table population and maintenanceprocess through StartCYCLONAtLevel for each prefixLevel (line15). The process executes in the background and is explained inupcoming sections.

2. It waits for information about other nodes matching nextPrefix toshow up at a given prefixLevel. CYCLON is responsible for gath-ering information about other nodes in the system for a givenprefixLevel. The auxiliary function remains in a blocked state untilit finds a node matching nextPrefix. For example, at root. level,the auxiliary function remains blocked until CYCLON adds atleast one entry for root.a. prefixLevel.

Page 38: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 27

3. It recursively advances through all prefixLevels of a node execut-ing steps mentioned in (1) and (2) above.

CYCLON populates the finger table of a P2P node with shufflingrounds (Section 4.4.1). An example state of the finger table of noderoot.a.b.node4 after several rounds of CYCLON shuffling for root.prefixLevel is given in Table: 4.6.

Table 4.6: Finger table of node root.a.b.node4 from Figure 4.3 after sev-eral CYCLON shuffling rounds at root. prefixLevel

.

prefixLevel Fingers

root.

root.a.b.node4,root.e.f.g.node6,

root.d.node1,root.a.c.node3

root.a. root.a.b.node4root.a.b. root.a.b.node4

Eventually, the auxiliary function iterates through all prefixLevelsof a node. An exit condition is met for the last prefixLevel on line 12,and NSBootstrap terminates. However, the background maintenanceprocesses spawned by the auxiliary function continue to execute untilthe node leaves the system. This approach has the following benefits:

1. A failure, crash or removal event at a P2P node is recognized bythe maintenance process. When such an event is recognized, themaintenance process can remove that failing node from its fingertable.

2. Information about new nodes joining the P2P system are propa-gated through the system.

The following section describes the finger table population andmaintenance process in detail.

4.4.1 Using CYCLON for finger table maintenance

NSBootstrap utilizes the CYCLON protocol for populating and main-taining the finger table of a P2P node. The CYCLON protocol used

Page 39: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

28 CHAPTER 4. LOCATION-BASED P2P LOOKUP

by NSBootstrap is the original work by Voulgaris et al. [16]. The al-gorithm is described in Section 3.2.1. CYCLON has the advantage ofbeing inexpensive and robust (time to converge, no circular relations).

Voulgaris et al. define a system parameter Shuffle Length for CYL-CON exchanges. This constant is the maximum number of elementsa node P exchanges with another node Q in one round of CYCLON.Since NSBootstrap executes CYCLON at multiple levels, the algorithmrequires the following enhancements:

1. An extension to run multiple instances of a CYCLON process perP2P node. For NSBootstrap, an instance of the CYCLON protocolis executed per prefixLevel.

2. Redefining the CYCLON Shuffle Length to make sure that anode will preserve atleast one pointer to all childPrefixes for aprefixLevel after a CYCLON round. This is achieved by alwaysexchanging Shuffle Length number of fingers per childPrefix fora given prefixLevel in a CYCLON round. In a CYCLON exchangefrom P to Q, this ensures that all childPrefixes for a prefixLevelknown to P are discovered by Q. Additionally, P does not lose allpointers to a childPrefix for a prefixLevel after a CYCLON round.

One round of CYCLON

The original work by Voulgaris et al., defines a CYCLON round [16]as a single CYCLON shuffle operation. An overview of what happensduring a CYCLON round at a prefixLevel during NSBootstrap for anode P is explained below:

1. The age of all fingers in a given prefixLevel is incremented by 1.2. P selects a peer Q with the highest age from its finger table row

for prefixLevel.3. P exchanges information about Shuffle Length (if available) num-

ber of its own fingers per childPrefix for the prefixLevel with Q.The selected fingers are modified as per original CYCLON rec-ommendations prior to the exchange.

4. P receives from Q information about Shuffle Length number offingers per childPrefix for the same prefixLevel.

5. P updates its finger table without losing all pointers to any ofits childPrefix for the prefixLevel with this new set of fingers. Thisupdate step follows the original CYCLON recommendations.

Page 40: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 29

NSBootstrap search index and CYCLON finger tables

For a P2P node executing our solution, NSBootstrap creates a searchindex termed finger table. This index is later used by our distributedsearch algorithm to traverse the P2P network. NSBootstrap uses CY-CLON to populate and maintain this finger table. CYCLON requires afinger table to execute its shuffling operation. The distinction betweenthe search index built per P2P node by NSBootstrap and the actualfinger tables maintained by CYCLON is represented in Table 4.7.

For a P2P system executing NSBootstrap, CYCLON can be replacedwith an indexing based approach for P2P search (Section: 2.3) thatgenerates a search index similar to Table 4.7 (a). However, CYCLONis preferred in our solution as the P2P OpenStack architecture utilizedthe protocol for membership management [2].

Table 4.7: The search index built by NSBootstrap (a), and the actual fin-ger tables built and maintained by CYCLON (b-d) for root.a.b.node4from Figure 4.3.

(a) Search index built by NSBootstrap for node root.a.b.node4 (Figure 4.3)

prefixLevel Fingers

root.

root.a.b.node4,root.e.f.g.node6,

root.d.node1,root.a.c.node3

root.a.root.a.b.node4,root.a.b.node2,root.a.c.node3

root.a.b.root.a.b.node4root.a.b.node2

(b) CYCLON finger tablefor root. prefixLevel

Finger Ageroot.a.b.node4 0root.e.f.g.node6 2root.d.node1 3root.a.c.node3 4

(c) CYCLON finger tablefor root.a. prefixLevel

Finger Ageroot.a.b.node4 1root.a.b.node2 3root.a.c.node3 2

(d) CYCLON finger table forroot.a.b. prefixLevel

Finger Ageroot.a.b.node4 0root.a.b.node2 1

Page 41: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

30 CHAPTER 4. LOCATION-BASED P2P LOOKUP

Connectivity of the network graph

The finger table of a P2P node stores pointers to other nodes in theP2P network. NSBootstrap uses CYCLON to build and maintain thisfinger table. Connectivity between P2P nodes is essential to ensurethat a resource lookup through the finger tables returns a correct re-sult. Voulgaris et al. states that CYCLON guarantees the connectivitybetween P2P nodes in a fail-free environment [16] within a finite time.NSBootstrap inherits the property of connectivity from CYCLON.

However, there are edge cases for disconnection on the overlay net-work graph. A crash, failure or removal of a P2P node can result in adisconnection. We do not investigate these cases within the scope ofthis thesis.

Similar to most index-based solution, NSBootstrap eventually con-verges to a stable state. The properties of this stable state are describedin Section 4.4.2.

4.4.2 Stable state for NSBootstrap

A bootstrap algorithm is evaluated for the time and effort it takes fora new node joining a system to attain a stable state. We denote that anode executing NSBootstrap has acquired a stable state if and only if,it has the following in its finger table:

1. At least one entry (excluding itself) for each of its prefixLevels ifit exists in the system.

2. At least one entry (excluding its own entry) for each childPrefixfor all its prefixLevels) if it exists in the system.

3. All entries of its leaf siblings in the ANT for its parent prefixLevel.

The lifetime of a P2P node until it attains a NSBootstrap stable stateis described in Figure A.1.

4.4.3 Optimize NSBootstrap resource utilization

In this section, we list a set of optimizations for NSBootstrap with re-spect to its utilization of P2P node resources. These optimizations areto reduce the number of bytes (1) stored per P2P node, and (2) ex-changed per NSBootstrap transactions. Both of them have a positive

Page 42: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 31

impact on NSBootstrap resource utilization on a P2P node. We definethe following parameters which can be configured to tune the execu-tion of NSBootstrap.

1. Maximum number of children nodes per childPrefix: This de-termines the maximum number of other node information storedper childPrefix for a prefixLevel. For NSBootstrap to function, avalue of≥ 1 is required to satisfy the stable state conditions (Sec-tion 4.4.2). A higher value for this configuration can speed uplookups as more data is available, but can increase the state storedper node.

2. CYCLON Shuffle Length: Determines the maximum number offingers exchanged per CYCLON round at a prefixLevel per child-Prefix. For a node P executing NSBootstrap, a Shuffle Length of1 would exchange one entry per childPrefix at a prefixLevel with anode Q in a single CYCLON round. A value ≥ 1 is required forthis parameter.

3. CYCLON round intervals: The original CYCLON algorithm doesnot mandate the interval between successive shuffling rounds[16]. Reducing this value can speed up NSBootstrap. However,this can increase the number of requests per unit time in the sys-tem affecting CPU, network usage and eventually performance.This parameter should be tuned considering the volatility of nodesin the P2P network.

4.4.4 Factors affecting NSBootstrap convergence time

Section 4.4.2 lists the conditions for a node executing NSBootstrap toattain a NSBootstrap stable state. For a P2P system, our solution con-verges after all nodes in the system have attained a NSBootstrap sta-ble state. Since a CYCLON round consumes time and computationalpower, we identify the following factors that can affect the time andeffort it takes for a system to converge through NSBootstrap.

1. Exchanging information with siblings on the same branch: CY-CLON protocol selects a peer from its finger table based on itsage to perform a shuffle operation. As per NSBootstrap require-ments, leaf siblings on the ANT keep track of similar branches onthe ANT. Hence, for a node executing NSBootstrap, exchanging

Page 43: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

32 CHAPTER 4. LOCATION-BASED P2P LOOKUP

information with peers (or siblings) on its branch on the ANTcan reduce the number of CYCLON rounds required to attain astable state. In contrast, exchanging information with nodes in adifferent branch on the ANT increases the number of CYCLONrounds to attain the same result.

2. Being introduced to a node in own branch: A P2P node is intro-duced with information of any other random node in the system.However, introducing a node P to an existing node Q on its ownbranch on the ANT can reduce the number of CYCLON roundsP would take to attain a stable state. The contrary results in morenumber of CYCLON rounds to attain the same result.

These factors are used to explain and form assumptions later inChapter 5.

4.5 NSSearch: The lookup protocol

In this section, we introduce our lookup protocol, NSSearch. The pro-tocol is dependent on the NSBootstrap algorithm and is a major con-tribution of this thesis. The protocol assumes that (1) every node in theP2P system has initialized NSBootstrap, and (2) may or may not haveachieved the stable state defined in Section 4.4.2.

The query language processed by the search system is describedin Section 4.3. An example query processed by NSSearch is to findsearchCount nodes in a searchPrefix prefix. NSSearch utilizes the fin-ger table built with NSBootstrap as an index to navigate through theP2P system systematically. Additionally, we term the node executinga search as the search node. An overview of the NSSearch algorithmto find searchCount nodes in searchPrefix as executed by a search nodeP is described below:

1. P checks if searchCount nodes exists for searchPrefix in its fingertable F. If yes, return these searchCount nodes as results. If not,the protocol advances through the next step.

2. P finds the longest prefix match between its label and searchPrefix.Fingers for this prefix are pulled from the finger table F. Thesefingers are further queried for nodes that match searchPrefix.

3. The algorithm further advances in an iterative or a recursive man-ner.

Page 44: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 33

For a search node P with label myLabel, the lookup in step 2 abovetranslates to a string comparison between searchPrefix and myLabel. Forexample, assume root.a.b.node4 from Figure 4.3 as a search node,processing a query of the following form via NSSearch:

query : (prefix = root.e.f.h.) ∧ (count = 2)

A longest prefix match between root.a.b.node4 and root.e.f.h.will result in:

longestMatch = longestPrefixMatch(root.e.f.h., root.a.b.node4)= root.

For node P, NSSearch now proceeds to figure out the next finger(s)to contact. NSBootstrap makes sure that a node would store informa-tion of at least one extra node per each of its childPrefixes for all itsprefixLevels (Section 4.4.2). NSSearch leverages on this knowledge toquery a node’s finger table to fetch other fingers matching the nextchildPrefix after longestMatch. In the above example, this would be:

childSearchPrefix = (root.e.f.h.).getAfterPrefix(root.)= root.e.

For node P, NSSearch now has the knowledge of (1) longestMatch,the prefixLevel it has in its finger table F and (2) childSearchPrefix, theprefix which should match one or more of its longestMatch prefixLevelfingers. NSSearch identifies these fingers which could possibly an-swer a search query as possible searchTargets. In the above example ofroot.a.b.node4, this would be:

searchTargets = F [root.].match(root.e.)= root.e.f.g.node6

NSSearch advances further in an iterative (Algorithm 2) or recur-sive manner with these searchTargets. In the following sections, we de-scribe the iterative traversal method in detail. The traversals are simi-lar to iterative and recursive methods used by Domain Name System(DNS) [23] to resolve domain names to IP addresses.

Page 45: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

34 CHAPTER 4. LOCATION-BASED P2P LOOKUP

4.5.1 Iterative traversal of an ANT by NSSearch

As the name suggests, a search node executing NSSearch is responsiblefor (1) finding the right nodes to query, (2) query these nodes, and (3)report the results back to the user. Algorithm 2 describes the pseudo-code for a node executing NSSearch in an iterative fashion.

Algorithm 2 Iterative NSSearch: Pseudocode for a node n receivingand executing NSSearch in an iterative fashionInput: A query for count number of nodes in searchPrefixResult: List with count number of nodes in searchPrefix (if found)Data Structures:

1: F := finger table of n;NSSearch(searchPrefix, count)

2: if F[searchPrefix].length ≥ count then3: return F[searchPrefix].get(count)4: end if5: longestMatch← getLongestPrefixMatch(myLabel, searchPrefix)6: childSearchPrefix← getAfterPrefix(longestMatch, searchPrefix)7: searchTargets← F[longestMatch].match(childSearchPrefix)8: results = NSSearchIterative(searchPrefix, count, searchTargets)9: return results

NSSearchIterative( searchPrefix, count, searchTargets)10: results = []11: for target in searchTargets do12: if results.length ≥ count then13: break14: end if15: FOUND, newResults← target.FindForPrefix(searchPrefix)16: if FOUND == TRUE then17: results.append(newResults)18: end if19: searchTargets.append(newResults)20: end for21: return results

Preconditions: It is assumed that a node has achieved a NSBoot-strap stable state as per Section 4.4.2. Even though we present the al-gorithm to work as soon as NSBootstrap has initialized the finger tableF, this is not evaluated within the scope of this thesis.

Page 46: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 4. LOCATION-BASED P2P LOOKUP 35

Algorithm 2 is organized as follows. The main execution thread ofthe program is described in lines 2-9. NSSearch starts by checking ifthe answer for a query exists in its own finger F table in lines 2-4. If not,it proceeds to figure out further searchTargets using an auxiliary func-tion NSSearchIterative in lines 10-21. Moreover, Algorithm 2 utilizesthe utility functions described in Table 4.8.

Table 4.8: Utility functions used in NSSearch iterative implementation(Algorithm 2).

Line Utility Function Descripiton

5getLongestPrefixMatch(label, searchPrefix)

params: label, searchPrefixreturns: longest string match betweenlabel and searchPrefix

6getAfterPrefix(longestMatch, searchPrefix )

params: longestMatch, searchPrefixreturns: next string token afterlongestMatch in searchPrefix

13node.FindForPrefix(searchPrefix)

params: searchPrefixreturns:tuple of (True, results) if results foundelse tuple of (False, further search targets)

The auxiliary function NSSearchIterative (lines 10-21) iterates throughpossible searchTargets to retrieve count number of results for the query.Once the algorithm has iterated through all searchTargets, this result listis returned as a response to the search query in line 9.

The utility function FindForPrefix is responsible for querying othernodes in the system for possible results. P2P nodes have their own im-plementation to answer a query through FindForPrefix. Essentially,a node Q receiving a query through FindForPrefix from P executesNSSearch steps 1-7 and returns the searchTargets (or actual results iffound) without further executing the auxiliary function NSSearchIt-erative. The boolean FOUND (line 15) is returned True if Q belongsto searchPrefix. In this case, variable newResults (line 15) will have re-sults which can be returned at P. If Q does not belong to searchPrefix,the boolean FOUND is returned False. Moreover, in this case, variablenewResults will have possible other search targets which P can queryfurther for results.

Page 47: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

36 CHAPTER 4. LOCATION-BASED P2P LOOKUP

4.6 Summary

In this chapter, we have introduced and detailed the core contribu-tions of this thesis. We started by detailing the overall architectureof our proposed P2P search solution, followed by the concept of Ab-stract Namespace Tree (ANT). NSSearch, the search algorithm is pre-sented after the node bootstrap algorithm, NSBootstrap. The pre-sented search algorithm is a straightforward traversal through the in-dex built during NSBootstrap. We have also presented 3 optimizationparameters for NSBootstrap, which are relevant for the solution eval-uation presented in the following chapter.

Page 48: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 5

Evaluations and discussions

In this chapter, we detail the evaluation tests we did as part of thisthesis work for NSBootstrap and NSSearch. Since we are introducingthese two algorithms in this thesis, we define:

• Validation: We validate whether the algorithm delivers its ex-pected results.

• Evaluation: We evaluate the results of the algorithm in varyingenvironments. We also evaluate the resource utilization (CPUusage) of the algorithm on P2P nodes.

5.1 Test Setup

In this section, we detail the testbed and mechanics used while vali-dating and evaluating our solutions.

All tests are run on a setup of N P2P nodes, with N varying from108 to 3000. Each P2P node is deployed as a single Python processrunning on an HTTP port. To deal with test-bed limitations, we groupmultiple such processes on a single Virtual Machine (VM) and run theVMs in parallel. The VMs remain in a single logical network and eachhave a configuration of 8 Virtual CPU cores and 16 GB of RAM. Also,we deploy at the most 125 of such processes per VM to stay below100% CPU usage (to avoid unrealistic results).

From an implementation perspective, the nodes start to executeNSBootstrap as soon as they receive their unique label from a script.Unlike a real-world setup, labels are assigned dynamically during theexperiment. The Python processes we implemented are analogous to

37

Page 49: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

38 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

agent processes running on top of an OpenStack installation in a P2POpenStack system (Section 3.3).

Moreover, P2P nodes represent the leaf level nodes of the system’sANT for every configuration.

All tests are repeated 10 times and we define 3 parameters whichare varied, one after the other. These 3 parameters define the totalnumber of leaf nodes in the system and the shape/size of the systemANT formed. These 3 parameters are:

1. Depth (D): The depth of the ANT excluding its leaf nodes. An in-crease of this number has an exponential effect on the total num-ber of leaf nodes in the system.

2. K: The number of children every non-leaf vertex has on an ANT.This defines the expansion of the ANT or its width. An increaseof this number has a polynomial effect on the total number ofleaf nodes in the system.

3. Leaves (L): The number of leaf physical nodes on any branch onthe ANT.

The total number of leaf P2P nodes in the system is hence deter-mined by the parameters D, K, L and is denoted as N.

Moreover, the following configuration parameters were used forrunning our evaluation tests:

1. Any new P2P node P is introduced to the system with the infor-mation of one additional node Q (P 6= Q) by a static introducerwithin the system. This mandates that the node P has to pop-ulate its finger table all the way from a single pointer (to Q) toeventually attain a NSBootstrap stable state (Section 4.4.2).

2. For each prefixLevel in the system, a node P stores informationabout 1 or 2 other nodes (excluding itself) in its table for each ofits childPrefixes per prefixLevel. For example, node root.a.b.node4from Figure 4.3 stores information about 1 or 2 other nodes inroot.a., root.d. and root.e. for root. prefixLevel.

Page 50: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 39

(a) Initial, D=2, K=2, L=1 configura-tion. N=4 (b) Varying L, D=2, K=2, L=2 configu-

ration. N=8

(c) Varying K, D=2, K=3, L=1 configuration. N=8

(d) Varying D, D=3, K=2, L=1 configuration. N=8

Figure 5.1: Sample ANT state for various configurations of D, K and L

3. The CYCLON Shuffle Length is set by default to 1. Refer to Sec-tion 4.4.3 for details on this parameter. The parameter is set to thelowest possible value it can take to stress-test the bootstrappingprocess. Moreover, a node is considered to have attained a stablestate if and only if it satisfies all conditions for our NSBootstrapstable state (Section 4.4.2)

4. For any node in the system, an execution round of our fingertable maintenance protocol (CYCLON), happens at an intervalof 3 seconds.

Page 51: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

40 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

5.2 NSBootstrap

NSBootstrap is a contribution of this thesis work and defines the boot-strapping phase of a P2P node. For a new node joining a system,the bootstrap phase is considered complete when it attains all of theNSBootstrap stable state requirements (Section 4.4.2). Our strategy toevaluate NSBootstrap is described below:

1. Start N P2P nodes at the same time and assign all of them aunique label.

2. Monitor CPU usage and the number of CYCLON rounds requiredby all (N) nodes in the system until all nodes attain a NSBoot-strap stable state (Section 4.4.2)).

3. Start the N + 1th P2P node and assign it a unique label.4. Monitor CPU usage and the number of CYCLON rounds required

by the N + 1th node until it attains a NSBootstrap stable state.

5.2.1 Metrics measured

The metrics measured are:

1. CPU and memory usage by a node executing NSBootstrap untilit attains a stable state.

2. The total number of CYCLON rounds executed at all prefixLevelsby a node executing NSBootstrap until it attains a stable state. ACYCLON round in our setup is defined in Section 4.4.1.

5.2.2 Validation

From the evaluation experiments, we see that all P2P nodes in the sys-tem attained a NSBootstrap stable state after a finite number of CY-CLON rounds. This validates our NSBootstrap protocol.

5.2.3 Evaluation

The configurations for D, K, and L we used for this experiments arelisted in Table 5.1.

Page 52: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 41

(a) Varying D, with constant K and L

D K L Nodes Repeat3 3 4 108 x104 3 4 324 x105 3 4 972 x106 3 4 2916 x10

Total runs 400

(b) Varying K, with constant D and L

D K L Nodes Repeat3 3 4 108 x103 4 4 256 x103 5 4 500 x103 7 4 1372 x103 9 4 2916 x10

Total runs 500(c) Varying L, with constant K and D

D K L Nodes Repeat3 5 1 125 x103 5 2 250 x103 5 4 500 x103 5 8 1000 x103 5 24 3000 x10

Total runs 500

Table 5.1: Various configurations of D, K and L used for our experi-ments to evaluate NSBootstrap

A point on the resultant plot is the average measurement of 10 runs.The upper bound and lower bound of the error bar represent the max-imum and minimum values obtained respectively.

5.2.4 Results and discussion

Varying D experiments

The results of varying D experiments are plotted in Figure 5.2.Observations

1. We see a linearly increasing line for the average number of CY-CLON rounds as Depth (D) increases.

2. For the average number of CYCLON rounds of N P2P nodes inthe system, we see that the measurement range increases as theDepth (D) increases.

Page 53: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

42 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

(a) Average CYCLON rounds for N nodes in the system. The total numberof nodes (N) is given inside parenthesis for the resulting configuration on thehorizontal axis. A point in the plot is the average of N*10 measurements.

(b) Average CYCLON rounds for an N + 1th node joining the system after Nnodes have attained a stable state as per Section 4.4.2.

Figure 5.2: Evaluation results for a system with fixed K and L valueswith Depth D.

Page 54: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 43

3. For a new P2P node N + 1 joining the system, we see that therange of measurements around the average CYCLON rounds isnegligible.

Discussion

1. As expected, the total average number of CYCLON rounds in-crease linearly with depth D of the ANT. As D increases for aP2P node on the ANT, it has to (1) keep track of more prefixLevels,and (2) hence results in more average CYCLON rounds executedper P2P node. This is observed for both an N + 1th joining thesystem as well.

2. We observe a varying range of measurements around the av-erage number of CYCLON rounds in Figure 5.2(a). This is ex-pected due to (1) the randomness in CYCLON protocol, and (2)the chance for a new P2P node to get introduced to another nodein its own branch on the ANT. The latter can speed up NSBoot-strap and reduce the total number of CYCLON rounds required.

3. For a N +1th P2P node joining the system, attaining a stable stateis a predictable process. From Figure 5.2(b), one can deduce thatthe total number of iterations required to attain a stable statefor a node is directly proportional to the depth D of its label.

4. For a new N + 1th P2P node, the average number of total CY-CLON rounds required is always the depth (D) of its label addedto a fixed value (2 for this experiment (Figure 5.2 (b)). This isexplained below with an example.

Assume a new P2P node root.c.c.c.108 joining a P2P systemwith an ANT of K=3, L=4 and D=3 configuration. It is presumed thatall other 107 nodes in the system have attained a NSBootstrap statealready as per Section 4.4.2. Lets assume that the static introducer nodein the setup provides root.c.c.c.108 information about anothernode root.b.a.a.54. The following steps are responsible for its 5total CYCLON rounds:

1. Round=1, exchange (empty here) and add information of root.from introducer node. This fetches 1 node in root. prefix. Letsassume this to be root.b.a.a.54.

Page 55: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

44 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

2. Round=2, exchange and add information of root. fromroot.b.a.a.54. This fetches information of at least 1 node inroot.c. prefix. Lets assume this to be root.c.a.a.node53.

3. Round=3, exchange and add information of root.c. fromroot.c.a.a.node50. This fetches information of at least 1node in root.c.c. prefix. Lets assume this to be root.c.c.a.node52.

4. Round=4, exchange and add information of root.c.c. fromroot.c.c.a.node52. This fetches information of at least 1node inroot.c.c.c. prefix. Lets assume this to be root.c.c.c.node51.

5. Round=5, exchange and add information of root.c.c.c. fromroot.c.c.c.node51. This fetches information of all nodes inroot.c.c.c. prefix. Lets assume this to be [root.c.c.c.node50,root.c.c.c.node49 ..] .

6. Node root.b.a.a.54 has achieved a stable state as per Section4.4.2.

Varying K experiments

The results of varying K experiments are plotted in Figure 5.3.Observations

1. We do not observe any direct relationship between the measuredvalue for the average number of CYCLON rounds and K.

2. Comparing Figure 5.3(a) and 5.2(a), we observe that the maxi-mum average number of total CYCLON rounds required by Nnodes in the system are significantly lower in the former plot forsimilar values of N. For example, with N=2916, a D=6, K=3 con-figuration requires about 250 CYCLON rounds on average perP2P node to attain a NSBootstrap stable state. However, for thesame N, a D=3, K=9 configuration requires only about 50 CY-CLON rounds on average per P2P node to attain a NSBootstrapstable state.

3. For a new N +1th P2P node joining a system, increasing K seemsnot to have an effect on its number of CYCLON rounds as per5.3(b). The error is almost negligible in this case as well.

Page 56: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 45

(a) Average number of CYCLON rounds for N nodes in the system. Thetotal number of nodes (N) is described inside parenthesis for the resultingconfiguration on the horizontal axis. A point in the plot is the average ofN*10 results.

(b) Average number of CYCLON rounds for an N + 1th node joining thesystem after N nodes have attained a stable state as per Section 4.4.2.

Figure 5.3: Evaluation results for a system with fixed D and L valueswith K value changed for 10x runs.

Page 57: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

46 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

Discussion

1. For a system with N P2P nodes, two factors influence the aver-age number of CYCLON rounds it takes to attain a NSBootstrapstable state with an increase in K and L values. Firstly, with anincrease in L value, there is an increased chance for a node n tofind and exchange its fingers with another node m, which be-longs to n’s branch on the ANT. This can speed up the bootstrapprocess. Secondly, there is an increase in total nodes N in thesystem, pushing up the number of CYCLON rounds required toattain a NSBootstrap stable state. We speculate that an effect ofthe former factor is responsible for a higher average number ofCYCLON rounds for the K=3 configuration when compared toK=4.

2. From Figure 5.3(a), we find that the average number of CYCLONrounds is not related to K. We also find a dip at K=4, which seemsto suggest that K≈L appears to be a good configuration of theANT to reduce the average number of CYCLON rounds. Wefurther speculate that K≈L provides a middle ground betweenthe factors mentioned in 1 above.

3. From Figure 5.3(a) and Figure 5.2(a), the average number of to-tal CYCLON rounds are significantly smaller while increasing Kwhen compared to increasing D for similar N value. Since in-creasing K does not add more prefixLevels to a P2P node, this isexpected.

4. The number of CYCLON rounds required for a new N + 1th P2Pnode to attain a NSBootstrap stable state is independent of K.Since (1) the N + 1th node talks to a node n in the system whichhas already attained a stable state and (2) the depth of the ANTremains the same throughout the configurations, this is an ex-pected result.

Page 58: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 47

Varying L experiments

The results of varying L experiments are plotted in Figure 5.4.Observations

1. From figure 5.4(a), we observe an initial high average for the totalCYCLON rounds for N nodes in L=1 configuration. Later, we seethat the value falls to less than 20 when 2 ≤ L ≤ 24.

2. We also observe that the variance in the measurements of theaverage CYCLON rounds reduces significantly for 2 ≤ L ≤ 24

cases in Figure 5.4(a).

3. For a new N+1th node joining the system, the average number ofCYCLON rounds remains almost a constant. However, there areminor variations in the upper and lower bound of the averageCYCLON rounds when 1 ≤ L ≤ 8.

Discussion

1. At L=1, only a single P2P node exists on a branch on the ANT.This reduces the chance for an NSBootstrap best case (Section4.4.4). This accounts for the initial high average for the total CY-CLON rounds in Figure 5.4(a) at L=1 configuration.

2. For cases 1 < L ≤ 24, there are more than one leaf P2P nodessharing the same branch on the ANT. This increases the chancesfor an NSBootstrap best case (Section 4.4.4). This factor, alongwith the previous speculation about K≈L being a better state forNSBootstrap, can be the reason for a dip at L = 2, which is seenall the way till L = 24.

3. For a new N + 1th P2P node, we observe that increasing L doesnot affect the total number of CYCLON rounds required to attaina NSBootstrap stable state. This is expected as the non-leaf nodesof ANT remains constant. For the N+1th node, a single CYCLONexchange with another node in the same branch as it is, can fetchall information it needs, to attain a stable state.

Page 59: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

48 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

(a) Average number of CYCLON rounds for N nodes in the system. Thetotal number of nodes (N) is given inside parenthesis for the resulting con-figuration on the horizontal axis. A point in the plot is the average of N*10measurements.

(b) Average number of CYCLON rounds for an N + 1th node joining thesystem after N nodes have attained a stable state as per Section 4.4.2.

Figure 5.4: Evaluation results for a system with fixed D and K valueswith the number of Leaf P2P nodes L changed between iterations.

Page 60: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 49

4. For the N + 1th node, we see that it has a maximum averagenumber of CYCLON rounds of 6 for a L=1 configuration. At thisconfiguration, the node is alone in its branch, and hence it needsextra CYCLON exchanges to attain a stable state. This is also anNSBootstrap worst case scenario to attain a stable state (Section4.4.4).

5. The average number of CYCLON rounds remains≈ 5 for the N+

1th P2P node, which is an expected value for a D=3 configuration(Refer to Varying D discussions).

CPU Utilization

The CPU and real memory usages of all nodes in the system are tracked.However, we present only two results in this thesis. Figure 5.5 has theCPU and memory usage plots for a N + 1th node from two rounds ofthe Varying D experiment.

Figure 5.5 has the CPU usage in percentage of a single P2P node(actually, the N + 1th node in the experiment). The scatter in blue rep-resents the value of this metric with respect to time on the horizontalaxis. The plot assumes that the node joined the P2P system at t=0. Thereal memory usage of the P2P node is plotted in the same figure with ared line. The time t when the node attained a NSBootstrap stable stateis given by a pointer.

Figure 5.5: CPU and Memory usage by a sample N + 1th node on ex-ecuting NSBootstrap. The annotation points to the moment in timewhen the node attained a NSBootstrap stable state.

Page 61: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

50 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

Observations: We see that there are multiple CPU spikes prior tothe time at which the node attains a NSBootstrap stable state. Later itrecedes to a normal behavior.

Discussions: The period in the lifetime of a node from the momentit joins the system until it attains a stable state includes CPU intensivesteps. Once it attains a NSBootstrap stable state, the node continues toexecute our maintenance protocol at a fixed interval.

Total number of fingers a node should keep track of

Assuming a system with an ANT with parameters D, K, and L, we canpredict the maximum number of fingers, a node should keep track ofwhile executing NSBootstrap. The total number of P2P nodes in thesystem Ntotal (previously represented as N) is a function f dependenton D, K and L, given by Equation 5.1:

Ntotal = f(D,K,L)

= KD ∗ L(5.1)

Assuming SCprefix P2P nodes are stored per childPrefixes for eachprefixLevel, we find that the total number of fingers, Nfingers a P2P nodeshould keep track of is a function g dependent on D, K, L and SCprefix,given by Equation 5.2:

Nfingers = g(D,K,L, SCprefix)

= D ∗ (SCprefix ∗K) + L(5.2)

From Equations 5.1 and 5.2, we can deduce that the percentage of totalnodes in the system (Ntotal) a node should keep track of is:

Ntracked(%) =Nfingers

Ntotal

=D ∗ (SCprefix ∗K) + L

KD ∗ L

(5.3)

With Equation 5.3 and our testing configurations in Table 5.1, wegenerate Figure 5.6.

Observations and Discussions: Figure 5.6 shows a falling trendas the number of P2P nodes in the system increases. Increasing theK value of the ANT increases the total % of P2P nodes tracked perparticipating P2P node in the system. Since K adds the total number

Page 62: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 51

of fingers at a prefixLevel, this is an expected result. Moreover, we seethat the increasing D eventually leads to very low Ntracked(%). This isexpected as our experiments kept the K value constant (and hence thenumber of nodes tracked per prefixLevel constant) while increasing D.

Figure 5.6: The percentage of Ntotal a node n should keep track of whileexecuting NSBootstrap

5.2.5 Conclusions

The experiments in general validate our NSBootstrap protocol and thefollowing conclusions can be drawn from all the results:

1. In general, the average number of CYCLON rounds increaseswith respect to the configuration of the ANT as the total numberof P2P nodes (N) increases. Or in other words, we know thatincreasing K or L would result in a lower value for the averagenumber of CYCLON rounds (to attain a NSBootstrap stable state)when compared to increasing D for the same value of N in thesystem. This can help in planning real-world P2P systems.

2. For a new P2P node, N + 1 joining a system with N P2P nodesalready in their stable state, NSBootstrap completes in a totalnumber of steps predictable by the D and L values of the ANT.From our experiments and theoretical knowledge of the stable

Page 63: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

52 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

state requirements, we find that the average number of CYCLONrounds executed by N + 1th node, Iavg is:

Iavg =

{2 + (D − 1) + 1 if L > 1

2 + (D − 1) if L = 1(5.4)

The value of L only has a +/- 1 effect on the average total CY-CLON rounds.

3. The percentage of total nodes in the system a node P should keeptrack of, Ntracked(%) provides a measure of the memory footprintof an index-based P2P solution. For NSBootstrap, we see a de-creasing trend in this memory footprint as described in Figure5.6 as the size of the system increases. The actual value of thepercentage is given by Equation 5.3.

4. For the same number of total nodes in the system N, increasingthe K value of the ANT is observed to reduce the average num-ber of CYCLON rounds per P2P node. This is explained as (1)increasing D requires P2P nodes in the system to keep track ofadditional prefixLevels, (2) increasing K value reduces the num-ber of unique fingers tracked in the system by a node P, and (3)increasing L requires a node to keep track of more nodes at theleaf level.

5. From our measurements and previous knowledge of NSBoot-strap and ANT, we can conclude that the average number of CY-CLON rounds executed by a P2P node in the system depends onthe D value of its ANT. We see a linear relation between D andthe average CYCLON rounds in Figure 5.2(b). From this, we caninfer that the average number of CYCLON rounds required bya node in the system is proportional to log(N) (where N is thetotal number of P2P nodes in the system) as log(N) provides theD value of the ANT following our configurations.

5.3 NSSearch

In our setup, NSSearch is run on a node after it joins the system and hasattained a NSBootstrap stable state. For evaluations, we implement theiterative version of NSSearch. Our strategy to evaluate NSSearch fora system with an ANT of Depth D is described below:

Page 64: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 53

1. Wait for N nodes to execute NSBootstrap and attain a stable stateon a system with an ANT configuration of D=6, K=3 and L=4.

2. Generate 10 searchPrefix for each depth d ∈ [1..D] and select arandom node searchNode ∈ N and query it for count ∈ [1..d].

3. For each value of count and searchPrefix, repeat the experiment 10times, monitoring searchNode for metrics mentioned in 5.3.1.

5.3.1 Metrics measured

We measure the total number of successive requests to other nodesperformed by a searchNode to complete a NSSearch query. We termthese requests as NSSearch iterations. It is the number of times theloop inside our NSSearchIterative function in Algorithm 2 (lines 11-20) is executed to produce a response for a query. Practically, a searchiteration involves a search node P querying another node Q in thesystem for possible query results using FindForPrefix utility function.One NSSearch search iteration accounts to 2 messages passed in thesystem (a request and its response).

5.3.2 Validation

From the evaluation experiments, we observe that all NSSearch searchqueries returned results with a finite number of steps. This validatesour NSSearch protocol.

5.3.3 Evaluation

Table 5.2 lists the various search configurations deployed to evaluateNSSearch. Similar to the NSBootstrap test setup, it is to be assumedthat:

1. The number of nodes stored per childPrefix is set to 2. In otherwords, a node has pointers to either 1 or 2 other nodes in itsfinger table per childPrefix for each of its prefixLevels.

2. NSBootstrap CYCLON round interval is set to 3 seconds. More-over, the CYCLON Shuffle Length is set to 1.

3. A system with an ANT of Depth D=6, K=3 and L=4 is used for allexperiments. The mean of 10 runs is plotted for each experimentalong with its range.

Page 65: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

54 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

Table 5.2: Sample search queries built to evaluate NSSearch by chang-ing searchPrefix and count.

Depth(D)

samplesearchPrefix

count(C)

repeats

0 root. 1, 2, 3, 4 x101 root.a. 1, 2, 3, 4 x102 root.a.a. 1, 2, 3, 4 x103 root.a.a.b. 1, 2, 3, 4 x104 root.a.a.b.b. 1, 2, 3, 4 x105 root.a.a.b.b.a. 1, 2, 3, 4 x106 root.a.a.b.b.a.b. 1, 2, 3, 4 x10Total runs 240

5.3.4 Results and discussion

We have our evaluation results for NSSearch in Figure 5.7.

Figure 5.7: Number of total search iterations in the system for a search-Prefix of depth d and count c. The iterations for count=1 overlap theresults for count=2.

Page 66: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 55

Observations:

1. On the horizontal axis, we have the depth of the searchPrefix in-creasing by one as we move from left to right. For example, asample search for a query at with a depth of 5 would be to findcount nodes in root.a.a.b.b.a.. On the vertical axis, we havethe total NSSearch iterations.

2. Searching for a count=1 or 2 requires either 0 or 1 iteration or allcounts tested in this experiment.

3. The number of messages passed (or the search iterations) in-crease almost linearly as the depth of the searchPrefix increases.

Discussions

1. Given the specifications of our finger table, a search for count=1or 2 can be found either (1) in the same branch or (2) with some-one in another branch for any P2P node in the system. This ex-plains the lower red line (also masking the yellow line) in Figure5.7.

2. Searching for count>2 requires more requests through the sys-tem. A P2P node alone cannot satisfy these requests as we areonly storing either 1 or 2 nodes per childPrefix for each prefixLevels.We see that the number of search iterations increases as the depthof our searchPrefix increase (d from 4 to 6). We discuss the valueswith an example later in this section for a d > 2 case.

3. We see a variance in the total search iterations as the count anddepth of searchPrefix increases. We explain this with the follow-ing: (1) the randomness of NSBootstrap and CYCLON (Section4.4.4), and (2) a search node lying on the same branch as thesearchPrefix will be able to reply with fewer iterations (refer tothe example below).

4. From our measurements, we can conclude that the maximumNSSearch iterations required to answer a valid search query witha count ≤ L is log(N) + 1. Moreover, the average NSSearch it-erations required to answer a valid search with a count ≤ L islog(N).

Page 67: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

56 CHAPTER 5. EVALUATIONS AND DISCUSSIONS

An example execution of a search query is described below. As-sume a node root.c.c.a.a.a.a.node10 receives a search queryof the form:

query : prefix = root.a.a.b.b.a.b. ∧ count = 4 (5.5)

node10 should execute the steps described in Table 5.3.

Iter-ation

Process executed by a searchnode

Returns

1Find someone in root.a.

at root. prefixLevelroot.a.c.d.a.b.c.node1

2Ask root.a.c.d.a.b.c.node1 for

4 fingers in root.a.a.root.a.a.d.a.b.c.node2

...

3Ask root.a.a.d.a.b.c.node2 for

4 fingers in root.a.a.b.root.a.a.b.a.b.c.node3

...

4Ask root.a.a.b.a.b.c.node3 for

4 fingers in root.a.a.b.b.root.a.a.b.b.b.c.node4

...

5Ask root.a.a.b.b.b.c.node4 for

4 fingers in root.a.a.b.b.a.root.a.a.b.b.a.c.node5

...

6Ask root.a.a.b.b.a.c.node5 for4 fingers in root.a.a.b.b.a.b.

root.a.a.b.b.a.b.node6...

7Ask root.a.a.b.b.a.b.node6 for

4 fingers in root.a.a.b.b.a.b.

root.a.a.b.b.a.b.node6root.a.a.b.b.a.b.node7root.a.a.b.b.a.b.node8root.a.a.b.b.a.b.node9

Table 5.3: NSSearch iteration steps executed by node10 to find 4 fingersin root.a.a.b.b.a.b. searchPrefix. These 7 steps are responsible for themaximum value for d=6,c=4 in Figure 5.7.

5.4 Summary

In this section, we have evaluated both NSBootstrap and NSSearchalgorithms based on our implementations. As stated above, the corecontribution of the thesis is the node bootstrapping protocol, NSBoot-strap. NSSearch can be understood as performing a traversal throughthe cache built by NSBootstrap. In general, we find the following con-clusions from our experiments:

Page 68: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 5. EVALUATIONS AND DISCUSSIONS 57

• A P2P node executing NSBootstrap completes its bootstrappingphase over time with respect to its position on the ANT. The av-erage number of CYCLON rounds it takes to complete this phaseis proportional to log(N), where N is the total number of nodesin the system.

• A new P2P node joining a system of N nodes, after N nodes havealready attained a NSBootstrap stable state has to perform onaverage, log(N) number of CYCLON rounds to attain a NSBoot-strap stable state.

• Horizontal scaling of a P2P system executing NSBootstrap canbe planned with the knowledge of the K, D and L values of thesystem’s ANT. Adding more P2P nodes with increasing K or Lshows better results (smaller number of CYCLON iterations toattain a NSBootstrap stable state) when compared to increasingD.

• Searching through the system using NSSearch always returns aresponse to a valid query in the system with a number of stepsproportional to the searchPrefix depth (d) of the search query.

• There is a degree of randomness connected with NSBootstrapand NSSearch due to the underlying CYCLON algorithm, whichbuilds up the finger table cache.

Page 69: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Chapter 6

Conclusions

In this section, we conclude the thesis work by listing out future workand experiences from this project.

6.1 Future Work

We foresee the following future work from this thesis:

1. We designed our search mechanism keeping geographical loca-tions in mind. We believe it can be applied more broadly by cod-ing system properties into labels. In a real-world P2P OpenStacksystem, this can be physical properties like RAM, GPU capacity,etc. What can be coded to a label, and what cannot be needs tobe determined.

2. Integrating the proposed solution with the existing P2P Open-Stack system was not done as part of the thesis work. However,a centralized version of this approach was implemented outsidethe scope and timeline of the thesis work. Comparing our solu-tion with a centralized solution was not done as part of this thesiswork, and would have generated evaluation results with respectto scalability and performance.

3. From our plots, we can see that K≈L provides a better state of theANT with respect to the average number of CYCLON rounds ex-ecuted by the system. This was not explored further within thescope of this thesis work due to resource and time constraints.Testing with multiple other configurations with K≈L configura-tions should provide more insights.

58

Page 70: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

CHAPTER 6. CONCLUSIONS 59

4. NSBootstrap should react to node churns and failures. This isbriefly mentioned in Section 4.4. However, no evaluations wereperformed with such cases. Additionally, how the system prop-agate this information could be investigated.

5. The search index (finger table) built by NSBootstrap can be gen-erated with other P2P indexing based search approaches. Forexample, GAP protocol can be used instead of CYCLON to buildand maintain the finger table for a node. Evaluating the conver-gence time of such a solution can be used to generate compar-isons with our solution.

6. All evaluations for NSSearch were performed after the wholesystem had attained a stable state. However, a realistic evalua-tion should also test the correctness of the search protocol duringthe system bootstrapping phase.

6.2 Experiences

If we were to start over the thesis all over again, we would have donethe following differently:

1. We should have used both a simulation and emulation modelfor the test setup. We chose an emulated model with multiplethreads representing CYCLON executions on P2P node. Thiswas done to integrate the evaluation setup directly with the ex-isting P2P OpenStack setup. However, this limits our ability tointroduce more nodes into the system due to computational re-source constraints. To test very large systems (∼3 Million nodes),a simulated version of the solution would have been better. Asimulated version can run in a single thread, and can still pro-vide valid results for our metrics (except time measurements).Testing our solution with very large systems would have pro-vided better insights into the scalability and performance of thesolution.

2. With respect to implementation, we used Python threads to im-plement CYCLON execution. If we were to rethink the imple-mentation, we could have gone for other asynchronous solutions(like Python 3 asyncio). This would have reduced the complexityof the evaluation phase.

Page 71: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

60 CHAPTER 6. CONCLUSIONS

3. With respect to tooling, Python matplotlib was used to prepareplots in this thesis work. However, using R-plots would havegenerated better plots for our results (especially, for CPU met-rics).

4. As mentioned in the future works, topics related to a P2P nodeleaving and failing in a system is not evaluated in this thesis. Ourevaluation tests also assume that all nodes are started at the sametime and are healthy throughout the process. Adding realisticfactors (occasional node fail and join) to the evaluations wouldhave been a real addition to the thesis work.

Page 72: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Bibliography

[1] OpenStack. OpenStack Docs: Introduction to OpenStack. https://docs.openstack.org/security-guide/introduction/introduction-to-openstack.html. (Accessed on 03/09/2018).

[2] Xin Han. “Scaling OpenStack Clouds Using Peer-to-peer Tech-nologies”. In: Master’s thesis, Chalmers University of Technology,Gothenburg, Sweden (2017).

[3] Flavio Bonomi et al. “Fog computing and its role in the internetof things”. In: Proceedings of the first edition of the MCC workshopon Mobile cloud computing. ACM. 2012, pp. 13–16.

[4] Tom H Luan et al. “Fog computing: Focusing on mobile users atthe edge”. In: arXiv preprint arXiv:1502.01815 (2015).

[5] Borja Sotomayor et al. “Virtual infrastructure management inprivate and hybrid clouds”. In: IEEE Internet computing 13.5 (2009).

[6] Xiuqi Li and Jie Wu. “Searching techniques in peer-to-peer net-works”. In: Handbook of Theoretical and Algorithmic Aspects of AdHoc, Sensor, and Peer-to-Peer Networks (2006), pp. 613–642.

[7] Beverly Yang and Hector Garcia-Molina. “Improving search inpeer-to-peer networks”. In: Distributed Computing Systems, 2002.Proceedings. 22nd International Conference on. IEEE. 2002, pp. 5–14.

[8] Adrian Segall. “Distributed network protocols”. In: IEEE trans-actions on Information Theory 29.1 (1983), pp. 23–35.

[9] Mads Dam and Rolf Stadler. “A generic protocol for networkstate aggregation”. In: self 3 (2005), p. 411.

[10] Shlomi Dolev, Amos Israeli, and Shlomo Moran. “Self-stabilizationof dynamic systems assuming only read/write atomicity”. In:Distributed Computing 7.1 (1993), pp. 3–16.

61

Page 73: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

62 BIBLIOGRAPHY

[11] Arturo Crespo and Hector Garcia-Molina. “Routing indices forpeer-to-peer systems”. In: Distributed Computing Systems, 2002.Proceedings. 22nd International Conference on. IEEE. 2002, pp. 23–32.

[12] Zhonghong Ou. “Structured peer-to-peer networks: Hierarchi-cal architecture and performance evaluation”. In: Dissertation (2010).

[13] Ayalvadi J Ganesh, A-M Kermarrec, and Laurent Massoulié. “Peer-to-peer membership management for gossip-based protocols”.In: IEEE transactions on computers 52.2 (2003), pp. 139–149.

[14] Qin Lv et al. “Search and replication in unstructured peer-to-peer networks”. In: Proceedings of the 16th international conferenceon Supercomputing. ACM. 2002, pp. 84–95.

[15] Matei Ripeanu. “Peer-to-peer architecture case study: Gnutellanetwork”. In: Peer-to-Peer Computing, 2001. Proceedings. First In-ternational Conference on. IEEE. 2001, pp. 99–100.

[16] Spyros Voulgaris, Daniela Gavidia, and Maarten Van Steen. “Cy-clon: Inexpensive membership management for unstructured p2poverlays”. In: Journal of Network and Systems Management 13.2(2005), pp. 197–217.

[17] Robin Joseph. Enhancing OpenStack clouds using P2P technologies.2017.

[18] 6.1. 1000 Compute nodes resource scalability testing — performance_docs0.0.1.dev196 documentation. https://docs.openstack.org/developer/performance-docs/test_results/1000_nodes/index.html. (Accessed on 03/09/2018).

[19] OpenStack Havana Scalability Testing - Test Plan [Solutions] - Cisco.https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Data_Center/OpenStack/Scalability/OHS/OHS2.html. (Accessed on 03/09/2018).

[20] Michael Mitzenmacher. “The power of two choices in random-ized load balancing”. In: IEEE Transactions on Parallel and Dis-tributed Systems 12.10 (2001), pp. 1094–1104.

[21] Misbah Uddin, Rolf Stadler, and Alexander Clemm. “A bottom-up design for spatial search in large networks and clouds”. In:International Journal of Network Management (2016), e2041.

Page 74: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

BIBLIOGRAPHY 63

[22] Initializing a multiple node cluster (multiple datacenters). https://docs.datastax.com/en/cassandra/2.1/cassandra/initialize/initializeMultipleDS.html. (Accessed on03/09/2018).

[23] David Ulevitch. Recursive DNS nameserver. US Patent 8,606,926.Dec. 2013.

Page 75: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Curriculum Vitae

64

Page 76: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Tony Thomas [email protected] | (+46) 721285096 | Stockholm, Sweden | Github | LinkedIn  

Experience __________________________________________________________________________________________________________________________________________________________________________________  

3YOURMIND GmbH — Python Backend Developer / Freelancer  Berlin JUN 2017 - PRESENT 

- Designed and developed RESTful APIs for 3D printing industry E-commerce platform  - Wrote and integrated 3rd-party Single Sign On (SAML, OAuth, JWT) plugins for customers - Developed on python-django, django-rest-framework and Vue.js frontend framework. Took part in critical design 

decisions.  - Optimized database queries to achieve significant improvements on page load time. Ensured team culture with 

clean-code, TDD, readability and peer-review practises.   

Ericsson Cloud Research — Master Thesis Student & Summer Worker  Stockholm JAN 2018 - SEP 2018 

- Designed, developed and evaluated a location-based distributed search mechanism for Peer to Peer systems  - Integrated the mechanism with an in-house developed distributed OpenStack system - Automated tests with moderately large systems (1-3000 nodes) using Python, Docker, OpenStack 

 

Igalia — Coding Experience Intern  Spain (Remote) AUG 2016 - JUL 2017 

- Designed and developed a dashboard application to visualize analytical data from browser testing bots. Utilized by the the internal browser development team to formulate decisions.  

- Wrote RESTful APIs to accept data streams. Developed on Python, django-rest-framework and AngularJS frontend framework. Used JQuery Flot to draw plots   

- Deployed ( here ), source code ( here )  

Education __________________________________________________________________________________________________________________________________________________________________________________  

KTH Royal Institute of Technology & TU Berlin — Master of Science  Stockholm, BerlinAUG 2016 - AUG 2017 in Berlin, AUG 2017 - PRESENT in Stockholm 

- EIT ICT Innovation program. Specialized on Internet Technologies and Architecture - Communication System Design (Worked with mininet, Apache Cassandra)  - Management of Networks and Networked Systems (Machine Learning basics using scikit and pandas) 

Projects __________________________________________________________________________________________________________________________________________________________________________________  

Bounce Handler — Google Summer of Code 2014 Wikimedia Foundation (Remote) - Processes bounce emails to Wikimedia production cluster. Deployed on 740+ wikis including English Wikipedia.  - On average, handles ~ 9 bounces per day [ 1 ]. Reduced the (%) of total mail bounce events at production cluster.  - Source code ( here ). Developed on PHP, bash, puppet, vagrant, bind9, postfix.  

Skills __________________________________________________________________________________________________________________________________________________________________________________  

Programming languages / frameworks : Python, Django, Flask & SQLAlchemy, Tornado, PHP, Java, C++; Machine Learning : scikit, pandas; Technologies : Linux, Docker, OpenStack; Databases : MySQL, PostgreSQL; Frontend : AngularJS, Vue.js, JQuery; Monitoring : Grafana, New Relic; Inspection : Django debug toolbar; 

Publications __________________________________________________________________________________________________________________________________________________________________________________  

Reactive Replica Selection Algorithm for Geo-Distributed Systems — SNCNW 2018, Sweden Co-authored with Seregi and Bogdanov, KTH. Introduced and evaluated a novel Replica Selection Algorithm (RSA) for Apache Cassandra. Based on work done as part of Communication System Design project at KTH.   

Policy Based Storage Abstraction for Video Surveillance Systems —ICCIC, 2016, India Co-authored with K. Rajan, Johnson, S. Anjana, Alangot. Introduced and developed a policy based storage control abstraction for video surveillance system and partially deployed it in the University premises. 

Page 77: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

Appendix A

Sample Code

A.1 Sample NSBootstrap operation

66

Page 78: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

APPENDIX A. SAMPLE CODE 67

A.2 Sample NSBootstrap implementation

Sample code of NSBootstrap algorithm that iterates through prefixLevels.Line 18 starts the CYCLON thread.

Listing A.1: Sample NSBootstrap algorithm1 def popula te_ f ingers ( provider , c u r r e n t _ p r e f i x = ’ root . ’ ,2 n e x t _ p r e f i x = ’ ’ , my_label= ’ ’ , prev ious_pre f ix=None ) :3 """4 Core l o g i c t h a t i t e r a t e s th rough p r e f i x L e v e l s and s t a r t s a5 CYCLON t h r e a d p e r p r e f i x L e v e l6 : param p r o v i d e r : The p r o v i d e r node f o r t h i s l e v e l7 : param c u r r e n t _ p r e f i x : Current p r e f i x e x e c u t i n g NSBoots t rap8 : param n e x t _ p r e f i x : Next p r e f i x a f t e r Current P r e f i x9 : param m y _ l a b e l : L a b e l o f t h e node

10 """11 provider . fe tch_and_udpate_for_pref ix (12 c u r r e n t _ p r e f i x , my_label )1314 i f n e x t _ p r e f i x != ’ { 0 } . ’ . format ( my_label ) :15 w a i t _ t i l l _ m o r e _ t h a n _ y o u _ a t _ t h i s _ l e v e l ( c u r r e n t _ p r e f i x ,16 provider , my_label )1718 # S t a r t CYCLON a t l e v e l19 e x e c u t e _ c y c l o n _ a t _ l e v e l ( c u r r e n t _ p r e f i x , my_label )20 i f n e x t _ p r e f i x == ’ { 0 } . ’ . format ( my_label ) :21 return2223 next_pre f ix_provider = wai t_ t i l l _node_match ing_next_pre f ix (24 c u r r e n t _ p r e f i x , next_pre f ix , my_label25 )2627 prev ious_pre f ix = c u r r e n t _ p r e f i x28 c u r r e n t _ p r e f i x = n e x t _ p r e f i x29 n e x t _ p r e f i x = g e t _ n e x t _ p r e f i x (30 c u r r e n t _ p r e f i x = c u r r e n t _ p r e f i x , my_label=my_label31 )

Page 79: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

68 APPENDIX A. SAMPLE CODE

3233 return popula te_ f ingers (34 provider= n e x t _ p r e f i x _ f i n g e r [ ’ ip ’ ] ,35 c u r r e n t _ p r e f i x = c u r r e n t _ p r e f i x ,36 n e x t _ p r e f i x =next_pre f ix , my_label=my_label ,37 prev ious_pre f ix=prev ious_pre f ix38 )

A.3 Sample CYCLON implementation

Sample code of CYCLON protocol implemented to run for each pre-fixLevel in python. The function accepts a prefixLevel and execute aninstance of CYCLON in a thread. The neighbor table of a node is storedin Controller.memory_cache.

Listing A.2: Sample CYCLON implementation1 def _ e x e c u t e _ c y c l o n _ a t _ l e v e l ( c u r r e n t _ p r e f i x , my_label ) :2 while True :3 for idx , node in enumerate (4 C o n t r o l l e r . memory_cache [ c u r r e n t _ p r e f i x ] ) :5 i f node [ ’ l a b e l ’ ] != my_label :6 C o n t r o l l e r . memory_cache [7 c u r r e n t _ p r e f i x ] [ idx ] [ ’ age ’ ] += 189 o l d e s t _ f i n g e r = sorted (

10 C o n t r o l l e r . memory_cache [ c u r r e n t _ p r e f i x ] ,11 key=lambda x : ( x [ ’ age ’ ] , random . random ( ) ) ,12 reverse=True13 ) [ 0 ]1415 f i n g e r s _ w i t h o u t _ o l d e s t = l i s t (16 f i l t e r (17 lambda x : x [ ’ l a b e l ’ ] != o l d e s t _ f i n g e r [ ’ l a b e l ’ ] ,18 C o n t r o l l e r . memory_cache [ c u r r e n t _ p r e f i x ]19 )20 )2122 random_other_samples = get_random_other_smart_samples (23 f ingers_wi thout_o ldes t , my_label , c u r r e n t _ p r e f i x

Page 80: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

APPENDIX A. SAMPLE CODE 69

24 )2526 i f not random_other_samples :27 time . s leep ( 3 )28 continue2930 to_send_f ingers = update_my_value_to_zero (31 random_other_samples , my_label ,32 C o n t r o l l e r . memory_cache [ ’ ip ’ ]33 )3435 to_send_targe t = o l d e s t _ f i n g e r [ ’ ip ’ ]36 t h e i r _ f i n g e r s _ r a w = send_table_to_node_and_fetch (37 to_send_target , c u r r e n t _ p r e f i x ,38 to_send_f ingers39 )40 i f not t h e i r _ f i n g e r s _ r a w :41 # FAIL : remove t h i s node42 C o n t r o l l e r . memory_cache [ c u r r e n t _ p r e f i x ] . pop (43 t o _ s e n t _ t a r g e t44 )45 time . s leep ( 3 )46 continue4748 t h e i r _ f i n g e r s = j son . loads ( t h e i r _ f i n g e r s _ r a w )49 update_my_f ingers_for_pref ix (50 my_label , c u r r e n t _ p r e f i x , t h e i r _ f i n g e r s ,51 to_send_f ingers52 )53 # CYCLON i n t e r− i t e r a t i o n wa i t t ime54 time . s leep ( 3 )

The CYCLON function is invoked per level using Eventlet greenthreads library:

Listing A.3: Invoking a CYCLON thread using Eventlet library1 import e v e n t l e t2 def execute_exchange_at_ leve l ( c u r r e n t _ p r e f i x , my_label ) :3 e v e n t l e t . spawn_n (4 _ e x e c u t e _ c y c l o n _ a t _ l e v e l ,

Page 81: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

70 APPENDIX A. SAMPLE CODE

5 c u r r e n t _ p r e f i x = c u r r e n t _ p r e f i x , my_label=my_label6 )

A.4 Finding next prefix of a label

Listing A.4: Sample prefix matching implementations to figure out thenext prefix for a label at a prefix

1 import re23 def g e t _ n e x t _ p r e f i x ( c u r r e n t _ p r e f i x = ’ root . ’ , my_label= ’ ’ ) :4 """5 For a l a b e l r o o t . a . b . node4 , i t s h o u l d r e t u r n r o o t . a .6 f o r a c u r r e n t p r e f i x o f r o o t .7 : param c u r r e n t _ p r e f i x : c u r r e n t p r e f i x in q u e s t i o n8 : param m y _ l a b e l : Sample node l a b e l9 : r e t u r n :

10 """11 matched_until = re . match ( c u r r e n t _ p r e f i x , my_label )12 t r y :13 f o u n d _ s t r i n g _ t i l l = matched_until . span ( ) [ 1 ]14 except :15 return my_label1617 # now add t h i s t o t h e r e s t18 return ’ { 0 } { 1 } . ’ . format ( c u r r e n t _ p r e f i x ,19 my_label [ f o u n d _ s t r i n g _ t i l l : ] . s p l i t ( ’ . ’ ) [ 0 ] )

This function is used extensively inside NSBootstrap and NSSearchto figure out the next prefix of a given node at a prefixLevel.

A.5 NSSearchIterative

The snippet A.5 describes how a node responds to a search query. Line21 calls the iterative function which further queries searchTargets whenrequired.

Listing A.5: Sample NSSearch implementation in an iterative fashion1 def e x e c u t e _ n s s e a r c h _ i t e r a t i v e ( my_label , search_pre f ix ,2 search_count ) :

Page 82: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

APPENDIX A. SAMPLE CODE 71

3 i t e r a t i o n s = 04 answers = [ ]56 t r y :7 answers = C o n t r o l l e r . memory_cache [ s e a r c h _ p r e f i x ]8 next_search_nodes = answers9 i f len ( answers ) >= search_count :

10 return True , i t e r a t i o n s , answers11 except KeyError :12 next_search_nodes = get_next_search_node (13 my_label , s e a r c h _ p r e f i x14 )15 i f not next_search_nodes :16 return False , i t e r a t i o n s , [ ]17 i t e r a t i o n s += 118 next_search_nodes = remove_myself_from_fingers (19 next_search_nodes , my_label20 )2122 return _execute_query_on_targets (23 next_search_nodes , search_pre f ix , i t e r a t i o n s ,24 search_count , answers25 )

Snippet A.6 describes the actual iterative process of pinging searchTar-gets for replies.

Listing A.6: ’Node receiving NSSearchIterative pings possible searchtargets for answers’

1 def _execute_query_on_targets (2 next_search_nodes , search_pre f ix , i t e r a t i o n s ,3 search_count , answers=None4 ) :5 p r e f i x = service_path_map [ ’ l o c a t o r ’ ] [ ’ p r e f i x ’ ]6 api_ver = service_path_map [ ’ l o c a t o r ’ ] [ ’ api_vers ion ’ ]7 found = Fa lse8 next_search_nodes = g e n e r a t e _ u n i q u e _ l i s t ( next_search_nodes )9

10 for t a r g e t in next_search_nodes :11 i t e r a t i o n s += 1

Page 83: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

72 APPENDIX A. SAMPLE CODE

12 t r y :13 response = reques ts . get (14 ’ ht tp : / / { 0 } / { 1 } / { 2 } / search/f ind ’ . format (15 t a r g e t [ ’ ip ’ ] , pre f ix , api_ver16 ) ,17 params ={18 ’ search ’ : search_pre f ix ,19 ’ count ’ : search_count20 }21 )22 except :23 continue2425 response_parsed = json . loads ( response . t e x t )26 i f response_parsed [ ’ success ’ ] :27 answers . extend ( response_parsed [ ’ f i n g e r s ’ ] )28 answers = g e n e r a t e _ u n i q u e _ l i s t ( answers )2930 i f len ( answers ) >= search_count :31 return True , i t e r a t i o n s , answers3233 next_search_nodes . extend ( response_parsed [ ’ f i n g e r s ’ ] )34 next_search_nodes = g e n e r a t e _ u n i q u e _ l i s t (35 next_search_nodes36 )3738 i f len ( answers ) >= search_count :39 found = True4041 return found , i t e r a t i o n s , answers

Page 84: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

APPENDIX A. SAMPLE CODE 73

(a)I

niti

alst

ate

ofth

esy

stem

(b)

Nod

ero

ot.a

.b.n

ode4

join

s,in

trod

uced

toro

ot.e

.f.h.

node

8(c

)roo

t.a.b

.nod

e4po

pula

tes

root

.lev

el

(d)r

oot.a

.b.n

ode4

fetc

hm

ore

info

rmat

ion

for

root

.lev

el(e

)roo

t.a.b

.nod

e4po

pula

tes

root

.a.a

ndro

ot.a

.b.l

evel

,att

ains

ast

able

stat

e

Figu

reA

.1:N

SBoo

tstr

ap:L

ifet

ime

ofa

new

node

root

.a.b

.nod

e4jo

inin

ga

P2P

syst

em

Page 85: Location-based Search Service for a P2P OpenStack System1272242/FULLTEXT01.pdf · Service for a P2P OpenStack System TONY THOMAS Master in ICT Innovation Date: October 24, 2018 Industry

www.kth.seTRITA-EECS-EX-2018:703