Private, secure, and censorship resistant document...

94
IN DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY, SECOND CYCLE, 30 CREDITS , STOCKHOLM SWEDEN 2018 Private, secure, and censorship resistant document sharing for individuals and groups based on distributed ledger technology JENS RÖWEKAMP KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Transcript of Private, secure, and censorship resistant document...

IN DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2018

Private, secure, and censorship resistant document sharingfor individuals and groups based on distributed ledger technology

JENS RÖWEKAMP

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

TRITA EECS-EX-2018:519

www.kth.se

AbstractThe scandal around Facebook and Cambridge Analyticain 2017 showed drastically that new concepts to share andstore information need to be developed in order to minim-ize the huge potential for abuse resulting from centralizedinformation stored at trusted third parties.This thesis analysed to what degree current document ex-change systems (e.g. Dropbox) comply with the informationsecurity services confidentiality, integrity, privacy, anonym-ity, authenticity of authors, non-repudiation, and accountab-ility; with the result that all analysed systems lack supportfor privacy and anonymity. Mainly due to their centralizeddesign, missing (meta)data encryption, and regulations ofjurisdictions in which they operate.Based on that analysis a decentralized concept for docu-ment sharing in a peer-to-peer fashion utilising client-sideencryption, the separation of data and metadata, metadatamasking through Tor hidden services, and distributed ledgertechnology for directory service provision, was developed.The concept was proven through prototype implementationof a document exchange software called docShare and itsinformation security services were compared with formeranalysed exchange technologies. The analysis showed thatdocShare has a better information security service provi-sion but is still leaking identity information in form of IPaddresses when interacting with the distributed ledger Eth-ereum. Mainly because Ethereum doesn’t support trafficanonymization through Tor.

iii

ReferatPrivat, säker och censurresistent documentsharing för individer och grupper baserad på

DLT

Skandalen kring Facebook och Cambridge Analystica i 2017visade drastiskt att nya koncept för hur information delasoch sparas behöver utvecklas för att reducera den storamissbrukspotentialen som är ett resultat av att informationsparas centralt hos betrodda tredje partier.Denna avhandling analyserar till vilken utsträckning nuva-rande document exchange system(s) (till exempel Dropbox)följer säkerhetsservices såsom förtrolighet, integritet, ano-nymitet, autorernas autenticitet, påvislighet och räkenskap.Undersökningen visar att alla analyserade system saknarstöd för integritet och anonymitet, mest på grund av dencentraliserade designen, saknande informationskrypteringoch de juridiska reglerna som gäller för dem.Baserad på denna undersökning utvecklades ett koncept förpeer-to-peer document sharing som innebär att informationkrypteras, att information och metainformation separeras,metainformation skyddas genom TOR hidden services samtatt DLT används för katalogtjänster.Detta koncept bevisades genom prototypisk implementationav en dokumentutbytningssoftware som kallas för docSha-re vars information security services jämfördes med andraanalyserade utbytetekniker. Analysen visade att docSharehar en bättre information security service tillhandhållande,men den läcker fortfarande identitetinformationer i formav IP adresser när den interagerar med den distribuerandeledger Ethereum, främst för att Ethereum inte stödjer trafficanonymisering genom Tor.

iv

Acknowledgments

I am deeply thankful to

• my family, friends, fellow students, and professors for their support during mystudies,

• Pia Ströhle and Anja Pflugfelder for translating my abstract to Swedish, and

• George Lucas for creating the Star Wars universe.

Menlo Park, CA, 27th August 2018Jens Röwekamp

v

Contents

1 Introduction 11.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Research methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3.2 Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.5 Scope, limitations, and assumptions . . . . . . . . . . . . . . . . . . 41.6 Ethical and sustainable aspects . . . . . . . . . . . . . . . . . . . . . 51.7 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Technical background 72.1 Information security services . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Data integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Data confidentiality . . . . . . . . . . . . . . . . . . . . . . . 72.1.3 Data security, privacy, and anonymity . . . . . . . . . . . . . 82.1.4 Non-repudiation . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.5 Accountability . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.6 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.7 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.8 Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.1 Symmetric cryptography . . . . . . . . . . . . . . . . . . . . . 102.2.2 Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.3 Asymmetric cryptography . . . . . . . . . . . . . . . . . . . . 112.2.4 Public key management . . . . . . . . . . . . . . . . . . . . . 132.2.5 Digital envelopes . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.6 Zero-knowledge proof . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Distributed ledger . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.2 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.3 Tendermint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.4 Corda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4 The Onion Router . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

vi

CONTENTS

2.4.1 Traffic anonymization . . . . . . . . . . . . . . . . . . . . . . 222.4.2 Location-hidden-services . . . . . . . . . . . . . . . . . . . . . 22

3 Security assessment of related work 253.1 Document exchange use case analysis and sensitivity categorization . 253.2 Centralized file-sharing services . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2.2 Dropbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2.3 Nextcloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2.4 Citrix RightSignature . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Decentralized data storage and file-sharing services . . . . . . . . . . 303.3.1 Sia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.3.2 Storj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.3.3 SecuRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.4.1 Secure E-Mail through OpenPGP . . . . . . . . . . . . . . . . 34

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4 Prototype specification 384.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.1.1 Identity management . . . . . . . . . . . . . . . . . . . . . . . 404.1.2 Anonymization network . . . . . . . . . . . . . . . . . . . . . 414.1.3 docShare client . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2.1 User registration in the identity management system (IMS) . 424.2.2 Legal identity verification of an entity in the IMS . . . . . . . 424.2.3 Data exchange with compliance to private . . . . . . . . . . . 444.2.4 Data exchange with compliance to anonymous . . . . . . . . 444.2.5 Data exchange with compliance to tracking . . . . . . . . . . 444.2.6 Data exchange with compliance to business . . . . . . . . . . 47

5 Prototype implementation 495.1 Limitations to the specification . . . . . . . . . . . . . . . . . . . . . 495.2 Public identity management service . . . . . . . . . . . . . . . . . . . 495.3 Key value storage for tracking . . . . . . . . . . . . . . . . . . . . . . 505.4 Anonymization network . . . . . . . . . . . . . . . . . . . . . . . . . 515.5 docShare client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.5.1 SQLite3 metadata database . . . . . . . . . . . . . . . . . . . 525.5.2 docShare library . . . . . . . . . . . . . . . . . . . . . . . . . 525.5.3 Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.5.4 Terminal scripts . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Prototype evaluation 626.1 Evaluation environment . . . . . . . . . . . . . . . . . . . . . . . . . 626.2 Public identity management . . . . . . . . . . . . . . . . . . . . . . . 62

vii

6.3 Partner management . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.4 Document exchange with regards to secure and private . . . . . . . . 646.5 Document exchange with regards to anonymous . . . . . . . . . . . . 646.6 Document exchange with regards to tracking . . . . . . . . . . . . . 656.7 Document exchange with regards to business . . . . . . . . . . . . . 656.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656.9 Theoretical connection of the concept . . . . . . . . . . . . . . . . . 67

7 Conclusion 687.1 Recommendations for future work . . . . . . . . . . . . . . . . . . . 69

Bibliography 71

AppendenciesA Declaration of independence 79B Installation and usage guide 80C Digital Content 84

List of Figures

2.1 Structure of a hash-linked list. . . . . . . . . . . . . . . . . . . . . . . . 112.2 Structure of a Merkle tree. . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Structure of the Bitcoin blockchain. . . . . . . . . . . . . . . . . . . . . 172.4 High level structure of a replicted state machine. . . . . . . . . . . . . . 202.5 Traffic anonymization in Tor. . . . . . . . . . . . . . . . . . . . . . . . . 222.6 Simplified version of the Tor rendezvous protocol. . . . . . . . . . . . . . 23

4.1 Prototype architecture overview. . . . . . . . . . . . . . . . . . . . . . . 39

5.1 Screenshot of docShare’s public identity management system’s web-frontend. 505.2 Screenshot of docShare’s key value store’s web-frontend. . . . . . . . . . 515.3 Message format specification used in docShare communications. . . . . . 525.4 Digital envelope format for documents of sensitivity secure and private. 545.5 Digital envelope format for documents of sensitivity tracking. . . . . . . 55

6.1 Environment used to evaluate docShare’s implementation. . . . . . . . . 63

viii

List of Tables

List of Tables

2.1 Specification of combinations of data security, privacy, and anonymity. . 8

3.1 Summary of information security services based on the use case analysis. 263.2 Categorization of documents based on their need of information security

services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.3 Compliance of analysed document-sharing systems with defined document

sensitivity levels in section 3.1. . . . . . . . . . . . . . . . . . . . . . . . 353.4 Overview of the analysed systems with regards to the information security

services and features provided. . . . . . . . . . . . . . . . . . . . . . . . 36

5.1 List of docShare’s message protocol formats. . . . . . . . . . . . . . . . 535.2 Message handling by comDaemon. . . . . . . . . . . . . . . . . . . . . . 56

6.1 Summary of docShare’s implementation evaluation regarding compliedwith information security services. . . . . . . . . . . . . . . . . . . . . . 66

6.2 Summary of docShare’s implementation evaluation regarding compliedwith document’s levels of sensitivity. . . . . . . . . . . . . . . . . . . . . 66

ix

List of Abbreviations

BYOE bring-your-own-encryption

CA certificate authority

Dapp distributed application

DHT distributed hash table

DLP data loss prevention

DSS document sharing system

EVM Ethereum Virtual Machine

IMS identity management system

PKCS Public-Key Cryptography Standard

PKI public key infrastructure

SSO single-sign-on

Tor The Onion Router

URL uniform resource locator

UTXO Unspent Transaction Output

x

Chapter 1

Introduction

The Facebook scandal around Cambridge Analytica in 2017 showed drastically thatcentralized information stored at trusted third parties has a huge abuse potential.In Facebook’s case private information of over 50 million users was illegally sharedwithout consent and used by Cambridge Analytica to provide voter profiles used toinfluence the U.S. elections in 2016 and the Brexit referendum. [1]The Snowden leaks from 2013 disclosed the federal surveillance projects PRISM,Tempora and XKeyScore and how British and U.S. intelligence agencies use themto massively collect meta- and content data at internet backbones around the globe.[2] Under the context that the U.S. military kills on the basis of metadata, [3]the protection of metadata at third party services gets a higher significance thannaturally anticipated.This thesis tries to make an effort to analyse a sub-problem of this general problemspace by examining how (sensitive) documents are shared through the services oftrusted third parties like Dropbox [4].

1.1 Problem statementThe exchange and signing of a contract between two parties is an example for securedocument exchange. Depending on the sensitivity of the document to be exchangeddifferent precautions need to be considered and different technologies are used. Theparties could for example meet in person, use a courier for the delivery, sent thecontract via fax or (secure) e-mail, or use a document sharing service like Dropboxor Citrix RightSignature [5]. Concepts from information security like data integrity,confidentiality, privacy, anonymity, non-repudiation, authenticity of authors andaccountability are essential to be considered when choosing the right exchangetechnology.All above mentioned technologies offer different degrees of information securitywhen exchanging a document. Depending on the sensitivity of the document tobe exchanged, different information security services are required. Especially whenusing the services of a trusted third-party these information security services are hard

1

CHAPTER 1. INTRODUCTION

to impossible to verify as a client relies on the correct server-side implementationand security provision which she simply can’t influence nor control.Third-party services can be further undermined by law. The USA Patriot Act [6] forexample allowed U.S. federal agencies for terror prevention purposes to access fileshosted by any U.S. company. Its successor, the Freedom Act [7], is currently used tolegally access data stored at cloud services hosted in the United States. Dropbox forexample states in its transparency reports [8] that they were forced in 1,619 cases in2016 to provide user content through search warrants. The number of received andcomplied with national security letters though needed to be kept secret. Its darkfigure might be much higher.Recent developments in distributed ledger technology allow the development ofdistributed systems that don’t rely on the services of a trusted third-party orcommunication partner and could be utilized for secure document exchange.

1.2 Research methodologyA constructive research methodology [9, 10] was chosen. From the assumptionof lacking information security services in current document sharing systems tworesearch questions emerged:

1. to what degree do current document sharing system (DSS) meet the informationsecurity requirements of their users?

2. how could the information security of document exchange be improved usingdistributed ledger technology?

To answer the first research question a literature review will be conducted todefine information security parameters for document exchange systems. Afterwardscommonly used DSS will be analysed with regards to their compliance to examinedinformation security parameters.To answer the second research question a feasibility study will be conducted to builda decentralized DSS that can operate without trusting any third-party. After thefeasibility is proven through prototype implementation, the theoretical connectionand the research contribution of the solution will be shown. Finally, the scope ofapplicability of the solution will be examined.

1.3 GoalsThis thesis aims at analysing existing DSS with regards to their information securityservices and features provided, and at designing and implementing a private, secure,and censorship resistant DSS for individuals and groups on the basis of distributedledger technology. The DSS should guarantee different levels of information securitybased on the sensitivity of the documents to be exchanged.At the end of this thesis two goals will be achieved. First, relevant document

2

CHAPTER 1. INTRODUCTION

exchange systems will be analysed and compared with regards to their supportedinformation security services, and second a prototype of a decentralized documentsharing system addressing the limitations distinguished during former analysis willbe built.

1.3.1 ObjectivesTo reach these goals following objectives need to be met.

1. different information security services for the exchange of documents need tobe identified. Then documents need to be categorised into levels of sensitivitybased on their need of security services.

2. existing document sharing systems need to be analysed with regards to theircompliance to information security services and key concepts need to beextracted.

3. a concept for a decentralized DSS which supports the exchange of documentsof different levels of sensitivity between individuals and groups needs to becreated.

4. a rudimentary prototype that proves the concept for each defined level ofsensitivity needs to be created.

5. the prototype needs to be analysed with respect to its compliance to thedocuments’ levels of sensitivity as defined in 1.

1.3.2 DeliverablesThis project has following deliverables:

Report: This report as analysis of existing DSS and as feasibility evaluation ofthe specified and prototyped decentralized document sharing system.

Prototype: A rudimentary proof-of-concept of a decentralized DSS to share documentsof different levels of sensitivity between groups and individuals.

1.4 PurposeThe purpose of this thesis is twofold. First, uncensored information exchange isneeded maintain free speech, formation of opinion and democracy. Considering recentefforts in the European Parliament to introduce upload filters, to remove illegallyuploaded copyrighted material from content platforms like YouTube or Wikipedia,[11] realities are close where content platform operators need to fear huge penalties ifthey share copyrighted material. As a consequence, they will most likely implementnontransparent self-learning algorithms to decide which material to publish and

3

CHAPTER 1. INTRODUCTION

which to filter. These algorithms will filter material which they can’t 100% identifyas non-copyrighted material and therefore remove also legitimate material. But thelarger danger lies in the huge abuse potential inherent in nontransparent censorshipinfrastructure. Image a scenario where a content provider uses its nontransparentcensorship infrastructure for its own agenda, to for example influence the nextelections.Second, as advancements in machine learning, storage, and computational powerallow us to largely collect and analyse data and metadata; profiling of this data formarketing, terror prevention, and other purposes became reality. The problem isthat people aren’t aware if they are tracked and how their information is used, andconsequently could change their behaviour to avoid unpleasant situations. Consideran imaginary scenario where an internet service provider tracks what kind of foodyour ordered through the internet and shares that information with your healthinsurance company, which will depending on the healthiness of the food orderedincrease or decrease your premiums.The author believes that through the concept of a decentralized, private, secure, andcensorship resistant information exchange system alternatives can be built to easeillustrated situations and to help to maintain free speech, form of opinion, democracy,and unaltered behaviour.

1.5 Scope, limitations, and assumptionsThe scope of this thesis is the design and evaluation of information security servicesof (distributed) document sharing systems. Therefore, performance considerationsare secondary. Further, due to limitations in time and resources, this thesis underliesfollowing assumptions and limitations:

• To ensure anonymity a TCP anonymization software called The Onion Router(Tor) (cf. section 2.4) is used in design and implementation. It is assumedthat Tor, if used like documented in its best practices, sufficiently anonymizesits users.

• It is assumed that more than 50% of the participating nodes used in thedistributed ledger are behaving as intended and aren’t malicious.

• It is assumed that the used cryptographic algorithms (SHA-256, RSA andAES) and the underlying operating system on which these algorithms areexecuted on are secure.

• It is assumed that every user will act as miner to maintain the distributedledger. Therefore, incentive considerations of used distributed ledger won’t behandled in detail.

• The evaluated implementation will be seen as proof-of-concept, there will beno formal verification of the algorithms used.

4

CHAPTER 1. INTRODUCTION

1.6 Ethical and sustainable aspectsRegarding ethics and sustainability there are two important aspects to consider.First, there is the ethical dilemma that users on the one hand need tools to exchangelegal information like contracts or trade secrets securely and privately. On theother hand, the same tools could be used to exchange illegal information like childabusive material securely and privately. From the point of technology there is nodifference in the data being transmitted. Every content would be transferred asencrypted bitstream to guarantee confidentiality. Therefore, it is hard to impossibleto control if illegal material would be transmitted without sacrificing the securityservices provided. It is further to consider that the definitions of legal and illegalmaterial differ between jurisdictions, even between persons. If content filtering wouldbe technological possible without sacrificing security services, there arises anotherquestion: Who would be the party to decide which contents are allowed to be sharedand which are forbidden? As a consequence, to this dilemma, the author decided notto publish the source code of the prototype implementation in an online repository,but to hand it out to interested researches on personal request.The second aspect is of more technical nature and impacts sustainability. Manydistributed ledger technologies like Bitcoin (cf. section 2.3.1) provide a publicavailable append only log that grows indefinite within time. In this log a series oftransactions is saved to maintain a global state of the system. Therefore, the wholelog is needed to verify and add new entries. As the further discussed solution will bebased on distributed ledger technology, the question arises if it is sustainable to firstdownload a huge transaction log (e.g. 100GB) to only share a small file (e.g. 4MB).Further, to agree on such shared transaction log a consensus algorithm is needed.Currently proof of work is one of the most common used consensus algorithmsin distributed ledgers. Unfortunately, proof of work uses a lot of computationalresources. According to [12, 13] Bitcoin mining has an energy footprint similarto the Republic of Ireland. Therefore, alternative consensus algorithms should beconsidered to reduce the energy footprint of the solution to develop.

1.7 OutlineThis thesis is structured as follows:

Chapter 2 covers the needed technical background to understand this thesis. Itfurther discusses and defines relevant information security services for exchangingdocuments.In chapter 3 documents are categorised into levels of sensitivity based on their needof information security services and other features provided. The categorizationwill be performed on the basis of practical use-cases. Afterwards, relevant DSS andconcepts are introduced and analysed with regards to their information securityservices and features provided.

5

CHAPTER 1. INTRODUCTION

In chapter 4 the architecture of the decentralized DSS is sketched to exchangedocuments of different levels of sensitivity between individuals and groups.Chapter 5 describes the prototype implementation as proof-of-concept.In chapter 6 the prototype is evaluated with regards to its information securityservices. It further shows the theoretical connection of the concept.Finally, chapter 7 discusses the research contribution, draws a conclusion, and finisheswith recommendations for future work.

6

Chapter 2

Technical background

In this chapter relevant technical background information is covered to understandthis thesis. It further provides relevant definitions used within this thesis.

2.1 Information security servicesInformation security is the discipline of guaranteeing certain security propertiesin information systems by providing equivalent security services. These securityproperties and services are explained below. [14, p.153,274ff]

2.1.1 Data integrityData integrity is the “property that data has not been changed, destroyed, or lost inan unauthorized or accidental manner.” [14, p.95]Data integrity services can’t protect data from being changed but ensure that changesto data are detectable. Amongst other hashes can be used to guarantee data integrity.If the hash of a transferred file equals the hash of its original it was transferred withintegrity.

2.1.2 Data confidentialityData confidentiality is the “property that data is not disclosed to system entitiesunless they have been authorized to know the data.” [14, p.94]Data confidentiality services provide data confidentiality by using encryption. Anexample for a protocol which provides data confidentiality is SSL [15].

Depending on an application’s use case, data confidentiality can be applicableonly to actual user data, or also its meta data.

7

CHAPTER 2. TECHNICAL BACKGROUND

2.1.3 Data security, privacy, and anonymityData security is the property to “protect data from disclosure, alteration, destruction,or loss that either is accidental or is intentional but unauthorized.” [14, p.97] It canbe achieved by combining data integrity and data confidentiality services. Datasecurity only protects data from illegal parties. Both legal parties (like trusted thirdparties that operate an infrastructure service used) and partners have access to thedata transferred.Privacy describes the right of an entity to choose and enforce how much personalinformation is shared with its environment. [14, p.232] In the context of this thesisdata is shared with regards to privacy if only partners that have been chosen expli-citly have access to the data.Anonymity is the property that an identity is unknown or concealed to every partyand can’t be deanonymized by analysing metadata. [14, p.18]

Table 2.1 summarizes the definitions and specifies combinations of data security,privacy, and anonymity.

PartiesShared data Partners Legal Illegal SpecificationIds 3 3 7

SecurityMetadata 3 3 7

Content 3 3 7

Ids 3 7 7 Security &privacyMetadata 3 7 7

Content 3 7 7

Ids 7 7 7 Security, privacy& anonymityMetadata 7 7 7

Content 3 7 7

Table 2.1: Specification of combinations of data security, privacy, and anonymity.Adapted from [16]. Ids: Data identifying a user (e.g. name, telephone nr, IP address,e-mail address). Metadata: Some property of the user (e.g. gender, age, location,business partners). Content: Data to be exchanged (e.g. contracts, files). Partners:Entities explicitly chosen to exchange information with. Legal: Entities not explicitlychosen but necessary to exchange the information (e.g. mail relays). Illegal: Entitiesnot explicitly chosen and not necessary to exchange the information (e.g. governmentagencies).

2.1.4 Non-repudiationNon-repudiation “provides protection against false denial of involvement in a [commu-nications] association.” [14, p.200] A non-repudiation service provides the recipientof a message with evidence that proves its origin and provides the sender of the

8

CHAPTER 2. TECHNICAL BACKGROUND

message with evidence that proves it was received as addressed.Technical and legal aspects of non-repudiation need to be distinguished. Technicalnon-repudiation only assures that a digital signature was created with the corres-ponding private key. Legal non-repudiation refers to the possession and control ofthe private key used. [14, p.200ff]

2.1.5 AccountabilityAccountability is the “property of a system or system resource that ensures that theactions of a system entity may be traced uniquely to that entity, which can then beheld responsible for its actions.” [14, p.13] Accountability is for example importantto trace authorized changes in a document back to the entity which altered thedocument. An audit service can be used to achieve accountability. It records actionsof system entities and their resulting system events. [14, p.26]

2.1.6 AuthenticationAuthentication is the “process of verifying a claim that a system entity or systemresource has a certain attribute or value.” [14, p.25] Authentication is widely used incomputer systems to verify the identity of a user. Its process consists of two steps.In the identification step the claimed value (e.g. a user identifier) is presented to theauthentication service. In the verification step authentication information belongingto the value (e.g. a password) is presented or generated. It acts as evidence to provethe binding between the attribute and that for which it is claimed. [14, p.25ff]Authenticity is the “property of being genuine and able to be verified and be trusted.”[14, p.28] In the context of document exchange an entity must be able to verify theauthenticity of another entity and its messages. When an identity is being registeredin a system, the system is responsible to prove the identity’s authenticity and itseligibility.

2.1.7 Access ControlAn access control service protects system resources against unauthorized access. Thesystem’s access policy defines which resources are accessible by which entities. Itcan for example be implemented as access control list or access control matrix. [14,p.11ff] Authorization defines the “process for granting approval to a system entityto access a system resource.” [14, p.29] Authorization depends on the form of accesscontrol used. It could for example be achieved through sharing of a secret (e.g. adecryption key).

2.1.8 AvailabilityAvailability is the “property of a system resource being accessible, or usable oroperational upon demand, by an authorized system entity, according to performancespecifications for the system.” [14, p.30] Performance specifications are usually

9

CHAPTER 2. TECHNICAL BACKGROUND

specified by quantitative metrics. An availability service protects a system againstdenial-of-service attacks to guarantee its availability. It therefore relies on properresource management and access control. [14, p.30ff]

2.2 CryptographyCryptography is a mathematical science that deals with transforming data to achievedata integrity, data confidentiality or data authenticity. [14, p.90ff] In this sectionmain cryptographic concepts are described to provide different information securityservices.

2.2.1 Symmetric cryptographyIn symmetric cryptography the same (secret-)key is used to perform both of twocounterpart cryptographic operations. Usually symmetric cryptography is used toencrypt plaintext into cyphertext and to decrypt cyphertext back into plaintext.Symmetric cryptography ensures data confidentiality if data is encrypted before itis sent. The disadvantage compared to asymmetric cryptography lies in the costlydistribution of the (secret-)key if done securely. The security of the data depends onevery entity owning a copy of the key. [14, p.296ff]

An example algorithm for symmetric cryptography is AES [17].

2.2.2 HashingA hash function H maps an arbitrary, variable-length bit string, s, into a fixed-lengthstring, h=H(s), called hash. A secure hash function, which can for example be usedfor fingerprinting, has two security properties:

• One-way function: Given H and h it is computationally infeasible to find s.

• Weakly collision-free: Given H and an input s it is computationally infeasibleto find a different input s’ such that H(s) = H(s’), or

• Strongly collision-free: Given H it is computationally infeasible to find anypair of inputs s and s’ such that H(s) = H(s’). [14, p.140ff]

An example of a secure hash function is SHA-256 [18]. There are interesting datastructures that can be build upon secure hash functions to ensure data integrity.

Hash-linked list

A hash-linked list, also called blockchain, is a data structure similar to a linked listthat uses hash pointers instead of pointers to ensure data integrity. A hash pointerpoints to the previous block of the list and includes its cryptographic hash (cf. figure2.1). To verify the integrity of the whole blockchain a user only has to remember

10

CHAPTER 2. TECHNICAL BACKGROUND

the hash pointer pointing to the head of the list. He then successively calculates thehash of the block the pointer is pointing to and compares both hashes. If the hashesmatch he repeats this action with the following hash pointer until he reaches thegenesis block. Then the integrity of the whole list is verified.

Figure 2.1: Structure of a hash-linked list.

hash

previouspointer

previous

data

hash

previo

uspointer

data

hash

pointer

previo

us

data

hash

pointer

If an adversary tampers with the datastored in one block of the linked list,the hash pointer of the following blockwon’t match the hash of the block itis pointing to. Therefore, an adversarywould have to tamper with the hashpointer of the following block whichwould result in changing its block hash.As a result, an adversary would beforced to alter all hash pointers up tothe one pointing to the head of thelist to legitimate his changes. As theuser remembers this hash it can’t bechanged by the adversary and thereforehis change will be detected. [19]

Merkle tree

A Merkle tree is a binary tree thatuses hash pointers instead of pointers(cf. figure 2.2). Similar to hash-linked lists only the root hash pointer needs to beremembered to guarantee data integrity.Merkle trees have two other valuable features. Membership of a data object can beproven in O(log n) time and space by showing only the items from the leaf dataobject until the root hash pointer (as indicated in red). If all hashes match up, theshowed data object is member of the Merkle tree. If the data objects in the leavesare ordered (e.g. alphabetical) a non-membership proof of any data object is alsopossible. Therefore, the data objects paths of the items before and after the missingone have to be shown. [19]

2.2.3 Asymmetric cryptographyAsymmetric cryptography, also called public-key cryptography, uses a key-pair toperform two counterpart cryptographic operations. One key is used for the firstoperation and the other for its counterpart. Usual applications are encryption,digital signatures, and key-agreement.Both keys are differentiated by their accessibility: public and private. The publickey is meant to be exchanged with a third-party so that a third-party can performone operation (e.g. verify a signature). Contrary the private key is meant to be kept

11

CHAPTER 2. TECHNICAL BACKGROUND

private and used for the counterpart operation (e.g. signing a document). Usuallyevery party has its own keypair and publishes its public key to assure a mutualprovision of services. [14, p.21ff]

Examples for asymmetric cryptography are RSA [20] and Elliptic Curve Cryp-tography [21, 22].

hash

pointer

data data

hash

pointer

hash

pointer

data data

hash

pointer

hash

pointer

hash

pointer

hash

pointer

Figure 2.2: Structure of a Merkle tree.

Encryption

If asymmetric cryptography is used for encryption the public key is used by athird-party to encrypt content for the entity owning the corresponding private key.Compared to symmetric encryption asymmetric encryption uses more resources[23]. Therefore, asymmetric encryption is usually used to securely communicate asecret-key for symmetric encryption if larger amounts of data need to be transferred(cf. section 2.2.5).

Digital signatures

A digital signature is a value which can be associated to a data object to enablea recipient to verify the data’s integrity and authenticity. Therefore, the senderhashes the data object, encrypts the hash with his private key, which is called digitalsignature, and sends both data object and encrypted hash to the recipient. Therecipient uses the sender’s public key to decrypt the hash from digital signature.

12

CHAPTER 2. TECHNICAL BACKGROUND

Afterwards he hashes the data object and compares both hashes. If the hashesmatch the recipient confirmed the object’s integrity and authenticity. [14, p.104ff]

Key agreement

Key agreement algorithms generate a shared secret-key between multiple partiesbased on one’s own private key and the third parties’ public keys. Only the publickeys need to be transferred, the calculated secret-key is never sent through anycommunication channel. [14, p.102ff]

An example algorithm is Diffie-Hellman-Merkle [24].

2.2.4 Public key managementAccording to B. Schneier [25, p.169-187] proper key management is the hardest partof cryptography. What are the best cryptographic algorithms of use if you can’tsecure your private keys or can’t obtain the public keys of the entities you want tocommunicate with? Public key management systems try to fill this gap. Here aresome of their key responsibilities.

• Identities need to be bound to their public keys. This is usually done in (publickey) certificates.

• Identities of owners of certificates need to be verifiable so that certificates canbe trusted.

• Certificates need to be able to be updated/replaced by their owner.

• Certificates need to be able to be revoked if their private key is compromised.

• Certificates need to be exchanged without the possibility of tampering.

In the following three different public key management systems are described.

Public key infrastructure

In public key infrastructures (PKIs) someone trustworthy sings the public key ofanother entity to verify that entity’s identity. Centralized certificate authorities(CAs) take this role of trusted third parties. If a certificate is verified by a CAanother entity needs trust the CA that the certificate belongs to the right identity.This has the benefit that an entity only has to verify and manage the root certi-ficates of a couple of CAs and has not to verify the identity of every entity theycommunicate with. Further do CAs take care of the certificate update, replacement,and revocation. PKIs are hierarchical key management schemes where CAs are ableto certify other CAs in a tree structure. CAs can also provide a list of public keysand identities of every entity they certified. This makes it easy to look up and obtainpublic key certificates of identities someone hasn’t communicated with before.

13

CHAPTER 2. TECHNICAL BACKGROUND

On the downside everyone has to trust the CAs not behaving malicious. They are asingle point of failure. There are documented cases [26, 27] where a CA underminedthe integrity of its clients by issuing or revoking wrong certificates. CAs can furtherbe censored by law if they operate in jurisdictions of certain countries.

The X.509 certificate infrastructure [28] is an example for PKIs. It is most widelyused in the internet (e.g. in TLS/SSL which is the basis for HTTPS).

Web of trust

The web of trust was first introduced by PGP [29] and defines a distributed keymanagement approach. Instead of trusting whole CAs each entity can define its ownset of trusted entities and rules. Trusted entities, also called introducers, are similarto CAs in PKIs. They verify and sign certificates of identities they know. Based onthe introducers’ signatures appended to the public key certificate a third-party candecide if she wants to trust the identity of the certificate or not. If she knows andtrusts a couple of the introducers appended she will probably trust the certificate.Otherwise, if she doesn’t know any of the introducers she will probably not trustthe certificate. Therefore, it is essential that introducers are careful when verifyingidentities.The web of trust model is resistant against denial of service and censorship as thereis no central authority. On the downside there is no central repository of public keysto look up identities. As a result, certificate updates or revocation propagate slowlythrough the system as every entity needs to be informed by an introducer.

Distributed ledger based

A different approach of designing public key management can be achieved throughdistributed ledger technology (cf. section 2.3). Here all public key certificates canbe stored in one single distributed ledger. This makes it easy to look up identitiesand every entity can update and revoke its own identities. Only the verification ofidentities need to be taken care of. This could be done in the distributed ledgeritself or locally in a fashion similar to the web of trust.

Examples of distributed ledger based public key management systems are Emer-coin [30], the decentralized public key infrastructure [31] or the BIX CertificationInfrastructure [32, 33].

2.2.5 Digital envelopesA digital envelope is a data format in which content data intended for one or morerecipients is encrypted with a oneßtime key. This one-time key is encrypted itself ina format that only the recipients can decrypt and appended to the message. Thisensures that nobody else except the intended recipients can decrypt the contentdata. [14, p.103ff]

14

CHAPTER 2. TECHNICAL BACKGROUND

In Public-Key Cryptography Standard (PKCS) #7 [34, p.18ff] the content datais encrypted with a randomly generated symmetric key and the encryption key isasymmetrical encrypted for every recipient with the recipient’s public key.

2.2.6 Zero-knowledge proofZero-knowledge proofs [35] are used to prove possession of some information (e.g.a secret) to another entity without revealing any of that information except thatone is possessing it. [14, p.342] Zero-knowledge proofs usually take the form ofinteractive protocols and involve two types of interaction partners: One prover andone or multiple verifiers.The verifier asks the prover a series of questions. If the prover knows the secret shecan answer the questions correctly. If she does not, she has some chance of answeringcorrectly, e.g. 50% for every question. With each question the chances of answeringcorrectly without knowing the secret drop rapidly. In the 50% example the chancesof answering correctly are 1

2n after n questions. Consequently, a verifier just hasto define for himself his acceptable probability of being cheated and calculate theamount of questions to ask, e.g. 20 for a probability of lower than 0.0001%.The questions need to be in a form that the verifier won’t get any informationabout the secret itself, but only about the possession of the secret. Therefore, theinformation a verifier wants to proof needs to be a solution to a mathematical hardproblem. Every time a new question is asked the prover transfers the mathematicalhard problem into another hard problem that is isomorphic to the original. Shethen commits to the solution of the isomorphic problem and transfers both to theverifier. Afterwards, the verifier asks the prover to either show that both problemsare isomorphic, or to reveal the solution in the commitment. Not all mathematicalhard problems can be used for zero-knowledge proofs, but a lot. These include thediscrete log of a given value [36] or the Hamiltonian cycle for a large graph [37].M. Blum and Goldreich et al. documented [37, 38] that any NP statement hasa zero-knowledge proof if it is translated into a instance of a mathematical hardproblem that can be used for zero-knowledge proofs.Another feature of interactive zero-knowledge proofs is that the verifier can’t convinceanybody else with a recording of the zero-knowledge protocol of the outcome of theproof. The verifier needs to be trusted. He and the prover could have collaborated,or the prover could have tampered with the recording. If any third party needs to beconvinced non-interactive zero-knowledge proofs need to be used. Here the provergenerates n isomorphic problems and commits to their solutions. The n first bits ofthe hash of concatenated commitments are used as basis for the decision to eitherprove the isomorphism or reveal the solution of the commitment. [25, p.101-111]

Practical applications of zero-knowledge proofs in computer science are authen-tication [39] or enforcing honest behaviour while maintaining privacy in distributedledgers through zk-SNARKs [40].

15

CHAPTER 2. TECHNICAL BACKGROUND

2.3 Distributed ledgerIn a ledger all business activities (e.g. asset transfers) are recorded as transactions.To reproduce the current state of a system all transactions need to be executed inthe order they were recorded. But, when multiple parties interact with each otherand keep their own ledgers and states, problems, and incidents through divergingledgers (e.g. through fault or fraud) can appear.A distributed ledger is a ledger maintained by a group of entities that do not fullytrust each other. They are systems that provide useful and trustworthy services likemaintaining a shared state, mediating exchanges, and providing a secure computingengine. Many distributed ledgers can further execute arbitrary tasks, typicallycalled smart contracts. Distributed ledgers offer an integrity-focused solution toByzantine fault tolerant [41] atomic broadcast. Because of their Byzantine faulttolerance distributed ledgers can act as trusted and dependable third parties. Theycan operate without the need of central administration or central data storage whichmakes them resistant against censorship. [42, p.14] [43]

Usually distributed ledgers consist of following components:

• a peer-to-peer protocol to distribute transactions between nodes,

• a consensus algorithm to order transactions and maintain a coherent replicatedstate,

• a business logic to execute valid transactions,

• a tamper free data structure for storing executed transactions,

• an authentication mechanism to distinguish its users and rights, and

• an economic incentive to participate in the system.

A simple form of a distributed ledger is the replicated state machine used byTendermint described in section 2.3.3 and depict in figure 2.4. Usually blockchains(cf. section 2.2.2) are used to achieve immutable and verifiable append-only logs,but reliable time in combination with multi-signatures could also be used. [44]Applications based on distributed ledgers can be either permissioned (private) orunpermissioned (public) and can be used to achieve technical non-repudiation,availability, and accountability.

2.3.1 BitcoinThe first known application of an unpermissioned distributed ledger is the digitalcurrency Bitcoin [45]. In Bitcoin the shared data is the transaction log of everybitcoin consumed and generated, from which the current balance of every accountcan be deducted.Bitcoin is the first known application that used a blockchain as replicated tamper free

16

CHAPTER 2. TECHNICAL BACKGROUND

data structure to record transactions. Each block of the blockchain is divided intoheader and transaction data (cf. figure 2.3). The header stores relevant metadata,like the hash-pointer pointing to the previous block, the Merkle root hash of theMerkle tree of transactions, and a nonce for the mining puzzle. The transactions arestored in a Merkle tree appended to the header. Only the header is used to calculatethe hash of the block. Data integrity is ensured as the header contains the Merkleroot hash of included transactions.

hashprev. hash pointer

nonce

merkle root hash

header

hash pointer

trans-action

trans-action

hash pointer

hash pointer

trans-action

trans-action

hash pointer

hash pointer

hash pointer

hashprev. hash pointer

nonce

merkle root hash

header

hashprev. hash pointer

nonce

merkle root hash

header

transactions transactions

Figure 2.3: Structure of the Bitcoin blockchain. Metadata is stored in the header ofeach block and transactions are stored in a Merkle tree. Only the header is used tocalculate the hash of the block. Data integrity is guaranteed as the header containsthe Merkle root hash of included transactions.

Bitcoin uses a consensus algorithm called proof of work, which combines cryptographyand economics, to add new blocks to the blockchain. A node that wants to add anew block has to find a nonce (integer number of choice) so that the resulting hashof the block is smaller than a given threshold. Once a node found an appropriatenonce it can announce the new block to all other nodes. Every node which receivedthe new block has to prove the validity of its transactions based on their local copyof the blockchain and that the calculated hash is below the given threshold. Bitcoinsconsumed in transactions further need to be checked to be signed by their formerowners. If all conditions are met, they can add the proposed block to their localblockchain.As proposed blocks are propagated asynchronous in the network, race conditionscan occur if multiple nodes propose new blocks simultaneously. In this case theblockchain is forked. To overcome this problem, in Bitcoin always the longest validbranch is effective. Therefore, the number of future blocks mined on top of multiplebranches decides which branch is legitimate and which isn’t. In practice a minedblock counts as confirmed if six or more blocks are mined on top of it. As solving a

17

CHAPTER 2. TECHNICAL BACKGROUND

mining puzzle is computationally expensive it is assumed that it is uneconomic orunfeasible for a single entity to catch up that number of blocks. Especially as thethreshold to find a new block, ten minutes average, is re-calculated every 2016 blocks(≈ two weeks) based on the hashing power of the entire Bitcoin mining network.Bitcoin further introduced incentives in form of block rewards and transaction feesto facilitate honest behaviour. Every block has a special transaction called coinbasetransaction, which generates a certain amount of bitcoin1 that the miner can send toan address of his choosing. Every other transaction consumes bitcoins from one ormultiple input addresses, creates new bitcoins in up to the same amount and assignsthem to the defined output addresses. The difference in the amount of consumedcoins and created coins is the transaction fee a miner can pocket. These economicincentives (and exchanges in fiat money) guarantee that enough miners participatein the mining process and therefore guarantee Bitcoin’s stability and success. [46, 47]

2.3.2 EthereumEthereum was introduced by V. Buterin [48] to create an unpermissioned distributedledger with build in Turing-complete programming language for building distributedapplications (Dapps). It adapts some concepts of the Bitcoin system and addressesseveral important limitations. Ethereum uses like Bitcoin a blockchain for storingexecuted transactions and an adaption of the proof-of-work consensus algorithmcalled Greedy Heaviest Observed Subtree [49] to address issues resulting fromEthereum’s fast confirmation times.In addition to externally owned accounts, which are controlled by private keys,Ethereum introduces a new form of accounts named contract accounts. Contractaccounts, also called smart contracts or Dapps, are a form of autonomous agentsthat are controlled by their internal contract code. Contract accounts have directcontrol over their key/value store, to keep track of persistent variables, and theirether balance, the internal currency of Ethereum.The communication in Ethereum is done by messages and transactions. Transactionsare signed data packets that store a message sent by an externally owned account.Messages are data objects that are send between contract accounts and only exist inthe Ethereum execution environment. Messages and transactions contain amongstothers its recipient, the amount of ether to transfer alongside the message, an optionaldata field, and a STARTGAS value.Contract code is invoked when a contract account receives a message or transaction.Transaction fees are deducted on the basis of the transaction length and instructionscompleted by the contract code. As contract code can contain infinite loops thetransaction fees can’t be calculated in advance. Therefore, the sender specifies inthe STARGAS parameter how much gas2 he is willing to spent maximal for his

1The amount is currently 12.5 BTC but is halved every 210,000 blocks (≈ 4 years) until themaximum of 21 million bitcoins is reached.

2Gas is an internal value for the transaction costs, calculated from the current gas price: gas ∗gas price = price in ether.

18

CHAPTER 2. TECHNICAL BACKGROUND

transaction. If the provided maximal transaction fee isn’t enough all state changesare revoked and the complete transaction fee is sent to the miner. If the maximaltransaction fee is sufficient, only the actual transaction fee is deducted, and the restis refunded to the sender.Ethereum contracts are written in a low-level stack-based byte language, which canbe compiled from higher languages like Serpent, and are executed in the EthereumVirtual Machine (EVM).Unlike blocks in the Bitcoin blockchain, Ethereum blocks contain a copy of bothtransactions and the most recent system state. Additional metadata like the blocknumber and the mining difficulty is also included. As a result, light nodes don’tneed to store the entire blockchain history to reproduce the latest state. The statesare stored in a Patricia tree structure [50], which uses pointers to values of previousstates, for efficiency. Further, have contract accounts access to the blockchain headerthat can act as a valuable source of randomness in their applications.Ethereum suffers from the same scalability issues as Bitcoin. Every transaction needsto be processed and verified by every node in the network and needs to be recordedin the blockchain. This results in an infinite growth of the blockchain within timewhich endangers Ethereum’s decentralization. At some critical point in the futurethe blockchain will be that huge, e.g. 100TB, that it is unfeasible for commodityhousehold nodes to keep a local copy. At this point only large organizations withtheir own data centres will be able to verify the integrity of every transaction. Thenlight nodes are forced to trust some supposed honest full node(s) without the instantpossibility to discover if they were cheated.Currently, Ethereum is experimenting to switch from their proof-of-work algorithmto a less resource intensive and more censorship resistant proof-of-stake consensusalgorithm called Casper [51, 52].

2.3.3 TendermintTendermint [42] is the name of a software platform consisting of a byzantine faulttolerant consensus protocol, its implementation, an interface to build arbitraryapplications on top of the consensus, and tools for deployments and management.Tendermint uses its own consensus algorithm achieving thousands of transactions persecond with dozens of nodes distributed around the globe and latencies about onesecond. Internally, Tendermint abstracts applications into a replicate state machineand uses a blockchain to store the transaction log (cf. figure 2.4).In Tendermint nodes who “mine” new blocks in the blockchain are called validators.They are fixed in size and know each other. Validators are responsible for maintaininga full copy the replicated state, proposing new blocks, and voting on them. A roundrobin algorithm decides which node proposes the next block. Once a block isproposed a vote about his validity is performed in two phases before it is committed.In the first phase each node broadcasts its opinion about the validity of the proposedblock. In the second phase every node broadcasts its opinion to commit the proposedblock based on the validity information received from every other validator before.

19

CHAPTER 2. TECHNICAL BACKGROUND

Figure 2.4: High level structure of the replicated state machine implemented byTendermint. The transaction log and resulting state is replicated across multiplenodes (diamonds). Source [42, p.7].

If a positive confirmation of more than 2/3 of all validators is received the proposedblock is added to the blockchain. Otherwise a new round with the next proposer isstarted. Local timeouts are used to deal with offline validators and faulty networkconnections. The timeout resets once a valid block is committed or a new round isstarted.Tendermit introduces governance algorithms to allow changes of the protocol orvalidator set. They are essentially based on proposal and voting. Compared toproof-of-work consensus there is no cost for an entity to operate multiple validators.Therefore, it is important that each validator’s identity is proven during registration.Unfortunately, Tendermint doesn’t specify a mechanism to do this and leaves it forthe application developer (e.g. through external channels). Further does Tendermintnot have an internal currency nor incentives for behaving honestly. Disincentives forfraudulent behaviour based on exclusion is stated for future work.

20

CHAPTER 2. TECHNICAL BACKGROUND

2.3.4 CordaCorda [53, 54, 55, 56] is a permissioned distributed ledger developed by R3CEV LLCwith the purpose of recording and enforcing business agreements among registeredfinancial institutions. It uses an Unspent Transaction Output (UTXO) model similarto Bitcoin where states are the atomic unit of information. Each state is labelledafter the transaction which created it. In contrast to any other described distributedledger, data is shared on a need to know basis. Therefore, consensus is reachedbetween parties to deals and not all participants. There is no single point thatrecords every transaction. Thus, there is only minimal support for rollbacks.Corda doesn’t use a blockchain to order transactions. Instead it uses pluggablenotaries that aren’t tied to any particular consensus algorithm. The role of minersis abstracted as transactions aren’t ordered into blocks. Notaries and timestampservices provide timestamping and transaction ordering functionality. Each transac-tion includes a window of timestamps in which it is asserted to have occurred and asignature of the notary who checked that all inputs weren’t consumed.Corda uses the X.509 certificate infrastructure [28] for connecting public keys toidentities. A network map service publishes the IP addresses through which everynode can be reached alongside with their identity certificates and provided services.Sybil attacks [57] are avoided as each participant needs to be authorized and identi-fied before it can join the network. A point-to-point messaging network ensures thattransactions get delivered to the right parties.Corda has no internal currency nor transaction fees as its permissioned nature andlimited use case is incentive enough for all participants. If smart contracts rely onoutside facts, a trusted oracle needs to be queried that provides deterministic datato all participants. To increase security dedicated secure signing devices3, that storethe private key and sign transactions, are supported on the client side.

2.4 The Onion RouterTor [58] is an anonymization software that hides the source IP addresses of TCPconnections by relaying them through various middle man computers. It further offersperfect forward security, congestion control, decentralized directory servers, integritychecking, configurable exit policies, and location-hidden-services via rendezvouspoints. Location-hidden-services, usually called hidden services, are services thatare anonymously hosted in the Tor network and can’t be censored by law, as theirlocation is protected by several middle man computers as well.Between 1.5 and 2 million clients use the Tor network daily to anonymize theirnetwork traffic, and in average 60,000 hidden services are hosted in the Tor network[59].Through Tor anonymity and service availability, via decentralized directory serversand firewall piercing hidden services, can be achieved.

3e.g. TREZOR - https://trezor.io/ - last accessed 05.10.2017

21

CHAPTER 2. TECHNICAL BACKGROUND

2.4.1 Traffic anonymizationTor anonymizes TCP traffic by relaying it through, usually three, middle mancomputers. These middle man computers run a copy of the Tor software and areusually called Tor nodes. Tor nodes can be configured to relay traffic within the Tornetwork or also between the rest of the internet. Figure 2.5 depicts how the trafficanonymization works.

Tor client

entry node

relay node

exit node

www.kth.se

www.kernel.orgTor directory server

1

2 3

4

5

encrypted link

unencrypted link

Tor node

Figure 2.5: Traffic anonymization in the Tor network. Adapted from [60].

The Tor client first downloads a list of active Tor nodes from a directory serverincluding their network configuration. Then a random path of three or more Tornodes is chosen to anonymize the traffic. Incrementally a circuit of encryptedconnections through the relays of the network is established. Hereby asymmetriccryptography is used to exchange symmetric link keys. For each hop along the circuita new link key is negotiated. No individual relay ever knows the complete patha data packet has taken. Relays just know their predecessor and successor in thecircuit. Once a circuit has been established, data can be exchanged anonymously.Any application which uses TCP streams and supports the SOCKS protocol [61]can be anonymized. To reach a good balance between efficiency and anonymity Torcircuits are changed every 10 minutes.

2.4.2 Location-hidden-servicesIn addition to offer TCP traffic anonymization from client to server Tor also offersserver location anonymization via hidden services. Therefore, it uses a rendezvousprotocol, that has recently been upgraded to version 3 [62]. Compared to version 2

22

CHAPTER 2. TECHNICAL BACKGROUND

it relies on stronger cryptographic algorithms, offers additional client authorization,and avoids that hidden service names can be leaked by dishonest hidden servicedirectory nodes (cf. [63, 64]). Figure 2.6 shows a simplification of how the rendezvousprotocol works.

Tor client

IP1

hidden service

Tor circuit

IP2

RP IP3

hidden service directory

Introduction PointsIP1-3

Identity encryption key

Rendezvouz PointRP

cookie One time secret

Tor cloud

1

1

1

2

TIIP1-3

3

3

4RP

cookie

cookie

5

IP1-3 communication keys

Temporary indexTI

Figure 2.6: Simplified version of the Tor rendezvous protocol. Adapted from [65].

Initially, just as in traffic anonymization, the Tor client and hidden service downloada list of Tor nodes from a directory server. The directory server further offers amutually agreed random value from the Tor directory authority nodes which is alsoneeded. This random value changes periodically, i.e. every 24 hours, and is used toprevent DoS attacks on hidden service directory nodes.

Each hidden service uses multiple asymmetric keypairs.

• A master (hidden service) identity keypair is used for long term identification.Its public key is part of the .onion address that uniquely identifies a hiddenservice.

• A temporary blinded singing keypair is derived from the master identity keypairand the downloaded random value. It changes every time a new random valueis announced. Its public key is used as index in the hidden service directoryand is therefore depict as “temporary index” in figure 2.6.

23

CHAPTER 2. TECHNICAL BACKGROUND

• A descriptor signing keypair is used to sign the hidden service descriptorsuploaded to the hidden service directory. It is signed by the temporary blindedsigning keypair. Its public part is included in the unencrypted section of thehidden service descriptor.

• For every introduction point two keypairs, one for authentication and one forencryption, are created. Their public keys are included in the encrypted partof the hidden service descriptor and labelled as “IP communication keys” infigure 2.6.

In the beginning a hidden service establishes permanent Tor circuits (of usuallythree hops) to multiple (usually three) Tor nodes, which will act as introductionpoints, and negotiates their communication keys to identify him as hidden service.Afterwards it builds its hidden service descriptor. The descriptor is divided into anunencrypted and a double encrypted section. Important parts of the unencryptedsection are a copy of the descriptor signing key and its signature over the wholedescriptor to ensure data integrity. As the descriptor signing key is signed by theblinded signing key its authenticity can also be confirmed. Important parts ofthe double encrypted section include information about the chosen introductionpoints and communication keys. The first layer of encryption provides confidentialityagainst everyone who doesn’t know the public identity key of the hidden service.The second layer of encryption protects against entities that do not possess validclient credentials and is only useful if client authorization is enabled.Once the hidden service descriptor is built it is uploaded to the responsible hiddenservice directory nodes which are arranged in a distributed hash ring. The responsiblehidden service directory nodes are amongst others derived from the temporary blindedsinging key and therefore change over time. Usually two nodes are responsible forhosting a hidden service descriptor to ensure availability.If a client wants to contact a hidden service it has to know its public identity keyor rather its .onion address. In conjunction with the public available random valueof the Tor directory authorities it can derive the hidden service’s public temporaryblinded signing key. From this it gets the responsible hidden service directories fromwhom the hidden service descriptor can be downloaded. The temporary blindedsinging key is used to validate the descriptor’s integrity and authenticity. Then thedescriptor is decrypted using the public identity key and optional provided clientcredentials.Once the introduction point information is acquired, the client connects itself toa random Tor node which will further act as rendezvous point. The encryptedrendezvous point contact information along with an authentication cookie is sent toone of the introduction points, which forwards it to the hidden service.The hidden service decrypts the information and connects itself to the rendezvouspoint. Afterwards it authenticates itself against the client using the acquiredauthentication cookie. At this point client and hidden service can communicateanonymously, privately and securely.

24

Chapter 3

Security assessment of related work

In this chapter basic features and information security services required for documentsharing are determined based on the analysis of three use cases. Afterwards docu-ments are categorised into levels of sensitivity based on their information securityservices needed. Then existing document sharing system are analysed with regardsto their information security services and features provided. In a final step existingsolutions are compared with their compliance to defined sensitivity levels. Observeddesigns and limitations will form the basis for the development of the design andimplementation of the document sharing system described in chapter 4 and 5.

3.1 Document exchange use case analysis and sensitivitycategorization

Based on the in section 2.1 identified information security services security, privacy,anonymity, non-repudiation, legal non-repudiation, sender authenticity, recipientauthenticity, and accountability, 28 individual combinations are theoretical possibleto form document sensitivity categories. As some information security services,like anonymity and sender authenticity, are mutual exclusive and some informationsecurity services, like privacy and security, rely on each other the number of possiblecombinations can be reduced to 84.1 As this are still too many combinations tocheck for compliance in existing DSS, use cases adapted from the post office analogueare used to extract five document sensitivity categories. An alternative approachto limit the number of considered use cases directly on the basis of the end-usersneed for information security services was dismissed as no statistics could be found,and end-users don’t have an information security awareness to make a reasonablestatement [66].The simplest use case is the exchange of documents or information with the soleintent that they will be consumed by their recipient(s). This includes sending apostcard to a friend, sharing holiday photos with the family in a sealed envelope,

1cf. “sensitivity-categorization-calculation.py” in appendix C.

25

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

or hinting a federal agency. In this case data security is fundamental. Therefore,data integrity and data confidentiality need to be met. Both rely on some form ofauthorization, identity management and identification. Depending on the sensitivityof the information to be exchanged privacy and sender anonymity could also berequired. In addition, the authenticity of sender and recipient need to be verifiable.In case that anonymity is required only the authenticity of the recipient needs to beverifiable as sender authenticity directly conflicts with its anonymity. It is optionalthat the exchanged information is buffered at an intermediary until it is fetched bythe receiver.The second use case extends above mentioned exchange of information with explicitsender and receive confirmations. This is for example important to prove that acontract cancellation or a valuable good reached its destination before a certaindeadline, and equivalents to parcel tracking in the post office analogue. In terms ofinformation security services this behaviour is called non-repudiation and impliesthat the receiving party can’t be trusted to send a receipt out of good faith.In the last use case a document or information is exchanged with the intent ofbeing altered by the recipient and being sent back to the sender. Examples includethe exchange and signing of a legally binding contract or collaborative work on adocument. Therefore, the information security services of the former use case needto be extended with accountability. Here, especially legal aspects of non-repudiationneed to be considered.

Table 3.1 summarizes for each use case the needed information security services.

Use caseone two three

Security 3 3 3

Privacy (3) (3) (3)Anonymity (3) (3) 7

Non-repudiation 7 3 3

Legal non-repudiation 7 7 3

Authenticity of sender /recipient

(3) / 3 3/ 3 3/ 3

Accountability 7 7 3

Table 3.1: Summary of information security services based on the use case analysis.Notation: 3: required, (3): requirement depends on the sensitivity of the document,and 7: not required.

Based on the use case analysis documents will be categorised into five categoriesas depict in table 3.2. The first three categories, secure, private, and anonymousreflect the first use case and only differ in the sensitivity of data to be transmitted.They are further typed italic to differentiate them from their respective informationsecurity services. Categories four and five, tracking and business, reflect use cases

26

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

two and three, each with security and privacy required.

Document category / level of sensitivitysecure private anonymous tracking business

Security 3 3 3 3 3

Privacy 7 3 3 3 3

Anonymity 7 7 3 7 7

Non-repudiation 7 7 7 3 3

Legal non-repudiation 7 7 7 7 3

Authenticity of sender /recipient

3/ 3 3/ 3 7/ 3 3/ 3 3/ 3

Accountability 7 7 7 7 3

Table 3.2: Categorization of documents into levels of sensitivity based on their needof information security services. Defined are following document categories: secure,private, anonymous, tracking and business.

In the following different centralized and non-centralized document exchange systemsare analysed with regards to their provided information security services and com-pliance to defined document categories. Implemented key concepts and algorithmsare highlighted to form the basis for the design of the document exchange system todevelop.

As many of the analysed systems are closed source software the analysis can’tbe based on the source code of the software but will be based on publicly availableinformation like product catalogues and industry certifications. If no informationabout a given information security property can be found it is assumed to be notsupported.

3.2 Centralized file-sharing servicesIn centralized file-sharing services files are copied via the network to a centralizedthird-party that stores the files for their customers and takes care of data availability,redundancy, and synchronization. Centralized file-hoster usually maintain their ownidentity management system based on their customers’ e-mail addresses. In businesssegments self-hosted IMS like Microsoft Active Directory are also supported. Filescan usually be shared directly between users of these directories or with externalsvia hyperlinks.In their “Cloud Adoption and Risk Report 2016 Q4” [67] Skyhigh analysed thecloud usage data for more than 30 million users worldwide at companies acrossall industries. According to their analysis over 20,000 cloud services were used in2016 and an average company uploads between 9.8 TB and 24.5 TB per monthinto the cloud. Skyhigh further states that 18.1% of all documents uploaded to acentralized file-sharing or collaboration service contain sensitive information. 43.1%

27

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

of all uploaded files are shared (mostly in the same organization), and 9.3% of filesshared externally contain sensitive information. Information security services varydrastically by cloud service. Only 42.1% of all services state that the customer ownsall data uploaded and only 16.6% delete data immediately after account termination.Even worse, only 8.7% commit not to share customer data with third parties likeadvertising companies, only 8.6% encrypt data at rest, and only 0.8% allow customersto manage their own encryption keys; hence large room for improvement.Following analysis will focus on two of the most popular file-sharing services forbusinesses and consumers [67, p.24ff], Box and Dropbox, one open source file-sharingservice, Nextcloud, and one file exchange service specialized in electronic signatures,Citrix RightSignature.

3.2.1 BoxBox [68] is a secure centralized file-sharing platform for businesses. It is developedand maintained by Box, Inc. and offers services for file-sharing, collaboration, andfine-grained access control. Box ensures data confidentiality by encrypting data intransit with TLS and data at rest with AES-256. It further offers a bring-your-own-encryption (BYOE) solution named KeySafe [69, 70] where enterprises canmanage their own encryption keys. In this case files are encrypted with an additionalcustomer key and Box, Inc. has no possibility to access the files. Box further supportsdata integrity, accountability, and technical non-repudiation by using version controland maintaining an append only log for file access and decryption. [71] Box’scentralized IMS and support for companywide third-party IMS, like LDAP orADFS2, guarantees authenticity of sender and recipient. Users are identified by theire-mail addresses. No further identity verification process is mentioned. Therefore,legal non-repudiation isn’t achieved. Box uses N+2 node fault tolerance to preventdata from being lost and promises an availability of 99.9%. [72] Other providedfeatures include document watermarking, two-factor authentication, support forsingle-sign-on (SSO), in-region storage and data loss prevention (DLP) for mobiledevices. [73]Box doesn’t offer privacy or anonymity as the platform needs access to ids andmetadata to operate. As Box is a closed-source ecosystem a customer has nopossibility to verify if all features are correctly implemented. Consequently, thecustomer needs to trust Box, Inc. to host its data responsibly. To maintain trust inits services Box complies with accepted standards including ISO 27001, ISO 27018,SOC 1 to 3, FedRAMP and FIPS 140-2. As Box is a centralized solution it is notcensorship resistant.

3.2.2 DropboxDropbox [4] is a centralized file-sharing platform for consumer and businesses. Itoffers services for file-sharing, synchronization, collaboration, and fine-grained accesscontrol. It encrypts, similar to Box, data in transit with TLS and data at rest with

28

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

AES-256. To minimize network overhead data is split into blocks and only changedblocks are synchronized. Metadata and block data are uncoupled and stored atdifferent places to increase security. Hashes and redundancy ensure data integrity.Dropbox doesn’t offer a client to manage its own encryption keys. As Dropbox Inc.is a U.S. company it can be forced by law to decrypt files for government agencies(cf. [8]). Consequently, security, privacy and anonymity are not achieved. Certificatepinning is used to verify the identity of Dropbox’s servers. In addition, Dropbox useslogging and version control to ensure accountability and technical non-repudiation.It further supports its own identity management system and third-party directoriesto ensure authenticity. Users are identified by their e-mail addresses. Independentthird-party audits, vulnerability reward programs, and compliance with acceptedstandards2 are used to maintain trust in Dropbox services.Additional services provided include perfect forward security for HTTPs sessions,transparency reports for government data requests, in-region storage for businesses inEurope, two-factor authentication, local network synchronization, disaster recoveryplans and practices, support for single-sign-on through a third-party identity provider,data loss prevention, remote wipe support for stolen mobile devices, and developmentAPIs. [74]

3.2.3 NextcloudNextcloud [75] is an open-source and secure centralized file-sharing platform thatcan be hosted at anyone’s own premises. It offers services for collaboration andfor file and calendar sharing and synchronization. Nextcloud supports internal aswell as external storage services like local hard drives, AWS buckets, Dropbox, orGoogle drive. It offers transport security through TLS and optional server sideAES-256 encryption for data at rest. In its next version Nextcloud will also supportend-to-end encryption where the client is in charge of its own encryption keys. Here,digital enveloping will be used for the realization. [76]Once a client activates end-to-end encryption an asymmetric key-pair will be created.Nextcloud’s server will act as PKI root authority and issue a public key certificatebased on the public key the client uploads. The private key will be symmetricencrypted with a 12 word long mnemonic and uploaded to the server. This 12 wordlong mnemonic needs to be remembered by the client to add new devices to itsaccount. This is for convenience and seems to be less error prone than rememberingthe whole private key. End-to-end encrypted files will be encrpyted with a randomAES-256 password. This password will be stored in a metadata file which will beencrpyted with the public keys of the persons allowed to access the file.Next to end-to-end encryption, Nextcloud further supports various security features,including single-sign-on, two-factor authentication and thrid-party active directoryand authentication support. Bug bounty programs, external code audits and reviews,and compliance with industry standards like ISO 27001 clause 14 perfect Nextcloud’s

2i.e. ISO 27001, ISO 27017, ISO 27018, SOC 1 to 3, . . .

29

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

security offering. [77]Nextcloud’s open source code basis allows it to be easily extended. In its app storevarious third-party extensions can be found. These can add amongst others supportfor decentralized storage providers, like Sia (cf. section 3.3.1), or integrate otherprojects like Draw.io.As Nextcloud needs access to metadata and user ids to operate it doesn’t offer privacyor anonymity. Unfortunately, no information could be found about non-repudiationand accountability. Authenticity of authors is supported through Nextcloud’s accesscontrol mechanisms that identify a user based on her e-mail address.

3.2.4 Citrix RightSignatureCitrix RightSignature [5] is a centralized document-sharing platform to collect legallybinding electronic signatures by adhering the U.S. E-Sign Act and Uniform ElectronicTransactions Act. Citrix uses TLS for transport security and relies on Amazon’sdatacentre redundancy and physical access control to protect data from loss andbeing accessed in an unauthorized manner. There is no additional encryption of dataat rest. [78] Complex hash algorithms are used to guarantee data integrity of bothsigned documents and its complete audit log of interactions. The complete auditlog stores who interacted when and how with the shared document and thereforeguarantees accountability.Citrix RightSignature uses its own proprietary identity authentication system. In-stead of relying on certificates or username password combinations parties areidentified and authenticated by multiple factors. These include the e-mail addressused to open the document (a unique document link is sent to every party viae-mail), a biometric signature analysis (based on unique characteristics related tothe speed and timing of a person’s signature given) and the IP address captured.In addition, unique identifiers of a signing party like its face through a webcam,its phone number through a challenge response protocol, or its driver’s license orpassport number can be used to verify an identity. All these factors are added tothe complete audit log to ensure the authenticity of authors and non-repudiation.[79, 80]Citrix RightSignature doesn’t meet the requirements for security, privacy, and an-onymity as amongst other data is stored unencrypted in a third-party datacentreand Citrix is a U.S. company and therefore bound to U.S. jurisdiction. In addition,Citrix RightSignature is a closed-source ecosystem and uses a proprietary identityauthentication system. It further doesn’t state any compliance with accepted stand-ards. In conclusion, any party needs to trust Citrix RightSignature not behavingmalicious and that they implemented their services correctly.

3.3 Decentralized data storage and file-sharing servicesInstead of renting storage from one single provider users in decentralized data storageservices rent storage from each other. Due to their decentralized design decentralized

30

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

storage solutions don’t rely on a single trusted third-party to be operational andas a result are more censorship resistant than centralized solutions. On the otherhand, different mechanisms need to be implemented to guarantee data security,privacy, reliability, and availability when storing files at multiple untrusted third-parties. Especially incentives to participate in the system, ways to deal with failingor inaccessible nodes, and the shift from server-side security to client-side securityprovision need to be considered.In the following two decentralized data storage services and one decentralized file-sharing services will be analysed with regards to their design, features and informationsecurity services provided.

3.3.1 SiaSia [81] is a software developed by Nebulous Inc. for decentralized data storage.Currently it is only useful for archival purposes, but in a future release also file-sharing functionality between Sia users will be implemented [82].Sia uses a distributed ledger similar to Bitcoin to govern payments for providedstorage in its internal currency Siacoin. Contrary to Bitcoin, transactions in Siacan’t execute scripts. Instead they can contain storage contracts, storage proofs andup to 64KB of arbitrary data.Storage contracts are agreements between users that hold the amount of data tostore, price, and duration. They further include a deposit of both client and host.Storage contracts are updated in revisions. Only the contract with the highestrevision number that is singed by every contracting party is valid. This has thebenefit that only the first and latest revision need to be committed to the blockchain.All other revisions can be negotiated between the contracting parties offside theblockchain as all funds are in escrow. In addition, a Merkle root hash of the storeddata is updated every revision in the storage contract.The Merkle root hash is one key ingredient for the proof of storage a host has toprovide. Before data is uploaded it is split into chunks of 4MB and each chunk isindividually encrypted. The Merkle root hash is built using consecutive segmentsof 64Bytes as leaves. To proof the storage a random segment is depict from theblock prior termination of the contract and the total amount of data stored. Thehost has to commit that segment and its membership proof to the blockchain asproof of storage. Settled siacoins will be transferred from the client to the host onlyif the proof of storage was successful. Otherwise deposited siacoins will be sent tocontractually agreed penalty addresses.The 64KB arbitrary data field of Sia transactions is used by hosts to advertise theirstorage conditions and prices. Developers can further use it to build applications ontop of the Sia blockchain. [83, 84, 85]

Regarding defined information security services Sia offers data integrity by us-ing Merkle root hashes, data confidentiality by using client-side encryption andtherefore data security. Privacy isn’t achieved. Even though users communicate

31

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

only with their public Siacoin addresses and therefore achieve pseudonymity onblockchain transactions, no anonymity or proxy network is used to hide their IPaddress when interacting with hosts or miners. To achieve availability Reed-Solomonerror correction codes [86] are used. Per default data is uploaded to 30 differenthosts and only 10 are required to restore the data. Sia’s code base is open sourceand makes the service verifiable. Escrow and storage proofs guarantee that hosts willget paid and clients get refunded if they were cheated. Therefore, no trust betweenparticipating parties is required.As Sia is currently only a data storage service and not a file-sharing service non-repudiation, authenticity of authors, accountability and collaboration are not applic-able.

3.3.2 StorjStorj [87] is a secure decentralized data storage developed by Storj Labs Inc. thatdoesn’t offer any file-sharing services. Similar to Sia data is encrypted client-sidebefore it is uploaded. Data is sharded into chunks of fixed size to preserve metadata-privacy. Consequently, a host can’t infer from the chunk size what type of informationwas uploaded. Chunks of files are uploaded to multiple hosts (per default three) toachieve availability in case hosts are inaccessible.Proofs of retrievability are used to guarantee that a host is still storing a file. Merkletrees and Merkle proofs are used in the implementation. Storj supports complete andpartial audits of chunks. Both use a challenge response protocol where a pre-definedsalt is sent to the host who uses this salt to generate a membership proof. To validatea proof of retrievability a client only has to remember the set of salts belonging to afile, its Merkle root hash and the depth of the Merkle tree.Storj doesn’t rely on distributed ledger technology to store and communicatemetadata. Instead distributed hash tables (DHTs) are used. Storj uses and extendsthe Kademlia protocol [88] for efficient message routing between its users. A usercreates an ECDSA keypair equivalent to Bitcoin wallets so that its node id equival-ents its wallet address. These Bitcoin addresses can be used for instant paymentsfor storage contracts, file downloads and proofs of retrievability. However, in itsreference implementation Storj will use its own cryptocurrency Storjcoin. For theduration of a contract money is hold escrow in a micropayment channel betweenclient and host.Proofs of retrievability in Storj are only sent between the contracting parties withoutnon-repudiation. Thus, they can’t be publicly verified. Therefore, penalties formalicious behaviour can’t be enforced as no party can prove the wrong behaviour.It also complicates the development of reliable reputation systems as an importantreliability metric isn’t publicly available.Without the knowledge about who is trustworthy and who isn’t there is no certaintythat a host storing important data won’t be inaccessible in the near future. Therefore,a client needs to regularly check if all chunks are still retrievable from all hoststhrough proofs of retrievability. In case a host is inaccessible the client needs to

32

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

initiate a re-upload of the related chunks to restore redundancy. Therefore, it issuggested that every client permanently runs a so called “bridge service” that takescare of the contract negotiation, audit insurance and verification, payments, and filemonitoring. [89]

Regarding information security services Storj provides data security by achiev-ing data confidentiality through client-side encryption and data integrity throughhashing in Merkle trees. Privacy and anonymity aren’t achieved as clients’ IPaddresses aren’t anonymized through proxy or anonymization networks. Storj’s opensource code basis and its decentralized design allow participants to don’t trust anythird party.As Storj is only a data storage service and not a file-sharing service non-repudiation,authenticity of authors, accountability and collaboration are not applicable.

3.3.3 SecuRES“SecuRES: Secure Resource Sharing System” [90] is the title of the bachelor’s thesisof D. Svensson and P. Leung. The authors investigated in 2015 to what extend publicledger technology can be used to create a decentralized digital resource sharingsystem. They especially focussed on non-repudiation, data confidentiality and dataintegrity, and complemented their work with a prototype implementation.In their prototype they combined concepts of the Bitcoin blockchain and Storj’sdecentralized data storage. Metadata is stored in transactions in the blockchain andfiles are sharded into chunks, encrypted and uploaded to storage nodes. Transactionscontain amongst others the sender, recipients, file creation time, file description, andthe with the recipient’s public key encrypted file decryption key. In SecuRES’s caseboth client and storage nodes are responsible for monitoring the file state by usingproofs of retrievability.Due to their limited time and resources the authors weren’t able to specify any IMS.Consequently, users are recognized by their wallet address only. They further hadno time to think about incentives to participate in the system.Nonetheless, their developed prototype concept shows that decentralized file-sharingbased on public ledger technology is possible. Even though the prototype isn’tpublicly available the authors achieved non-repudiation by using the blockchain asmetadata storage, data integrity by using file hashing in Merkle trees, and dataconfidentiality by using client-side encryption and digital enveloping of encryptionkeys. Privacy and anonymity weren’t achieved as not all metadata in the blockchainis encrypted. Everyone can reproduce which user was interacting with whom bylooking at the transaction input and output addresses even though wallets arepseudonyms. In addition, IP addresses weren’t anonymized. Therefore, storagenodes can infer from the IP address used for uploading and downloading a chunk whointeracted with each other. The authenticity of authors is given as each transactionneeded to be signed by its sender, even though no identification mechanism duringthe registration is specified. The accountability property is achieved through the

33

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

linkage of transactions of the same file in the blockchain. Though, the availabilityof the system is unknown as important parts as incentives aren’t implemented andprototype tests weren’t performed.

3.4 MiscellaneousIn this section not solely centralized or decentralized file-sharing alternatives toexchange documents are discussed.

3.4.1 Secure E-Mail through OpenPGPOpenPGP [91] is a security software that provides information security services formessages and data files, key management services, and certificate services. Appliedto e-mail OpenPGP offers data security by providing data confidentiality throughencryption and data integrity through digital signatures.A prerequisite of using OpenPGP is the successful exchange of public keys andbelonging e-mail addresses. Otherwise digital envelopes can’t be encrypted, anddigital signatures can’t be validated. Public keys can be acquired from various keyservers. Unfortunately, these usually don’t provide any identification. Therefore,identities behind acquired keys still need to be personally verified through an externalchannel.As OpenPGP is only able to encrypt the content of e-mails, its metadata andids are still available to the mail relays responsible for the exchange. Therefore,privacy and anonymity can’t be achieved. A recently discovered attack calledefail [92], exploits OpenPGP’s solely content encryption, and is able to undermineits confidentiality on certain client implementations. Further doesn’t OpenPGPimplement any mechanisms to force a recipient to confirm the receipt of a message.Consequently, non-repudiation and accountability aren’t feasible.

3.5 SummaryTable 3.3 below summarizes the results of preceding security assesment and showsthat none of the analysed document exchange systems supports the defined sensitivitylevels private, anonymous, tracking, and business.

34

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

Document category / level of sensitivitysecure private anonymous tracking business

Box 3 7 7 7 7

Dropbox 7 7 7 7 7

Nextcloud 3 7 7 7 7

RightSignature 7 7 7 7 7

Sia n.a. n.a. n.a. n.a. n.a.Stroj n.a. n.a. n.a. n.a. n.a.SecuRes 3 7 7 7 7

Secure E-Mail 3 7 7 7 7

Table 3.3: Compliance of analysed document-sharing systems with defined documentsensitivity levels in section 3.1. Following notation is used: 3: supported, 7: notsupported, and n.a.: not applicable.

Table 3.4, on the next page, gives a detailed overview about provided informationsecurity services on each analysed system. It further states additional features likecensorship resistance or the need to trust a third-party.

3.6 ConclusionTo conclude, the initial assumption of lacking information security services incurrent document exchange systems was proven and shows the demand for practicalalternatives. Based on the analysis of three use cases, five sensitivity categories ofdocuments with different needs of information security services have been extracted.Afterwards four centralized document sharing systems, two decentralized data storageservices, one decentralized data sharing service, and document exchange throughsecure e-mail were analysed with regards to the compliance of defined sensitivitycategories.None of the analysed DSS was able to support all four levels of sensitivity. Indeed,only three of the analysed eight systems were able to support the most basic documentcategory secure. The more advanced categories private, anonymous, tracking, andbusiness weren’t supported by any DSS.Analysed systems mainly lack support for privacy and anonymity on which the moresensitive categories are built upon. The main limitations are the leak of informationthrough non-anonymized network connections, and centralized metadata storage.

35

CHAPT

ER3.

SECURIT

YASSESSM

ENT

OFRELAT

EDWORK

Centralized DecentralizedBox Dropbox Next-

cloudRight-

SignatureSia Storj SecuRes Secure

E-MailSecurity 3 7 3 7 3 3 3 3

Privacy 7 7 7 7 7 7 7 7

Anonymity 7 7 7 7 7 7 7 7

Non-repudiation 3 3 unk. 3 7 7 7 7

Legal non-repudiation 7 7 unk. 3 7 7 7 7

Authenticity of sender /recipient

3/ 3 3/ 3 3/ 3 3/ 3 n.a. n.a. 3/ 3 3/ 3

Accountability 3 3 unk. 3 n.a. n.a. 3 7

Information buffering 3 3 3 3 3 3 3 3

Censorship resistant 7 7 7 7 3 3 3 7

Verifiable source code 7 7 3 7 3 3 7 3

Thrid-party audit 3 3 3 7 7 7 7 7

No third-party trust 7 7 7 7 3 3 3 7

Table 3.4: Overview of the analysed systems with regards to the information security services and features provided. Followingnotation is used: 3: supported, 7: not supported, unk.: unknown, and n.a.: not applicable.

36

CHAPTER 3. SECURITY ASSESSMENT OF RELATED WORK

The analysis showed further that it is difficult to precisely define and verify inform-ation security services with only little information about the document exchangesystems to analyse. Especially when it comes to specifying legal parties that areaccepted to have access to the data.Another result of the analysis is that legal non-repudiation and identity verifica-tion are hard to implement. Only Citrix RightSignature is able to provide legalnon-repudiation through its proprietary identification process. All other serviceseither achieved only technical non-repudiation or none. The problem with technicalnon-repudiation though is that an entity can always claim that it lost its identifica-tion credentials in a public place and someone else used them in its name (cf. [25,p.111]).

Centralized DSS are characterized by a central trusted arbitrator between theexchanging parties. This arbitrator is mainly responsible for the identity manage-ment, authentication, redundant file buffering in case an exchanging party is offline,security provision through encryption and hashing, and non-repudiation through alog of actions. The main limitations result from the centralized structure of suchsystems. Clients rely on the correct server-side security provision and implementationand have no possibility of direct control, even though independent third-party auditsand certifications try to fill this gap. Dropbox’s transparency reports [8] furthershow that centralized services are forced, by jurisdictions in which they operate, todecrypt data for federal agencies. Central arbitrators are a single point of failurethat can be used for censorship.

Decentralized systems try to avoid this single point of failure by distributing re-sponsibilities evenly through the system. Thereby, new mechanisms for decentralizedidentity management, authentication, security provision, non-repudiation and filebuffering are implemented. They usually rely on distributed ledger technology,public key encryption and digital enveloping. Unfortunately, none of the analyseddecentralized systems in the market currently offers services for document exchange,only for decentralized storage. Discovered shortcomings of distributed DSS are theidentity validation of transaction partners with regards to legal non-repudiation, theredundant buffering of information at untrusted and unreliable intermediaries, andincentives to participate in the system to provide a consistent service.

37

Chapter 4

Prototype specification

In this chapter the prototype specification of a document sharing system for individu-als and groups, called docShare, will be sketched. It should support the exchangeof documents of the, in section 3.1 defined, sensitivity categories secure, private,anonymous, tracking, and business. docShare’s implementation will be handled inchapter 5, and its evaluation in chapter 6.

Based on the analysis of existing document exchange systems in chapter 3, threeoverall requirements for docShare can be extracted. First, docShare needs to bedesigned as distributed system to avoid censorship through single points of failureand to strengthen the overall robustness of the system. Second, IP addresses andother identifying attributes need to be masqueraded when interacting with entitiesnot explicitly chosen to exchange information with to preserve privacy. Third, thesource code and design specifications of docShare need to be publicly available foreveryone to verify its functionality.However, due to the dual-use property of private and anonymous document exchangetechnology (cf. section 1.6) the source code of docShare won’t be publicly availablein an online repository. Instead, interested readers are asked to obtain a copy of thesource code attached to appendix C either from KTH or the author directly.

4.1 ArchitectureThe architecture of docShare combines two public decentralized services to achieveprivacy and anonymity for the exchange of documents. A decentralized anonym-ization network, i.e. Tor, will be used for anonymous routing of packages and forproviding an infrastructure for location hidden services operated by the clients. Adistributed ledger will be used for decentralized identity management and will mapthese hidden services to each user. As depict in figure 4.1 parties will communicatedirectly with each other when they are online. There will be no document bufferingat an intermediary, even though this feature could be easily extended. The IMS isdepict as smart contract within the Ethereum ecosystem. This is only an example.

38

CHAPT

ER4.

PROTOTYPE

SPECIFIC

ATIO

N

Alice

docShare client

3

Dave 8

Exit node

abc.onion

docShare client

jkl.onion

Bo

b

docSh

are clien

t 24

def.onion

Caro

l

docSh

are clien

t 71

ghi.onion

Anonymization network

Public distributed ledger

e.g. Tor

e.g. Ethereum

IMS as smart contract

id38...

identifierabc.onionjkl.onion

...

publ. keyd34fGh3xb49lkf3ye

...

...

...

...

...

IMS entry updates

IMS entry fetching

IMS entry fetching

identification and document exchange setup

private document exchange

anonymous documentexchange

Miner

Miner

Miner

Figure 4.1: Overview of docShare’s architecture and functionality.

39

CHAPTER 4. PROTOTYPE SPECIFICATION

Any distributed ledger based IMS could be used in the implementation.

4.1.1 Identity managementThe decentralized identity management will be realized similar to the decentralizedpublic key infrastructure [31] in a public distributed ledger. This has the benefitthat every user can create an identity whenever she likes to and is further directlyin charge of her identity entry and management.Each entry in the IMS describes a receiving service of the respective entity in theanonymization network, i.e. an .onion address, and includes the entity’s public keyfor symmetric key exchange and signature verification. Both attributes are boundto a unique user identifier which could simply be an integer. Every user identifiercan be claimed by any entity during the registration in a first come first servebasis. The IMS should also allow optional fields that can be defined by the clientherself. These include her name, telephone number, expiration date of the account,time-periods when she will be available to receive documents, or her willingness toaccept anonymous documents.Using a public distributed ledger as identity management system implies that thelegal identity of each entity can’t be verified during the registration. As a result, aclient needs to verify the legal identity behind an identifier before she can be surewith whom she is communicating. This could be done through an external channel,or through video chat where both parties show each other their face, passport, anda proof of actuality, e.g. the daily newspaper. Once the legal identity is verifiedthe binding between identifier and identity can be saved locally on each client’scomputer to avoid future legal identity verifications for that entity. Only in case ofthe exchange of documents of sensitivity business, where legal non-repudiation isrequired, legal identity verification needs to be performed again before the documentexchange. Otherwise a transaction partner could claim that someone else used hisauthentication credentials.

Decentralized identity management systems based on public distributed ledgertechnology can be realized in two variants. They can be either built as independentdistributed ledger or built on top of an existing distributed ledger, e.g. as Dapp inEthereum. The realization as independent distributed ledger has the benefit that allaspects can be hand tailored to the purpose of the decentralized DSS. Therefore,there won’t be much overhead with regards to resource consumption. Further wouldthe blockchain growth be proportional to the usage of the system. On the otherhand, incentives to participate in the system and mining of blocks needs to be takencare of. Existing distributed ledgers already have an active community to securetheir network. On the downside, in this case, the user might be required to downloada huge transaction log to get started. The transaction log would further not increaselinear with the usage of the DSS as many applications are using the distributedledger.With regards to docShare’s concept and information security services only the dis-

40

CHAPTER 4. PROTOTYPE SPECIFICATION

tributed ledgers’ properties of a decentralized, byzantine fault tolerant, append-onlylog of transactions are relevant. As the prototype implementation will be furtheronly seen as proof-of-concept, the decision of which of both variants to use will bejustified by the ease of implementation and available systems on the market.

4.1.2 Anonymization networkThe anonymization network has two responsibilities in docShare. Its first responsibil-ity is to masquerade network traffic between the docShare clients and the distributedIMS during the registration and update of entries to preserve anonymity. In casethe IMS is built upon an existing distributed ledger the transaction update fetchingcan be done without using the anonymization network as users can’t be correlatedto docShare based on their blockchain download pattern. All updates need to befetched, like for every other user, to verify the latest state. In case a dedicated ledgeris used for the IMS the anonymization network should be used for fetching newentries as here a correlation is possible. Its second responsibility is the provisionof the infrastructure for location hidden services operated by each docShare client.Using hidden services allows the client to anonymously communicate with otherentities. Through hidden services documents can be exchanged anonymously, entitiescan carefully prove their identity to each other, and private channels outside theanonymization network for faster data transfer can be negotiated.Tor will be used in the realization as it offers both traffic anonymization and locationhidden services, but any other anonymization network that complies to both require-ments would be sufficient. Location hidden services have another useful feature.They are operational behind NAT gateways. Therefore, configuration changes inthe network infrastructure aren’t necessary.

4.1.3 docShare clientThe docShare client is the main software that needs to be installed to transferdocuments via docShare. It offers interfaces to interact with the anonymizationnetwork and the distributed IMS. Consequently, it depends on installed instancesof Tor and the distributed IMS client. The docShare client is responsible for themanagement and verification of the legal identities behind the entries in the IMS, andthe exchange of documents of the sensitivity categories secure, private, anonymous,tracking, and business.In the realization the docShare client provides hidden services from which anonymouscommunication between the entities can be established. Access control mechanisms,based on filter lists of allowed and blocked entities, regulate from whom to receivedocuments and messages to avoid denial of service attempts. The docShare clientis further responsible for the client-side encryption of documents to be exchanged.Digital enveloping will be used to support the exchange of files within groups.Each docShare client maintains its own database of verified legal identities behindthe entities in the IMS. Entities can be verified through external channels, i.e. by

41

CHAPTER 4. PROTOTYPE SPECIFICATION

personally exchanging the entities identifier in the IMS, or via the docShare clientitself. Here both entities establish a video or message only conference, and withinthis conference convince each other about their identities. The docShare client isfurther responsible for the registration and management of the client’s own identityin the distributed IMS.Details about used communication protocols are handled in the next section.

4.2 ProtocolsFollowing communication protocols are used to maintain docShare’s functionality.Communication partners are Alice, the first client, Bob, the second client, Carol,the third client, and the IMS.

4.2.1 User registration in the IMSTo register a new entity in the IMS following protocol should be used between Aliceand the IMS. If a dedicated IMS is used, the download of new transactions from theIMS is performed through the anonymization network.

1. Alice creates an asymmetric key-pair that can be used for encryption anddigital signatures.

2. Alice configures her anonymization network and IMS client software and cachesher hidden service identifier.

3. Alice downloads all updates from the IMS and reproduces its latest state.4. Alice creates an IMS entry with a random identifier that hasn’t occurred

before and signs it with her private key. This entry must comprise the randomidentifier, Alice’s public key and her hidden service identifier. Optional fields,like Alice’s name and time periods of availability, can be added.

5. Alice transfers the signed entry via the anonymization network to the IMS.6. The IMS receives Alice’s signed entry, validates its signature, and checks if the

claimed identifier wasn’t used before. If both requirements are met her entryis added to the next transaction. Otherwise, Alice’s request is dropped.

7. Alice waits for and downloads the next six updates from the IMS and checksif her entry is included in the latest state. If not steps 3 to 7 are repeated.

In this protocol steps 1 and 2 could be executed in parallel.

4.2.2 Legal identity verification of an entity in the IMSThe legal identity verification of an entity in the IMS can be done in various waysand mainly depends on how much both entities know and trust each other. Theidentity verification process needs to protect both entities from falsely identifyingthe other party and from leaking own identity information in case the opposing

42

CHAPTER 4. PROTOTYPE SPECIFICATION

party is malicious. Two basic cases can be differentiated. In the first case Alice andBob know each other and have an external channel to communicate with to verifytheir identities. In the second case Alice and Bob might know each other but haveno external channel from the beginning.It should be noted that both forms of identification can’t be used for legal non-repudiation. For legal non-repudiation the identification also needs to be stored ina tamper free data structure by both entities. More information can be found insection 4.2.6.

Verification through an external channel

Following protocol should be used by Alice and Bob to verify each other’s legalidentity with the help of an external channel. This external channel could be apersonal meeting between the two parties or a phone call where both parties canidentify the other by his/her voice.

1. Alice and Bob communicate their IMS identifier and a one-time secret (perperson) to each other through the external channel.

2. Alice and Bob both whitelist each other’s identifier in their docShare client toallow message chatting with the other party.

3. Alice initiates a, with Bob’s IMS mentioned public key, encrypted chat to Bobthrough Bob’s hidden service.

4. Alice and Bob verify their IMS identifier by telling each other the opponentsone-time secret.

5. If the verification was successful, Alice marks Bob’s identifier as verified onher computer and Bob marks Alice’s identifier as verified on his computer.

Verification without external channel

Without external possibility to verify each other’s identity Alice and Bob have toverify each other through docShare. This arises the problem that a malicious entitycould trick Alice into revealing her identity without showing his. In this case hewould gain knowledge that Alice actively uses docShare, and about her hidden serviceidentifier. Therefore, the author encourages the users to be extremely cautious whenverifying their own identity with this method. Following protocol should be used byAlice and Bob.

1. Alice finds Bob’s identifier in the IMS, whitelists him for message chatting,and sends Bob an encrypted request through Bob’s hidden service that shewould like to identify herself to Bob.

2. Bob who doesn’t know yet Alice can either accepts her offer or not based on hisown trust preferences and Alice’s entry in the IMS. If Bob accepts he initiatesan encrypted chat with Alice through Alice’s hidden service. Otherwise theprotocol terminates.

43

CHAPTER 4. PROTOTYPE SPECIFICATION

3. During the chat Alice convinces Bob to a certain degree of being Alice andBob convinces Alice of being Bob. If both partners are confident enough theyinitiate a video conference to remove the last doubts about their identities.It is recommended that every party shows the other a prove of actuality, e.g.today’s newspaper, to ensure the other party of not viewing a recording.

4. If the verification was successful, Alice marks Bob’s identifier as verified onher computer and Bob marks Alice’s identifier as verified on his computer.

4.2.3 Data exchange with compliance to privateFollowing protocol describes how Alice can privately share a document with Boband Carol. It assumes that Alice successfully verified her identity to Bob and Caroland vice versa.

1. Alice digitally envelopes the digitally signed document to share, so that onlyBob and Carol can decrypt it, and verify its integrity and authenticity.

2. Alice creates a random uniform resource locator (URL) for the digitally envel-oped document and publishes it in her hidden service or public private serviceif available.

3. Alice sends the URL to Bob and Carol through their hidden services.4. Bob and Carol fetch the enveloped document from the URL, decrypt it, and

verify its integrity and authenticity.

It should be noted that documents exchanged with regards to private also meet therequirements for secure.

4.2.4 Data exchange with compliance to anonymousFollowing protocol describes how Alice can anonymously share a document withBob. It assumes that Bob allows the anonymous receipt of documents through hishidden service and that Alice successfully verified Bob’s identity, e.g. through atrusted third party.

1. Alice hashes the document to share and digitally envelopes the hash anddocument, so that only Bob can decrypt it.

2. Alice uploads the digitally enveloped document to Bob’s hidden service.3. Bob decrypts the digitally enveloped document and verifies its integrity by

using the included hash.

4.2.5 Data exchange with compliance to trackingThere are principally three ways to achieve non-repudiation which is the main require-ment for tracking. Non-repudiation can be achieved through a trusted arbitrator whotakes an active role in the document exchange, through direct document exchange

44

CHAPTER 4. PROTOTYPE SPECIFICATION

via oblivious transfer [25, p.166ff], and through optimistic document exchange wherean arbitrator is only involved if one party is dishonest.

Non-repudiation through a trusted active arbitrator

The ISO/IEC 13888 standards [93, 94, 95] provide a straightforward solution tonon-repudiation. They involve a trusted third party which receives the expectationsof both communication parties, the document to exchange and its receipt. If bothdocument and receipt are received by the third party and the expectations are metthey are forwarded to the sender and receiver.There are several drawbacks of this solution. First, data collected at the thirdparty could deanonymize the communication partners and therefore violate privacyand anonymity. Second, the third party is always involved in the exchange evenif both parties are honest. Third, involving a third party can lead to performancebottlenecks if its computational resources aren’t sufficient. Forth, trusted thirdparties are a good way to introduce censorship in a system. Therefore, trusted thirdparties won’t be used in docShare to ensure non-repudiation.

Non-repudiation through oblivious transfer

Another way to exchange documents for receipts was shown by Even et al. [96]The authors describe a protocol for certified mail that relies on oblivious transferto keep both parties honest and assumes that both parties have approximately thesame computational resources. In oblivious transfer a sender sends one out of tworecognizable messages to a recipient but doesn’t know which message is received.The document exchange protocol works like this.

1. Alice encrypts the document d with a one-time key kd and sends the encrypteddocument kd(d) to Bob.

2. Alice creates n key pairs kpAn(k1,k2) where k1 is chosen randomly and k2 isthe XOR of kd and k1.

3. Alice creates a dummy document dd, copies it 2n-1 times, encrypts each onewith a key of the n key pairs and sends them to Bob.

4. Bob generates n key pairs kpBn(k1,k2) and n unique receipts each with a lefthalf and right half.

5. Bob encrypts the receipts with the n key pairs, k1 is used for the left half andk2 for the right half and sends them to Alice.

6. Alice and Bob send each other one of both keys of the n key pairs throughoblivious transfer, decrypt the halves they can and make sure that they arevalid.

7. Alice and Bob send each other the first bits of all 2n keys and verify that then first bits of the already known keys are equal.

45

CHAPTER 4. PROTOTYPE SPECIFICATION

8. Step 7 is repeated for the second bits, third bits, etc. until all keys have beentransferred.

9. Alice and Bob decrypt the remaining halves of the received messages. Alicehas n receipts from Bob, and Bob can XOR any key-pair to get the decryptionkey for d.

10. Alice and Bob exchange the private keys used during the oblivious transferand verify that the other party did not cheat.

Alice could cheat in step two and use a different key kx instead of kd to generatek2. Bob would still be able to decrypt the dummy document, but has no possibility,until step 9, to detect that Alice has cheated. Therefore, his receipt is only one partof the complete receipt. The other part is Alice’s proof that each of the key pairsshe sent to Bob yields to kd.Unfortunately, Even et al. don’t describe how such proof should look like. In case ofa conflict an arbitrator probably demands from Bob to show him all of Alice’s keypairs to validate that Alice sent the correct keys. This requires from Bob to storethe key pairs received indefinitely. If Bob loses the received key pairs, accidentallyor in purpose, there is no way to prove if Alice’s receipt is valid or not. Nonetheless,as Alice still has a (valid) receipt from Bob, Bob can’t decline that a transactionhappened between them.Due to the uncertainty how Alice can proof that each sent key-pair yields to kd,oblivious transfer won’t be used in docShare to ensure non-repudiation.

Non-repudiation through optimistic protocols

Optimistic protocols for non-repudiation only require a trusted arbitrator in caseone of the communication parties is dishonest. In the usual case that both com-munication partners are honest, the communication takes place directly betweenthe communication partners. This has two benefits compared to non-repudiationthrough a trusted active arbitrator. First, in the usual honest case no additionalmetadata, that could violate privacy or anonymity, is stored at any third party. Thisleads to performance benefits and no single point of failure. Second, in the case of adishonest party a distributed ledger can be used as trusted third party. Only one ofthe communication partners needs to actively interact with the distributed ledger.The other just needs to fetch updates. Therefore, privacy requirements can be meteasily.To address privacy requirements of tracking docShare will use an adapted versionof the optimistic non-repudiation protocol described in [97] and [98]. A distributedledger will be used as trusted arbitrator to harden docShare against censorship.Similar to the IMS the trusted arbitrator can be realized as independent distributedledger or built upon an existing distributed ledger. Its actual realization in theprototype will be justified by the ease of implementation and available systems inthe market.

46

CHAPTER 4. PROTOTYPE SPECIFICATION

The adapted optimistic protocol for non-repudiation works like this:

1. Alice creates a one-time key kd and uses it to encrypt the document d.2. Alice and Bob negotiate a temporary transport key kt to encrypt their message

transfer.3. Alice sends a with kt encrypted, signed message to Bob including kd(d), its

hash h(kd(d)), d’s description, and a deadline td until Alice expects Bob’sreceipt.The message equivalents to following question: “Bob, would you like to receivethis document with following description and commit to sign its receipt untildeadline td?”

4. Bob decrypts the message, verifies Alice’s signature, and reads d’s description.If he doesn’t want to receive the document the protocol terminates. OtherwiseBob sends a with kt encrypted, signed message back to Alice including “yes”,d’s description, h(kd(d)), td, a fallback reference rf , a fallback symmetricencryption key kf , and a fallback deadline tf .The message equivalents to following statement: “Alice, yes I would like toreceive the described document and commit to sign its receipt until deadline td.If I don’t sign the receipt until td, I acknowledge the receipt of d, if you uploadthe with kf encrypted document’s decryption key kd under the reference rf inthe distributed ledger until tf .”

5. Alice decrypts the message, verifies Bob’s signature, and the messages validity.This includes that the proper document is cited and that tf is in reasonablefuture. If all requirements are met, Alice sends a with kt encrypted, signedanswer to Bob including d’s decryption key kd and h(kd(d)).

6. Bob decrypts the message, verifies Alice’s signature, and uses kd to decryptthe document. Afterwards he sends a with kt encrypted, singed receipt backto Alice. It states the word “receipt”, and a copy of the received message instep 5.

7. If Alice hasn’t received a receipt from Bob until td, she uploads the withkf encrypted document’s decryption key kd under the reference rf to thedistributed ledger.

4.2.6 Data exchange with compliance to businessBusiness has two additional requirements to tracking. Namely legal non-repudiationand accountability.Accountability can be archived by versioning changes to every document. Documentscould reference the hash of their predecessor in their header and build a hash linkedlist, reflecting the changes made in every iteration.To achieve legal non-repudiation both communication partners need to legally identify

47

CHAPTER 4. PROTOTYPE SPECIFICATION

each other before each transfer, map the identification to each others public keys,and store that identification process in a tamper free data structure in case adispute arises. As additional requirement the combination of private key and anyidentification recording mustn’t be exploited to generate new valid identificationrecordings. Otherwise a malicious party could publish its private key in publicdomain, and claim that another party that already obtained a valid identificationrecording from a former exchange, identified itself as the malicious party. As legalnon-repudiation depends on the legal jurisdiction both exchange partners agree tooperate in, it needs to obey that jurisdiction’s guidelines and data privacy regulations.This makes it a very interesting and challenging problem to solve.Unfortunately, the author hasn’t found any solution to the problem yet. Therefore,contrary to initial intended, data exchange with compliance to business won’t besupported in docShare.

48

Chapter 5

Prototype implementation

In this chapter a reference implementation of docShare, regarding to its designspecification in chapter 4, is described. The accompanying source code and testenvironment virtual machines can be found in appendix C. The installation andusage guide in appendix B has more information about the usage of the prototypeand the used software environment to reproduce this research.

5.1 Limitations to the specificationDue to its purpose as rudimentary proof of concept not all features mentionedin the specification are implemented. Only those necessary to validate the initialassumption, that the exchange of documents of different levels of sensitivity, asdescribed in section 3.1, is possible in a decentralized and censorship resistant fashionusing distributed ledger technology, are implemented.

Therefore, the implementation underlies following limitations:

• Only one mode of legal identity verification, through an external channel, issupported. Legal identity verification without external channel using chat andvideo conference won’t be implemented. (cf. section 4.2.2)

• Only one channel of document sharing, through the anonymization network,is supported. Faster private document sharing through a public peer-to-peerconnection won’t be implemented. (cf. section 4.2.3)

5.2 Public identity management serviceThe public identity management service is realized as Ethereum smart contract. Thecontract serves as key value store where users can upload their identity information.An identity consists of the Tor hidden service address used to operate the anonymousdocShare endpoints, the public RSA key used for digital enveloping during documentexchange, and optional fields a user can define for herself. Every time a user creates

49

CHAPTER 5. PROTOTYPE IMPLEMENTATION

a new identity a unique identity identifier in form of an integer is assigned by thesmart contract.A web-frontend was created to interact with the smart contact. As shown in figure 5.1users can use the frontend to lookup identity information, update their own identityinformation, and permanently deactivate their identity in case their Ethereum walletused to create the identity got compromised.

Figure 5.1: Screenshot of docShare’s public identity management system’s web-frontend.

5.3 Key value storage for trackingThe key value store to store encrypted decryption keys in case a sender didn’t receivea receipt for documents of sensitivity tracking (cf. section 4.2.5) is also realized asEthereum smart contract. The contract has procedures to store arbitrary stringtuples in Ethereum’s blockchain and to query these strings using the first string askey.

50

CHAPTER 5. PROTOTYPE IMPLEMENTATION

A web-frontend (cf. figure 5.2) was built to access the information stored in the keyvalue store and shows the block timestamp of a key value tuples creation.

Figure 5.2: Screenshot of docShare’s key value store’s web-frontend.

5.4 Anonymization networkTor is used to masquerade docShare’s connection metadata and to provision theinfrastructure for location hidden services. To facilitate Tor’s location hidden services,it needs to be configured to forward docShare’s service endpoints listening to ports2342, 8888, and 9999 into the Tor network.The Tor software is further responsible for end-to-end encrypting the communicationchannels of provided hidden services by following Tor’s rendezvous protocol.

5.5 docShare clientThe docShare client is the main program a user interacts with to exchange documentsof different levels of sensitivity. It’s written in Python 3 and consists of four maincomponents:

• a SQLite3 database for local metadata storage,

• three network daemons that form the Tor endpoints for network communication,

• 13 terminal scripts to share documents through the Tor endpoints, to visualizeinformation, and to manage the identity verification, and

51

CHAPTER 5. PROTOTYPE IMPLEMENTATION

• the docShare library that bundles common configurations, cryptographicfunctions, and interfaces to interact with the Ethereum smart contracts andlocal database.

5.5.1 SQLite3 metadata databasedocShare’s SQLite3 metadata storage is responsible for storing information regardingpartner verification, shared documents and received documents. It consists of sixtables, “partners”, “shared”, “sharedConfirmation”, “sharedMapping”, “received”,and “receivedConfirmation”.

The “partners” table stores additional information to an identity of the publicidentity management system. Entries are linked by their unique id. A user can adda name to the partner, the information if he is verified for document exchange, andthe authentication token exchanged through the legal identity verification process.The “shared” and “sharedMapping” tables hold information regarding documentsexchanged with a unique partner or groups of partners, like the document locator. Ifdocuments of sensitivity tracking are exchanged additional receive metadata, like theactual receipt or the fallback deadline, is stored in the “sharedConfirmation” table.Table “received” stores general metadata about received share offerings, i.e. thesender, the document locator and if the document was downloaded. If a documentof sensitivity tracking is received additional metadata, like the fallback reference, isstored in the “receivedConfirmation” table.

5.5.2 docShare libraryThe docShare library bundles common configurations, cryptographic functions, andinterfaces to interact with the Ethereum smart contracts and local database. It isreferenced by the docShare network services and terminal scripts to achieve reusableand maintainable code. It further defines the message format for the networkcommunication and the format for digital enveloping.

Message format specification

As illustrated in figure 5.3 messages in docShare have a defined format. They consistof the sender identifier, the communication protocol used, the actual usage data totransmit, a timestamp, and the sender’s digital signature over the first four fields.

sender protocol data time signature

8-byte 2-byte 1024-byte 8-byte 384-byte

1426 bytes

Figure 5.3: Message format specification used in docShare communications.

52

CHAPTER 5. PROTOTYPE IMPLEMENTATION

The timestamp is used to avoid reply attacks and the signature to guarantee dataintegrity and authenticity. Each message has a fixed length of 1426 bytes to avoidside channel attacks by analysing differences in the packet size transmitted.docShare defines nine message protocols using this format. Their details can befound in table 5.1.

id name purpose data field content0 partner verifica-

tion requestauthenticate yourself to yourpartner for document ex-change

the partner’s one time secretexchanged through the ex-ternal channel during verific-ation

1 resource locatorrequest

request a list of all resourcelocators your partner shareswith you

none

2 resource locatoroffer

transmit up to 4 resource loc-ators shared with this partner

up to 4 resource locators

3 confirmed re-source locatoroffer

transmit up to 4 resource loc-ators shared with this partnerthat require a receive confirm-ation

up to 4 resource locators

4 resource locatorsummary

inform the partner aboutthe number of sent/receivedshared resource locator offers

number of transmitted/re-ceived shared resource locators

5 accept resourcelocator offer

accept to confirm the receipt ofa shared resource that requiresconfirmation

hashsum of the encryptedarchive, expiry time of the of-fer, fallback reference, fallbackencryption key, fallback dead-line

6 resource locatordecryption

share the decryption key of ashared resource that requiresconfirmation

hashsum of the encryptedarchive, expiry time of the of-fer, decryption key

7 resource receiveconfirmation

acknowledge the receipt of ashared resource

“acknowledge receipt of filewith hashsum: [hashsum of en-crypted archive] from your of-fer deadlined: [expiry time ofthe offer]”

666 error notification inform the recipient that some-thing went wrong

error message

Table 5.1: List of docShare’s message protocol formats.

53

CHAPTER 5. PROTOTYPE IMPLEMENTATION

Digital envelope formats

docShare uses two different digital envelope formats to exchange documents ofsensitivity secure, private, and tracking.

The digital envelope format for sensitivities secure and private is depict in fig-ure 5.4. The content data and its digital signature are encrypted with a randomlygenerated AES-256 key which is asymmetrical encrypted for every recipient withthe recipient’s public key.

ARCHIVE

…….

ARCHIVE

Signature

File

Figure 5.4: Digital envelope format for documents of sensitivity secure and private.

To masquerade the number of recipients randomly generated fake decryption keysare generated, asymmetrical encrypted with randomly generated fake public keysand added to the archive until the number of keys stored reaches a multiple of 20.

The digital envelope format for documents of sensitivity tracking (cf. figure 5.5)extends the digital envelope format of sensitivities secure and private with an addi-tional nested encrypted archive. Similar to above mentioned format the first AESencrypted archive can be decrypted with the asymmetrical encrypted decryptionkey provided for each recipient. But, instead of getting access to the content data,recipients get access to the content data’s file descriptor and its digital signature.

54

CHAPTER 5. PROTOTYPE IMPLEMENTATION

ARCHIVE

…….

ARCHIVE

Signature

File descriptor

ARCHIVE

Signature

File

Figure 5.5: Digital envelope format for documents of sensitivity tracking.

The file descriptor holds information about the encrypted content data, the hash ofthe nested encrypted archive it is referring to, a timestamp when the offer is expiring,and information needed to proceed with the optimistic protocol for tracking. Thiscould be in form of a legally binding contract. The key to decrypt the content dataisn’t part of the envelope format and needs to be received through the optimisticdocument exchange protocol.

5.5.3 ServicesdocShare uses three background daemons to handle the exchange of documents ofdifferent sensitivities and to perform the identity verification. These services useTCP sockets that are bound to localhost and are forwarded through Tor. Therefore,no additional network channel encryption is implemented, as Tor provides an end toend encryption for location hidden services.

comDaemon

The communication daemon (comDaemon) takes care of all protocol communicationin docShare and handles messages of the format described in section 5.5.2. Once amessage arrives it is checked if it contains a valid signature and if its timestamp is notolder than 60 seconds. The first check is used to guarantee the message’s integrityand authenticity and the second is used to reduce the probability of successful replyattacks.If both checks succeed the message is handled depending on its protocol field.Otherwise the TCP connection is terminated. The comDaemon accepts messages ofprotocols partner verification request, resource locator request, resource locator offer,confirmed resource locator offer, accept resource locator offer, and resource receive

55

CHAPTER 5. PROTOTYPE IMPLEMENTATION

confirmation. Any message except these of protocol partner verification also requiresthat the sender of the message is a verified partner. If he isn’t, the connection isterminated. Table 5.2 describes how the comDaemon handles the different protocolsof received messages once they pass all checks.

message protocol action taken by comDaemonpartner verification request Check if the provided one-time secret matches

the one assigned to the sender in the database.If so, verify the sender as partner and reply witha partner verification request. Otherwise closethe connection.

resource locator request Sent all resource locators of shared documentsassigned to this partner via resource locator of-fer and conf. resource locator offer messagesto the partner’s comDaemon. Afterwards replywith the number of shared resources via resourcelocator summary message.

resource locator offer Save containing resource locators in the data-base and reply with a resource locator summarymessage.

conf. resource locator offer Save containing resource locators in the data-base and reply with a resource locator summarymessage.

accept resource locator offer Check if hashsum, expiry date and provided fall-back data are valid. If so, save the received mes-sage, update the fallback data in the database,and reply with the decryption key via resourcelocator decryption message. Otherwise reply withan error message.

resource receive confirmation Check if the receipt states a valid hashsum anddeadline. If so, save the received confirmation,update the database that a confirmation wasreceived and close the connection. Otherwisereply with an error message.

Table 5.2: Message handling by comDaemon.

shareDaemon

The share daemon (shareDaemon) is responsible for providing access to shareddocuments of sensitivity secure, private, and tracking based on their resource locator.It is realized as HTTP service providing access via HTTP GET request. Onlypartners that know the correct 255-character long resource locator consisting of analphabet of 66 letters can download the encrypted document.

56

CHAPTER 5. PROTOTYPE IMPLEMENTATION

Assuming a future TCP connection establish time of 1ms1 in the Tor network it wouldtake 25566 ms (around 2.151148 years) to iterate through all possible combinationsof resource locators. It is therefore deemed unrealistic, that an attacker can usethis technique to figure out the number of documents a particular person is sharingthrough docShare. Even though if an attacker is lucky to download a shareddocument it can’t access its content as it is encrypted. In addition, he won’t be ableto either know the intended recipients nor the actual number of recipients as theyare masqueraded through the digital envelope formats. Therefore, this rudimentaryproof of concept abstains of an additional layer of authentication in shareDaemon’simplementation.

anonDaemon

The anonymous receive daemon (anonDaemon) is used to anonymously receivedocuments from any docShare user. It is implemented as HTTP service that receivesarbitrary UTF-8 encoded data2 via HTTP POST request and stores them on disk.To avoid that two documents with the same name overwrite each other a uploadeddocument is prefixed with the current timestamp.Due to the high abuse potential of operating an anonymous receive daemon, it is de-activated by default and needs to be explicitly activated in docShare’s configuration.

5.5.4 Terminal scriptsThe last components of the docShare client are the terminal scripts used to interactwith the local docShare instance and partner docShare hidden service endpoints.Their main purpose is the sharing of documents, partner management, systeminitialization, and information visualization. As a proof of concept doesn’t requirea graphical user interface terminal scripts were chosen as form of implementationto reduce the development time. Each script is shortly described below, stating itspurpose, command line arguments, and main execution steps.

add_confirmed_share.py

This script is used to share a document of sensitivity tracking with one or multiplepartners.

It requires the file to share, the file’s description, the days of validity until thesharing offer expires, and the recipients’ ids or names as command line arguments.

When the script is called it checks the command line arguments for validity, createsthe encrypted archive of the digital envelope format for tracking, moves the archive

1Currently the average round trip time of TCP connections in the internet is around 90ms.2To reduce the development time only UTF-8 encoded data is supported. This is sufficient for

a rudimentary proof of concept.

57

CHAPTER 5. PROTOTYPE IMPLEMENTATION

to the folder where the shareDaemon can provide it, updates the metadata databasetables “shared”, “sharedMapping” and “sharedConfirmation” accordingly, and sendsmessages of protocol conf. resource locator offer to each recipient’s comDaemon.

add_partner.py

This script is used to add a user from the IMS as potential partner and to initiatethe verification process of the exchanged one-time-secrets.

It requires the user id in the IMS of the potential partner, a local identification string(e.g. name), the own one-time-secret exchanged with the potential partner, and theone-time-secret received from the potential partner as command line arguments.

When the script is called it checks if the user id for validity, updates the respectivefields in the metadata database table “partners”, and sends a message of protocolpartner verification request to the potential partner’s comDaemon. If it receives apartner verification message as reply it checks if stored and received one-time-secretsmatch and changes the status from potential partner to partner.

add_share.py

This script is used to share a document of sensitivity secure or private with one ormultiple partners.

It requires the file to share, the file’s description, and the recipients’ ids or names ascommand line arguments.

When the script is called it checks the command line arguments for validity, createsthe encrypted archive of the digital envelope format for secure and private, movesthe archive to the folder where shareDaemon can provide it, updates the metadatadatabase tables “shared” and “sharedMapping”, and sends messages of protocolresource locator offer to each recipient’s comDaemon.

delete_partner.py

This script is used to delete a (potential) partner from the metadata database.

It requires the partner’s id and the partner’s name to delete as command linearguments.

When the script is called it checks the command line arguments for validity anddeletes the according entry from the “partners” table.

58

CHAPTER 5. PROTOTYPE IMPLEMENTATION

delete_share.py

This script is used to delete a shared document from docShare.

It requires the id of the share to delete and its description as command line arguments.

When the script is called it checks the command line arguments for validity, obtainsthe according resource locator from the database, deletes the respective archive fromshareDaemon’s folder, and deletes the share’s entries from the “shared”, “sharedMap-ping” and “sharedConfirmation” tables.

download_received.py

This script is used to download a document of sensitivity secure, private, or trackingfrom a partner.

It requires the document’s share id in the database as command line argument.

When the script is called it checks the command line argument for validity, andconstructs the document’s resource locator from the partner’s .onion address in theIMS and the locator information stored in the “received” table. Then the share isdownloaded to a temporary directory. Afterwards the script iterates through thearchive’s encrypted decryption keys while trying to decrypt the according decryptionkey with its private key. Once the decryption key is decrypted the encrypted archivewill be decrypted. Depending on the form of digital envelope format distinguishedthe script will either check the file’s validity or the file descriptor’s validity using thedigital signature provided.If the archive is of digital envelope format secure or private, all temporary files aredeleted, and the file and its signature are moved to docShare’s receive directory.Afterwards the status of the share will be updated in the “received” table.If the archive is of digital envelope format tracking, the encrypted archive’s hashsum,and the offer’s deadline and description, are extracted from the file descriptor. Usingthe hashsum the archive is checked for integrity. Afterwards the files description isdisplayed to the user who has to explicitly accept to confirm the receipt of the file.If he chooses to fallback reference and fallback encryption key are generated, andthe fallback deadline is calculated to be one hour in the future. According entriesare updated in the “receivedConfirmation” table. Then a message of protocol acceptresource locator offer is sent to the partner. The received reply is updated in the“receivedConfirmation” table and checked for integrity and proper protocol. Thenthe decryption key is extracted, and the archive decrypted. Finally, the contentfile’s validity is checked using the digital signature provided, according entry inthe “received” table is updated, and a resource receive confirmation is sent to thepartner’s comDaemon.

59

CHAPTER 5. PROTOTYPE IMPLEMENTATION

initialize.py

This script is used to initialize docShare’s SQLite3 metadata database and to createthe RSA key pair of 3072-bit length.

No command line arguments are required.

share_file_anonymously.py

This script is used to share a document of sensitivity anonymous with one or multiplerecipients.

It requires the file to share and the recipients’ ids as command line arguments.

When the script is called it checks the command line arguments for validity andtries to send the file to the recipients’ anonDaemons via HTTP POST request.

show_partner.py

This script is used to show all partner information from the “partners” table.

It has no command line arguments.

show_received.py

This script is used to show relevant information related to received share offeringsand their download status.

It has an optional “verbose” argument to display more information.

show_shares.py

This script is used to show relevant information related to documents shared withpartners and their status.

It has an optional “verbose” argument to display more information.

update_received.py

This script is used to manually synchronize all shared offerings of documents partnersshared with you.

It has no command line arguments.

When the script is called it sends a resource locator request message to everypartner.

60

CHAPTER 5. PROTOTYPE IMPLEMENTATION

upload_key_to_ledger_from_accepted_but_unconfirmed_shares.py

This script is used to determine the documents of sensitivity tracking that require amanual upload of the encrypted decryption key to the fallback reference in case theoptimistic protocol failed and a accept resource locator offer message was receivedbut no resource receive confirmation.

It has no additional command line arguments.

When this script is called it determines the documents of sensitivity tracking thatrequire a manual upload of the encrypted decryption key, decrypts it, and uploadsit to the regarding fallback reference. Afterwards the regarding entries in the“sharedConfirmation” table are updated.

61

Chapter 6

Prototype evaluation

In this chapter the prototype implementation of chapter 5 is analysed with regardsto its compliance to the defined document sensitivity categories of section 3.1 andto its functionality as outlined in chapter 4. Each of docShare’s components thatinteracts via the network is analysed regarding to its information security servicesto validate the global compliance of the system. Once the validation is complete thetheoretical connection of the concept is extracted.

6.1 Evaluation environmentdocShare was tested and evaluated in a virtual environment of three Ubuntu 17.10virtual machines that is attached to appendix C and sketched in figure 6.1. Theenvironment was setup according to the installation and usage guide of appendix Busing its own private Ethereum test network. Each of the three virtual machines isrunning an instance of the Go Ethereum client Geth, the anonymization softwareTor, docShare, and the web-frontend to access the Ethereum smart contracts forIMS and the simple key value store.

6.2 Public identity managementIn its web-frontend docShare uses the JavaScript library web3.js to interact with theEthereum smart contract for the public identity management service. As web3.js andthe Ethereum smart contract interact through Geth’s RPC API bound to localhostno information security evaluation is needed at this interface as no information isleaving the host.Therefore, only the connection between the Ethereum nodes at the submission oftransactions stating the change or creation of an identity needs to be analysed. HereEthereum only provides confidentiality, integrity, and sender authenticity by usinghashes and asymmetric cryptography. Privacy isn’t achieved as Geth doesn’t haveinterfaces to route traffic through Tor’s SOCKS proxy. As a result, IP addresses areshared with the receiving Ethereum nodes.

62

CHAPTER 6. PROTOTYPE EVALUATION

internet

Alice Bob Carol

Figure 6.1: Environment used to evaluate docShare’s implementation.

Geth’s usage of UDP for node discovery and its block update process wouldn’t becritical if docShare’s smart contracts are part of Ethereum’s global distributed ledgeras an adversary won’t be able to distinguish which smart contracts are used by theclient. In the evaluated test network however here also valuable non-masqueradedmetadata in form of IP addresses undermines the users privacy.From a functionality standpoint creating new identities, updating existing identities,and permanently deactivating identities was successfully tested using the web-frontend provided and by verifying the changes on the two other node’s web-frontends.

6.3 Partner managementThe partner management and verification communicate solely through Tor hiddenservices. Therefore, confidentiality and privacy are achieved as Tor’s rendezvousprotocol sufficiently masquerades TCP metadata and encrypts the communicationchannels. Data integrity and authenticity are met through docShare’s definedmessage format using digital signatures over the transferred content data.The functionality of the partner management was successfully tested by simulatingthe exchange of one-time-tokens and user ids between Alice, Bob, and Carol through

63

CHAPTER 6. PROTOTYPE EVALUATION

external channels and using the script add_partner.py. Also, the delete of a partnerwas successfully tested using the script delete_partner.py. The handling of wrongcommand line arguments and exchanged one-time-tokens was tested as well.

6.4 Document exchange with regards to secure andprivate

The document exchange for sensitivity secure and private uses the same scripts.Therefore, they only need to be analysed once. Similar to partner management theshare of document offers through comDaemon provides confidentiality and privacythrough Tor’s rendezvous protocol. Similarly, data integrity and authenticity aremet through docShare’s message format.The actual fetching of documents through the shareDaemon and the client scriptdownload_received.py is also secured through Tor (security and privacy wise).Integrity and authenticity are achieved through docShare’s digital envelope formatwith digital signatures over the content data. Asymmetric cryptography is not onlyused for encryption but also to masquerade the actual recipient information.The functionality of document exchange with regards to secure and private wassuccessfully tested by sharing a test file between Alice and Bob, and another test filebetween Bob, Alice, and Carol, and comparing their SHA-256 hashes once they weredownloaded by all parties. Another test was performed where Alice temporarilydeleted Carol as partner and Carol tried to share a file with Alice, this sharing failedas expected. Also, the delete of a file via delete_share.py was successfully tested byobserving the delete of the share in the file system. The manual synchronizationvia update_received.py was successfully tested as well. First host Bob shut downits docShare daemons, then Carol shared a file with Bob but couldn’t submit theresource locator offer, afterwards Bob’s docShare daemons were reactivated andupdate_received.py executed.

6.5 Document exchange with regards to anonymousThe anonDaemon is implemented as Tor hidden service and doesn’t require partnerverification. Confidentiality, privacy, and sender anonymity are achieved throughTor’s rendezvous protocol and build in channel encryption. No additional servicesfor data integrity are implemented. The recipient authenticity can be achieved ifsender and recipient did verify themselves beforehand through an external channel.This service has been successfully tested by activating the anonDaemon in Alice’sdocShare configuration and sharing an UTF-8 encoded file from Bob using share_-file_anonymously.py. For verification the original shared file’s SHA-256 hash andthe received SHA-256 hash were compared.

64

CHAPTER 6. PROTOTYPE EVALUATION

6.6 Document exchange with regards to trackingTracking uses the same hidden service, docShare messaging format - digital envelopecombination as document exchange with regards to secure and private. Therefore,confidentiality, integrity, privacy, and authenticity are met for the case that theoptimistic protocol exits without the need to store the encrypted decryption key inthe key value store smart contract.In case the optimistic protocol fails, and the upload of fallback data is requireddocShare suffers from the same privacy issues as public identity management wheninteracting with Geth and Ethereum to update identities.The implemented functionality was tested successfully. Bob is able to share afile with Carol and Alice using add_confirmed_share.py, and Alice can retrievethe file via download_received.py. Once the process is done, Bob has Alice’saccept resource locator offer and resource receive confirmation in his local data-base. It was also tested when Carol retrieves the file via a modified version ofdownload_received.py that doesn’t send a resource receive confirmation. In thiscase Bob was able to store Carol’s accept resource locator offer message in hisdatabase and upload the decrypted encryption key to the fallback reference usingupload_key_to_ledger_from_accepted_but_unconfirmed_shares.py.

To avoid a mapping between the Ethereum wallet used to register oneself in the IMSand the wallet used to upload fallback keys to the key value smart contract, it isadvised to use a second independent wallet for uploading fallback keys. docSharesupports this in its configuration.

6.7 Document exchange with regards to businessAs stated in section 4.2.6 data exchange with compliance to business wasn’t imple-mented in docShare. Therefore, its implementation can’t be evaluated and is simplynot supported.

6.8 SummaryTo summarize docShare currently only supports the document exchange of sensitivitysecure and partially supports sensitivities private, anonymous, and tracking. Adetailed summary is given in tables 6.1 and 6.2. docShare’s main limitation isthe metadata masking of IP addresses when interacting with the Ethereum smartcontracts. Ethereum currently has no support for SOCKS proxies to update entriesanonymously. Further is Ethereum’s node discovery protocol based on UDP andtherefore also leaks identity information in form of IP addresses. Anonymity isn’tachieved as data integrity services aren’t implemented to verify that a file wasuploaded correctly.

65

CHAPT

ER6.

PROTOTYPE

EVALU

ATIO

Ndocument exchange with compliance to

public identitymanagement

partner man-agement

secure private anonymous tracking business

Confidentiality 3 3 3 3 3 3 n.a.Integrity 3 3 3 3 7 3 n.a.Privacy 7 3 3 3 3 (3) n.a.Anonymity 7 7 7 7 3 7 n.a.Non-repudiation n.a. n.a. 7 7 7 3 n.a.Legal non-repudiation

n.a. n.a. 7 7 7 3 n.a.

Authenticity ofsender / recipient

3/ n.a 3/ 3 3/ 3 3/ 3 7/ 3 3/ 3 n.a. / n.a.

Accountability 3 n.a. 7 7 7 7 n.a.

Table 6.1: Summary of docShare’s implementation evaluation regarding complied with information security services. Followingnotion is used: 3: complied, 7: not complied, (3): partially complied, and n.a.: not applicable.

document exchange with compliance topublic identitymanagement

partner man-agement

secure private anonymous tracking business total

secure 3 3 3 3 7 3 n.a. 3

private 7 3 3 3 7 (3) n.a. (3)anonymous 7 7 7 7 (3) 7 n.a. (3)tracking n.a. n.a. 7 7 7 3 n.a. (3)business n.a. n.a. 7 7 7 7 n.a. 7

Table 6.2: Summary of docShare’s implementation evaluation regarding complied with document’s levels of sensitivity.Following notion is used: 3: complied, 7: not complied, (3): partially complied, and n.a.: not applicable.

66

CHAPTER 6. PROTOTYPE EVALUATION

Nonetheless, the prototyped implementation shows that with slight modifications inthe implementation and the change of the distributed ledger for IMS and key valuestore a decentralized DSS on the basis of distributed ledger technology complying todefined document sensitivities secure, private, anonymous, and tracking is possibleas all implemented functionality was tested successfully.

6.9 Theoretical connection of the conceptTo align with the constructive research methodology the theoretical connectionof docShare is shown in this section. docShare differentiates itself from all otheranalysed DSS by separating the storage of content- and metadata. Metadata ishandled and stored at the client. It is used for the local identity management oftrusted partners and the management of shared documents including their encryption.Content data could be stored at any location as its locator information and decryptionmanagement is transferred directly via private peer-to-peer connections utilizingTor hidden services. Identity management and name system services are designedas decentralized systems using distributed ledger technology to combine a centralrepository to look up entries with the benefits of censorship resistant and denial ofservice resistant distributed systems.Using this design, it is still possible to attack single host’s Tor hidden serviceendpoints via denial of service attacks or to attack single distributed ledger nodes,but this won’t impact the global state of the system. Relying on Tor hidden servicesmakes you further mostly independent from your network operator and his attemptsof censoring services (e.g. through DNS filters).

67

Chapter 7

Conclusion

To analyse to what degree current document sharing systems meet the informationsecurity requirements of their user, a literature review was conducted to extractand define relevant information security services for document exchange; namely:data integrity, data confidentiality, security, privacy, anonymity, authenticity ofsender and recipient, (legal) non-repudiation, and accountability. Based on extractedinformation security services and practical use cases derived from the post officeanalogue, five combinations of information security services were defined as documentsensitivity categories. These document sensitivity categories secure, private, anonym-ous, tracking, and business form the basis for the information security complianceanalysis of commonly used document exchange systems. Four centralized file-sharingservices, including the marked leaders Box and Dropbox, three decentralized datastorage and file-sharing services, and the exchange of digital assets through securee-mail were analysed based on their publicly available information, with the resultthat none of the systems complies to any of the higher document sensitivity categor-ies private, anonymous, tracking or business. Existing document systems mainlylack the support to masquerade metadata information needed to share documents.Therefore (trusted) third parties are able to get information about who shared whatdocuments with whom. Also the leakage of identifiers in form of IP addresses is ahuge concern. The analysis further pointed out that legal jurisdictions, in which theservices operate, can undermine the service’s confidentiality by forcing the operatorto decrypt data and hand it out to federal agencies.As a result, a new concept to share documents using distributed ledger technologyand anonymization software was developed to exchange documents without the needof centralized metadata storage and trusted third party. In this concept the client isdirectly responsible for its metadata management. This includes metadata relatedto shared document locators, their encryption keys, and her partner management.The concept uses a twofold identity management system. Global identities, theiraccessibility endpoint as Tor hidden service, and public RSA key, are available in adecentralized directory in form of a distributed ledger. But the partner management,including the real name partner verification is outsourced to the client. Therefore,

68

CHAPTER 7. CONCLUSION

only a client holds all information with whom to interact. Different protocols fromliterature were analysed for applicability and ease of implementation to support thehigher sensitivity categories tracking and business. In the end an optimistic protocolwas chosen to be used to implement technical non-repudiation. Only in case theoptimistic protocol fails a distributed ledger is needed to resolve a conflict. Aninteresting problem regarding legal non-repudiation and how to preserve informationregarding legal non-repudiation in a data structure that can only be used once butis still legally binding was found. Unfortunately, it couldn’t be solved in scope ofthis thesis. Therefore, no working concept to support sensitivity tracking could beprovided.Based on the developed concept a prototype named docShare was implemented asrudimentary proof of concept. It uses a private Ethereum network for the globalidentity management and key value store in case the optimistic protocol of sensitivitytracking fails. Its functionality was proven through test cases and its compliance todefined sensitivity services was analysed. It was found out that the implementeddistributed ledger Ethereum can’t be anonymized to masquerade its metadata whencreating or updating entries. Therefore, metadata in form of IP addresses canleak while interacting with the Ethereum smart contracts. As a result, docShareonly supports sensitivity secure, and partially sensitivities private, anonymous, andtracking. With that key limitation identified a future prototype can be built onthe basis of developed concept and a different distributed ledger that supportsmetadata anonymization, completely complying to defined document sensitivitiessecure, private, anonymous, and tracking.

7.1 Recommendations for future workThere are a couple of distributed ledger related topics that are a good starting pointto investigate further for future work. Obviously implementing a newer versionof docShare using a distributed ledger that supports metadata anonymizationthrough Tor’s SOCKS proxy should be considered. Another starting point is moresustainability related by analysing different consensus algorithms that rely on lesscomputational resources than proof of work consensus. Also, the indefinite growth ofcurrent distributed ledgers over time should be analysed as it eventually will impacttheir decentralization and censorship resistance. In a distant future the distributedledger’s records will be that enormous (e.g. 10TB) that only bigger companies canafford to store all transactions. At this point general household computers won’t beable to store a local copy of the transaction log and rely on the bigger companies tovalidate all transactions. Legal aspects of maintaining a distributed ledger are alsoa great field for research. People already abuse the structure of the blockchain byuploading virus fragments and child abusive material directly into the blockchain.[99] Depending on the jurisdiction a miner is operating from this could lead toserious accusation and imprisonment. Therefore, either technical or legal solutionsto this problem need to be found.

69

CHAPTER 7. CONCLUSION

Besides distributed ledger related research one could investigate how to supportlegal non-repudiation in distributed document exchange and how to save it in anunforgeable and not reusable format. It also might be worth investigating to findother areas besides document sharing where the concept of metadata and contentdata separation is applicable and useful, and how important IP address masqueradingreally is for end-users.From the prototype implementation side, the switch from Tor version 2 to Torversion 3 to prohibit denial of service attempts of Tor hidden services through doubleencrypted location hidden service identifiers, and the storage of encrypted contentdata at untrusted intermediaries could also be worth evaluating. One could alsoinvestigate to improve data availability by uploading digital enveloped documents tountrusted intermediaries.

70

Bibliography

[1] N. Confessore, “Cambridge analytica and facebook: The scandal and the falloutso far,” New York Times, 2018. [Online]. Available: https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html (Accessed2018-06-02).

[2] E. Macaskill and G. Dance, “Nsa files: Decoded - whatthe revelations mean for you,” The Guardian, 2013. [Online].Available: https://www.theguardian.com/world/interactive/2013/nov/01/snowden-nsa-files-surveillance-revelations-decoded (Accessed 2018-06-02).

[3] J. Naughton, “Death by drone strike, dished out by algorithm,” The Guardian,2016. [Online]. Available: https://www.theguardian.com/commentisfree/2016/feb/21/death-from-above-nia-csa-skynet-algorithm-drones-pakistan (Accessed2018-06-02).

[4] Dropbox, Inc. (2017) Dropbox - homepage. [Online]. Available: https://www.dropbox.com/ (Accessed 2017-10-11).

[5] Citrix. (2017) Rightsignature homepage. [Online]. Available: https://rightsignature.com/ (Accessed 2017-09-05).

[6] Congress, US, “h.r.3162 - 107th congress (2001-2002): uniting and strengtheningamerica by providing appropriate tools required to intercept and obstructterrorism (usa patriot act) act of 2001,” Washington, DC, 2001.

[7] ——, “h.r.2048 - 114th congress (2015-2016): Uniting and strengthening americaby fulfilling rights and ensuring effective discipline over monitoring (usa freedomact) act of 2015,” Washington, DC, 2015.

[8] Dropbox, Inc., “Transparency reports.” [Online]. Available: https://www.dropbox.com/transparency/reports (Accessed 2017-10-25).

[9] E. Kasanen, K. Lukka, and A. Siitonen, “The constructive approach in manage-ment accounting research,” Journal of management accounting research, vol. 5,p. 243, 1993.

71

BIBLIOGRAPHY

[10] G. Crnkovic, “Constructive research and info-computational knowledge genera-tion,” Model-Based Reasoning in Science and Technology, vol. 314, pp. 359–380,2010.

[11] European Comission, “Proposal for a directive of the european parliamentand of the council on copyright in the digital single market,” 2016. [Online].Available: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52016PC0593 (Accessed 2018-07-01).

[12] K. J. O’Dwyer and D. Malone, “Bitcoin mining and its energy footprint,” 2014.

[13] A. de Vries, “Bitcoin’s growing energy problem,” Joule, vol. 2, no. 5, pp.801–805, May 2018. doi: 10.1016/j.joule.2018.04.016. [Online]. Available:http://dx.doi.org/10.1016/j.joule.2018.04.016

[14] R. Shirey, “Internet Security Glossary, Version 2,” Internet Requestsfor Comments, RFC 4949, August 2007. [Online]. Available: https://tools.ietf.org/html/rfc4949 (Accessed 2017-09-07).

[15] A. Freier, P. Karlton, and P. Kocher, “The Secure Sockets Layer (SSL) ProtocolVersion 3.0,” Internet Requests for Comments, RFC 6101, August 2011.[Online]. Available: https://tools.ietf.org/html/rfc6101 (Accessed 2017-09-12).

[16] S. Muftic, N. bin Abdullah, and I. Kounelis, “Business information exchange sys-tem with security, privacy, and anonymity,” Journal of Electrical and ComputerEngineering, vol. 2016, 2016.

[17] NIST, “197: Advanced encryption standard (aes),” Federal informationprocessing standards publication, November 2001. [Online]. Available:http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.197.pdf (Accessed 2017-09-14).

[18] ——, “180-4: Secure hash standard,” Federal information processing standardspublication, August 2015. [Online]. Available: http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf (Accessed 2017-09-14).

[19] E. W. Felten. (2015) Hash pointers and data structures. Bitcoinand Cryptocurrency Technologies Lecture at Coursera.org. [Online].Available: https://www.coursera.org/learn/cryptocurrency/lecture/EYEAo/hash-pointers-and-data-structures (Accessed 2017-09-21).

[20] R. Rivest, A. Shamir, and L. Adleman, “Cryptographic communications systemand method,” Sep. 20 1983, uS Patent 4,405,829.

[21] N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of computation, vol. 48,no. 177, pp. 203–209, 1987.

72

BIBLIOGRAPHY

[22] V. S. Miller, “Use of elliptic curves in cryptography,” in Conference on theTheory and Application of Cryptographic Techniques. Springer, 1985, pp.417–426.

[23] P. Mahajan and A. Sachdeva, “A study of encryption algorithms aes, des andrsa for security,” Global Journal of Computer Science and Technology, 2013.

[24] W. Diffie and M. Hellman, “New directions in cryptography,” IEEE transactionson Information Theory, vol. 22, no. 6, pp. 644–654, 1976.

[25] B. Schneier, Applied cryptography: protocols, algorithms, and source code in C,20th ed. john wiley & sons, 2017. ISBN 978-1-119-09672-6

[26] R. Gill. (2016) Trust in the era of hackable certificate authorities.Akamai. [Online]. Available: https://enterprise-access.akamai.com/blog/trust-in-the-era-of-hackable-certificate-authorities/ (Accessed 2017-09-16).

[27] L. Zeltser. (2015) How digital certificates are used and misused. [Online].Available: https://zeltser.com/how-digital-certificates-are-used-and-misused/(Accessed 2017-09-16).

[28] R. Housley, W. Ford, W. Pok, and D. Solo, “Internet X.509 Public KeyInfrastructure - Certificate and CRL Profile,” Internet Requests for Comments,RFC 2459, August 2011. [Online]. Available: http://www.ietf.org/rfc/rfc2459.txt(Accessed 2017-09-16).

[29] P. R. Zimmermann, The Official PGP User’s Guide. Cambridge, MA, USA:MIT Press, 1995. ISBN 0-262-74017-6

[30] emercoin.com. (2017) emercoin website. [Online]. Available: https://emercoin.com/ (Accessed 2017-09-18).

[31] C. Allen, A. Brock, V. Buterin, J. Callas, D. Darje, C. Lundkvist,P. Kravchenko, J. Nelson, D. Reed, M. Sabadello, G. Slepak, N. Thorp,and H. T. Wood, “Decentralized public key infrastructure,” 2015. [Online].Available: https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust/blob/master/final-documents/dpki.pdf (Accessed 2017-09-18).

[32] S. Muftic, “Bix certificates: Cryptographic tokens for anonymous transactionsbased on certificates public ledger,” Ledger, vol. 1, pp. 19–37, 2016.

[33] ——, “Blockchain identity management system based on public identities ledger,”Apr. 25 2017, uS Patent 9,635,000.

[34] B. Kaliski, “PKCS #7: Cryptographic Message Syntax - Version 1.5,”Internet Requests for Comments, RFC 2315, March 1998. [Online]. Available:https://tools.ietf.org/html/rfc2315 (Accessed 2017-09-18).

73

BIBLIOGRAPHY

[35] S. Goldwasser, S. Micali, and C. Rackoff, “The knowledge complexity of inter-active proof systems,” SIAM Journal on computing, vol. 18, no. 1, pp. 186–208,1989.

[36] D. Chaum, J.-H. Evertse, and J. Van De Graaf, “An improved protocol fordemonstrating possession of discrete logarithms and some generalizations,”in Workshop on the Theory and Application of of Cryptographic Techniques.Springer, 1987, pp. 127–141.

[37] M. Blum, “How to prove a theorem so no one else can claim it,” in Proceedingsof the International Congress of Mathematicians, vol. 1, 1986, p. 2.

[38] O. Goldreich, S. Micali, and A. Wigderson, “Proofs that yield nothing but theirvalidity or all languages in np have zero-knowledge proof systems,” Journal ofthe ACM (JACM), vol. 38, no. 3, pp. 690–728, 1991.

[39] U. Feige, A. Fiat, and A. Shamir, “Zero-knowledge proofs of identity,” Journalof cryptology, vol. 1, no. 2, pp. 77–94, 1988.

[40] E. Ben-Sasson, A. Chiesa, E. Tromer, and M. Virza, “Succinct non-interactivezero knowledge for a von neumann architecture.” in USENIX Security Sym-posium, 2014, pp. 781–796.

[41] L. Lamport, R. Shostak, and M. Pease, “The byzantine generals problem,”ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 4,no. 3, pp. 382–401, 1982.

[42] E. Buchman, “Tendermint: Byzantine fault tolerance in the age of blockchains,”Guelph, Ontario, Canada, 2016.

[43] C. Cachin and M. Vukolić, “Blockchains consensus protocols in the wild,” arXivpreprint arXiv:1707.01873, 2017.

[44] S. Muftic. (2017, June) Blockchain and smart contracts. GDG Meetup -Presentation. [Online]. Available: https://youtu.be/w7p6B-SS1PA?t=1h1m41s(Accessed 2017-09-20).

[45] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2008.

[46] A. Narayan. (2015) How bitcoin achieves decentralization. Bitcoin andCryptocurrency Technologies Lecture at Coursera.org. [Online]. Available:https://www.coursera.org/learn/cryptocurrency/home/week/2 (Accessed 2017-09-27).

[47] J. Bonneau. (2015) Mechanics of bitcoin. Bitcoin and CryptocurrencyTechnologies Lecture at Coursera.org. [Online]. Available: https://www.coursera.org/learn/cryptocurrency/home/week/3 (Accessed 2017-09-27).

74

BIBLIOGRAPHY

[48] V. Buterin et al., “A next-generation smart contract and decentralized applica-tion platform,” white paper, 2014.

[49] Y. Sompolinsky and A. Zohar, “Secure high-rate transaction processing inbitcoin,” in International Conference on Financial Cryptography and DataSecurity. Springer, 2015, pp. 507–527.

[50] G. H. Gonnet, R. A. Baeza-Yates, and T. Snider, “New indices for text: Pattrees and pat arrays.” Information Retrieval: Data Structures & Algorithms,vol. 66, p. 82, 1992.

[51] V. Zamfir, “Introducing casper “the friendly ghost”,” EthereumBlog, 2015. [Online]. Available: https://blog.ethereum.org/2015/08/01/introducing-casper-friendly-ghost (Accessed 2017-09-29).

[52] V. Buterin, “Casper version 1 implementation guide,” Ethereum Githubrepository, 2017. [Online]. Available: https://github.com/ethereum/research/wiki/Casper-Version-1-Implementation-Guide (Accessed 2017-09-29).

[53] R. G. Brown, J. Carlyle, I. Grigg, and M. Hearn, “Corda:An introduction,” R3 CEV, August, 2016. [Online]. Available:https://www.researchgate.net/profile/Ian_Grigg/publication/308636477_Corda_An_Introduction/links/57e994ed08aed0a291304412.pdf (Accessed2017-07-20).

[54] M. Hearn, “Corda–a distributed ledger,” Corda Technical WhitePaper, 2016. [Online]. Available: https://block.academy/researches/corda-technical-whitepaper.pdf (Accessed 2017-07-20).

[55] R. G. Brown. (2017) The corda way of thinking. [Online]. Available:https://gendal.me/2017/02/21/the-corda-way-of-thinking/ (Accessed 2017-07-20).

[56] ——. (2016) Introducing r3 corda: A distributed ledger designedfor financial services. [Online]. Available: https://gendal.me/2016/04/05/introducing-r3-corda-a-distributed-ledger-designed-for-financial-services/ (Ac-cessed 2017-07-20).

[57] J. R. Douceur, “The sybil attack,” in International Workshop on Peer-to-PeerSystems. Springer, 2002, pp. 251–260.

[58] R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The second-generationonion router,” DTIC Document, Tech. Rep., 2004.

[59] Tor Project, “Tor Metrics,” 2017, [Online] Available: https://metrics.torproject.org/ [Accessed: 14.05.2017].

75

BIBLIOGRAPHY

[60] The Tor project. (unknown) Tor: Overview. [Online]. Available: https://www.torproject.org/about/overview.html.en (Accessed 2017-10-05).

[61] M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas, and J. Jones, “SOCKSProtocol Version 5,” Internet Requests for Comments, RFC 1928, March 1996.[Online]. Available: http://www.ietf.org/rfc/rfc1928.txt (Accessed 2017-10-05).

[62] N. Mathewson, “Tor Rendezvous Specification - Version 3,” Tor Gitweb,Tech. Rep., September 2017. [Online]. Available: https://gitweb.torproject.org/torspec.git/plain/rend-spec-v3.txt (Accessed 2017-10-05).

[63] A. Biryukov, I. Pustogarov, and R.-P. Weinmann, “Trawling for tor hiddenservices: Detection, measurement, deanonymization,” in Security and Privacy(SP), 2013 IEEE Symposium on. IEEE, 2013, pp. 80–94.

[64] G. Noubir and A. Sanatinia, “Honey onions: Exposing snooping tor hsdir relays,”DEF CON, vol. 24, 2016.

[65] The Tor project. (unknown) Tor: Hidden service protocol. [Online].Available: https://www.torproject.org/docs/hidden-services.html.en (Accessed2017-10-05).

[66] S. Talib, N. L. Clarke, and S. M. Furnell, “An analysis of information securityawareness within home and work environments,” in Availability, Reliability,and Security, 2010. ARES’10 International Conference on. IEEE, 2010, pp.196–203.

[67] Skyhigh Networks, “Cloud adoption & risk report 2016 q4,” Campbell,CA 95008, USA, Tech. Rep., 2017. [Online]. Available: https://www.skyhighnetworks.com/cloud-report/ (Accessed 2017-10-11).

[68] Box, Inc. (2017) Box - homepage. [Online]. Available: https://www.box.com/(Accessed 2017-10-10).

[69] ——, “Box keysafe.” [Online]. Available: https://cloud.app.box.com/v/KeySafeDatasheet (Accessed 2017-10-10).

[70] T. Shields and H. Shey, “Quick take: Use “customer-managed keys” toregain control of your data,” February 2015. [Online]. Available: https://www.box.com/security/forrester-encryption-key-management (Accessed 2017-10-10).

[71] Box, Inc., “Box: Redefining content security,” Security White Paper. [Online].Available: https://cloud.app.box.com/v/RedefiningContentSecurity (Accessed2017-10-10).

[72] Box Inc., “Comprehensive security at all levels.” [Online]. Available:https://www.box.com/static/download/Security_Overview_2-1.pdf (Accessed2017-10-10).

76

BIBLIOGRAPHY

[73] Box, Inc., “Box: Securing business information in the cloud.” [Online]. Available:https://cloud.app.box.com/v/SecurityeBook (Accessed 2017-10-10).

[74] Dropbox, Inc., “Dropbox business security,” whitepaper. [Online]. Avail-able: https://cfl.dropboxstatic.com/static/business/resources/dfb_security_whitepaper-vfllunodj.pdf (Accessed 2017-10-11).

[75] Nextcloud GmbH. (2017) Nextcloud - homepage. [Online]. Available:https://nextcloud.com/ (Accessed 2017-10-13).

[76] ——, “End-to-end encryption design,” no. September 20, 2017. [Online].Available: https://nextcloud.com/endtoend/ (Accessed 2017-10-13).

[77] ——. (2017) Nextcloud - homepage - security and authentication. [Online].Available: https://nextcloud.com/secure/ (Accessed 2017-10-13).

[78] Citrix. (2017) Rightsignature homepage - electronic signature security. [Online].Available: https://rightsignature.com/security (Accessed 2017-10-17).

[79] ——. (2017) Rightsignature homepage - legality of electronic signatures.[Online]. Available: https://rightsignature.com/legality (Accessed 2017-10-17).

[80] ——. (2017) Rightsignature homepage - are electronic signatures leg-ally binding? [Online]. Available: https://rightsignature.com/legality/are-electronic-signatures-legally-binding (Accessed 2017-10-17).

[81] Nebulous Inc. (2017) Sia - homepage. [Online]. Available: https://sia.tech/(Accessed 2017-10-19).

[82] Z. Herbert, D. Vorick, and L. Champine. (2017) Trello - sia public roadmap. [On-line]. Available: https://trello.com/b/Io1dDyuI/sia-public-roadmap (Accessed2017-10-19).

[83] D. Vorick and L. Champine, “Sia: Simple decentralized storage,” 2014.

[84] dinkel. (2017) Sia wiki - contracts. [Online]. Available: https://siawiki.tech/renter/contracts (Accessed 2017-10-19).

[85] Taek. (2015) Sia forum - how sia works. [Online]. Available: https://forum.sia.tech/topic/108/how-sia-works (Accessed 2017-10-19).

[86] I. S. Reed and G. Solomon, “Polynomial codes over certain finite fields,” Journalof the society for industrial and applied mathematics, vol. 8, no. 2, pp. 300–304,1960.

[87] Storj Labs Inc. (2017) Storj - homepage. [Online]. Available: https://storj.io/(Accessed 2017-10-21).

77

BIBLIOGRAPHY

[88] P. Maymounkov and D. Mazieres, “Kademlia: A peer-to-peer informationsystem based on the xor metric,” in International Workshop on Peer-to-PeerSystems. Springer, 2002, pp. 53–65.

[89] S. Wilkinson, T. Boshevski, J. Brandoff, and V. Buterin, “Storj a peer-to-peercloud storage network,” 2016. [Online]. Available: https://storj.io/storj.pdf(Accessed 2017-09-08).

[90] D. Svensson and P. Leund, “Secures: Secure resource sharing system,” Bachelor’sThesis, KTH Royal Institute of Technology, Brinellvägen 8, 114 28 Stockholm,Sweden, 2015.

[91] J. Callas, L. Donnerhacke, H. Finney, D. Shaw, and R. Thayer, “OpenPGPMessage Format,” Internet Requests for Comments, RFC 4880, November 2007.[Online]. Available: https://tools.ietf.org/html/rfc4880 (Accessed 2017-09-08).

[92] D. Poddebniak, C. Dresen, J. Müller, F. Ising, S. Schinzel, S. Friedberger,J. Somorovsky, and J. Schwenk, “Efail: Breaking s/mime and openpgpemail encryption using exfiltration channels (draft 0.9.1).” [Online]. Available:https://efail.de/efail-attack-paper.pdf (Accessed 2019-05-27).

[93] “Information technology – Security techniques – Non-repudiation – Part 1: Gen-eral,” International Organization for Standardization, Geneva, CH, Standard,Jul. 2009.

[94] “Information technology – Security techniques – Non-repudiation – Part 2:Mechanisms using symmetric techniques,” International Organization for Stand-ardization, Geneva, CH, Standard, Dec. 2010.

[95] “Information technology – Security techniques – Non-repudiation – Part 3:Mechanisms using asymmetric techniques,” International Organization forStandardization, Geneva, CH, Standard, Dec. 2009.

[96] S. Even, O. Goldreich, and A. Lempel, “A randomized protocol for signingcontracts,” Communications of the ACM, vol. 28, no. 6, pp. 637–647, 1985.

[97] N. Asokan, M. Schunter, and M. Waidner, “Optimistic protocols for fair ex-change,” in Proceedings of the 4th ACM conference on Computer and commu-nications security. ACM, 1997, pp. 7–17.

[98] B. Pfitzmann, M. Schunter, and M. Waidner, “Provably secure certified mail,”2000.

[99] R. Matzutt, J. Hiller, M. Henze, J. H. Ziegeldorf, D. Müllmann, O. Hohlfeld,and K. Wehrle, “A quantitative analysis of the impact of arbitrary blockchaincontent on bitcoin,” in Proceedings of the 22nd International Conference onFinancial Cryptography and Data Security (FC). Springer, 2018. [Online].Available: https://fc18.ifca.ai/preproceedings/6.pdf (Accessed 2018-06-05).

78

Appendix A

Declaration of independence

I hereby certify that I have written this thesis independently and have only usedthe specified sources and resources indicated in the bibliography.

Menlo Park, CA, 27th August 2018

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jens Röwekamp

79

Appendix B

Installation and usage guide

InstallationThis section shows the installation of a docShare instance based on a fresh 64BitUbuntu 17.10 Desktop installation.1

First, extract the contents of the Code.tar archive of appendix C to the Desktop.

Second, install the dependencies Geth, Tor and the required Python3 libraries:

Listing B.1: Software dependency installation.sudo apt−get i n s t a l l so f tware−prope r t i e s−commonsudo add−apt−r e po s i t o r y −y ppa : ethereum/ethereumsudo apt−get updatesudo apt−get i n s t a l l ethereum tor python3−pip g i tcd ~/Desktop/docShare /bin /pip3 i n s t a l l −r requ i rements . txt

Third, configure the Tor daemon in /etc/tor/torrc to forward ports 2342, 8888, and9999 to the Tor network as Tor hidden services.

Listing B.2: Tor hidden service configuration for docShare.HiddenServiceDir /var / l i b / to r /docShare /HiddenServicePort 2342 1 2 7 . 0 . 0 . 1 : 2 3 4 2HiddenServicePort 8888 1 2 7 . 0 . 0 . 1 : 8 8 8 8HiddenServicePort 9999 1 2 7 . 0 . 0 . 1 : 9 9 9 9

Afterwards restart the Tor daemon to apply the new configuration.

Fourth, initialize the Ethereum testnet and determine your enode id for the testnetmashup.

1Used instllation medium: ubuntu-17.10.1-desktop-amd64.iso

80

Listing B.3: Ethereum testnet initialization.cd ~/Desktop/Ethereum\ Testnet /. / s t a r t . sh i n i t. / connect . shadmin . nodeInfoexit. / stop . sh

Afterwards, configure the Ethereum testnet by setting obtained enode informationand IP information in ∼/Desktop/Ethereum Testnet/start.sh for every node youwant to communicate with.Now you can start the Ethereum testnet again, create your wallet information andmine Ether.

Listing B.4: Ethereum testnet account initialization.cd ~/Desktop/Ethereum\ Testnet /. / s t a r t . sh. / connect . shadmin . pee r spe r sona l . newAccount ( )miner . s t a r t ( )exitFifth, initialize the smart contracts in the Ethereum testnet using the webservice fromremix.ethereum.org. Open Firefox and browse to http://remix.ethereum.org anduse the open icon to add a local file. Select ∼/Desktop/Smart Contracts/IMS.soland open it from the browser. Afterwards switch to the run tab on the right andselect web3 provider as environment. Confirm http://localhost:8545 to interactwith the local Ethereum testnet. Before you can deploy the smart contract youhave to unlock the account. Therefore use connect.sh to connect to Geth and typepersonal.unlockAccount(eth.accounts[0]). Now you can hit the deploy button inthe remix IDE to deploy the contract. Save the contract address and repeat theprocedure for ∼/Desktop/Smart Contracts/KV.sol to deploy the smart contract forthe key value store.Once both smart contracts are published you have to adapt the scripts ∼/Desktop/-docShare/bin/docShare.py, ∼/Desktop/Contract Webfrontends/js/kv.js, and ∼/-Desktop/Contract Webfrontends/js/app.js to use the right contract addresses.

Sixth, initialize the docShare database and create your asymmetric key pair.

Listing B.5: docShare initialization.cd ~/Desktop/docShare /bin. / i n i t i a l i z e . py

Seventh, start the web-frontend to create a new identity in the identity managementsystem.

Listing B.6: Web-frontend start and identity registration in IMS.cd ~/Desktop/Contract Webfrontends/. / s t a r t . shcd ~/Desktop/Ethereum\ Testnet /. / connect . shpe r sona l . unlockAccount ( eth . accounts [ 0 ] )exit

Afterwards, navigate in Firefox to http://localhost:8080 and add your contactdetails for the IMS. Please ensure that the hidden service address equals the hiddenservice address in /var/lib/tor/docShare/hostname and the RSA public key equalsthe public key in ∼/Desktop/docShare/public.pem.

Finally, feel free to set your assigned IMS ID as ownId in ∼/Desktop/docShare/-lib/docShare.py.

Now you can start docShare via:

Listing B.7: docShare start.cd ~/Desktop/docShare /. / s t a r t . sh

Usage

User registration

Listing B.8: Actions on docShare1cd ~/Desktop/docShare /bin. / add_partner . py 2 docShare2 blub blab. / show_partner . py

Listing B.9: Actions on docShare2cd ~/Desktop/docShare /bin. / add_partner . py 0 docShare1 blab blub. / show_partner . py

Document sharing with regards to private

Listing B.10: Actions on docShare1cd ~/Desktop/docShare /bin. / add_share . py test_msg . py test_msg . py docShare2. / show_shares . py

Listing B.11: Actions on docShare2cd ~/Desktop/docShare /bin. / show_received . py. / download_received . py 1

Document sharing with regards to tracking

Listing B.12: Actions on docShare1cd ~/Desktop/docShare /bin. / add_confirmed_share . py i n i t i a l i z e . py "A␣Python␣ s c r i p t ␣ that

␣ i n i t i a l i z e s ␣docShare ’ s ␣ database ␣and␣keys " 1 docShare2. / show_shares . py

Listing B.13: Actions on docShare2cd ~/Desktop/docShare /bin. / show_received . py. / download_received . py 2

Appendix C

Digital Content

data medium here

SHA-256: aeb41948850f541aec9686c861ee5c3204950901cd0d579caf652f09c293007chashsums.txt

84