NDN DeLorean - SIGCOMM-Sponsored Events | acm...

18
NDN DeLorean: An Authentication System for Data Archives in Named Data Networking Yingdi Yu (UCLA), Alexander Afanasyev (Florida International University), Jan Seedorf (HFT Stuttgart), Zhiyi Zhang (UCLA), Lixia Zhang (UCLA), ACM Information Centric Networking Conference, September 27, 2017, Berlin, Germany

Transcript of NDN DeLorean - SIGCOMM-Sponsored Events | acm...

NDN DeLorean: An Authentication System for Data Archives in Named Data Networking

Yingdi Yu (UCLA), Alexander Afanasyev (Florida International University), Jan Seedorf (HFT Stuttgart), Zhiyi Zhang (UCLA), Lixia Zhang (UCLA),

ACM Information Centric Networking Conference,September 27, 2017, Berlin, Germany

NDN and Data-Centric Security

• In NDN you sign the data with a digital signature..• ..so the users can check if they get the

right data

• Data secured both in motion and at rest

2

KeyLocator:/USAToday/Author/CompuFax/KEY

Signed by

Signed by

/USAToday/Headline/2015/10/22/html/_chunk=2

/USAToday//Editor/Section/KEY

KeyLocator:/USAToday/Editor-in-chief/KEY

Mismatch Between Data and Signature Lifetimes

• Data lifetime can be significantly longer than its signature’s life time• Parent certificate expiration or compromise, key compromise, crypto algorithm compromise, …

• Periodical re-signing unlikely to be feasible• Do not scale in long term• Data may outlive its producer

• Need a ”look back” data authentication• Check signature validity at the time of data production

3

data is produced data is retrievedsignature expire

time

Look Back Data Authentication

• Need a certified timestamp for the past time point• Trusted service• Hash-chain or block-chain

• DeLorean: Multi-level Merkle-tree based hash chain

4Producer Consumer

Timestamp/Bookkeeping Service

Security through publicity

DeLorean Workflow Overview

Publishers

Consumers

Auditors

DeLorean Service

Storage

Request proofs of signature existence

Aggregate requests and publish “Chronicle Volumes”for rolling time periods

Retrieve data and part of volume chronicle as a proof; verify proof and signature

Audit published volumes

2015-10-2210:40am

2015-10-2210:50am

2016-05-103:30pm

DeLorean Chronicle Tree Construction

• Chronicle• K-ary Merkle tree• Root hash (chonicle digest) fixes the state of

the Chronicle• Each new volume updates nodes along the

path to root• May create new root and intermediate nodes

• Existence verification:• O(logkm)

• Consistence verification:• O(logkm)

• Add/check volume to/in 20 year old 32-ary Chronicle with 10-min wide Volumes• 4 hash computations

6

v0 v1

c1,0

t0 t1 t2 t3 t4 t5

v2

c1,0 c1,1

c’2,0

v2 v3

c1,1

c2,0

v3

c2,0

v4

c’1,2

c’2,1

c’3,0

Chronicle digest

Chronicle digest

DeLorean Chronicle Volume Construction

• Volume• K-ary Merkle tree• Root hash (volume digest) fixes the state of

the time-specific volume• Each new signature updates nodes along the

path to root• May create new root and intermediate nodes

• Existence verification:• O(logkm)

• Consistence verification:• O(logkm)

• Add/check signature to/in a Volume with > 1,000,000 signatures• 4 hash computations

7

v0 v1

c1,0

t0 t1 t2 t3 t4 t5

v2

c1,0 c1,1

c’2,0

v2 v3

c1,1

c2,0

v3

c2,0

v4

c’1,2

c’2,1

c’3,0Chronicle digest

v’1,0v’’1,0

s0

v1,0

s2

s0 s1 s2

v1,0 v’1,1

v’2,0 Volume digest

Proof of Existence

• Existence of volume v4 in the Chronicle at time t4• { c’1,2, c’2,1, c’3,0 }

• Existence of signature s2 in volume v4 (at time t4)• { v’1,1, c’2,0,}

• Each node of chronicle and volume tree is published as NDN data packet• “Complete” nodes

• Final version: all children are present and fixed

• “Incomplete” nodes• Transient state• The latest transient node ”fixes” all previous nodes

v0 v1

c1,0

t0 t1 t2 t3 t4 t5

v2

c1,0 c1,1

c’2,0

v2 v3

c1,1

c2,0

v3

c2,0

v4

c’1,2

c’2,1

c’3,0Chronicle digest

v’1,0v’’1,0

s0

v1,0

s2

s0 s1 s2

v1,0 v’1,1

v’2,0 Volume digest

Proof of Consistency

• Periodic retrieval of current chronicle root• c2,0, c’3,0, c’’3,0, …

• Check if the “newer” root • incorporates the old root (if old root complete)• References the same subset of children (if

incomplete)

• Easy to catch misbehavior

• Trivial networking and storage burden on auditors• Only root node needs to be stored• Usually one node retrieval

v0 v1

c1,0

t0 t1 t2 t3 t4 t5

v2

c1,0 c1,1

c’2,0

v2 v3

c1,1

c2,0

v0 v1

c1,0

t0 t1 t2 t3 t4 t5

v2

c1,0 c1,1

c’2,0

v2 v3

c1,1

c2,0

v3

c2,0

v4

c’1,2

c’2,1

c’3,0

Name: /DeLorean/_CHRONICLE/complete/2,1/abc1e3..Content:

Signature: ...

a2ed8b.. 7ac9dd.. 757be1.. 1b595f..32 children hashes

...

NDN Data Packet for DeLorean Nodes (Chronicle)

• Naming convention• uniquely identify a node in a particular state of Chronicle and/or Volume tree

• Given a time point, the name of any node is determined

10

3,0

2,0 2,1

1,64

2,2

2048, 2049

... ...

...

Index: 0, 1, ...... , 32,

1,0 ......

......

/[service-prefix]/_CHRONICLE/[NODE-STATE]/[layer],[index]/[hash]

/DeLorean/_CHRONICLE/incomplete=2050/1,64/1ffa1

/DeLorean/_CHRONICLE/incomplete=2050/2,2/abc1e3/DeLorean/_CHRONICLE/complete/2,0/a2ed8b

Name: /DeLorean/_CHRONICLE/incomplete=2050/3,0/7ac9dd..Content:

Signature: ...

a2ed8b.. abc1e3..3 children hashes

abc1e3..

NDN DeLorean ICN ’17, September 26–28, 2017, Berlin, Germany

Figure 5: Two-level hierarchy of the timestamp service

(and all other children digests) was produced by the DeLoreanservice; while the digests provide assurances of the Chronicle andVolume trees consistency.

5.2 Volume ConstructionEach DeLorean volume is a set of signatures (their cryptographichashes) that were witnessed during the corresponding time period.During the current time period, the DeLorean service collects thesignatures from the publishers by adding them as leaves to thevolume tree. As soon as the time period is over, the DeLorean addsthe latest version of the volume digest to the chronicle tree, e�ec-tively “sealing” the volume from further modi�cations. After this,the publishers can contact the service and retrieve DeLorean proofsof signature existence during the volume’s time period, which canbe used to reliably roll-back clocks for future validation of theirdata.

5.3 Proof of ExistenceIn order to prove the existence of a particular leaf node in a Merkletree, one needs to be able to reconstruct a part of the tree alongthe path from the leaf to the root. For example, given the currentstate of the chronicle represented by its digest c 03,0, to check thatthe volume with digest �1 exists in the Chronicle tree (Figure 5),the path�1 ! c1,0 ! c2,0 ! c 03,0 needs to be reconstructed: c

?1,0 =

hash(�0,�1), c?2,0 = hash(c?1,0, c1,1), c?3,0 = hash(c?2,0, c 02,1). Afterthat, equality between c?3,0 and c

03,0 serves as a proof of chronicle

consistency and that �1 is part of it. Similarly, one can prove thatsignature s1 exists in the Volume tree using the equality betweenreconstructed volume tree root �?2,0 and a known volume digest �1.

The computational cost to prove the existence of a single volumeis O(logV ) hash computations, where V is the total number ofvolumes in the chronicle; and the computational cost to verify theexistence of a signature in the chronicle is O(logV + log S), whereS is the (average) number of signatures witnessed per time period.The arity of the Merkle tree used for Chronicle and Volume treeswill determine the base of the logarithm in the above equations.

Given a volume tree, a consumer can quickly locate a recordaccording to the local record index. Based on the local index, aconsumer can compute a veri�cation path from the record back

to the volume tree root and verify the existence of the record in avolume, same way as verifying the volume existence in chronicle.

In summary, to assure that a speci�c signature was witnessedby the DeLorean service at a speci�c time interval, the consumerneeds to have: (1) volume digest, (2) volume index, and (3) index ofthe signature within the volume. The �rst two are used to recon-struct the relevant portions of the chronicle tree; and the last onealong with the �ngerprint of the data signature (obtained from datadirectly) can reconstruct and verify consistency of the volume tree.

5.4 Publishing Node StatesIn order to successfully prove the existence of any volume or sig-nature in the chronicle, one needs to obtain sibling nodes in thechronicle and volume trees. To allow that, DeLorean leverages thepower of NDN to publish each such node as an immutable, named,and cryptographically signed NDN data packet. We use the follow-ing naming convention for these data packets (Figure 7): “/<prefix>/<TREE-TYPE>/<NODE-STATE>/<NODE-INDEX>/<DIGEST-VALUE>”, where

• “<prefix>” is a pre�x to identify instance of the DeLoreanchronicle service, e.g., if there is only one service in theworld, this could be just “/DeLorean”;

• “<TREE-TYPE>” is an identi�cation of the tree type and canbe either “_CHRONICLE” or “_VOLUME-<ID>” for the chronicleor a speci�c volume tree. “<ID>” in the volume tree nameis a sequence number of the volume, since the beginningof DeLorean service instance;

• “<NODE-STATE>” is the state of the Merkle tree node, whichis either “complete” when a node has the full set of descen-dents or “incomplete=<ID>” when one or more descen-dents do not yet exist. Recall that until a subtree of theMerkle tree is complete, the root digest of this tree changeswhenever a new leaf node is added. “<ID>” in the incom-plete tree is used to disambiguate the name for di�erentincomplete states.

The naming of the incomplete state nodes is designedto simplify retrieval of the latest state, is consistent acrossall incomplete nodes at a given time, and is directly relatedto the number of nodes in the tree. In other words, atany given point of time, the current state of any node inthe chronicle tree is determined by the current numberof volumes. For a 32-ary chronicle tree with the largestvolume sequence number s , for the node with index i atthe level l (both i and l start from 0) :

“<NODE-STATE>” =

(“complete”, if s � 32l ⇥ (i + 1)“incomplete-(s+1)”, otherwise

For example in Figure 6, all incomplete nodes are pub-lished using the “incomplete-2050” state.

• “<NODE-INDEX>” is a tuple of the level of the node in theMerkle tree (0 for leaf nodes, up to dlogk N e for parentnodes, where N is the number of leaf nodes and k arity ofthe Merkle tree) and the index of the node at the speci�edlevel (from 0 to k), such as “2,1”.

Given the sequence number s of the leaf node of interest,and the desired level l of the intermediate node, its indexis (l , bs ⇥ k�l c).

Signed by the DeLorean service for provenance

only

NDN Data Packet for DeLorean Nodes (Volume Trees)

• Naming convention• uniquely identify a node in a particular state of Chronicle and/or Volume tree

11

3,0

2,0 2,1

1,64

2,2

2048, 2049

... ...

...

Index: 0, 1, ...... , 32,

1,0 ......

......

/[service-prefix]/_VOLUME-[time-index]/[NODE-STATE]/[layer],[index]/[hash]

NDN DeLorean ICN ’17, September 26–28, 2017, Berlin, Germany

Figure 5: Two-level hierarchy of the timestamp service

(and all other children digests) was produced by the DeLoreanservice; while the digests provide assurances of the Chronicle andVolume trees consistency.

5.2 Volume ConstructionEach DeLorean volume is a set of signatures (their cryptographichashes) that were witnessed during the corresponding time period.During the current time period, the DeLorean service collects thesignatures from the publishers by adding them as leaves to thevolume tree. As soon as the time period is over, the DeLorean addsthe latest version of the volume digest to the chronicle tree, e�ec-tively “sealing” the volume from further modi�cations. After this,the publishers can contact the service and retrieve DeLorean proofsof signature existence during the volume’s time period, which canbe used to reliably roll-back clocks for future validation of theirdata.

5.3 Proof of ExistenceIn order to prove the existence of a particular leaf node in a Merkletree, one needs to be able to reconstruct a part of the tree alongthe path from the leaf to the root. For example, given the currentstate of the chronicle represented by its digest c 03,0, to check thatthe volume with digest �1 exists in the Chronicle tree (Figure 5),the path�1 ! c1,0 ! c2,0 ! c 03,0 needs to be reconstructed: c

?1,0 =

hash(�0,�1), c?2,0 = hash(c?1,0, c1,1), c?3,0 = hash(c?2,0, c 02,1). Afterthat, equality between c?3,0 and c

03,0 serves as a proof of chronicle

consistency and that �1 is part of it. Similarly, one can prove thatsignature s1 exists in the Volume tree using the equality betweenreconstructed volume tree root �?2,0 and a known volume digest �1.

The computational cost to prove the existence of a single volumeis O(logV ) hash computations, where V is the total number ofvolumes in the chronicle; and the computational cost to verify theexistence of a signature in the chronicle is O(logV + log S), whereS is the (average) number of signatures witnessed per time period.The arity of the Merkle tree used for Chronicle and Volume treeswill determine the base of the logarithm in the above equations.

Given a volume tree, a consumer can quickly locate a recordaccording to the local record index. Based on the local index, aconsumer can compute a veri�cation path from the record back

to the volume tree root and verify the existence of the record in avolume, same way as verifying the volume existence in chronicle.

In summary, to assure that a speci�c signature was witnessedby the DeLorean service at a speci�c time interval, the consumerneeds to have: (1) volume digest, (2) volume index, and (3) index ofthe signature within the volume. The �rst two are used to recon-struct the relevant portions of the chronicle tree; and the last onealong with the �ngerprint of the data signature (obtained from datadirectly) can reconstruct and verify consistency of the volume tree.

5.4 Publishing Node StatesIn order to successfully prove the existence of any volume or sig-nature in the chronicle, one needs to obtain sibling nodes in thechronicle and volume trees. To allow that, DeLorean leverages thepower of NDN to publish each such node as an immutable, named,and cryptographically signed NDN data packet. We use the follow-ing naming convention for these data packets (Figure 7): “/<prefix>/<TREE-TYPE>/<NODE-STATE>/<NODE-INDEX>/<DIGEST-VALUE>”, where

• “<prefix>” is a pre�x to identify instance of the DeLoreanchronicle service, e.g., if there is only one service in theworld, this could be just “/DeLorean”;

• “<TREE-TYPE>” is an identi�cation of the tree type and canbe either “_CHRONICLE” or “_VOLUME-<ID>” for the chronicleor a speci�c volume tree. “<ID>” in the volume tree nameis a sequence number of the volume, since the beginningof DeLorean service instance;

• “<NODE-STATE>” is the state of the Merkle tree node, whichis either “complete” when a node has the full set of descen-dents or “incomplete=<ID>” when one or more descen-dents do not yet exist. Recall that until a subtree of theMerkle tree is complete, the root digest of this tree changeswhenever a new leaf node is added. “<ID>” in the incom-plete tree is used to disambiguate the name for di�erentincomplete states.

The naming of the incomplete state nodes is designedto simplify retrieval of the latest state, is consistent acrossall incomplete nodes at a given time, and is directly relatedto the number of nodes in the tree. In other words, atany given point of time, the current state of any node inthe chronicle tree is determined by the current numberof volumes. For a 32-ary chronicle tree with the largestvolume sequence number s , for the node with index i atthe level l (both i and l start from 0) :

“<NODE-STATE>” =

(“complete”, if s � 32l ⇥ (i + 1)“incomplete-(s+1)”, otherwise

For example in Figure 6, all incomplete nodes are pub-lished using the “incomplete-2050” state.

• “<NODE-INDEX>” is a tuple of the level of the node in theMerkle tree (0 for leaf nodes, up to dlogk N e for parentnodes, where N is the number of leaf nodes and k arity ofthe Merkle tree) and the index of the node at the speci�edlevel (from 0 to k), such as “2,1”.

Given the sequence number s of the leaf node of interest,and the desired level l of the intermediate node, its indexis (l , bs ⇥ k�l c).Name: /DeLorean/_VOLUME-5/complete

/2,1/abc1e3..Content:

Signature: ...

a2ed8b.. 7ac9dd.. 757be1.. 1b595f..32 children hashes

...

Name: /DeLorean/_VOLUME-5/incomplete=2050/3,0/7ac9dd..Content:

Signature: ...

a2ed8b.. abc1e3..3 children hashes

abc1e3..

Signed by the DeLorean service for provenance

only

Node Retrieval

• Nodes at higher layers are frequently retrieved and benefit from caching• Complete never change• Most higher-level incomplete nodes don’t change often

• Can be replicated anywhere in the network

12

... ...

Public Audit with Merkle Tree

• All the users can verify consistence of the timestamp service• More users, the more secure and (publicly) reliable• Each published volume needs to be checked ~ at time of published to ensure timestamp trust

• Difficult to create double history• NDN interest does not carry sender address• Interest may not reach timestamp service (satisfied by cache)

13

From whom?

/DeLorean/_CHRONICLE/incomplete=2050/1,64/1ffa1

/DeLorean/_CHRONICLE/incomplete=2050/1,64/1ffa1

Evaluation: Overview

• Analytical evaluations• Necessary storage capabilities at the DeLorean service provider• Verification cost at users• Needed number of auditors and audits per auditor

• Keep in mind …• Not every signature goes to DeLorean• For large volume archives, DeLorean need to track only signature of a manifest

• Real-world example: newspaper archive in public libraries• (based on data from statistica.com)• 700.000 pieces of newspaper content published per day• on average 486 pieces of content published per minute

Evaluation: DeLorean Service Storage Requirements

12 GB

59.3 GB

119 GB

178 GB

237 GB

0

50

100

150

200

250

500 2,500 5,000 7,500 10,000

Signatures per minute

Yea

rly st

orag

e re

quire

men

t, G

B/ye

ar �, minutes

10

Yearly storage requirement is linear to the number of witnessed signatures

Yearly storage requirements at DeLorean service providerdepending on signatures per minute

(Arity of Merkle tree: 32; duration of a timeslot within a volume: 10 minutes)

Amount of data a user would need to retrieve to verify the existence of a signature at a certain point in the past:

• For 32-ary trees and volume timeslot 10 minutes• 1500-byte data packet

retrieval for the first 20 years• 4 x 1500-byte data packet

retrieval for years 20-600

ØVerification costs clearly negligible!

Evaluation: Required Number of Auditors

• Decentralized auditing• Evaluation: probability that there is a volume that has not been verfiied by at least one

auditor around the time the volume has been finalized

● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●●

0%

25%

50%

75%

100%

1,000 2,000 4,000 6,000 8,000 10,000

The number of auditors per epoch, |A|

P(V

A)

Epoch, days

● 1

3

7

∆, minutes

10

60

●● ●●

0%

25%

50%

75%

100%

0 250 500 750Timeslot forvolume creation

Period during whicheach auditor fetchesthe chronlice at least once

Discussion and Future Work

• Incentives to audit• Users• Providers• Competitors

• Recovery from inconsistency• “Bulletin boards” to post detected problems

• Auditors cannot forge bad reports because of NDN signatures

• Transition to new provider(s)

• Relation to real-time data production• Produce now, get proof of existence later

Summary

• With data-centric security, data lifetime can be longer than its signing key’s validity period

• DeLorean provides publicly verifiable (“trust through transparency”) bookkeeping service to enable “look back” validation of long-lived data• Collects signatures (signature digests) from producers• Publishes volumes of signatures collected within corresponding time periods• Efficient (storage, update, and lookup) Merkle-tree structure for signature volumes and chronicle of

volumes

• Opportunities• Decouple the lifetime of data and signature• Make short-lived keys feasible

18