Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt
ICDE 2012Mai 30, 2012
Introducing Access Control in Webdamlog
Serge AbiteboulINRIA Saclay & ENS Cachan
Abiteboul – DBPL - 2013
2
•The Web as a distributed knowledge base
•Webdamlog: a rule-based language for the Web
•Access control in Webdamlog
•The Webdamlog system
•Conclusion
Abiteboul – DBPL - 2013
3
A typical Web user’s data•What kinds of data?
- data: photos, music, movies, reports, email
- metadata: photo taken by Alice in Paris on ...
- ontologies: Alice’s ontology and mapping with other ontologies
- localization: Alice’s pictures are on Picasa, back-ups are at INRIA
- security: Facebook credentials (Alice, 123456)
- annotations: Alice likes Elvis’ website
- beliefs: Alice believes Elvis is alive
- external knowledge: Bob keeps copies of Alice’s pictures
- time, provenance, ...
all kinds
Social data
Abiteboul – DBPL - 2013
4
A typical Web user’s data•What kinds of data?
•Where is the data?
- laptop, desktop, smartphone, tablet, car computer
- mail, address book, agenda
- Facebook, LinkedIn, Picasa, YouTube, Tweeter
- svn, Google docs
- also access to data / information of family, friends, companies associations
all kinds
everywhere
Abiteboul – DBPL - 2013
5
A typical Web user’s data•What kinds of data?
•Where is the data?
all kinds
everywhere•What kind of organization?
- terminology: different ontologies
- systems: personal machines, social networks
- distribution: different localization
- security: different protocols
- quality: incomplete / inconsistent information
heterogeneous
Abiteboul – DBPL - 2013
6
Example of processingAlice and Bob are getting engaged. Their friends want to offer them an album of photos where they are together
To make such a photo album
• Find friends of Alice & Bob (say with Facebook)
• for each friend, find where she keeps her photos (say, Picassa)
• find the means to access her photos possibly via friends
• find the photos that feature Bob and Alice together, e.g., using tags or face recognition software
• possibly ask someone to verify the results
Some reasoning is needed to execute these tasks automatically!
Abiteboul – DBPL - 2013
7
A typical Web user
• Overwhelmed by the mass of information
• Cannot find the information needed
• Is not aware of important events
• Cannot manage/control how others access and use his/her own data
Abiteboul – DBPL - 2013
8
YOU need help!
How can systems help?
• We need to move from a Web of text to a Web of knowledge
- In the spirit of semantic Web
• To better support user needs,
- Systems need to analyze what is happening and construct knowledge
- Systems should exchange knowledge
- Systems should reason and infer knowledge
Abiteboul – DBPL - 2013
9
Thesis
All this forms a distributed knowledge base
with processing based on automated reasoning
Abiteboul – DBPL - 2013
10
Our topic
•Distributed reasoning
Exchanging facts and rules Webdamlog
• Access control with access control
Abiteboul – DBPL - 2013
11
•The Web as a distributed knowledge base
•Webdamlog: a rule-based language for the Web
•Access control in Webdamlog
•The Webdamlog system
•Conclusion
Abiteboul – DBPL - 2013
12
Webdamlog: a datalog-style language
Datalog
A prehistoric language by Web time...
+ nice and compact syntax
+ well-studied with many extensions
+ recursion essential: network cycles
Webdamlog
Not as simple/beautiful & procedural
Needed for real Web applications!
Webdamlog is not datalog
Abiteboul – DBPL - 2013
13
Webdamlog: an extension of datalog
Datalog program fof(x,y) :- friend(x,y)fof(x,y) :- friend(x,z), fof(z,y)
Extensional facts (stored in the database)friend(“peter”,”paul”) friend(“paul”, “mary”) friend(“mary”,”sue”)
Intentional facts (derived)fof(“peter”,”paul”) fof(“peter”,”mary”) fof(“peter”, “sue”)fof(“paul”, “mary”) fof(“paul”, “sue”) fof(“mary”,”sue”)
Abiteboul – DBPL - 2013
14
Webdamlog: an extension of datalogExtends datalog
• negation, updates, distribution, delegation, time
For a world that is
• distributed: autonomous and asynchronous peers
• dynamic: knowledge evolves; peers come and go
Influenced by
• Active XML (INRIA) - for distribution & intentional data
• Dedalus (UC Berkeley) - for time & implementation
Abiteboul – DBPL - 2013
15
FactsFacts are of the form m@p(a1, ..., an), where
m is a relation name & p is a peer name
a1, ..., an are data values (n is the arity of m@p)
the set of data values includes the relations and peer names
Examples
friend@my-iphone(“peter”, “paul”) extensional
fof@my-iphone(“adam”, “paul”) intentional
Abiteboul – DBPL - 2013
16
Examples of facts
data & metadata: pictures@alice-iphone(1771.jpg, “Paris”, 11/11/2011)
ontology: [email protected]("Elvis”, theKing)
annotations: [email protected](“wikipedia.org”, encyclopedia)
localization: where@alice(pictures, picasa/alice)
access rights: right@picasa(pictures, friends, read)
security: secret@picasa/alice; public@picasa/alice
Abiteboul – DBPL - 2013
17
Rules
Rules are of the form
$R@$P($U) :- (not) $R1@$P1($U1), ..., (not) $Rn@$Pn($Un)
where
$R, $Ri are relation terms
$P, $Pi are peer terms
$U, $Ui are tuples of terms
Safety condition
$R and $P must appear positively bound in the body
each variable in a negative literal must appear positively bound in the body
A term is a variable
or a constant
Examples coming up, stay tuned
Abiteboul – DBPL - 2013
18
State transition
Choose some peer p randomly – asynchronously
Compute the transition of p
the database updates at p
the messages sent to other peers
the delegations of rules to other peers
Keep going forever
(I0, Γ0, ∅) ➝ (I1, Γ1, Γ1*) ➝... ➝ (In, Γn, Γn
*) ➝...
Fair sequence: each peer is selected infinitely often
Abiteboul – DBPL - 2013
19
The semantics of rulesClassification based on locality and nature of head predicates (intentional or extensional)
• Local rule at my-laptop: all predicates in the body of the rules are from my-laptop
Local with local intentional head classic datalog
Local with local extensional head database update
Local with non-local extensional head messaging between peers
Local with non-local intentional head view delegation
Non-local general delegation
Abiteboul – DBPL - 2013
20
Local rules with local intentional head
Example: Rule at peer my-laptop
friend is extensional, fof is intentional
fof@my-iphone($x, $y) :- friend@my-iphone($x,$y)
fof@my-iphone($x,$y) :- friend@my-iphone($x,$z), fof@my-iphone($z,$y)
fof is the transitive closure of friend
Datalog = Webdamlog with only local rules and local intentional head
Abiteboul – DBPL - 2013
21
Local rules with local extensional head
A new fact is inserted into the local database
believe@my-iphone(“Alice”, $loc) :-
tell@my-iphone($p,”Alice”, $loc),
friend@my-iphone($p)
Abiteboul – DBPL - 2013
22
Local rules with non-local extensional head
A new fact is sent to an external peer via a message
$message@$peer($name, “Happy birthday!”) :-
today@my-iphone($date),
birthday@my-iphone($name, $message, $peer, $date)
Extensional facts:
today@my-iphone(March 6)
birthday@my-iphone("Manon”, “sendmail”, “gmail.com”, March 6)
[email protected]("Manon”, “Happy birthday”)
Abiteboul – DBPL - 2013
23
Local rules with non-local intentional head
View delegation!
boyMeetsGirl@gossip-site($girl, $boy) :-
girls@my-iphone($girl, $loc),
boys@my-iphone($boy, $loc)
Semantics of boyMeetGirl@gossip-site is a join of relations girls and boys from my-iphone
Formally, my-iphone delegates a rule boyMeetGirl@gossip-site(g,b) for each g, b, l, girls@my-iphone(g,l), boys@my-iphone(b,l)
Abiteboul – DBPL - 2013
24
Non-local rules: general delegation
(at my-iphone): boyMeetsGirl@gossip-site($girl, $boy) :-
girls@my-iphone($girl, $loc),
boys@alice-iphone($boy, $loc)
Suppose that girls@my-iphone(“Alice”, “Julia's birthday”) holds.
Then my-iphone installs the following rule at alice-iphone
(at alice-iphone): boyMeetsGirl@gossip-site(“Alice”, $boy) :-
boys@alice-iphone($boy, “Julia's birthday”)
When girls@my-iphone(“Alice”, “Julia's birthday”) no longer holds, my-iphone uninstalls the rule
Abiteboul – DBPL - 2013
26
Complexity of delegation: illustration
fof(x,y) :- friend(x,y)
(at p) fof@p(x,y) :- peers@p($q), friend@$q(x,y)
If peers@p contains 100 000 tuples
peers@p(q1), ...., peers@p(q100 000)
This rule will install 100 000 rules!
for i=1 to 100 000 (at qi) fof@p(x,y) :- friend@qi(x,y)
Data complexity transformed into program complexity
Abiteboul – DBPL - 2013
27
Summary of results [PODS 2011]
• Formal definition of the semantics of Webdamlog
• Results on expressivity
- the model with delegation is more general, unless all peers and programs are known in advance
• Convergence is very hard to achieve
- positive Webdamlog
- strongly stratified programs with negation
Abiteboul – DBPL - 2013
28
•The Web as a distributed knowledge base
•Webdamlog: a rule-based language for the Web
•Access control in Webdamlog
•The Webdamlog system
•Conclusion
Abiteboul – DBPL - 2013
Requirements
Data access Users would like to control who can read and modify their information
Data dissemination Users would like to control how their data are transferred from one participant to another, and how they are combined, with the owner of each piece of data keeping some control over it
Application control Users would like to control which applications can run on their behalf, and what information these applications can access.
29
Abiteboul – DBPL - 2013
30
The general picture
• The privileges we consider: read, write, grant
• For read:
• Coarse grained access control: at the relation level
• Fine grain access control: at the tuple level
Abiteboul – DBPL - 2013
31
Insertion in extentional relations
Definition of intensional relations• Requires write privilege on the target relation
• [at Alice] alicePhotos@Bob($f) :- person@Alice($p,
“Friend”),personInPhoto@Alice($pid, $p),
photo@Alice($pid,−, $f)
• [at Alice] allPhotos@Alice($f) : alicePhotos@Alice($f)
• [at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f)
Abiteboul – DBPL - 2013
32
Who can read a fact ? – default
• Extensional relations: if you have read privilege to the relation
• Intensional relations:
if you have read privilege to the relation &
if you can read all the tuples that have been used to create this fact – provenance of the fact
Abiteboul – DBPL - 2013
33
Digression: provenance
• Provenance of a tuple
•How it was constructed: conjunction
• Alternatives: disjunction
Abiteboul – DBPL - 2013
34
Digression: provenance graph
gossip@p(Jane, John)rule3
×
girls@p(Jane, Julia’s birthday) boys@p(John, Julia's birthday)rule1
× ×
boyMeetsGirl@p(Jane, John)
×+
(Also used for maintenance in case of update)
Abiteboul – DBPL - 2013
35
Coarse grain access control• [at Alice] alicePhotos@Bob($f) :-
person@Alice($p, “Friend”),personInPhoto@Alice($pid, $p),
photo@Alice($pid,−, $f)
• alicePhotos@Bob is extensional
•Whoever has read access to alicePhotos@Bob sees all the relation
Abiteboul – DBPL - 2013
36
Fine grain access control• [at Alice] allPhotos@Alice($f) :
alicePhotos@Alice($f)
• [at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f)
• allPhotos@Alice is intensional
• Sue who has read privilege to allPhotos@Alice and alicePhotos only, can see only the photos of Alice in allPhotos
• Lili who has read privilege to the three relations, sees everything
Abiteboul – DBPL - 2013
37
Overwriting the default for intensional data
• Let us change the rule to:
• [at Alice] allPhotos@$x($f) :-alicePhotos@Alice($f),
friends@Alice($x)
• Issue: you can read the photos only if you also have read privilege to friends@Alice
Abiteboul – DBPL - 2013
38
Overwriting the default for intensional data
• [at Alice] allPhotos@$x($f) :-alicePhotos@Alice($f),
[hide friends@Alice($x)]
•Hide: block the provenance from friends@Alice
• Similar mechanism for extensional data – expose
Abiteboul – DBPL - 2013
39
Issues with non local rules• [at Bob]
message@Sue(“I hate you”) :- date@Alice(d)
aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x)
Ignoring access rights, by delegation, this results in running
• [at Alice]
message@Sue(“I hate you”) :- date@Alice(d)
aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x)
Abiteboul – DBPL - 2013
40
Default solution: sand boxWe run the rule at Alice in a Sandbox
•We use the access rights of Bob
So the second rule does not succeed in sending secrets
• The message specifies that this is done at Bob’s request
So requires authentication/signatures
• Alternative: delegation without sandbox. Possible if the peer that asks for the delegation is given the privilege to install rules at the other peer – Here if Alice gives Bob the right to install a rule in her environment
Abiteboul – DBPL - 2013
41
Access control implementation • A program with access control is compiled
locally in a Webdamlog program without that is executed
• Access control data is managed like any other data
Relation acl (defines relation access)
Relation kind (ext or int)
• Based on provenance implemented as a distributed graph
•On-going work on optimization
Abiteboul – DBPL - 2013
42
•The Web as a distributed knowledge base
•Webdamlog: a rule-based language for the Web
•Access control in Webdamlog
•The Webdamlog system
Abiteboul – DBPL - 2013
43
The Webdamlog engine
Based on Bud
• developed at UC Berkeley
Manages knowledge
- Stores facts and rules
- exchanges knowledge with other engines
- performs reasoning
Abiteboul – DBPL - 2013
44
The engine: beyond Bud
• Compilation of
(Bud’s language)
• Main Webdamlog features not supported by Bud
1. Variable relation and peer names
2. Delegations with dynamic changes of the program
Webdamlog+AC ⇒ Webdamlog ⇒ Bloom
Abiteboul – DBPL - 2013
45
The Webdamlog peer
Support communication with other peers and with users
Support common security protocols
Support wrappers to external systems such as Facebook
Provides Web interfaces
Abiteboul – DBPL - 2013
46
Provenance graphs
• Records the history of derivation
• Provenance semiring semantics [Green et al. 07]
• Used for performance optimization
• Used for fine grain access control
• Other possible uses such as explanation of results
Abiteboul – DBPL - 2013
47
•The Web as a distributed knowledge base
•Webdamlog: a rule-based language for the Web
•Access control in Webdamlog
•The Webdamlog system
•Conclusion
Abiteboul – DBPL - 2013
Thesis
Let us turn the Web into a distributed knowledge base
with billions of users
supported by billions of systems
analyzing information
extracting knowledge
exchanging knowledge
inferring knowledge
48
Abiteboul – DBPL - 2013
49
WebdamlogLanguage
•A language for distributed data management [PODS 2011]
•Datalog with distribution, updates, messaging
•Main novelty: delegation
Implementation
•WebdamExchange peer in Java [demo ICDE 2011]
•Webdamlog engine based on Bud [demo Sigmod 2013]
Access control: on-going work with Miklau-Stoyanovich
Probabilistic Webdamlog: on-going work with Deutch-Vianu
Cambridge University Press, 2012
http://webdam.inria.fr/Jorge
Grazie !
Top Related