access governance Journey in country of data€¦ · Journey in country of data access governance...
Transcript of access governance Journey in country of data€¦ · Journey in country of data access governance...
![Page 1: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/1.jpg)
[email protected] Stockholm Summit 2018-12-06
Journey in country of data access governance
2018-04-18 1
![Page 2: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/2.jpg)
SvSv
22018-04-18
Who is talking?
Magnus Runesson
Data Engineer @ Svenska Spel
DeveloperOpsRDBMSBigDataHigh performance
![Page 3: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/3.jpg)
SvSv
Gaming is for everyone´s enjoyment
![Page 4: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/4.jpg)
SvSv
42018-04-18
Why?
Svenska Spel’s data warehouse
Atlas & Ranger
How did we implement it?
Learnings
Conclusions
Agenda
![Page 5: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/5.jpg)
SvSv
52018-04-18
GDPR requires
• clear purpose for PII data
• privacy by design
• clear consent or legal ground
• not to use/store PII if not needed
• people own their own data.
• penalty if not followed
Why?New gaming market requires
• introduce multi tenancy
![Page 6: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/6.jpg)
SvSv
62018-04-18
Our customers and partners integrity is protected
Follow competition regulation
Users have only access to data aimed for current purpose
Keep doing our required processing
Adaptable for new requirements
Maintainable solution
Goals
![Page 7: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/7.jpg)
SvSv
72018-04-18
Svenska Spel’s data warehouse
![Page 8: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/8.jpg)
SvSv
82018-04-18
Moved from classic Cognos + Oracle
HDP 2.6 using Hive
Includes Personal Identifiable Information (PII)
300+ event streams in
150+ published tables and views
Svenska Spel’s data warehouse
![Page 9: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/9.jpg)
SvSv
92018-04-18
Used data are
Understood
Documented
Modelled
Modelled with Data Vault
Oracle SQL Developer Data Modeler
SQL code generated from model
Model based development
![Page 10: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/10.jpg)
SvSv
102018-04-18
History tracking
Uniquely linked
Pattern based
Easy to generate code
Easy to add new sources
Data Vault
Link
Hub
Hub
Satellite
Satellite Satellite
![Page 11: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/11.jpg)
SvSv
11
CRM mart
ETLAnonymization
Data
Lak
e
Inte
grat
ion
Data
Vau
lt
Dim
ensio
nm
art
ETL BI martExasolTableau
Hadoop Presentation
Role based access
CRM
Whitelisting
…
![Page 12: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/12.jpg)
SvSv
122018-04-18
Apache Atlas and Ranger
![Page 13: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/13.jpg)
SvSv
132018-04-18
Metadata about resources
Resource is
Table
Column
Schema
File on HDFS
…
Lineage
Apache Atlas
![Page 14: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/14.jpg)
SvSv
142018-04-18
Tags have no meaning themselves
Your business vocabulary define the meaning
Example of tags:
Business entity owning the data
Indication of sensitive data
The rules in Ranger enforces the policy
Separate metadata from policy implementation
Atlas tags
PII
![Page 15: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/15.jpg)
SvSv
152018-04-18
Is user U allowed to do operation O on resource R?
Access
Row based filtering
Masking
Audit logging
Resources referred with tags
Apache Ranger
![Page 16: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/16.jpg)
SvSv
162018-04-18
customerCustomer_id Name Postal_code Has_phone Marketing
1 Steve 12345 False False
2 Bill 54321 True False
3 Paul 54672 False True
Table in Hive before we started our work
![Page 17: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/17.jpg)
SvSv
172018-04-18
customerCustomer_id Name Postal_code Has_phone Marketing
1 Steve 12345 False False
2 Bill 54321 True False
3 Paul 54672 False True
PII_table
PII
Add PII tags on table and columns in Atlas.No behaviour change.
PII
![Page 18: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/18.jpg)
SvSv
182018-04-18
customerCustomer_id Name Postal_code Has_phone Marketing
17 ABC 12345 False False
42 DEF 54321 True False
13 BDE 54672 False True
PII
We set a rule in Ranger to mask PII columnsAnalyst viewPII_table
PII
![Page 19: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/19.jpg)
SvSv
192018-04-18
customerCustomer_id Name Postal_code Has_phone Marketing
3 Paul 54672 False True
PII
Ranger restrict our CRM user to only see rows withMarketing = TruePII_table
PII
![Page 20: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/20.jpg)
SvSv
202018-04-18
How did we implement this?
![Page 21: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/21.jpg)
SvSv
212018-04-18
Development process
Change model
Store model
Generate code
Deploy
PII
Add rules
![Page 22: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/22.jpg)
SvSv
222018-04-18
• In-house tool
• Template based generation of SQL/HQL
• Generate files with tag-information
• Tables and columns respectively
HQL generator
HQL generator
CSV SQL
PII
![Page 23: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/23.jpg)
SvSv
232018-04-18
schema;table;attribute;tags
dim_mart;customer_d;customer_id;PII,Sensitive
dim_mart;customer_d;has_phone;
Corresponding file for tables without attribute(column)
Tag file for columns
![Page 24: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/24.jpg)
SvSv
242018-04-18
Hand coded of rules per tag
Policy tool applies rule on all tables with the tag
Can be different rules for different users
Filter gets appended to where condition by Ranger
Used for
Row based filtering (access)
Masking (anonymization)
Catch all rule to deny access to tables not in our model
Ranger rules
![Page 25: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/25.jpg)
SvSv
25
{ "command": "apply_tag_row_rule", "filters": [
{ "groups": [ "tenant_1"],
"users": [], "tagFilterExprs": [ { "tags": [ "multitenant" ], "filterExpr": "${table}.tenant_id = 1" } ] },...
Ranger rule filter example
![Page 26: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/26.jpg)
SvSv
262018-04-18
Deployment process
*.sqltable_tags.csvcolumn_tags.csvranger_policies.json
Apply *.sql DDL
Policy tool - tag files
Policy tool - policy file
![Page 27: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/27.jpg)
SvSv
272018-04-18
• Makes it easy to manage
• Atlas tags
• Ranger policy rules
• Command line tool
• Consumes tags from CSV files
• Consumes policies from JSON files
• Calls Atlas and Ranger API
• Ensure same access on Hive as HDFS (not filtering and masking)
• Supports tag-based filtering
Policytool
![Page 28: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/28.jpg)
SvSv
282018-04-18
Put everything together
![Page 29: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/29.jpg)
SvSv
292018-04-18
Development process
Change model
Store model
Generate code
Deploy
PII
Add rules
*.sqlcolumn_tags.csvtable_tags.csvtag_row_policies.csv
![Page 30: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/30.jpg)
SvSv
302018-04-18
Deployment process
*.sqlcolumn_tags.csvtable_tags.csvranger_policies.json
Apply *.sql DDL
Policy tool - tag files
Policy tool - policy file
![Page 31: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/31.jpg)
SvSv
312018-04-18
Change in view of an AnalystBefore
CRM
Analyst
![Page 32: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/32.jpg)
SvSv
322018-04-18
Learnings
![Page 33: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/33.jpg)
SvSv
33
Work closely with the business
Avoid too complex rules
Minimize number of rules
Use {user}, public and other alias Ranger uses.
Clear business rules
![Page 34: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/34.jpg)
SvSv
34
People do unconsciously things differently
Keep hdfs and hive rules in sync
Use tags as much as possible
Systematic model
![Page 35: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/35.jpg)
SvSv
35
Ensure rules are in sync with what is deployed
Use CI/CD
Ask HW for latest patches on 2.6.5 (ATLAS-2634, HIVE-20633, ATLAS-2891, ATLAS-2975)
Automate
![Page 36: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/36.jpg)
SvSv
362018-04-18
Hey, would it not be nice to have the same rules in the
presentation layer?
![Page 37: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/37.jpg)
SvSv
37
CRM mart
ETLAnonymization
Data
Lak
e
Inte
grat
ion
Data
Vau
lt
Dim
ensio
nm
art
ETL BI martExasolTableau
Hadoop Presentation
Role based access
CRM
Whitelisting
…
![Page 38: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/38.jpg)
SvSv
38
Transfer rules and tags to Exasol
Use virtual schemas to apply them
Reduce amount of data in Exasol
Lower license cost
Single source of truth of access policies
Atlas & Ranger on Exasol
![Page 39: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/39.jpg)
SvSv
392018-04-18
• Simple and easy model
• Limited performance penalty
• Tag on table with masking rule => all columns masked
• Lot of moving pieces
• Hard to understand API doc
• Restriction on Ranger row based filtering (not on tags)
• Row based filtering and masking not on direct file access
Experiences of Atlas and Ranger
![Page 40: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/40.jpg)
SvSv
402018-04-18
• Our customers and partners integrity is protected
• Users have only access to data aimed for current purpose
• Keep doing our required processing
• Adaptable for new requirements
• Maintainable solution
Reached Goals
![Page 41: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/41.jpg)
SvSv
412018-04-18
• Goals reached
• No SQL changes
• Scale when new datasets added
• Our data model is guaranteed in sync
• Better comments in Hive
• Minimal impact on ETL developers workflow
Conclusions
![Page 42: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/42.jpg)
SvSv
422018-04-18
• Make it as simple as possible
• Automate
• Know your tool
• Be clear on your authorization model
• Know your data
Takeaways
![Page 43: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/43.jpg)
SvSv
43
cobra-policytool on GitHub https://github.com/SvenskaSpel/cobra-policytool
Resources
![Page 45: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/45.jpg)
SvSv
452018-04-18
BONUS - How everything is connected
![Page 46: access governance Journey in country of data€¦ · Journey in country of data access governance 2018-04-18 1. S v 2018-04-18 2 Who is talking? Magnus Runesson Data Engineer @ Svenska](https://reader035.fdocuments.in/reader035/viewer/2022081611/5f06e1dc7e708231d41a34bf/html5/thumbnails/46.jpg)
SvSv
46