Best Practices for Protecting Sensitive Data Across the Big Data Platform

68
© 2016 MapR Technologies | © 2016 Dataguise, Inc. Best Practices for Protecting Sensitive Data Across the Big Data Platform Mitesh Shah MapR | Product Manager Security & Data Governance ® Venkat Subramanian CTO | Dataguise

Transcript of Best Practices for Protecting Sensitive Data Across the Big Data Platform

®

1© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Best Practices for Protecting Sensitive Data Across the Big Data Platform

Mitesh ShahMapR | Product ManagerSecurity & Data Governance

®

Venkat SubramanianCTO | Dataguise

®

2© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Business Intelligence Trend for 2016 onwards…

IT-led, System-of-Record• Limited access• Glacial speed of response

Pervasive, Business-led, Self-service Analytics

• Near Real-time• Agile BI & Analytics• Deeper Insights into Diverse Data

Rita Sallam (Gartner)*

®

3© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Big Data Paradox

Data is the Biggest Asset

Data is also the Biggest Vulnerability

®

4© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Secure Business Execution

The ability of an Enterprise to safely and responsiblyleverage the value of all of their data assets to gain new business insights, maximize competitive advantage,and drive revenue growth.

®

5© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR and Dataguise…

Enable SECURE BUSINESS EXECUTION

Through

Trusted Platform and Sensitive Data Management

®

6© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Big Data Platform Needs to be Trusted (not just secure)

Can we properly identify users?Can we authorize access

to data?

Can we plug in existing enterprise systems?

Is my data highly available?

Is there a proper paper trail?

Have others done this before?

Is multi-tenancy supported? Are apps supported across geographies and data centers?

Is my data governed?

TRUSTED

SECURE

Questions to Ask of Your Big Data Vendor. Verify the Platform is Trusted.

®

7© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR Trust Model

Credibility

Vuln

Mgm

t

Detection

Resp

onse

Compliance

AA

DPA

Governance

Resilience

Four Pillars of Security

Auditing

Authorization

Data Protection

Authentication

®

8© 2016 MapR Technologies | © 2016 Dataguise, Inc..

What’s the (Big) Difference?

Flexibility•Multiple execution engines: Hive, Spark, MapReduce, Drill…

Scale•1000s of users, groups and applications sharing the same cluster

•100s data sources

•PBs of data

Multi-Structured Data•Multiple data formats: Parquet, JSON, CSV, MapR-DB tables

®

9© 2016 MapR Technologies | © 2016 Dataguise, Inc..

A

MapR Trust Model (Product Security)

GranularAuthorization

UbiquitousData Protection

• Access Control Expressions (ACEs)

• Protect files, tables, column families, columns, and management objects

• Extend to role-based access control (RBAC) with custom role functions

• Drill Views

• Encryption for data in motion• Within a cluster• Between clusters• Between client and cluster

• Encryption for data at rest• LUKS• Self-encrypting disk• Partners

• NSA-level cryptographic algorithms

• All events recorded immediately in JSON log files, with minimal performance impact

• Includes data access and administrative actions

• Ad hoc queries and custom reports on audit logs via SQL and standard BI tools

• Ticket-based authentication for all services in the cluster

• Integration with LDAP, Active Directory and other third-party directory services

• Kerberos or username/password authentication

AA

DPA4

21

3

FlexibleAuthentication

RobustAuditing

®

10© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Granular Authorization with MapR

®

11© 2016 MapR Technologies | © 2016 Dataguise, Inc..

The Problem with POSIX Permissions

-rw-rw---- bruce dev-teamPOSIX Permissions

user group other

1.Change ownership of file to Sally.

2.Add Sally to dev-team group, even if she’s not a developer.

3.Allow ‘others’ to read the file.

Scenario 1:Sally needs to read the file.Options:

???

Scenario 3: All members that belong to both dev_team and managers.

1. Allow ‘others’ to read the file.

2. Create a supergroup ‘Tech’, and include all members from dev, QA, and Support in that group. chgrp Tech <filename>

Scenario 2: Groups ‘QA’ and ‘Support’ need to read the file.Options:

POSIX Permissions Are Limiting

AUTHORIZATION

®

12© 2016 MapR Technologies | © 2016 Dataguise, Inc..

POSIX ACLs vs ACEs

r : user:sally | (group:dev_team & group:managers)

Access Control Lists

MapR Access Control Expressions

AUTHORIZATION

Which one is easier to set and understand?Which one allows for higher granularity?

®

13© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR Has ACEs for Files and MapR-DB Records

Example: user:mary | (group:admins & group:VP) & user:!bobPermissions on files, tables, column families, columns, JSON documents and sub-documents

AUTHORIZATION

Use Access Control Expressions (ACEs) to set granular permissions.

®

14© 2016 MapR Technologies | © 2016 Dataguise, Inc..

File ACEs – Key Features

Intuitive InheritanceSubdirectories and files inherit perms from parent directory

Whole-Volume ACEsVolume-level filter –useful in multitenant environments.

RolesArbitrary grouping of users according to your business needs

High PerformanceNo performance hit

Boolean OperatorsAllowing for ultra fine-grainpermissions

AUTHORIZATION

®

15© 2016 MapR Technologies | © 2016 Dataguise, Inc..

File ACEs: Whole Volume ACE Example

Whole-Volume ACEr: group:finance

Jane grants read access to Bob.File: /finance/final_report.csv r: user:bob

Bob cannot read the file/finance/final_report.csv because the whole-volume ACE is set to allow read-access to finance only.

Jane(Finance)

Bob(Developer)

Whole-Volume ACE

AUTHORIZATION

®© 2015 MapR Technologies 16

ACEs for Streams TooAUTHORIZATION

®

17© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Robust Auditing with MapR

®

18© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR Audits

• Who touched customer records outside of business hours?

• What actions did users take in the days before leaving the company?

• What operations were performed without following change control?

• Are users accessing sensitive files from protected/secured source IPs?

• Why do my reports look different, despite sourcing from same underlying data?

Monitoring IncidentResponse

Security

AUDITING

Serving Security Analysts

®

19© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR Audits – Key Features

Data Access• Files• MapR-DB Tables

Cluster Operations• Administrative Operations• Maprcli commands

Authentication Requests

Secure High Performance Flexible• Retention Period• Maxsize• Coalesce Interval• Selective Auditing

JSON Format

{"timestamp":"{$date=2015-06-01T05:24:58.231Z}","operation":"GETATTR","user":"root","uid":"0","ipAddress":"10.10.x.x","nfsServer":"10.10.x.x","srcPath":"/dbtest.0/","srcFid":"2147.16.2","VolumeName":“mktg_files","volumeId":“mktg_files","status":"0"}

AUDITING

®

20© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Querying Audit Logs with SQLExample: detect suspicious, failed commands

®

21© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Auditing After-Hours Access

®

22© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Data Protection with MapR

®

23© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Encryption at Rest (Today)

SSN

Credit Card #

Health Records

Name +Age + Address

Sensitive Data

Volume

Self-Encrypting

Disk

2

3Use Partners for Masking, Tokenization, Format Preserving Encryption

DATA PROTECTION

Many Options for Block-Level, Disk-Level, and Field-Level Encryption

1

®

24© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Sensitive Data Management with Dataguise

®

25© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Cost of a Data Breach

“Hackers and criminal insiders cause the most data breaches…malicious attacks can take an average of 256 days to identify…The most costly breaches continue to occur in the US and Germany at $217 and $211 per compromised record…If a healthcare organization has a breach, the average cost could be as high as $363.”

Time and Financial Impact on Organizations

Ponemon Institute’s 2015

®

26© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Secure Environment

Perimeter Security• Physical security, Firewalls, IDS/IPS…

Volume/File-level Encryption• Control over data access• Meeting regulatory compliance…

Aren’t these enough?YOU NEED BOTH…AND *MORE

®

27© 2016 MapR Technologies | © 2016 Dataguise, Inc..

PHI: Guidance for Data De-Identification

Sensitive/Privacy Data• Name• Address• Dates – Birth, Death...• Telephone Numbers• Device Identifiers and Serial Numbers• Email Addresses• SSN• Medical Record Numbers• Account Numbers….….

®

28© 2016 MapR Technologies | © 2016 Dataguise, Inc..

What Should We Do?

At a Granular (cell) Level:• Precisely locate sensitive content across ALL repositories• Protect those assets appropriately – masking, encryption• Provide “controlled” access to data• Enable employees, trusted partners to make data-driven decisions

RISKS

BREACH

SECURITY

COMPLIANCE

VALUE

REVENUE

DATA DRIVEN DECISIONS

BUSINESS INTELLIGENCE

®

29© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecureDETECTWhere sensitive content is present in structured, unstruct. & semi-structured data

AUDITWho has access to which sensitive data & identify misalignments and risk factors

PROTECTSensitive data at the element level –encrypt/decrypt with RBAC mask or redact

MONITORBased on alert policies, track sensitive data access through a 360°dashboard

®

30© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecureDETECTWhere sensitive content is present in structured, unstruct. & semi-structured data

AUDITWho has access to which sensitive data & identify misalignments and risk factors

PROTECTSensitive data at the element level –encrypt/decrypt with RBAC mask or redact

MONITORBased on alert policies, track sensitive data access through a 360°dashboard

Across Hadoop, RDBMS, Files, NoSQL DB

®

31© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecureOn Premise, in the Cloud, or Hybrid

DETECTWhere sensitive content is present in structured, unstruct. & semi-structured data

AUDITWho has access to which sensitive data & identify misalignments and risk factors

PROTECTSensitive data at the element level –encrypt/decrypt with RBAC mask or redact

MONITORBased on alert policies, track sensitive data access through a 360°dashboard

Across Hadoop, RDBMS, Files, NoSQL DB

®

32© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

How do we do that in DgSecure?

®

33© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Complex Sensitive Data DetectionSENSITIVE DATA DISCOVERY FOR COMPLEX

ENVIRONMENTS

Patterns in “Strings”• Digit Patterns: 4451 3340 0023 1200 8/16 B7127157

Expires 04-19-15

Patterns in “Grammar”• August Thomson vs

1240 August Ave vs 12 August 1994

Patterns in Context (Dependent)• Other data elements in horizontal or vertical vicinity

‘94538’ near address elements

Patterns in Combination (Composite)• CCN & Name, CCN, Name, Expiry not just CCN

Patterns in Knowledge• Ontologies HL7 Encoding, Financial Market Data

DISCOVERY FOR:

Data at Rest• Hadoop (HDFS)• DBMS• Teradata• Files• SharePoint

Data in Motion• Flume (into HDFS)• FTP (into HDFS or between file

systems)• Scoop (into HDFS)• Kafka (Q3 2016)

®

34© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Sensitive Data Protection

Masking• Obfuscation, one-way operation• Multiple options in DgSecure – fictitious but realistic values, X’ing out part of the content…• Consistent masking to retain statistical distribution of dataEncrpytion• Encrypted cell/row• Accessible by authorized users only – Hive, bulk, via App• Granular protectionRedaction• X’ing out entire sensitive data cell• Nullifying

Masking & Encryption in Hadoop

®

35© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Masking multiple Options - ExamplesMasking Option Applied Original Value Masked Value

Telephone – Random- Realistic, fictitious -

(508) 850-0058 (325) 418-0131

Telephone – Character- Hide digits -

(508) 850-0058 XXX-XXX-0058

Telephone – Intellimask- Replace first 3 digits -

(508) 850-0058 (451) 850-0058

Telephone – FPM - Format Preserving- Replace char & Digits with same type -

(508) 850-0058508 850 0058508-850-0058

(729) 432-9647729 432 9647729-432-9647

Telephone – Static Masking- Replace all with (111) 222-3333 -

(508) 850-0058 (111) 222-3333

®

36© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Unstructured Data – Any Sensitive Elements?

RAWDATA

®

37© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Masking Data in Hadoop (Cell Level)

RAWDATA

®

38© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Masking Data in Hadoop (Cell Level)

MASKEDDATA

®

39© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Encrypting Data in Hadoop (Cell Level)

MASKEDDATAENCRYPTEDDATA

®

40© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Masking Data in Hadoop (Cell Level)

®

41© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Decryption through Hive QueriesUser WITHOUT access privileges for Names and SSN

®

42© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Decryption through Hive QueriesUser WITH access privileges for Names and SSN

®

43© 2016 MapR Technologies | © 2016 Dataguise, Inc..

BI Use Cases and Sensitive ElementsBrand SentimentLog AnalysisCustomer RetentionClinical Trial AnalysisPayments Risk Mgmt.Trading System Perf.Risk ModelingSupply Chain Optimization

Smart MeteringInsurance PremiumsProcess EfficiencyPerson of Interest DiscoveryDynamic PricingIT Security IntelligenceReal-time UpsellMonitoring Sensors

Analytic

Transactional

NameAddressEmail AddressCustomer Lifetime ValueIP AddressURLMedical Record NumberSocial Security Number

Telephone NumberDate of Birth (DOB)IP AddressesCredit Card NumberCredit LimitPurchase AmountVINDevice ID

®

44© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Protection Policy: Encryption, MaskingBrand SentimentLog AnalysisCustomer RetentionClinical Trial AnalysisPayments Risk Mgmt.Trading System Perf.Risk ModelingSupply Chain Otimization

Smart MeteringInsurance PremiumsProcess EfficiencyPerson of Interest DiscoveryDynamic PricingIT Security IntelligenceReal-time UpsellMonitoring Sensors

Analytic

Transactional

NameAddressEmail AddressCustomer Lifetime ValueIP AddressURLMedical Record NumberSocial Security Number

Telephone NumberDate of Birth (DOB)Medical Test ResultsCredit Card NumberCredit LimitPurchase AmountVINDevice IDTransaction Date

Mask

Encrypt

®

45© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Protection Policy: Encryption, MaskingBrand SentimentLog AnalysisCustomer RetentionClinical Trial AnalysisPayments Risk Mgmt.Trading System Perf.Risk ModelingSupply Chain Otimization

Smart MeteringInsurance PremiumsProcess EfficiencyPerson of Interest DiscoveryDynamic PricingIT Security IntelligenceReal-time UpsellMonitoring Sensors

Analytic

Transactional

NameAddressEmail AddressCustomer Lifetime ValueIP AddressURLMedical Record NumberSocial Security Number

Telephone NumberDate of Birth (DOB)Medical Test ResultsCredit Card NumberCredit LimitPurchase AmountVINDevice IDTransaction Date

Mask

Encrypt

®

46© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

DgSecure Solution Workflow

®

47© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure for Hadoop: Policy

DETECT AUDIT PROTECT REPORT

• Policy• Per Data Feed?• Protection Options

• Custom Elements• Singleton• Composite• Dependent

• Domain Definition• Key Management

®

48© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure for Hadoop: Detection

In-FlightWithin HDFSFull vs. IncrementalStructured, Semi, UnstructuredQuick ScanElement Count

DETECT AUDIT PROTECT REPORT

®

49© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure for Hadoop: Access Audit

In-FlightWithin HDFSFull vs. IncrementalStructured, Semi, UnstructuredQuick ScanElement Count

Files/Directories- Sensitive Elements- Protected?- Who has access?

Users- What can theyaccess?

DETECT AUDIT PROTECT REPORT

®

50© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure for Hadoop: Protection

In-FlightWithin HDFSFull vs. IncrementalStructured, Semi, UnstructuredQuick ScanElement Count

Files/Directories- Sensitive Elements- Protected?- Who has access?

Users- What can theyaccess?

Domain BasedMaskingRedactionEncryption

- Field or Record- AES or FPE

DETECT AUDIT PROTECT REPORT

®

51© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure for Hadoop: Reports

In-FlightWithin HDFSFull vs. IncrementalStructured, Semi, UnstructuredQuick ScanElement Count

Files/Directories- Sensitive Elements- Protected?- Who has access?

Users- What can theyaccess?

Domain BasedMaskingRedactionEncryption

- Field or Record- AES or FPE

Job Level- Sensitive elements- Directories & Files- Remediation applied

Dashboard- Directory or by policy- Drill-down

Audit report- User actions

Notifications

DETECT AUDIT PROTECT REPORT

®

52© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

DgSecure Monitor

®

53© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure Monitor

Precisely Focused on Monitoring Sensitive Data• Where are the sensitive content and how many (density)• How is it protected• What data is accessed• Who is accessing itAcross All Enterprise Repositories• Hadoop and Cassandra• Cloud support (AWS S3 and Azure Blob)Continuous, Near-real-time Anomaly Behavior Detection• Using maching learning to build user profile• Complex event processing to detect breach“Out of the Box” Templates

®

54© 2016 MapR Technologies | © 2016 Dataguise, Inc..

DgSecure Monitor

NoSQL

ON PREMISE

Sensitive Info

RDBMS

Hadoop

DgSECURECLOUD

DATASTORES

S3RDBMS

BlobStorageHadoop

DgSecureRepository

Monitoring Metadata

Monitoring Metadata Manager

Detection

Data Access Information

Monitoring Engine

®

55© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.

Secure Business WorkflowEnterprise Data Marketplace Use Case

®

56© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

Multiple Data Feeds with their own PoliciesData Asset Marketplace: Data Assets (Indexed)Access Granted upon Request per policy & compliance

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

®

57© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

CISO/CPO:Setpolicyperdata

feedtype

®

58© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

DataAssetOwner:Provenancemetadata

®

59© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

RunDiscovery todetectsensitivedata

Metadatatorepository

Mask/Encrypt toprotectsensitivedata

Metadataincl.lineagetorepository

®

60© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

IT/SetProcess:UseMetadatatoset

accesscontrol

®

61© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

DataAssetowneraddsannotations&addstoDataAsset

Index

®

62© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

DataScientistbrowsesavailabledatasetsandmakes

accessrequest

®

63© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

Dataownerapproves requestSetsaccesscontrol

inRanger

®

64© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

DataScientistrunsdata

mining/BI/Analytics

®

65© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

DataScientistrunsdata

mining/BI/Analytics

Other Data Sources

®

66© 2016 MapR Technologies | © 2016 Dataguise, Inc..

Data Marketplace End-to-End Workflow

1SOURCES LANDING ZONE DATA PROCESS

COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORTData Admin Data Scientist Data Admin Data Scientist

Set policy per feed

Data Lake

Data Feed 1

Data Feed 2

Data Feed 3

Data Feed 4

Set Access Control Metadata Repository

Q1 Region 1 Data Set 1Q2 Region 2 Data Set 2Q3 Region 3 Data Set 3Q4 Region 4 Data Set 4

Access Given

Access Denied

WORKFLOWSECURE BUSINESS EXECUTION1 2 3

4 5 6 7 8

Other Data Sources

®

67© 2016 MapR Technologies | © 2016 Dataguise, Inc..

MapR + Dataguise: Comprehensive Data Security

ActiveDirectory

Disk

Auditing

IncidentResponseAuthentication

Authorization

Data Protection

Data Protection

ComplianceVulnerability Management

®© 2016 MapR Technologies 68© 2016 MapR Technologies

Q&A