Protecting customer data - Sogeti.nl Management... · 2 opening introduction / context why?...
Transcript of Protecting customer data - Sogeti.nl Management... · 2 opening introduction / context why?...
Case @Rabo IDB
Co Meerveld 05-03-2015
Testdata management
2
opening
introduction / context
why?
recommendations
Agenda
Testdata management - Case Rabobank
proces steps in Rabo
questions
3
Introduction to IDB International Direct Banking
IDB Corporate Movie
4
4
Application overview
5
Why?
6
Hack exposes 94 milion credit card numbers
A student performing a Google search for his name discovered a publicly-accessible "test"
database containing student names, birth dates and Social Security numbers for him and about
2,000 other Los Rios Community College District
students.
UC San Francisco: A development server that was less-secure than the live server was hacked. The development server contained
approx. 7000 individuals’ financial data.
Rabobank leaks 3000 psychological reports of entrepreneurs
HITRUST had a non-critical, standalone public web server compromised by an SQL injection that resulted in some test data being leaked…. …The database in question was a test database that was populated with information from rosters previously made public from planning meetings held during 2008, in addition to some factitious data created by our developers… …The user names and passwords mentioned were available only in the test database. The server did not contain any personal health or other sensitive information…
Audit versus Testers
7
Audit finding: “During the audit we found that data scrambling or masking techniques to ensure the confidentiality of customer data are only in place for two IDB countries i.e. New Zealand and Australia. Furthermore, we found that the data could still easily be read. It is the responsibility of the data owners within IDB to ensure that customer data is sufficiently protected when copied to other environments for testing purposes.” “.... investigating the options to implement data scrambling/masking, and ensuring that this is consistently applied for all IDB environments in PRG and TST in order to ensure compliance with security policies and regulations.”
Contradiction scrambling requirements
From one side, the architect registered a requirement:
•client id, subscription id, applicant id must be scrambled
From another side, the test teams (Core Banking and FEE) bring forward exactly the opposite requirement:
•The scrambling shouldn’t change the subscription and client id (we didn’t discuss application id), because the Functional and Regression testing of projects and releases is based on the test cases that use the existing client id’s
Our process
8
Tools?
Single DB?
Chain?
Objective
9
Requirement:
“To comply with rule that data may not link to corresponding
natural person”.
The aim is:
• To have production data anonymized ánd still useful as test data
• To maximize re-usage of production data (shuffle function)
• To minimize usage of generated test data (names like ‘Customer 1’, ‘Customer 2’, etc.)
• Keeping chain consistency (usage of same algorithm for each application)
IDB policy
10
• Current data protection laws prohibit usage production data for testing purpose.
• IDB has to be compliant to data protection laws. IDB wants to protect customer data.
• Accordance to security policy usage of scrambled data in test-environments is mandatory.
• All IDB test-environments (PRG,TEST,PreProd) must contain scrambled data.
• All branches must comply.
How does it work?
11
• Scrambling of every application (database) within the chain: Thaler, MDB3, CCW, Messagent, Datamart.
• Vertical deployment of scrambled data (using back-up and restore)
• Scrambling template for each application • Develop once, use many Repeatable process
• Scrambling of different kind of customer data.
• Possibility to add ‘extra testcases’ during scrambling (eg. Regression testcases)
• Maintenance of templates : • Changed application, datamodel or requirements
Examples scrambling functions
12
Data element Used Function
Client names Shuffle of available names (first name and last name) Replace last character by ‘x’
Telephone numbers Substitute telephone number by 1's except first 4 values
Birthdate Alter the birthdate by + or - a random number of days with a maximum of 90
Adress Shuffle of streetnames, postal codes, cities, etc. Replace housenumber by random value.
Comment fields Replace by “Lorum ipsum…”
Email adress Rebuild based on new name
Trading name Replace by fictitious values
Account number Generate new account numbers (incl. IBAN)
… …
Solution Overview
13
Thaler
PRG
TEST
PRE_PROD
…
Step by step: 1. Anonimize each application
on separate environment, keeping chain consistency
2. Make back-up of anonimized database
3. Refresh environments
using standard restore procedures
MDB3 CCW …
1
2
3
What will change?
14
• Application databases outside of production environment will (must!) always be anonymised.
• Refresh of scrambled test data sets done on a weekly basis (Scrambling is time consuming. It involves all (5) environments for all (6) countries)
• Standard restores done from scrambled production datasets (max one week old)
• Other restores (not standard) need to take scrambling time into account
(expect to add 8 hours or more to (Thaler) restore time)
• Unscrambled production data restore still possible (by exception only)
Recommendations, be aware!
15
• Availability of test environments.
• Communication of stakeholders
• Balance between usability and compliance requirements
• Distribution of testdata
• How to maintain
Questions?
16 16
Questions?