DATPROF Test data Management (data privacy & data subsetting) - English
Transcript of DATPROF Test data Management (data privacy & data subsetting) - English
![Page 1: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/1.jpg)
Test Data ManagementHarald Kikkers, Maarten Urbach & Bert Nienhuis
![Page 2: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/2.jpg)
DATPROF
Data IntegrationTest Data Management
• Dutch Software supplier
• Founded in 1998
• Partners: ITCG, Sogeti, …
![Page 3: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/3.jpg)
…and many more!
![Page 4: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/4.jpg)
MANYORGANISATIONSUSE MULTIPLE COPIES OFPRODUCTION DATABASES
![Page 5: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/5.jpg)
PURPOSES:• TESTING
• DEVELOPMENT
• OUTSOURCING
• MARKETING
• TRAINING
![Page 6: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/6.jpg)
Agile Development • Building the right product• Room for change• Every 2-4 weeks working increments of the software• Progress in development
![Page 7: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/7.jpg)
Test Automation
How to test all these iterations?
And… what data to use?
?
![Page 8: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/8.jpg)
Team 1 Team 2 Team 3
6 TB 500 GBProduction
10 GB
6 TB 500 GBTest
10 GB 6 TB 500 GBDevelopment
10 GB
Total19,53 TB
![Page 9: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/9.jpg)
Team 1 Team 2 Team 3
6 TB 500 GBProduction
10 GB
6 TB 500 GBTest
10 GB 6 TB 500 GBDevelopment
10 GB
Total19,53 TB
Team 1 Team 2 Team 3
Test
Team 1 Team 2 Team 3
Development
![Page 10: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/10.jpg)
Team 1 Team 2 Team 3
6 TB 500 GBProduction
10 GB
6 TB 500 GB
Development
10 GB
6 TB 500 GBTest
10 GB
6 TB 500 GB
Development
10 GB
6 TB 500 GBTest
10 GB
6 TB 500 GB
Development
10 GB
6 TB 500 GBTest
10 GB
Total45,57 TB
![Page 11: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/11.jpg)
Team 1 Team 2 Team 3
6 TB 500 GBProduction
10 GB
600 GB 50 GB
Development
1 GB
600 GB 50 GBTest
1 GB
600 GB 50 GB
Development
1 GB
600 GB 50 GBTest
1 GB
600 GB 50 GB
Development
1 GB
600 GB 50 GBTest
1 GB
Total10.4 TB
10 % Subset 10 % Subset 10 % Subset
![Page 12: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/12.jpg)
Development
Test
Development
Test
Development
Test
How to protectsensitive customer data?
![Page 13: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/13.jpg)
Test Test Test
Development Development Development
![Page 14: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/14.jpg)
Minimize data usage
Save on hardware & infra
Reduce throughput times
Efficient data management
Protect customer information
Comply with regislation
Prevent brand damage
Maintain competitive advantages
Subsetting AnonymizingAdvantages of subsetting data Advantages of scrambling & masking data
![Page 15: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/15.jpg)
DBA Tools ETL Suites100$ tools IBM, Informatica, Oracle
![Page 16: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/16.jpg)
DBA Tools ETL Suites?
![Page 17: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/17.jpg)
DBA Tools ETL Suites
- User Experience
- Default templates
- Easy to maintain
- Smart functionality
- Chain support
![Page 18: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/18.jpg)
DBA Tools ETL Suites
![Page 19: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/19.jpg)
Production Test/DevelopmentSource Database Target Database
![Page 20: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/20.jpg)
Data model classification
Subset – Process dataExample: Customers, Orders, Contracts, Invoices, Transactions
Full – Master dataExample: Application data, configuration, master tables
Embty – Logging, non relevant historyExample: Logging tables, temp tabellen
Determine data to be subsetted
![Page 21: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/21.jpg)
![Page 22: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/22.jpg)
Chain of systemsMethod for deriving consistent subsets from multiple systems
Production Test/Development
Start FilterAll customers from The
Netherlands
Start FilterAll orders from customers in
the previous subset.
![Page 23: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/23.jpg)
ImportMeta data Classification Deployment
![Page 24: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/24.jpg)
Anonymization of sensitive data
![Page 25: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/25.jpg)
- Bank account balance
- Dept
- Medication
- Illness
- Religion
- Political preference
- Salary
- Phone history
- Et cetera…
- Name
- Date of birth
- Bank account number
- Social security number
- Adress
- Insurance number
- Cellphone number
- Et cetera..
Personal data
Identifying Characteristics
“Any information relating to an identified or identifiable natural person ("data subject")Source: Data Protection Directive - Directive 95/46/EC
![Page 26: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/26.jpg)
Techniques
![Page 27: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/27.jpg)
ShuffleShuffle values within same column
ConditionalManipulate specified rows+
First name Last name Type
John
Max
Joe
Clark
Smith
Williams
DATPROF
Customer
Customer
Customer
Company
![Page 28: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/28.jpg)
321First name Last name Type Comment E-Mail
John
Max
Joe
Smith
Williams
Clark
BlankDelete values from columns
ScrambleReplace existing characters
“Brother of J. Clark”
“Has dept”
Customer
Customer
Customer
CompanyDATPROF
![Page 29: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/29.jpg)
Nr. First name Last name Type Co.. E-mail Date of Birth
John
Max
Joe
Smith
Williams
Clark
DATPROF
123
Customer
Customer
Customer
Company
321
789
456
First dayChange dates to first day within same month and year
01-02-1954
01-11-1984
01-03-1974
Postal code
Date of Birth 1st day of month 1st day of year
87% 3.7% 0.04%Source: research anonimity by Prof. Dr. Latanya Sweeney (Harvard University)
x.xxxxx@xxxx...
Xxxxx_xxx@xx...
x_xx@XxxXxxx...
![Page 30: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/30.jpg)
Nr. First name Last name Type .. E-mail Date of birth
123
321
789
01-02-1954
01-11-1984
01-03-1974
Look-upReplace values with values from a lookup table
James
Adrian
Thomas
John
Max
Joe
First names
Chris
Thomas
James
Ruben
Adrian
Michael
David
Reference data
Smith
Williams
Clark
DATPROF
Customer
Customer
Customer
Company
x.xxxxx@xxxx...
Xxxxx_xxx@xx...
x_xx@XxxXxxx...
![Page 31: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/31.jpg)
Nr. First name Last name Type Comment E-mail Date of birth
Thomas
James
Adrian
Smith
Williams
Clark
DATPROF
123
Customer
Customer
Customer
Company
321
789
456
01-02-1954
01-11-1984
01-03-1974
ExpressionUse custom made functions
Scrambled [email protected]
Scrambled
Scrambled
![Page 32: DATPROF Test data Management (data privacy & data subsetting) - English](https://reader035.fdocuments.net/reader035/viewer/2022062218/58edaa7c1a28ab46528b46db/html5/thumbnails/32.jpg)
ImportMeta data
Define masking rules 3. Deployment