Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

28
Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation James Chan DevOps: Continuous Delivery CA Technologies Sr. Principal Consultant, Technical Sales D04T26T #CAWorld

Transcript of Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

Page 1: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

Fines in the Millions Levied Every Year Coming Soon!The Business Case for Synthetic Data Generation

James Chan

DevOps: Continuous Delivery

CA Technologies

Sr. Principal Consultant, Technical Sales

D04T26T

#CAWorld

Page 2: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

2 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

For Informational Purposes Only

© 2015 CA. All rights reserved. All trademarks referenced herein belong to their respective companies.

The content provided in this CA World 2015 presentation is intended for informational purposes only and does not form any type of

warranty. The information provided by a CA partner and/or CA customer has not been reviewed for accuracy by CA.

CA does not provide legal advice. Neither this document nor any CA software product referenced herein shall serve as a substitute for your

compliance with any laws (including but not limited to any act, statute, regulation, rule, directive, policy, standard, guideline, measure,

requirement, administrative order, executive order, etc. (collectively, “Laws”)) referenced in this document. You should consult with

competent legal counsel regarding any Laws referenced herein.

Terms of this Presentation

Page 3: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

3 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Abstract

As data becomes ever more central to daily operations, pressure is mounting on organizations to become fully secure. Legislation is becoming increasingly stringent throughout the world. The common practice of using production data in non-production environments will soon risk fines in the millions of dollars.

James Chan

CA Technologies

Sr. Principal Consultant, Technical Sales

Page 4: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

4 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Agenda

DATA BREACHES IN 2014-2015

APPLICATION QUALITY AND DATA

SYNTHETIC DATA GENERATION

GLOBAL TRENDS IN DATA REGULATION

THE GDPR AND HOW IT AFFECTS YOU

MASKING DATA FOR COMPLIANCE

1

2

3

4

5

6

Page 5: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

5 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Information collected from public sources to generate Gemalto report.http://www.gemalto.com/brochures-site/download-site/Documents/Gemalto_H1_2015_BLI_Report.pdf

Page 6: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

6 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

The cost of a data breach is on the rise.

The average cost of a data breach rose 15 percent last year to $3.5 million.

Data breaches will cost $2.1 trillion in 2019.

The Target data breach (2013) cost shareholders $148 million.

1 Ponemon Institute, 20142 Juniper Research, 20153 Forbes, 2014

Page 7: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

7 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

DEV

Application Quality and Data

QA/Test PRODUCTIONPRE-PROD

Functional testing UAT Integration

testing Performance

testingDeploy to pre-prod

Deploy to productionCI/BuildCode

commit SCM

Find defects here …instead of here

SHIFT LEFT

Release Plan

Design Spec

CustomerExperience

Requirements

Test data warehouse

If we’re going to test, we need data.So where do we get the data we need?

Production data is there, but it’s real data. It’s our personal data.

Synthetic data has no PII, but it’s traditionally been too difficult to implement (scripts, lack of common framework, adoption).

Page 8: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

8 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Using Production Data Risks Non-Compliance

Sensitive data is stored inconsistently enterprise-wide, in uncontrolled spreadsheets.

Fully masking data is highly complicated—the complex data first has to be fully profiled…

…Usually some complex relationships are left unmasked as a form of compromise.

Manual masking, in-house tools and ETL processes are slow and error-prone, while 58 percent of data breaches are caused by internal human error.

1 Infosec, 2014

Page 9: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

9 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Legislation is Becoming More Stringent Worldwide

“Dutch Parliament Adopts Data Breach

Notification Obligation and Increases Fines”

“Reach of Nevada

Personal Data Laws

Extended”

“Data Breach Notification

Bills Introduced in House

and Senate”

“Australia’s New Mandatory Data Retention Law”

“House to Move on Student Data

Privacy”

“Data Breach Provisions in Outsourcing Contracts”

“New Data Protection Powers

Requested in Oregon”

“The Personal Data Notification &

Protection Act Seeks Uniformity in

Responses to Data Security Breaches”

“Processing Personal Data in

Russia? Consider These

Changes to Russian Law and

How They May Impact Your

Business”

“New europeaN

Data Protection

Guidelines

published”

“ISO 27018 – Data Protection Standards for the Cloud”

“FTC COntinues to Expand Its Role as All-Purpose

Data Privacy and Security Regulator”

“FCC Cracks Down on Consumer

Privacy Violations” “Florida Law Requires Businesses to Ramp Up Data Protection or Face Steep Penalties”

“Delaware Data Disposal law requires

action by affecteD businesses”

"New Data Privacy Rules on Mobile Payments"

"African Union Adopts

Convention on

Cybersecurity and

Personal Data

Protection"

"China's New Consumer Protection Law"

"Singapore's Personal Data Protection Act Now in Force"

"Increased Enforcement of Data Protection Law Expected"

Page 10: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

10 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Draft European Union General Data Protection Regulation (GDPR)—To Be Enforced Soon

Proposed EU Data Protection Legislation

20172016

Page 11: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

11 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

The GDPR Seeks To:

Unify legislation, so that a single set of rules applies across the EU.

Place responsibility on anyone processing EU citizen’s data –even if they are outside of the EU.– “The controller shall adopt policies and implement measures to be

able to demonstrate that the processing of personal data is performed in compliance with this Regulation.”

Page 12: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

12 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

These responsibilities will likely include

Having an independent data protection officer (DPO) if an organization is more than 250 employees in size

Conducting data protection impact assessments, concerning the lifecycle of personal data

Have data protection policies in place and be able to demonstrate that these are known by employees

Reporting data breaches no later than 72 hours after they occur.

Page 13: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

13 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

It will be harder to use production data for testing and development.

The GDPR will strengthen existing legislation forbidding the use of personal data for reasons other than why it was given.

Data can only be used if:– Explicit or unambiguous consent has been given for its use for the specific purpose.

– It is necessary for legal purposes (e.g. to fulfil a contract, the subject’s vital interest).

– It is necessary for public interest or for a legitimate interest of the processor.

Data shall not be retained “beyond the minimum necessary, in terms of amount of the data and time of their storage” and shall not be made accessible to an indefinite number of individuals.

Page 14: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

14 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Organizations processing EU data must be able to satisfy:

The right to data portability: a citizen’s right to request a copy of data in a format usable by them

The right to erasure: for data to be forgotten unless there is a legitimate reason to keep it.

Page 15: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

15 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Organizations must be able to demonstrate compliance:

Each member state will establish a supervisory authority with investigative powers.

They will be able to levy fines could be around 1 million € or 2 of annual turnover/revenue (whichever is highest) if full compliance cannot be demonstrated.

Page 16: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

16 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Will you be compliant with upcoming legislation?

52% of senior European IT professionals said they are not ready for the General Data Protection Regulation.

35% admitted that they did not know whether their IT policies and processes were up to the necessary standards.

Only 11.1% of US organizations were fully PCI DSS compliant by their annual baseline assessment in 2013.

1 & 2 Ipswitch , 2014

3 Verizon, 2014

Page 17: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

17 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Production data sources and files

Validation

Profiling

PII Discovery

Native Engines

Data Subset

Audit

Test Data manager

1 2 3

Secure Data Subsets

Test/Dev Environments

SubsetSubset

Subset

Seed Tables

Custom Masking Functions

Cross Referencing Masked

XML Files

Excel Files

SQL Files

CSV Files Fixed Definition Files

HTML Files

VSAM/ISAM

Swift

TXT Files

Part of the Solution: Efficient and Effective Data Masking

Page 18: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

18 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Automatically discover sensitive data stored enterprise-wide.

Powerful algorithms discover allsensitive data, enterprise-wide

Support for every major database type, mainframe platforms and flat files

Simple, easy to use interface when selecting columns or tables to mask and masking routines

Multiple columns can be selected at onceXML

FilesExcel Files

SQL Files

CSV Files Fixed Definition Files

HTML Files

VSAM/ISAMSwift

TXT Files

Page 19: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

19 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Create millions of rows of referentially intact data in minutes.

Millions of rows of data masked in minutes using native masking engines

Over 80 built-in data masking functions

Cross-referencing to manage data synchronization across systems

GT Fast Data Masker uses native database scripts to produce the highest possible performance when masking Oracle, MS Server, Teradata and mainframe platforms

These include substitution, randomization, hashing and seed data. Once created, masking rules can be stored and re-used from a central repository.

GT Fast Data Masker will only display suitable masking routines, based on the selected column and tables

Deterministic masking functions and built-in cross-referencing ensure consistency. The referentially intact, realistic data, can be injected into multiple systems at once

Number of Rows Time To Mask

201,722,392 9 minutes, 42 seconds

453,877,152 11 minutes, 22 seconds

768,088,071 7 minutes, 57 seconds

17,422,541 1 minute

46,579,485 1 minute, 25 seconds

1,759,612 13 seconds

47, 895 5 seconds

Page 20: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

20 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Helps You Achieve Demonstrable Compliance

Fully auditable, centralized provisioning, reporting and tracking

Assists with requirements to Demonstrate Compliance with the EU GDPR, GLBA, HIPAA, PCI DSS, PIPEDA and more

Mask in place or in flight for secure virtualization and outsourcing

All GT Fast Data Masker process are fully auditable, while data is provisioned from a central warehouse

GT Fast Datamasker provides a log of the mask

Page 21: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

21 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

But remember…

Even completely masked data says a lot – either commercially sensitive functional requirements can be inferred from the data or the data will break a system; you can’t have it both ways!

Masking data still cannot prevent internal human error.

The only way to truly secure your data is to not let production data leave production environments in any form.

This can only be achieved using synthetic data generation.

Page 22: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

22 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Synthetic Data Generation

“Empty”

CA TDM + Required Data Characteristics

Provision fit for purpose data anytime and every time!Provision data with or without access to production systems!

Ready for Testing!

Page 23: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

23 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

The SolutionMasking technologies will continue to be used to address compliance issues, despite the inherent flaws with the strategy. The most mature organizations will start leveraging synthetic data technologies to completely remove the need for using sensitive data in lower environments

Regulations

Legislation around data privacy is on the rise worldwide. The topic of using real data for testing will soon not only have quality ramifications, but legal effects.

Maintaining QualityOrganizations leveraging production data for testing today will have to completely rethink the way they acquire the data they need to ensure application quality.

SummaryA Few Words to Review

Page 24: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

24 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Q & A

Page 25: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

25 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Recommended Sessions

SESSION # TITLE DATE/TIME

DO4T16SCase Study: Manheim Implements Test Data

Management to Reduce Testing Time and Costs11/18/2015 at 04:30 pm

DO4T03SAnalyst View: Leading Your DevOps Enterprise Journey –

Gene Kim11/19/2015 at 10:30 am

DO4T17ST-Mobile’s DevOps and Continuous Delivery Journey –

Building a Foundation for a Future Built for Agility11/19/2015 at 2:00 pm

Page 26: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

26 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Must See Demos

Test Data Manager

DevOps

Theater 4

DevOps Simulation

DevOps

Theater 3

Test Case Optimizer

DevOps

Theater 4

TDM Integrations

DevOps

Theater 4

Page 27: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

27 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

Follow On Conversations At…

Smart Bar

Theaters 3 & 4

Tech Talks

Theaters 3 & 4

Page 28: Fines in the Millions Levied Every Year Coming Soon! The Business Case for Synthetic Data Generation

28 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD

For More Information

To learn more, please visit:

http://cainc.to/Nv2VOe

CA World ’15