InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into...

23
1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved. InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics Tord Svensson Alan Hutcheson

Transcript of InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into...

Page 1: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved.

InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics

Tord Svensson Alan Hutcheson

Page 2: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

2 © Copyright 2016 Dell . All rights reserved.

JOIN THE CONVERSATION! #MMTM16

Take the LEAP personality quiz

and win!

Connect with us

ECD SERVICES

Genius Labs Garden Level

Foyer

Page 3: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

3 © Copyright 2016 Dell . All rights reserved.

An Enterprise Information Archiving Platform that unlocks data of all types, trapped in siloed applications, lowering IT costs, preserving compliance and putting application data to work.

Leave No Application Data Behind

Page 4: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

4 © Copyright 2016 Dell . All rights reserved.

Page 5: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

5 © Copyright 2016 Dell . All rights reserved.

Conflict: Analytics vs Compliance

Access = ƒ(Compliance) Without proper controls, the compliance, risk, and/or legal teams will block efforts to move data into the lake.

ANALYTICS

COMPLIANCE

Access to as much data as possible

Protect & control access to data

?

Page 6: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

6 © Copyright 2016 Dell . All rights reserved.

Analytics Use Cases

HISTORICAL REAL TIME

Make better decisions based on understanding

what happened

Make better decisions based on what is

happening

Page 7: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

7 © Copyright 2016 Dell . All rights reserved.

Historical Analytics

Data Source

Data Source

Data Source

Archive

Compliance Engine

WORM Storage Act Act

Ingestion

Real Time Analytics

Access

Historical Analytics

eDiscovery

Page 8: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

8 © Copyright 2016 Dell . All rights reserved.

Compliance and Data Lake

Build compliance into the Data Lake?

Compliance Data

Application data streaming into data lake

Filter compliance data into the Data Lake

DATA LAKE

Page 9: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

9 © Copyright 2016 Dell . All rights reserved.

Bringing Compliance to Analytics

InfoArchive

Hadoop

HDFS MapReduce

Big Data Analytics

Big Data Analytics

Application built using Hadoop & 3rd party tools

Customer needs compliant archive for application decommissioning, active archiving & data reuse Customer needs scalable big data analytics

Page 10: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

10 © Copyright 2016 Dell . All rights reserved.

InfoArchive and Hadoop

HDFS API

XML XML XML

XML XML XML

InfoArchive

Hadoop

Retention policy

Filter Transform Tokenize Mask

Real time access to the

information

Information proactively

pushed

Option 2: Rendition

Option 1: Direct Access

Hadoop

REST

/JD

BC

Page 11: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

11 © Copyright 2016 Dell . All rights reserved.

InfoArchive and Hadoop – Analytical Rendition Analytical Rendition • Partial masking and redaction • Transformation to suitable format, e.g. JSON • Retention policy is inherited from the source data

<?xml version="1.0" encoding="utf-8"?> <Customers> <Customer CustomerID="GREAL"> <CompanyName>Great Lakes Food Market</CompanyName> <ContactName>Howard Snyder</ContactName> <ContactTitle>Marketing Manager</ContactTitle> <Phone>(503) 555-7555</Phone> <SSNBR>123-45-4567</SSNBR> <FullAddress> <Address>2732 Baker Blvd.</Address> <City>Eugene</City> <Region>OR</Region> <PostalCode>97403</PostalCode> <Country>USA</Country> </FullAddress> </Customer> <Customer CustomerID="HUNGC"> <CompanyName>Hungry Coyote Import Store</CompanyName> <ContactName>Yoshi Latimer</ContactName> <ContactTitle>Sales Representative</ContactTitle> <Phone>(503) 555-6874</Phone> <SSNBR>XXX-XX-XXXX</SSNBR> <FullAddress> <Address>City Center Plaza 516 Main St.</Address> <City>Elgin</City> <Region>OR</Region> <PostalCode>97827</PostalCode> <Country>USA</Country> </FullAddress> </Customer> <Customers>

Delete

Mask

Page 12: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

12 © Copyright 2016 Dell . All rights reserved.

Customer Use Cases FSI

– Understanding customer interactions – Compliance & Risk

Healthcare – Understand treatment results to predict new approches – Predicting readmission rates

Page 13: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

13 © Copyright 2016 Dell . All rights reserved.

Real Time Analytics

Data Source

Data Source

Data Source

Archive

Compliance Engine

WORM Storage Act Act

Ingestion

Real Time Analytics

Access

Historical Analytics

eDiscovery

Page 14: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

14 © Copyright 2016 Dell . All rights reserved.

Real Time Use Cases

• Scale-out ingestion

processing • Fault-tolerant

processing • Guaranteed data

delivery

• Digital Surveillance • Fraud Prevention • Data Loss Prevention • Data Valuation • Sentiment Analysis

• Metadata Collection • Deduplication • Automatic

classification • Data partitioning

strategies • Entity Extraction • Index Management • Update Elimination

• Dodd-Frank, MiFiD II… • Right to be Forgotten • Right to Access • Retention Decisions • Encryption Decisions • Tokenization Decisions

DATA CENTRIC INGESTION CENTRIC DOMAIN CENTRIC COMPLIANCE CENTRIC

Page 15: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

15 © Copyright 2016 Dell . All rights reserved.

High Volume Ingestion Framework

</> </>

</>

Ingestion Framework

Microservices

Spring Cloud Data Flow

Cloud Foundry, Yarn, Mesos, Kubernetes

InfoArchive

Page 16: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

16 © Copyright 2016 Dell . All rights reserved.

Adaptable

</> </>

</>

Ingestion Framework

Microservices

InfoArchive

Page 17: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

17 © Copyright 2016 Dell . All rights reserved.

Real Example

</> </>

</>

Ingestion Framework

Extract

Microservices

InfoArchive

Enrich Archive (dedup and package)

Confirm

LDAP

Page 18: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

18 © Copyright 2016 Dell . All rights reserved.

Real Example Continued

</> </>

</>

Ingestion Framework

Extract

Microservices

InfoArchive

Enrich Archive (dedup and package)

Confirm

LDAP

Sentiment

CoreNLP

Page 19: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

19 © Copyright 2016 Dell . All rights reserved.

The Big Picture

Many Frameworks

Microservices

...

Entity Recognition

Sentiment Analysis

Language Dependencies

Adaptive Learning

Classification

Topic Modeling

Similarity Calculations

Vectorization

Shallow Learning

Real-Time Processing

Text Analytics

Machine Learning

Natural Language Processing

Advanced Text Processing

Page 20: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

20 © Copyright 2016 Dell . All rights reserved.

Conclusion & Call to Action • InfoArchive

– enables both historical and real time analytics – while remaining compliant

• Historical Analytics – Direct or via rendition

• Real Time Analytics – When data is ingested – Enrich or take action

Feedback: – What are your archiving use cases? – What are your analytics use cases?

Page 21: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

21 © Copyright 2016 Dell . All rights reserved.

MOMENTUM BARCELONA APP AND WIN!

21

http://bit.ly/mmtm16BCN

© Copyright 2016 Dell . All rights reserved.

BEYOND SILOS Play the BEYOND Game and win a Raspberry Pi pre-loaded with InfoArchive

Page 22: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

22 © Copyright 2016 Dell . All rights reserved.

LET US KNOW WHAT YOU THOUGHT Take the Session Survey

1. Open the schedule with the Momentum App 2. Go to the session you attended 3. Open “Session Survey” 4. Answer the 4 questions and submit. Thank you!

© Copyright 2016 Dell . All rights reserved.

Page 23: InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into data lake. Filter compliance data into the Data Lake. DATA LAKE ... • Sentiment Analysis

23 © Copyright 2016 Dell . All rights reserved.