InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into...
Transcript of InfoArchive: Ensuring Big Data Compliance and Reducing ... · Application data streaming . into...
1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved.
InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics
Tord Svensson Alan Hutcheson
2 © Copyright 2016 Dell . All rights reserved.
JOIN THE CONVERSATION! #MMTM16
Take the LEAP personality quiz
and win!
Connect with us
ECD SERVICES
Genius Labs Garden Level
Foyer
3 © Copyright 2016 Dell . All rights reserved.
An Enterprise Information Archiving Platform that unlocks data of all types, trapped in siloed applications, lowering IT costs, preserving compliance and putting application data to work.
Leave No Application Data Behind
4 © Copyright 2016 Dell . All rights reserved.
5 © Copyright 2016 Dell . All rights reserved.
Conflict: Analytics vs Compliance
Access = ƒ(Compliance) Without proper controls, the compliance, risk, and/or legal teams will block efforts to move data into the lake.
ANALYTICS
COMPLIANCE
Access to as much data as possible
Protect & control access to data
?
6 © Copyright 2016 Dell . All rights reserved.
Analytics Use Cases
HISTORICAL REAL TIME
Make better decisions based on understanding
what happened
Make better decisions based on what is
happening
7 © Copyright 2016 Dell . All rights reserved.
Historical Analytics
Data Source
Data Source
Data Source
Archive
Compliance Engine
WORM Storage Act Act
Ingestion
Real Time Analytics
Access
Historical Analytics
eDiscovery
8 © Copyright 2016 Dell . All rights reserved.
Compliance and Data Lake
Build compliance into the Data Lake?
Compliance Data
Application data streaming into data lake
Filter compliance data into the Data Lake
DATA LAKE
9 © Copyright 2016 Dell . All rights reserved.
Bringing Compliance to Analytics
InfoArchive
Hadoop
HDFS MapReduce
Big Data Analytics
Big Data Analytics
Application built using Hadoop & 3rd party tools
Customer needs compliant archive for application decommissioning, active archiving & data reuse Customer needs scalable big data analytics
10 © Copyright 2016 Dell . All rights reserved.
InfoArchive and Hadoop
HDFS API
XML XML XML
XML XML XML
InfoArchive
Hadoop
Retention policy
Filter Transform Tokenize Mask
Real time access to the
information
Information proactively
pushed
Option 2: Rendition
Option 1: Direct Access
Hadoop
REST
/JD
BC
11 © Copyright 2016 Dell . All rights reserved.
InfoArchive and Hadoop – Analytical Rendition Analytical Rendition • Partial masking and redaction • Transformation to suitable format, e.g. JSON • Retention policy is inherited from the source data
<?xml version="1.0" encoding="utf-8"?> <Customers> <Customer CustomerID="GREAL"> <CompanyName>Great Lakes Food Market</CompanyName> <ContactName>Howard Snyder</ContactName> <ContactTitle>Marketing Manager</ContactTitle> <Phone>(503) 555-7555</Phone> <SSNBR>123-45-4567</SSNBR> <FullAddress> <Address>2732 Baker Blvd.</Address> <City>Eugene</City> <Region>OR</Region> <PostalCode>97403</PostalCode> <Country>USA</Country> </FullAddress> </Customer> <Customer CustomerID="HUNGC"> <CompanyName>Hungry Coyote Import Store</CompanyName> <ContactName>Yoshi Latimer</ContactName> <ContactTitle>Sales Representative</ContactTitle> <Phone>(503) 555-6874</Phone> <SSNBR>XXX-XX-XXXX</SSNBR> <FullAddress> <Address>City Center Plaza 516 Main St.</Address> <City>Elgin</City> <Region>OR</Region> <PostalCode>97827</PostalCode> <Country>USA</Country> </FullAddress> </Customer> <Customers>
Delete
Mask
12 © Copyright 2016 Dell . All rights reserved.
Customer Use Cases FSI
– Understanding customer interactions – Compliance & Risk
Healthcare – Understand treatment results to predict new approches – Predicting readmission rates
13 © Copyright 2016 Dell . All rights reserved.
Real Time Analytics
Data Source
Data Source
Data Source
Archive
Compliance Engine
WORM Storage Act Act
Ingestion
Real Time Analytics
Access
Historical Analytics
eDiscovery
14 © Copyright 2016 Dell . All rights reserved.
Real Time Use Cases
• Scale-out ingestion
processing • Fault-tolerant
processing • Guaranteed data
delivery
• Digital Surveillance • Fraud Prevention • Data Loss Prevention • Data Valuation • Sentiment Analysis
• Metadata Collection • Deduplication • Automatic
classification • Data partitioning
strategies • Entity Extraction • Index Management • Update Elimination
• Dodd-Frank, MiFiD II… • Right to be Forgotten • Right to Access • Retention Decisions • Encryption Decisions • Tokenization Decisions
DATA CENTRIC INGESTION CENTRIC DOMAIN CENTRIC COMPLIANCE CENTRIC
15 © Copyright 2016 Dell . All rights reserved.
High Volume Ingestion Framework
</> </>
</>
Ingestion Framework
Microservices
Spring Cloud Data Flow
Cloud Foundry, Yarn, Mesos, Kubernetes
InfoArchive
16 © Copyright 2016 Dell . All rights reserved.
Adaptable
</> </>
</>
Ingestion Framework
Microservices
InfoArchive
17 © Copyright 2016 Dell . All rights reserved.
Real Example
</> </>
</>
Ingestion Framework
Extract
Microservices
InfoArchive
Enrich Archive (dedup and package)
Confirm
LDAP
18 © Copyright 2016 Dell . All rights reserved.
Real Example Continued
</> </>
</>
Ingestion Framework
Extract
Microservices
InfoArchive
Enrich Archive (dedup and package)
Confirm
LDAP
Sentiment
CoreNLP
19 © Copyright 2016 Dell . All rights reserved.
The Big Picture
Many Frameworks
Microservices
...
Entity Recognition
Sentiment Analysis
Language Dependencies
Adaptive Learning
Classification
Topic Modeling
Similarity Calculations
Vectorization
Shallow Learning
Real-Time Processing
Text Analytics
Machine Learning
Natural Language Processing
Advanced Text Processing
20 © Copyright 2016 Dell . All rights reserved.
Conclusion & Call to Action • InfoArchive
– enables both historical and real time analytics – while remaining compliant
• Historical Analytics – Direct or via rendition
• Real Time Analytics – When data is ingested – Enrich or take action
Feedback: – What are your archiving use cases? – What are your analytics use cases?
21 © Copyright 2016 Dell . All rights reserved.
MOMENTUM BARCELONA APP AND WIN!
21
http://bit.ly/mmtm16BCN
© Copyright 2016 Dell . All rights reserved.
BEYOND SILOS Play the BEYOND Game and win a Raspberry Pi pre-loaded with InfoArchive
22 © Copyright 2016 Dell . All rights reserved.
LET US KNOW WHAT YOU THOUGHT Take the Session Survey
1. Open the schedule with the Momentum App 2. Go to the session you attended 3. Open “Session Survey” 4. Answer the 4 questions and submit. Thank you!
© Copyright 2016 Dell . All rights reserved.
23 © Copyright 2016 Dell . All rights reserved.