Governing Big Data What Strategy Works Best

25
Governing Big Data What Strategy Works Best 1.877.545.XACT www.xactdatadiscovery.com Nick Reizen, Vice President of eDiscovery Scott Polus, Director of Forensic Services with Bill Millican, Director of Sales & Operations XACT Data Discovery

description

Governing Big Data What Strategy Works Best. Nick Reizen, Vice President of eDiscovery Scott Polus, Director of Forensic Services with Bill Millican, Director of Sales & Operations XACT Data Discovery. 1.877.545.XACT www.xactdatadiscovery.com. previous. Why We’re Here. - PowerPoint PPT Presentation

Transcript of Governing Big Data What Strategy Works Best

Page 1: Governing Big Data What Strategy Works Best

Governing Big DataWhat Strategy Works Best

1.877.545.XACT www.xactdatadiscovery.com

Nick Reizen, Vice President of eDiscoveryScott Polus, Director of Forensic Services

with Bill Millican, Director of Sales & Operations

XACT Data Discovery

Page 2: Governing Big Data What Strategy Works Best

• What is happening with data growth• How does this impact your company• How does this impact YOU• Will the growth stop• What can we do• What should we do• What must we do• Why

previous

Why We’re Here

877.545.XACT www.xactdatadiscovery.com 2

Page 3: Governing Big Data What Strategy Works Best

• The problem will not fix itself• The problem will persist and increase in

velocity as the mass increases• There is a solution• Time is of the essence• What is Governance (different from management)

• The Strategy and Its Components

previous

What We Hope to Learn

877.545.XACT www.xactdatadiscovery.com3

Page 4: Governing Big Data What Strategy Works Best

• By the year 2020 data will be gaining on us at a rate 4,300 % greater than it is today

• Stored Customer Data– 2012: 1.2 ZB (zettabytes) Zettabyte = 1 Trillion GB– 2015: 7.9 ZB– 2020: 35 ZB

previous

Statistics: How About Some

877.545.XACT www.xactdatadiscovery.com4

Page 5: Governing Big Data What Strategy Works Best

• From DataVersity: With numbers like these, when Big Data and Unstructured Data walk in the room, everyone else has less oxygen to breathe.

• From the Association of Certified Financial Crime Specialists: “However, in June of 2013 it surfaced that copies of the documents had already been moved from China to Las Vegas, and had been reviewed by the same attorneys who feigned ignorance to the judge.

previous

Statistics: How About Some

877.545.XACT www.xactdatadiscovery.com5

Page 6: Governing Big Data What Strategy Works Best

• Twitter: 200 million tweets per day = 46MB per second• FB: 640 million users – 50% login daily• FB: collects an average of 15TB of data every day• LinkedIn: 100 million users• Yahoo Hadoop cluster is 82 PB = 40,000 servers• 107 trillion e-mails were sent in 2010 (2014 – ?)• Google has more than 50 billion pages in its index• YouTube has 3 billion visitors per day with 48 hours of

video uploaded per minute (May 2010 // 2014 = ?)

previous

Statistics: How About Some

877.545.XACT www.xactdatadiscovery.com6

Page 7: Governing Big Data What Strategy Works Best

• Global Mobile Data Traffic Forecast by Region (Cisco)

previous

Statistics: How About Some

877.545.XACT www.xactdatadiscovery.com7

What we want to notice is not justthe growth, but the

compound nature of the growth

Page 8: Governing Big Data What Strategy Works Best

• Amazon’s S3 Cloud Service had some 262 billion objects at the end of 2010. (2012 = 417 B) with about 315,000 requests per second. What about 2014?

• 571 new websites are created every minute of the day

• The amount of data in the world today equates to:– Every US person tweeting 3 tweets per minute for 26,976

years– Every person in the world having more than 215 million high-

resolution MRI scans A DAY– More than 200 billion HD movies which would take a person

47 million years to watch

previous

More Statistics

877.545.XACT www.xactdatadiscovery.com8

Page 9: Governing Big Data What Strategy Works Best

Multiples of bytesSI decimal prefixesBinaryusage IEC binary prefixes(Symbol) Value Name (Symbol) Value Name (KiB)210 kibibyte (kB)103210 kilobyte (MiB)220 mebibyte (MB)10622 megabyte (GiB)230 gibibyte (GB)109230 gigabyte (TiB)240 tebibyte (TB)1012240 terabyte (PiBH) 1012pebibyte (PB)1015250 petabyte (EiB)260 exbibyte (EB)1018260 exabyte (ZiB)270 zebibyte (ZB)1021270 zettabyte

(YiB)280 yobibyte (YB)1024280 yottabyteSee also: Multiples of bits · Orders of magnitude of dataThe yottabyte

previous

Statistics: How About Some

877.545.XACT www.xactdatadiscovery.com9

Page 10: Governing Big Data What Strategy Works Best

previous

877.545.XACT www.xactdatadiscovery.com10

Page 11: Governing Big Data What Strategy Works Best

Electronic Discovery Reference Modela Viable Big Data Strategy in Disguise

previous

877.545.XACT www.xactdatadiscovery.com11

• Structured vs. Unstructured Data (the IDC)

Again, what we want to noticeis the exponentialgrowth of what is

Unstructured

compared to that which is

Structured

This should sound an AlarmA very loud ALARM

This problem is very real: action is required NOW

Page 12: Governing Big Data What Strategy Works Best

Electronic Discovery Reference Modela Viable Big Data Strategy in Disguise

Information Management – Refers to the pre-discovery ESI filing and retention ==== and is the new

Information GovernanceIdentification

Preservation

Collection

Processing

Review

Analysis

Production

previous

877.545.XACT www.xactdatadiscovery.com12

Page 13: Governing Big Data What Strategy Works Best

Electronic Discovery Reference Model

previous

877.545.XACT www.xactdatadiscovery.com13

Page 14: Governing Big Data What Strategy Works Best

Information Management/Records Management

• Internal e-Discovery Preparedness & Response Team• Identify & retain those who know: collaborate• Perform inventory: find it, name it, categorize it• Prepare location map: knowing where stuff is• Advanced Inventory: making sure• Procedures for processes• Audit and Edit Policies and Supporting Manual (documentation)• Validate retrieval and accuracy of data retrieved• Making Sure ---- absolutely sure

previous

877.545.XACT www.xactdatadiscovery.com14

Page 15: Governing Big Data What Strategy Works Best

Collection: an E-Discovery Term – Is it just E-Discovery?

• Collect what is most likely relevant or responsive: ESI is narrowed for many reasons, all of which must be documented and considered for reasonability

• Note: there may be ESI preserved but not further searched or collected because of inaccessibility (unreasonable or disproportionate cost and effort)

• Keyword search preparation, date ranges, file types • In Rule 26(f) conference reach agreement on the what• Deploy and collect data from all known repositories

– (tapes, drives, portable storage devices, networks, etc.)

• Ensure use of chain of custody and authenticity protocols:– preserve metadata; MD5 or SHA-1 Hash authentication, labeling, and identification

• Meticulous record keeping and documentation of procedures– Sources are required throughout the collection process.

previous

877.545.XACT www.xactdatadiscovery.com15

Page 16: Governing Big Data What Strategy Works Best

Processing

• Remove system files: stuff not needed (?)• De-Duplication: how do you know (?)• Date filter (dates ---- could be a moving target)• File type (do you know many there can be)

– Upwards of 1700 to 2000: nah --- Yes

• Keyword Searching: which “key” words• Conceptual Searching: huh – what’s that (?)• Process to Native or Tiff for Legal Team review

– What does this mean (?)

previous

877.545.XACT www.xactdatadiscovery.com16

Page 17: Governing Big Data What Strategy Works Best

Processing

• Predictive Coding

• How many of you know what this is

• How many of you are in some fashion using this approach

• How many of you know that some courts are requiring litigants to use predictive coding in the e-discovery process?

previous

877.545.XACT www.xactdatadiscovery.com17

Page 18: Governing Big Data What Strategy Works Best

Review

• Review for relevance:– confidentiality and privilege, and related activity such

as redaction

• 50% + E-Discovery expenses are in this phase• Review in house

– Summation, Concordance)

• Review hosted– (Relativity, iCONECTnxt, FYI)

previous

877.545.XACT www.xactdatadiscovery.com18

Page 19: Governing Big Data What Strategy Works Best

Inertia Wins the Battle: Bad Stuff Happens – to Those Who Refuse to Prepare

• Records Have Become Evidence• The intricate attributes of data, information, content,

records are now placed fully under the most powerful of legally litigated microscopes and it’s all DISCOVERABLE

• Privileged vs. Non-privileged: it is a slippery slope• In the Discovery Process it’s all available

• Let’s read some more…….

previous

877.545.XACT www.xactdatadiscovery.com19

Page 20: Governing Big Data What Strategy Works Best

Inertia Wins the Battle: Bad Stuff Happens – to Those Who Refuse to Prepare

• Scentsy Inc. v. B.R. Chase LLC:– …..the Court ordered that a defendant would be allowed to depose

“appropriate individuals” to determine whether relevant information was destroyed, with Plaintiff to bear the costs.

– …..moreover, while there was no way to know for sure if documents were deleted, the Court noted that the uncertainty was “caused by Scentsy’s inadequate retention policy couple with its late and imprecise litigation hold.

– …..in the face of warnings given in this decision, constitutes the kind of willfulness or recklessness which may result in serious repercussions.

previous

877.545.XACT www.xactdatadiscovery.com20

Page 21: Governing Big Data What Strategy Works Best

Inertia – Defeating It

• Ten Steps to Take: Beginning Today

• Step 1: Realize this is not a joke and it can, and probably will happen to you

• Step 2: Get good on what is really going on in the market and with the world of BIG DATA.

• Step 3: Get up to speed on the world of Electronic Discovery and its relevance and dependency on Records Management and Information Governance

previous

877.545.XACT www.xactdatadiscovery.com21

Page 22: Governing Big Data What Strategy Works Best

Inertia – Defeating It • Ten Steps to Take: Beginning Today

• Step 4: Translate the Electronic Discovery conversation into the conversation about Data (Records / Information) Discovery.

• Step 5: Start, today, transforming your paradigm into knowing what your organization has, where it is stored, what it is called, who uses it and has access to it.

• Step 6: Begin to formulate questions about discovering the truth about your organization’s data – its records – its information – its content – its metadata: ALL OF IT

previous

877.545.XACT www.xactdatadiscovery.com22

Page 23: Governing Big Data What Strategy Works Best

Inertia – Defeating It • Ten Steps to Take: Beginning Today

• Step 7: Begin asking questions. I can almost guarantee you that no one in your organization (outside of the legal department working on a piece of litigation) is asking

• Step 8: What questions:– a: where are the repositories– b: what are the file types– c: how much is there– d: who are the custodians– e: how does creation, receipt, destruction happen– f: and so on, and so on………

previous

877.545.XACT www.xactdatadiscovery.com23

Page 24: Governing Big Data What Strategy Works Best

Inertia – Defeating It • Ten Steps to Take: Beginning Today

• Step 9: Record Keeping – don’t forget who you are and what you are capable of accomplishing

• Step 10: Know what you’re after

• Great thought: what are you after? What are we after?

• How can we who implicitly understand the nuances, and all of the good, the bad and the ugly of Records Management and Information Governance………………

previous

877.545.XACT www.xactdatadiscovery.com24

Page 25: Governing Big Data What Strategy Works Best

Governing Big DataWhat Strategy Works Best

1.877.545.XACT www.xactdatadiscovery.com

Nick Reizen, Vice President of eDiscoveryScott Polus, Director of Forensic Services

with Bill Millican, Director of Sales & Operations

XACT Data Discovery