This Webcast Will Begin Shortly - Webinars, Webcasts, LMS...

download This Webcast Will Begin Shortly - Webinars, Webcasts, LMS ...media01.commpartners.com/.../Webslides_eDiscovery_Webcast_Lit0… · Setting Up the Review: Tools Repository or Review

If you can't read please download the document

Transcript of This Webcast Will Begin Shortly - Webinars, Webcasts, LMS...

  • 1

    This Webcast Will Begin Shortly

    If you have any technical problems with theWebcast or the streaming audio, please contact us

    via email at:

    [email protected]

    Thank You!

    Electronic Data DiscoveryTechnology & Terminology

    A Primer for In-House Counsel

    July 26, 2007

    Presented by theACC Litigation Committee and

    Steptoe & Johnson LLP

    Association of Corporate Counselwww.acc.com

    Welcome!

  • 2

    Today’s Panel

    Stephanie Mendelsohn, Director of Corporate Records and ElectronicDiscovery, Genentech, Inc.

    José Ramón González-Magaz, Partner, Steptoe & Johnson LLP

    Mike Bergeron, Of Counsel, Steptoe & Johnson LLP

    Sonya Sigler, General Counsel, Cataphora, Inc.

    Bill Mooz, VP and General Counsel, Catalyst Repository Systems, Inc.

    AgendaI. Key Process Steps for Running an Electronic Discovery Project

    Identification, Preservation & Collection, Stephanie Mendelsohn, Genentech, Inc.

    Review of eDiscovery Data, José González-Magaz & Mike Bergeron, Steptoe & Johnson LLP

    II. Technology & Terminology

    Collection, Culling & Analysis, Sonya Sigler, Cataphora, Inc.

    Review & Production, Bill Mooz, Catalyst Repository Systems, Inc.

    III. Q&A with Panel

  • 3

    Identification, Preservation andCollection

    Stephanie MendelsohnDirector of Corporate Records and

    Electronic DiscoveryGenentech, Inc.

    6

    Page 6

    Identify Data SourcesBefore any legal hold:

    Prepare for early attention to ESI:Describe infrastructureIdentify most knowledgeableIdentify data sources

    Document preservation and collection process

    After a legal hold is initiated:Identify custodiansIdentify data sources

  • 4

    7

    Page 7

    Preservation ProcessInitiate legal hold to suspend routine disposition ofdocuments and ESI.Engage in custodian interviews.Provide repositories as needed.Document, document, document.

    8

    Page 8

    What must be preserved?

  • 5

    9

    Page 9

    And Coming to a Phone Near You . . .

    10

    Page 10

    Collection ProcessWho performs the collection?How is the collection performed?What is collected?

    Identifying any sources of data that are not reasonablyaccessible.Prioritizing reasonably accessible data sources.

  • 6

    Review of Electronic Discovery Data

    José Ramón González-Magaz, PartnerSteptoe & Johnson LLP

    Mike Bergeron, Of CounselSteptoe & Johnson LLP

    12

    Page 12

    Reviewing the Data – Major Cost FactorsReviewing platform to be used.Native file review versus image-based review.Complexity of coding form.Degree of experience/

    specialization of the reviewers.

    Training of reviewers:Preparation of

    project manual.Scale of supervision needed.Quality control.

  • 7

    13

    Page 13

    Training the ReviewersTechniques for reviewing documents/files/data efficiently and reliably.Detection of privileged materials and preparation of privilege log coding.Coding of data reviewed.Identification and handling

    of “close-call” data.Substantive parameters of

    the review.Use of the technology/

    equipment.

    14

    Page 14

    Data Review Progress ReportsReview rate assessment.Budget tracking.Substantive reports.Memorialization of project

    developments.

  • 8

    15

    Page 15

    Review Rate EstimatesHigh

    (basic coding)Average

    Low

    (advanced coding)

    Review rate*

    (electronic)800 600 400

    Review rate*

    (scanned with

    objective coding**)

    300 200 100

    Review rate*

    (scanned without

    objective coding**)500 400 300

    **objective coding = coding for Date, Document

    Type, Title, Author, Recipient, Copyee, etc.

    *Number of documents per 8 hour reviewer day

    16

    Page 16

    Quality Control of Data ReviewConfirm supervision given during the review.Ensure all data were reviewed.Random check of

    coding performed.Second level review ofodd tags.Client confirmation thatcorrect reviewing

    parameters were properly applied.

  • 9

    Collection, Culling & Analysis

    Sonya SiglerGeneral Counsel

    Cataphora, Inc.

    18

    Page 18

    CollectionCollection Tools/Methods:

    Mirror Image of Hard Drives or ServersSelf SelectionOthers

    Data Mapping Appliances (ESI blueprint):KazeonDeep Dive

    Forensic Analysis:Deleted or Missing DataNot What You Expected

  • 10

    19

    Page 19

    Collection PhilosophyNarrow Based Collection:

    By Custodian - John DoeBy Date Range - January 1, 2002 - July 31, 2006Documents Pulled by Keywords - fraud, invoice

    Broad Based Collection:Collect it ALLCull After Collection

    20

    Page 20

    Culling GoalsReadily Accessible Data:

    Readily Accessible under FRCP 34Not Readily Accessible:

    Database dataSource Code, etc.

    Reduce your Data SetMake it Manageable

  • 11

    21

    Page 21

    De-duplication MethodsMD5 hash values:

    Do I need to know what this is???

    De-duplication of Data Sets:Within custodian setsAcross custodian setsAcross all data sets

    Near DuplicatesKnow what is being done to your data:

    ALWAYS ask! Vendors need to explain this clearly.

    22

    Page 22

    Duplicate Range

    25%

    90%

    Broad Based Collection Restoring Back-Up Tapes

    90%

    25%

  • 12

    23

    Page 23

    Culling MethodologiesLinguistic Methods (Word Based):

    KeywordOntologies

    Statistical Methods (#s based):Topic Clustering:

    Statistical SimilarityCounting #s of words, appearance together

    Latent Semantic Indexing

    24

    Page 24

    Keyword CullingCon

    Over-inclusive: Disambiguate

    Under-inclusive Word must be present Hard to craft Ineffective with short

    messages, IMs

    Pro Word Stemming:

    Hous* - house, housemate,household.

    Easy to use/explain/agree Familiar

    Fast results

  • 13

    25

    Page 25

    Linguistic MethodsWhat are ontologies?

    Combines previous methodsBuilt on continual improvement

    Review privileged informationProduction by Ontology:

    Automated ReviewTechnology Assisted Review

    26

    Page 26

    Statistical MethodsTopical Clustering:

    Statistical similarity:Royalty, Disney, high

    Supervised clustering:Choosing the Topics to Cluster

    Latent Semantic Indexing:Searches By Concept:

    “Find Me More Like”

    Simplified Searches:Natural LanguageEntire Documents

  • 14

    27

    Page 27

    Analysis Methods

    28

    Page 28

    AnalysisGraphically Depicting Data and Connectionsin the Data:

    Closeness AnalysisMap the Data SetMindshare AnalysisTone Detection

  • 15

    29

    Page 29

    Closeness Analysis

    30

    Page 30

    E-Mail Communications:Map The Entire Dataset – Up Front

    Green: Administration

    Red: Legal

    Blue: Accounting

  • 16

    31

    Page 31

    Mindshare Analysis

    32

    Page 32

    Tone Detection

  • 17

    33

    Page 33

    ConclusionDon’t Be Afraid to AskEducate Yourself:

    ACC WebsiteVendor’s Websites

    Review & Production

    Bill MoozVice President and General Counsel

    Catalyst Repository Systems, Inc.

  • 18

    35

    Page 35

    Setting Up the Review: ToolsRepository or Review Platform: System of hardware &software used to store and review discovery data.

    Enterprise Software: Software that runs on your hardware.Hosted Solution or Software as a Service: Review platformthat runs on providers’ infrastructure that you access remotely.

    Web-Based: Access is via a browser.Terminal Service: A software layer that enables you to access asystem remotely; requires additional hardware & software.

    Plug-Ins: Software loaded on user’s computer to access the reviewplatform; generally not required with web-based systems.

    36

    Page 36

    Setting Up the Review: DataNative Files: The form in which the document was generatedoriginally. The default format for production under the new FRCP.

    Conversion: Converting native files into TIFF (Tagged Image FileFormat) or PDF (Portable Document Format) for review orproduction; may or may not be required.

    Metadata: Data about the document itself (e.g., date created,author, recipient, etc.). May be objective (residing in document itself)or subjective (identified by humans).

    Processing: Extracting metadata from native files to enable thereview process.

    FTP: File Transfer Protocol, a way to send electronic data via theinternet. Not effective for files greater than 2 gigabytes in size.

  • 19

    37

    Page 37

    Organizing the Review: BatchingLabor Arbitrage: Moving tasks to lower-cost providers; typicallyinvolves using contract attorneys (on-shore or off-shore) to conductfirst pass review.

    Batching: Putting documents in logical groups for assignment toreviewers.

    Concept Clustering: Using mathematical equations to sortdocuments into related groups.

    Fielded Search: Searching document sets by meta data or acombination of meta data and text terms.

    Filters/Navigators: Built-in tools for organizing search resultsinto subcategories like date ranges, author, recipient, etc.

    38

    Page 38

    Organizing the Review: Folders & FormsFolders: Files for organizing documents. May be dynamic (autopopulatingbased upon criteria) or static. Security/access rights often administered at folder level.

    Review Forms: What the reviewer sees on the screen when reviewingdocuments. Will include a variety of fields to be coded, often with check-the-box capability. Forms are customized by case and even level of review(first-pass, second-pass, etc.).

    Fields: Document attributes that can be used to organize them. Examplesinclude date, bates number, author, hot doc, privilege, responsive, etc.

    Private Fields: Fields that are restricted to specific users; essentialrequirement for sharing a repository.

  • 20

    39

    Page 39

    Conducting the ReviewLinear Review: The process of reviewing documents one-by-one.Can include multiple passes.

    Bulk Tagging: Marking multiple documents all at once, e.g.,designating an entire folder of documents irrelevant with a singleaction.

    Redaction Tools: Tools that enable you to redact sensitiveelectronic documents, preserving the original for control purposes.

    Audit Trails: System-generated reports that enable you to reviewthe actions of review teams or individual reviewers.

    40

    Page 40

    Multi-Language ReviewsASCII: American Standard Code for Information Interchange, the standardsystem for encoding characters in the English language for use by computers.

    UTF 8: Unicode Transformation Format, the new global standard forencoding characters in all languages, including those with more than 26character sets.

    CJK: Chinese, Japanese, Korean & Thai – languages that do not usespaces between individual characters or words.

    Tokenization: The process of putting white space between charactersets in CJK documents to make them searchable.

    Language Packs: Upgrades that allow software to work with foreignlanguages. Essential that reviewers have them on their systems.Available at http://en.wikipedia.org/wiki/Help:Multilingual_support.

  • 21

    41

    Page 41

    ProductionsExport: Get data out of a review platform for use elsewhere. Canexport in multiple different formats.

    Conversion: Transforming a native file into another format,usually PDF or TIFF.

    Blowback: Print a set of data back to paper.Subcollection: A sub-set of documents on a repository that ismade available to someone with a limited need-to-know. Usedincreasingly to produce documents to opposing parties, especiallyregulatory agencies.

    Privilege Logs: Typically handled by exporting a limited set ofmetadata for the documents designated as privileged.

    Q&A with PanelStephanie Mendelsohn, Director of Corporate Records and Electronic Discovery, Genentech, Inc.

    [email protected]

    José Ramón González-Magaz, Partner, Steptoe & Johnson LLP202-429-8110 / [email protected]

    Mike Bergeron, Of Counsel, Steptoe & Johnson LLP301-610-2397 / [email protected]

    Sonya Sigler, General Counsel, Cataphora, Inc.650-622-9840 x604 / [email protected]

    Bill Mooz, VP and General Counsel, Catalyst Repository Systems, Inc.303-824-0842 / [email protected]

    Thank you for your time!

  • 22

    Thank you for attending another presentation fromACC’s Desktop Learning Webcasts

    Please be sure to complete the evaluation form for this program as your comments andideas are helpful in planning future programs.

    You may also contact Sherrese Williams at [email protected]

    This and other ACC webcasts have been recorded and are available, for one year after thepresentation date, as archived webcasts at www.webcasts.acc.com.

    You can also find transcripts of these programs in ACC’s Virtual Library atwww.acc.com/vl