© 2008 Hitachi Data Systems Hitachi File Services and Content Awareness Michael Fahey Senior...
-
Upload
alvin-hudson -
Category
Documents
-
view
222 -
download
5
Transcript of © 2008 Hitachi Data Systems Hitachi File Services and Content Awareness Michael Fahey Senior...
© 2008 Hitachi Data Systems
Hitachi File Services and Content Awareness
Michael FaheySenior Director File Services
HCAPHNASHDDSHDD-MS
2
Market Growth Trending
Disk Space
3
A Growing Issue in the Environment
• Unstructured content growing faster than traditional information or structured content
– Database or transactional data
• 75% to 90% of data is unstructured
– Requires unique capabilities for archive management
Exa
byt
es
4
Changes in Legal Discovery
• Traditional archiving technologies cannot retrieve relevant content in a timely manner
– How long would it take to retrieve 10,000 or 100, 000 items?
• Traditional storage technologies do not have the functionality required for archiving and discovery
• FRCP– Rule 26f: Within 100 days of a suit
being filed parties must meet and disclose any issues relating to disclosure or discovery of ESI
5
Litigation E-Discovery Timelines
Pre Amendment E-Discovery Rules
Suit
FiledMotions
Negotiations
Discovery/Productions
Trial
Suit
FiledMotions
Negotiations
Discovery/ Productions
Trial
Summary Judgments
Settlements
December 1st, Amendment E-Discovery Rules
“Meet and Confer”
Days and weeks versus months and quarters to discover
6
Discovery Defensibility
• Can you found all electronic documents?
• Do you know everywhere to look?
• Are there unknown copies that will emerge later?
• Does the other side have something we don’t?
• Will we find out the answers to any of these during the trial?
• Will we face sanctions for not finding it?
Sender Recipient
Mail Server: Exchange or Notes
Backups
File ServersAdmin Client
Client Workstation
Removable Media
PrinterHard Disk
Admin Client
Client Workstation
Removable Media
PrinterHard Disk
WWW
Recipient
Mail Server: Exchange or Notes
Network Storage
Databases
7
Different Schools of Thought
• “Delete every email after 30 days” – Users aren’t supposed to keep PST files but do
• “If you want to keep something after 30 days then print it and save it”
– How do you ever find it?• “Sure we archive, we save our backup tapes”
– Backups are not an archive• “ We maintain strict limits on file server capacity.
Anything over the limit is deleted”• “All employees who leave have their data deleted”• “We don’t know what we have so we don’t delete
anything”– Sometimes this happens by itself or by a court
order
• See HDS Whitepapers by Contoural
8
Six Critical Steps to Managing ESI*
1.Create an ESI Survey Data Map• System description, scope, data flow,
organization, data formats2.Update records retention and deletion policy and
then execute it• Is it realistic?• Is it flexible?
3.Effective litigation hold and discovery process4.Delete documents the business does not need5.Designate and prepare a Rule 30 (b)(6) witness
• Someone that understands the data • Typically from IT or the Business Unit
6.Audit your process and refresh your policies on a periodic basis
• See HDS Whitepapers by Contoural
9
Content Awareness
• What Do We Have?– Some applications have built-in
classification and search capability• ECM• Email Archiving• Databases
– Many do not• Email systems without archiving• File servers
• How Do We Find It?– How many places do we have to go to
search across the enterprise?– What if a common storage system
could have full text search of everything?
10
Home Grown Application
MedicalImaging
• Supports multiple applications and content types
• Embedded full-text indexing and search
• High-performance, scalable and secure storage
File SystemDocument management
E-mail ArchiveSoftware
Discovery Module
Hitachi Content Archive Platform:How it Works
11
Key Concept: Archive Object Management
Fixed content files contain reference content that must be preserved or retained
Metadata contains descriptive attributes about the file
Policy information is special metadata that governs file retention, disposition and protection, plus guarantees file authenticity over time
Users and applications write and read files; internally, they are managed as objects
12
Navigators provide drill down by key terms, file type, and retention
View additional file system and archive metadata
Search Result Set
Support for: • 370 File Formats• 77 Languages • Full-text, metadata
and system data indexing
Set/Release Retention Hold
Export Results
DiscoveryOptional Advanced Search Capability
13
NFS WebDAVCIFS/SMBHTTP orHTTPS
Performance close to HTTP gateway
Supports RFC 2518 compliant clients
“MountPoint” just part of URL
Compatibility interface, primarily for UNIX
Mount by cluster reference
High protocol overhead
POSIX
Compatibility interface, primarily for Windows
Map network drive to cluster file system path (data or meta data)
Fastest gateway
Many good client libraries
GET, PUT, EXISTS, DELETE operations
Can specify metadata in URL
Access Protocols
SMTP
Fastest gateway using batch mode
Supports standard SMTP mail clients
Ingestion only – no read
NDMP
Standard backup/restore gateway for the archive.
Data and metadata packaged into transportable objects
14
Document management
Instantmessaging
Data creationapplications
Emailserver
Database sever
HCAP
AccessibilityStandard and open gateways make access easy. Content searching makes access even more immediate.
Preservation and RAS with HA SANHeterogeneous cluster storing fixed content is self managing, scalable, and permanently available.
Open StandardsNFS, CIFS, HTTP, WebDAV, SMTP, NDMP
Custom applications
HCAP
15
Key Features: Preserve and Protect
P
Protection Self-configuring and self-healing with automated policy enforcement, failover and ongoing
integrity checks Ensures specified number of replica copies are maintained to tolerate simultaneous points of
failure by maintaining 1 to 4 internal copies depending on value of data Policies enforce document retention, authentication and file replication combine to secure
valuable digital assets Keep management “soft costs” to a minimum
Encryption of Data at Rest (optional service at installation) Protects content from being recovered from stolen media using patented “Secret Sharing”
technology Transparently encrypts all content, metadata and search index
X X X X X X X X X X X X X X X X X X X X
ShreddingEnsures no trace of file is recoverable from disk after deletionComplies with DOD specs
ReplicationObject based: Bi-directionalFiles, metadata and policiesOptions for compression and encryption in flight built into HCAP
21May212036
May
WORM, Immutability and Retention • Write once, read many file system where content is stored in an immutable format with the
ability to set file-level retention
16
Duplicate EliminationFind and inspect duplicatesRemove duplicates, but maintain integrity
Key Features: Optimize
File Compression Reduces the physical size of the data stored in Hitachi Content Archive Platform Enables greater storage efficiencies, scalability and enhances TCO
Open Interoperability • Supports UNIX and Microsoft file systems, can store standard file formats such
as XML and HTML, and can leverage other Hitachi SAN storage platforms
Single Archive Name Space• All objects are stored in a single, archive-wide global name space, a well understood
paradigm• Open and easy to navigate with standard tools and applications
Simple Monitoring and Management• Can be configured and monitored through a Web-based interface and through SNMP
Scalability• Provide enterprise scalability and upgrade paths to create a long term archive
Thank You
17