Storage Trends File and Object Based Storage and how NFS ...

28
© 2014 IBM Corporation IBM Linux Technology Center Storage Trends File and Object Based Storage and how NFS-Ganesha can play Venkateswararao Jujjuri (JV) File systems and Storage Architect IBM Linux Technology center [email protected] | [email protected] 2014

Transcript of Storage Trends File and Object Based Storage and how NFS ...

  • 2014 IBM Corporation

    IBM Linux Technology Center

    Storage Trends File and Object Based Storage

    and how NFS-Ganesha can play

    Venkateswararao Jujjuri (JV)File systems and Storage ArchitectIBM Linux Technology center

    [email protected] | [email protected]

    2014

    mailto:[email protected]

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20142/28

    Outline Data is Exploding Storage Trends Unstructured Data Need for new solution Object Store File vs Object and Object details Big question and answer FOBS File and Object Based Storage Object Storage details and variations NFS Evolution and pNFS and future NFS-Ganesha Conclusions

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20143/28

    Data is Exploding We create

    Source: http://www-01.ibm.com/software/data/bigdata/what-is-big-data.html

    Growth will reach IDC Says Data will grow from 4.4ZB today to 44 ZB by 2020

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20144/28

    Storage Trends Data growth is around 70%/ year, most of it is unstructured. Scale-out rather than scale-up. Object is gaining lot of traction but file is not going away; NAS will

    stay as significant player. Analysts predict NAS grow at a CAGR of 25.44% over 2013-2018.

    (http://cti.tmcnet.com/news/2014/04/04/7762020.htm)

    Unified Storage NAS, SAN, and Object Growth mantra: FOBS

    IDC Projections* Structured Data Will grow At a 21.8% CAGR

    * Unstructured Data Will grow At a 61.7% CAGR

    Market Needs and Adoption2000 Direct Attached Storage SAN

    2010 Network Attached Storage NFS,CIFS

    2020 File and Object Based Storage

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20145/28

    Unstructured Data Basically non-database data Usually generated on an event (Cheese......Click)

    Typically no access or read access (photos, xrays, dental recs) Tough to interpret the content (jpeg can be a silly pic or blueprint)

    Emails, Instant Messages, Documents, Spread Sheets, Graphics, Images, Videos, Social Media, Medical Records, wearable. on .. and .. on...

    Explosive growth in search for cost effectiveness and manageability. Why not continue file/NAS model?

    Simple access model No need for heavy POSIX interface. Scale-Out: Hierarchical model is more of an overhead Context: Difficult to build context of an individual file. (need entire

    path) Metadata is distributed hence complex/inefficient policies. Loose/Eventual consistency is often good enough.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20146/28

    Need for a new Solution Requirements

    Simple interface Easy access, no need to traverse through dirs/subdirs Context of the contents Scale-Out capabilities Massive and Cheap Easy policies for ILM

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20147/28

    File vs Object

    File

    Penguin.jpg Object

    FileName: Penguin.jpgTimes: atime, mtime, ctime etcOwner:GroupPermissions: Unix style, ACLs etc

    ObjectIdFileTyleTimesCamera Info:Resolution:Owner Name:Location:Copyright:OrientationYcbCr positioningCompressionExposure TimeX-ResolutionY-ResolutionFocalAperture

    FlashFocal LengthColor SpaceAngleOrientationPreferred DisplayCategoryImportanceTagsVersionNotesVoice/comment

    Object: Simply an abstract container where data and metadata are co-located

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20148/28

    Objects Rich meta-data that co-exists with data; easy policies Addressed by a 128 bit id Flat Access Checksum is part of metadata Multiple file types can be in one object (a wave and jpeg) Cost effective because of eventual consistency and the lack

    of POSIX complexity. Scales well with off-the-shelf hardware Simple access protocol, RESTful API. Suited for the digital world generated unstructured data.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 20149/28

    Big Question

    So...

    File and NAS are DEAD?

    So...

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201410/28

    .and the Answer is..

    File and NAS will continue to grow

    File and Object joins hands togetherto keep the party on!

    FOBS File and Object Based Storage

    NO

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201411/28

    FOBS File and Object Based Storage Object storage works best for WORM workloads and not all

    data fits that tab. Object is meant for low cost mass storage which is not

    actively shared. Traditional applications and file systems use continues File fills part of the spectrum where the need for rich set of

    security and consistency guarantees. Object Storage fills the space where file/NAS is week. After-all, most of the object stores and structured data stores

    (databases) are created on file-systems

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201412/28

    FOBS File and Object Based Storage

    Object Store

    File

    Volum

    e Based

    Market share

    FOBS

    Secure Consistency General Purpose Performance Legacy

    WORM/Cold Cost Effective High Volume Scalability Manageability

    Access/Update Frequency

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201413/28

    Object Storage Objects are broadly divided into two categories.

    Storage Devices* Move Smarts into the device layer NASD, OSD-1, OSD-2, OSD Layer on FS etc

    * Access command set Ex: SCSI model command set for OSD.

    * Custom OSD mode: Lustre, Ceph

    * T10 OSD model: EXOFS, PanFS

    * pNFS support.

    * PBs of storage on 1000s of disks, 1000s of clients

    Web Services* Objects created on Filesystems and accessed through web.

    * REST Model HTTP protocol Operation:Get, Put, Post, Delete

    * Highly Available

    * Loosely consistent.

    * SWIFT, S3, Azure etc

    * Gaining tremendous popularity.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201414/28

    Object Based Filesystem (Ex: Ceph)

    * Provides Posix-Compliant FS on top of Object-Based Ceph Storage Cluster

    * Files gets mapped to Objects and MDS below librados

    * MDS stores all Filesystem Metadata (Directories, Owners, Access info etc)

    * Data directly stored on OSDs

    * Out of band IO: Metadata provides data location, and IO is directly to OSDs

    * Offers kernel mount or FUSE interface

    Source:http://ceph.com/docs/master/architecture/

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201415/28

    Web Services Object Store - SWIFT

    Source:http://docs.openstack.org/training-guides/content/module003-ch007-cluster-architecture.html

    * Storage Nodes consists Objects, stored as binary files on the filesystem with metadata stored in the files extended attributes (xattrs).

    * Proxy Nodes receive and process Incoming request and determine the correct storage server for the request.

    * All objects stored in Swift have a URL

    * All objects stored are replicated 3x in as-unique-as-possible zones.

    * Object data can be located any where in the cluster

    * Nodes/Disks can be added without downtime.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201416/28

    NFS Evolution NFS is extremely popular and widely used. Stateless NFSv3, very successful and de-facto for 'NFS'. Stateful NFSv4 came out in 2003

    Adaptation is slow but gaining momentum since NFSv4.1 came out Became a stepping stone to move towards NFSv4.1

    NFSv4.1 introduced in 2010, added enhancements and addressed NFSv4 deficiencies.

    Improved performance - pNFS, Directory Delegations, Trunking Robustness - Exactly Once Semantics Security Windows native ACL support, Kerberized Back Channel For time-to-market reasons, few players skipping NFSv4.0 and

    directly moving to NFSv4.1. Ex: Vmware, Microsoft.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201417/28

    Parallel NFS - pNFS

    File/Object Layout Driver

    POSIX Interface

    ExoFS

    Control

    * Removes IO bottlenecks and improves large file performance.

    * Load balancing

    * Scale-out model

    * Control Protocol is not standardized, vendor value-add.

    * Allows direct client access to the storage devices

    * Clients can do parallel IO across storage

    * Layouts can be leased, re-callable, and revokable.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201418/28

    Parallel NFS - pNFS

    * Supported layout types are open-ended.

    * Supports three types of layouts - File (RFC 5661) - Block (RFC 5663) - Object (RFC 5664) - Future: - Flexible File Layout (proposal) and others

    * File Layout - Files, NFS protocol - Default layout and many implementations

    * Block Layout - SCSI blocks, iSCSI, FCP etc

    * Object Layout - OSD SCSI object protocol, OSD2 - Few implementations, PanFS, Exofs(OSDFS)

    User Interface

    NFSv4.1

    PNFS Layouts

    File Obj Block Future

    Network / IO stack

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201419/28

    NAS and Object to become FOBS Many traditional applications written for POSIX access Object storage is different and foreign to the traditional

    applications. One of the solutions is to create a Filesystem layer on top of

    object store. Ex: Maldivica storage connector creates filesystem

    interface on top of SWIFT object store which can be exported via NFS/CIFS (NAS)

    Provide Object Interface on NAS. Ex: Calsoft Integrates NAS with modified openstack

    SWIFT and provides SWIFT interface on NAS.

    Source: http://www.calsoftinc.com/OpenStack-Object-Storage-Swift.aspx# , http://maldivica.com/

    http://www.calsoftinc.com/OpenStack-Object-Storage-Swift.aspx

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201420/28

    Swift-on-File Ability to access the back-end using both object interface and

    file interface. Swift-on-File stores objects following the same path hierarchy

    as that object's URL. Object URL: https://swift.example.com/v1/acc/cont/obj Swift:/mnt/sdb1/2/node/sdb2/objects/981/f79/f566bd022b9285

    b05e665fd7b843bf79/1401254393.89313.data SoF: /mnt/gluster-vol/acc/cont/obj

    Enables objects created using the Swift API to be accessed as files on a Posix filesystem.

    This opens up enormous possibilities including NAS and RESTful interface to create and access the same data

    Use Case: Create video files using SWIFT, use file access to trans-code it, and let it use by SWIFT to access in different codec.

    Source:https://github.com/swiftonfile/swiftonfile/blob/master/README.md

    https://swift.example.com/v1/acc/cont/obj

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201421/28

    NFS for future NFS Pathless Objects - filesystem objects which can be

    created, queried for and destroyed without being associated with a pathname. (http://tools.ietf.org/html/draft-dipankar-nfsv4-pathless-objects-02)

    Metastripe - RFCs are being proposed to stripe/scale meta-data servers (http://tools.ietf.org/html/draft-mbenjamin-nfsv4-pnfs-metastripe-01)

    Ceph providing access to back-end RADOS object store through LIBRADOS API, S3/Swift compatible API, Block, CEPHFS - which can be nfs exported, including pNFS. (http://ceph.com/docs/master/architecture/)

    pNFS over CEPH CohortFS with metastripe PNFS over Lustre. - CEA, French Defense organization. Possible to offer selectable consistency with nfs backed

    object store vrs web based. OpenStack Manila project (https://wiki.openstack.org/wiki/Manila/)

    http://tools.ietf.org/html/draft-dipankar-nfsv4-pathless-objects-02http://tools.ietf.org/html/draft-mbenjamin-nfsv4-pnfs-metastripe-01http://ceph.com/docs/master/architecture/

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201422/28

    Added Advanced features takes NFS into advanced file sharing category.

    Performance: Server Side Copy: Removes one leg of copy operation IO_ADVISE: Client advise Server on Application access pattern. Application Data Blocks (ADB): ex: VM image file type. Sparce file support.

    Security: Labeled NFS: Mandatory Access Control based on system wide policy

    Scalability and QoS Space Reservation: Reserve Storage useful in thin provisioning Hole Punching: Return unused parts of the file back to the pool.

    .

    NFSv4.2

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201423/28

    NFS-Ganesha One of the mainstream NFS Server. User-level NFS server suitable for enterprise applicatoins

    Manageability and debug-ability http://tinyurl.com/kka8czz

    Can manage huge meta-data and data caches Provision to exploit FS specific features. Can serve multiple types of File Systems at the same time. Can serve multiple protocols at the same time. Can act as a Proxy server and export a remote NFSv4 server. Cluster friendly and Cluster Manager agnostic.

    Easy recovery, failover and failback implementation. Multi-protocol support with common DLM (planned)

    Small but growing community. Active participants

    IBM, Panasas, Redhat, CohortFS(LinuxBox), CES, Bull, + few more.

    http://tinyurl.com/kka8czz

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201424/28

    Supports many Filesystems through FSAL layer VFS, GPFS, PanFS, Gluster, CEPH, Lustre, XFS, FUSE, Proxy, PT etc

    NFS v3, NFSv4.0, NFSv4.1, pNFS support. Minimal NFSv4.2 IBM, Redhat, LinuxBox, Panasas released/releasing products

    based on NFS-Ganesha. Released 1.5, 2.0, 2.1 releases, 2.2 is set to be GA'd by end of

    October 2014. Delegation, Statistics, Dynamic exports, LTTng support. Supports file and object layouts of pNFS Cluster Manager Abstraction Layer (CMAL)

    Clustered DRC, DLM, multi-protocol support.

    .

    NFS-Ganesha

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201425/28

    Conclusions Object storage is expanding and file remains to be very

    important part of the equation and expected play together FOBS is the future.

    Unified storage - NAS, Object, SAN NFS is progressing as a protocol, NFSv4.1 and pNFS support is a

    must to be competitive in the market space. pNFS has major advantages - Scale-out meta-data, data; parallel

    IO/ performance improvement. NFSv4.2 is taking NFS as a preferred filesystem/access protocol

    for future storage needs.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201426/28

    NFS-Ganesha links

    NFS-Ganesha is available under the terms of the LGPLv3 license. NFS-Ganesha Project homepage on github

    https://github.com/nfs-ganesha/nfs-ganesha/wiki Github:

    https://github.com/nfs-ganesha/nfs-ganesh Download page

    http://sourceforge.net/projects/nfs-ganesha/files Mailing lists

    [email protected] [email protected] [email protected]

    https://github.com/nfs-ganesha/nfs-ganesha/wikihttps://github.com/nfs-ganesha/nfs-ganesh

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201427/28

    Legal Statements This work represents the view of the author and does not necessarily represent the

    view of IBM. IBM is a registered trademark of International Business Machines Corporation in

    the United States and/or other countries. UNIX is a registered trademark of The Open Group in the United States and other

    countries . Linux is a registered trademark of Linus Torvalds in the United States, other

    countries, or both. Other company, product, and service names may be trademarks or service marks

    of others CONTENTS are "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR

    IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. Author/IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

  • IBM Linux Technology Center

    2014 IBM CorporationLinuxCon 201428/28

    THANKYOU!

    Q&A

    VirtFS Overview of a Cluster Filesystem Pass-ThroughOverviewSlide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26page25Slide 28