Introduction to Red Hat Gluster Storage (GlusterFS )
-
Upload
vikhyat-umrao -
Category
Technology
-
view
46 -
download
5
Transcript of Introduction to Red Hat Gluster Storage (GlusterFS )
1
Red Hat Gluster Storage (GlusterFS )
Vikhyat UmraoSME - RHGS (GlusterFS)
Red Hat - Software Engineering Group
2
Agenda
● What is Red Hat Gluster Storage?
● What is GlusterFS?
● Architecture
● Use Cases
3
What is Red Hat Gluster Storage?
● Red Hat Gluster Storage is a software only, scale-out storage solution that provides flexible and unstructured data storage for the enterprise.
● GlusterFS, a key building block of Red Hat Gluster Storage, is based on a stackable user space design and can deliver exceptional performance for diverse workloads.
4
What is GlusterFS?
● A general purpose scale-out distributed file system.
● Aggregates storage exports over network interconnect to provide a single unified namespace.
● Filesystem is stackable and completely in userspace.
● Layered on disk file systems that support extended attributes.
5
Typical GlusterFS Deployment
● Global namespace ● Scale-out storage
building blocks● Supports thousands
of clients● Access using
GlusterFS native, NFS, SMB and HTTP protocols
● Linear performance scaling
6
GlusterFS Architecture – Foundations
● Software only, runs on commodity hardware
● No external metadata servers
● Scale-out with Elasticity
● Largely POSIX compliant
7
Concepts & Algorithms
8
GlusterFS concepts – Trusted Storage Pool
● Trusted Storage Pool (cluster) is a collection of storage servers.
● Trusted Storage Pool is formed by invitation – “probe” a new member from the cluster and not vice versa.
● Logical partition for all data and management operations.
● Membership information used for determining quorum.
● Members can be dynamically added and removed from the pool.
9
GlusterFS concepts – Trusted Storage Pool
Node2
Probe
Probe accepted
Node 1 and Node 2 are peers in a trusted storage pool
Node2Node1
Node1
08/05/15 The Future of Gluster.org - John Mark Walker
GlusterFS concepts – Trusted Storage Pool
Node1 Node2 Node3Node2Node1 Trusted Storage Pool
Node3Node2Node1
Detach
08/05/15 The Future of Gluster.org - John Mark Walker
A brick is the combination of a node and an export directory – for e.g. hostname:/dir
Each brick inherits limits of the underlying filesystem No limit on the number bricks per node Ideally, each brick in a cluster should be of the same size
/export3/export3
/export3
Storage Node
/export1
Storage Node
/export2
/export1
/export2
/export4/export5
Storage Node
/export1
/export2
3 bricks 5 bricks 3 bricks
GlusterFS concepts - Bricks
08/05/15 The Future of Gluster.org - John Mark Walker
GlusterFS concepts - Volumes
● A volume is a logical collection of bricks.
● Volume is identified by an administrator provided name.
● Volume is a mountable entity and the volume name is provided
at the time of mounting.
● mount -t glusterfs server1:/<volname> /my/mnt/point
● Bricks from the same node can be part of different volumes
08/05/15 The Future of Gluster.org - John Mark Walker
GlusterFS concepts - Volumes
Node2Node1 Node3
/export/brick1
/export/brick2
/export/brick1
/export/brick2
/export/brick1
/export/brick2
music
Videos
08/05/15 The Future of Gluster.org - John Mark Walker
Volume Types
● Type of a volume is specified at the time of volume creation
● Volume type determines how and where data is placed
● Following volume types are supported in glusterfs:a) Distributeb) Stripe*c) Replicationd) Distributed Replicatee) Striped Replicate*f ) Distributed Striped Replicate*
* volume types are in technology preview
08/05/15 The Future of Gluster.org - John Mark Walker
Distributed Volume
● Distributes files across various bricks of the volume.
● Directories are present on all bricks of the volume.
● Single brick failure will result in loss of data availability.
● Removes the need for an external meta data server.
08/05/15 The Future of Gluster.org - John Mark Walker
How does a distributed volume work?
● Uses Davies-Meyer hash algorithm.
● A 32-bit hash space is divided into N ranges for N bricks
● At the time of directory creation, a range is assigned to each
directory.
● During a file creation or retrieval, hash is computed on the file
name. This hash value is used to locate or place the file.
●Different directories in the same brick end up with different
hash ranges.
08/05/15 The Future of Gluster.org - John Mark Walker
How does a distributed volume work?
08/05/15 The Future of Gluster.org - John Mark Walker
How does a distributed volume work?
08/05/15 The Future of Gluster.org - John Mark Walker
How does a distributed volume work?
08/05/15 The Future of Gluster.org - John Mark Walker
Replicated Volume
● Synchronous replication of all directory and file updates.
● Provides high availability of data when node failures
occur.
● Transaction driven for ensuring consistency.
● Changelogs maintained for re-conciliation.
● Any number of replicas can be configured.
08/05/15 The Future of Gluster.org - John Mark Walker
How does a replicated volume work?
08/05/15 The Future of Gluster.org - John Mark Walker
How does a replicated volume work?
08/05/15 The Future of Gluster.org - John Mark Walker
Distributed Replicated Volume
● Distribute files across replicated bricks
– Number of bricks must be a multiple of the replica count
– Ordering of bricks in volume definition matters● Scaling and high availability
● Reads get load balanced.
● Most preferred model of deployment currently.
08/05/15 The Future of Gluster.org - John Mark Walker
Distributed Replicated Volume
08/05/15 The Future of Gluster.org - John Mark Walker
Elastic Volume Management
Application transparent operations that can be performed in
the storage layer.
● Addition of Bricks to a volume
● Remove brick from a volume
● Rebalance data spread within a volume
● Replace a brick in a volume
● Performance / Functionality tuning
08/05/15 The Future of Gluster.org - John Mark Walker
FUSE based native access
08/05/15 The Future of Gluster.org - John Mark Walker
NFS access
08/05/15 The Future of Gluster.org - John Mark Walker
libgfapi
● Exposes APIs for accessing Gluster volumes.
● Reduces context switches.
● qemu, samba, NFS Ganesha integrated with libgfapi.
08/05/15 The Future of Gluster.org - John Mark Walker
libgfapi v/s FUSE – FUSE access
08/05/15 The Future of Gluster.org - John Mark Walker
libgfapi v/s FUSE – libgfapi access
08/05/15 The Future of Gluster.org - John Mark Walker
Geo-Replication
● Geo-replication provides a distributed, continuous, asynchronous, and incremental replication service from one site to another over Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
● Geo-replication uses a master–slave model, where replication and mirroring occurs between the following partners:
● Master – a Red Hat Storage volume.
● Slave – a Red Hat Storage volume. A slave volume can be either a local volume, such as localhost::volname, or a volume on a remote host, such as remote-host::volname.
08/05/15 The Future of Gluster.org - John Mark Walker
Replicated Volumes vs Geo-replication
08/05/15 The Future of Gluster.org - John Mark Walker
Snapshot
● Red Hat Storage Snapshot feature enables you to create point-in-time copies of Red Hat Storage volumes, which you can use to protect data.
● Users can directly access Snapshot copies which are read-only to recover from accidental deletion, corruption, or modification of the data.
● In next slide we have a diagram which will explain it in a better way :
08/05/15 The Future of Gluster.org - John Mark Walker
Snapshot
08/05/15 The Future of Gluster.org - John Mark Walker
Snapshot
● In previous diagram, Red Hat Gluster Storage volume consists of multiple bricks (Brick1 Brick2 etc) which is spread across one or more nodes and each brick is made up of independent thin Logical Volumes (LV).
● When a snapshot of a volume is taken, it takes the snapshot of the LV and creates another brick. Brick1_s1 is an identical image of Brick1.
● Similarly, identical images of each brick is created and these newly created bricks combine together to form a snapshot volume.
08/05/15 The Future of Gluster.org - John Mark Walker
Red Hat Gluster Storage Console
● Red Hat Storage Console is management infrastructure that enables you to create a powerful, scalable storage environment.
● It provides IT departments with the tools to meet the challenges of managing complex environments, and enables administrators to reduce the cost and complexity of large deployments.
● Red Hat Storage Console includes: support to quickly create and manage Red Hat Storage trusted storage pool and volumes.
08/05/15 The Future of Gluster.org - John Mark Walker
Snapshot
● Red Hat Storage Snapshot feature enables you to create point-in-time copies of Red Hat Storage volumes, which you can use to protect data. Users can directly access Snapshot copies which are read-only to recover from accidental deletion, corruption, or modification of the data.
● Geo-replication uses a master–slave model, where replication and mirroring occurs between the following partners:
● Master – a Red Hat Storage volume.
● Slave – a Red Hat Storage volume. A slave volume can be either a local volume, such as localhost::volname, or a volume on a remote host, such as remote-host::volname.
Red Hat Gluster Storage Console
08/05/15 The Future of Gluster.org - John Mark Walker
Snapshot
● Red Hat Storage Snapshot feature enables you to create point-in-time copies of Red Hat Storage volumes, which you can use to protect data. Users can directly access Snapshot copies which are read-only to recover from accidental deletion, corruption, or modification of the data.
● Geo-replication uses a master–slave model, where replication and mirroring occurs between the following partners:
● Master – a Red Hat Storage volume.
● Slave – a Red Hat Storage volume. A slave volume can be either a local volume, such as localhost::volname, or a volume on a remote host, such as remote-host::volname.
Red Hat Gluster Storage Console
08/05/15 The Future of Gluster.org - John Mark Walker
Red Hat Gluster Storage Monitoring
● Monitoring of Red Hat Gluster Storage servers is built on Nagios platform to monitor Red Hat Storage trusted storage pool, hosts, volumes, and services. You can monitor utilization, status, alerts and notifications for status and utilization changes.
● Using Nagios, the physical resources, logical resources, and processes (CPU, Memory, Disk, Network, Swap, cluster, volume, brick, Host, Volumes, Brick, nfs, shd, quotad, ctdb, smb, glusterd, quota, geo-replication, self-heal,and server quorum) can be monitored.
● You can view the utilization and status through Nagios Server GUI or through Trends tab of Red Hat Storage Console.
08/05/15 The Future of Gluster.org - John Mark Walker
Red Hat Gluster Storage Monitoring
● Red Hat Gluster Storage trusted storage pool monitoring can be setup in of the three deployment scenarios listed below:
● Nagios deployed on Red Hat Storage node.
● Nagios deployed on Red Hat Storage Console node.
● Nagios deployed on Red Hat Enterprise Linux node.
08/05/15 The Future of Gluster.org - John Mark Walker
Use Cases
● Media Storage
● Disaster Recovery (Geo-Replication)
● Efficiently and cost-effectively store virtual machine images
● Collaboration and control data access anywhere, anytime
42
Q & A
Thank You