NetRAID for the Linux Kernel (UKUUG/LISA WCHAR 2004)

NetRAID

Peter T. [email protected]

Failover Loves Mirroring

12.2.1.3 12.2.1.3

raid1

nbd

disk

RAID1+ NBD

Resynchronization Snooze

12.2.1.3ZZZ...

raid1

nbd

The Numbers Game

100BT LAN = 10MB/s

1TB mirror @ 40MB/s = 25000s

7.5 hours!

Temporary network outages = frequent permanent disk losses = infrequent

Adds up to a need for a changed paradigm

What's wrong with ordnary RAID1?

Full resync too slow over the net

Net dropouts too frequent

trigger full resync

Does not expect same disk to be restored

Network glitches are cable errors

Requires presencial administration!

Writes synchronous to both sides - too slow

Reads may be from the slow side

RAID vs netRAID

Classical

small disks

physically close

medium bandwidth

infrequent dropouts

permanent losses

admin on hand

netRAID

large disks

physically dispersed

low bandwidth

frequent dropouts

temporary losses

admin off-scene

Solutions

Replace drivers

Linux kernel NBD → ENBD

Linux kernel RAID1 → FR1

Replace problems

disk fail is permanent → disk fail is temporary

repair by insert new disk → repair by reinsert old

admin does repair → device repairs itself

cables never fail → cables often fail

ENBD

automatic reconnect after network outage

block not error during temporary outage

redundant channel connectivity

(partitionable)

accelerated - skips writes equal both sides

talks to soft RAID overlay driver

supports remote ioctls and removable devices

FR1

full resync → intelligent partial resync

hot repair

automatic

asynchronous

writes eliminate latency

read from fastest (not there yet)

retain state across reboots (Paul Clements)

FR1 intelligent resync

● resync max40MB/s

ENBD performance measure (read)

● n=1,2,4channels

ENBD performance measure (write)

● n=1,2,4channels

netRAID1 nuances

With mirrored journal

must preserve write ordering!

immediate takeover - no fsck!

Without

3x faster!

needs fsck

Detecting failure

private or public connectivity test?

12.2.1.3

Summary

Component-based assembly

ENBD - remote network disk

FR1 - Fast RAIDneFS - any file system

easier to parcel out development

more testing

easier to slip part supports into kernel

FS agnostic

Work together for replication, failover, recovery

\thebilbliography

● Paul Clements & James E.J. Bottomley. High Availability Data Replication. Proc. Linux Symposium July 2003 Ottawa, Ontario, Canada. http://archive.linuxsymposium.org/ols2003/Proceedings/All-Reprints/Reprint-Clements-OLS2003.pdf

● P.T. Breuer et al. The Network Block Device http://www2.linuxjournal.com/lj-issues/issue73/3778.html

NetRAID for the Linux Kernel (UKUUG/LISA WCHAR 2004)

Education

Transcript of NetRAID for the Linux Kernel (UKUUG/LISA WCHAR 2004)