Live migration of OpenZFS datasets with Zmotion
-
Upload
ovh -
Category
Technology
-
view
2.831 -
download
7
Transcript of Live migration of OpenZFS datasets with Zmotion
Live Migration of OpenZFS datasets with Zmotion
930,000customers)worldwide
1100employees
18'Mhosted applications
3,5'Mdomain names
Key Numbers
32 PoP(Point'of'Presence)NETWORK 220,000
servers
4Tbpsbandwidth
Western'Europe North'America Central'Europe
17/DATACENTERS
ZFS @ OVH• Every flavors of ZFS. Mostly OpenZFS with Illumos and
ZoL
• Various I/O workloads : virtualization, web, emails, databases, logs, backups…
• Designed our own dual headed ZFS HA server
• Intensive usage of custom ZFS properties to store configuration items
• Atypical usage like OpenZFS over/under CePH
• Fragmented storage farms because of continuous allocations/deallocations of zpools
• Heavily fragmented and aged zpools
• Hardware issues or limits
• Switch to new stack (ZFS/OS)
• ZFS bugs (it happens)
Why do we migrate?
ZFS SRC Filer
ZFS DST Filer
WEBFRONTENDS
Let’s migrate!
OVH BACKBONE
NFS
INTERNET
1st try « naive » approach
ZFS send/receive
Snapshot source dataset
Create destination
datasetUp NFS
service IP
Shutdown NFS
service IP
ZFSincremental send/receive
TimelineDOWNTIME
DOWNTIME
SRC
DST
All NFS clients crashed on
frontends. Need to umount/
remount NFS.
Why NFS clients crashed?
• « stale NFS file handle » error
• NFS ID (fh3_fsid) exposed from server has changed after service ip migration
ZFS SRC Filer
ZFS DST Filer
NFS
WEB FRONTENDS
NFS
NFS ID (fh3_fsid) :0x23456789
NFS ID (fh3_fsid) :0x98765432
NFS (fh3_fsid)
VFS (fsid)
ZFS (fsid_guid)
Kernel
File System
K3_fsid (NFS) -‐> fsid (VFS) -‐> fsid_guid (ZFS)
2nd try « skeet-‐shoot » fsid_guid in RAM
Get source fsid_guid with Dtracefbt::zfs_ioc_dataset_list_next:entry { self->zfs_cmd = args[0]; }fbt::dsl_dataset_fast_stat:entry /self->zfs_cmd != NULL/ { printf("zc_name:-%s- guid:-%#lx-\n", stringof(self->zfs_cmd->zc_name), args[0]->ds_fsid_guid); }fbt::zfs_ioc_dataset_list_next:return /self->zfs_cmd != NULL/ { self->zfs_cmd = NULL; }
# zc_name:-foo/t- guid:-0x48a20330327752-
Set source fsid_guid with MDB
# Get memory address where fsid_guid is stored fbt::dsl_dataset_sync:entry { printf("fsid_guid:-%a- address:-%a-\n", args[0]->ds_fsid_guid, &args[0]->ds_fsid_guid); }# fsid_guid:-2360eb4e33dbe— address:-0xffffff025a70e620-
# mdb -kwLoading modules: [ unix genunix dtrace zfs nfs … ] > 0xffffff025a70e620/J0xffffff025a70e620: 2360eb4e33dbe > 0xffffff025a70e620/Z 48a20330327752 0xffffff025a70e620: 0x2360eb4e33dbe = 0x48a20330327752
ZFS send/receive
Snapshot source dataset
Create destination
dataset
Up NFS service IPon dest
Shutdown NFS
service IPon src
ZFSincremental send/receive
TimelineDOWNTIME
DOWNTIME
Set destination fsid_guid in
RAM with MDB + zfs umount/
mount to update VFS ID
Get source fsid_guid
with Dtrace
SRC
DST
3rd try’s the charm « YAZP! » (Yet An Other ZFS Property)
fsid_guid new ZFS property
root@src_server# zfs get fsid_guid foo/t NAME PROPERTY VALUE SOURCE foo/t fsid_guid 25231704771932250 -
root@dst_server# zfs create -o fsid_guid=25231704771932250 foo/t
mbuffered ZFS send/receive
Snapshot source dataset
zfs create -o fsid_guid destination
dataset
Up NFS service IPon dest
Shutdown NFS
service IP on src
Zmo[onzfs get
fsid_guid source dataset
incremental mbuffered
ZFSsend/receive
… nSRC
DST
Tiny hack but HUGE benefits!
• Zmotion is a combination of fsid_guid patch and zfs send/receive orchestration
• Thousands of datasets already Zmotioned
• Make ZFS a bit more « distributable »
• ZoL not concerned : nfsid parameter presents in Linux NFS stack
Availability
• fsid_guid ZFS property patch
• illumos gate #6333
/ovh
/6333
Ques[ons?
Thank you!
OVH Storage Team Francois Lesage @storagebits