Live migration of OpenZFS datasets with Zmotion

23
Live Migration of OpenZFS datasets with Zmotion

Transcript of Live migration of OpenZFS datasets with Zmotion

Page 1: Live migration of OpenZFS datasets with Zmotion

Live Migration of OpenZFS datasets with Zmotion

Page 2: Live migration of OpenZFS datasets with Zmotion

930,000customers)worldwide

1100employees

18'Mhosted applications

3,5'Mdomain names

Key  Numbers

Page 3: Live migration of OpenZFS datasets with Zmotion

32 PoP(Point'of'Presence)NETWORK 220,000

servers

4Tbpsbandwidth

Western'Europe North'America Central'Europe

17/DATACENTERS

Page 4: Live migration of OpenZFS datasets with Zmotion

ZFS  @  OVH• Every flavors of ZFS. Mostly OpenZFS with Illumos and

ZoL

• Various I/O workloads : virtualization, web, emails, databases, logs, backups…

• Designed our own dual headed ZFS HA server

• Intensive usage of custom ZFS properties to store configuration items

• Atypical usage like OpenZFS over/under CePH

Page 5: Live migration of OpenZFS datasets with Zmotion

• Fragmented storage farms because of continuous allocations/deallocations of zpools

• Heavily fragmented and aged zpools

• Hardware issues or limits

• Switch to new stack (ZFS/OS)

• ZFS bugs (it happens)

Why  do  we  migrate?

Page 6: Live migration of OpenZFS datasets with Zmotion

ZFS SRC Filer

ZFS DST Filer

WEBFRONTENDS

Let’s  migrate!

OVH BACKBONE

NFS

INTERNET

Page 7: Live migration of OpenZFS datasets with Zmotion

1st  try  «  naive  »  approach

Page 8: Live migration of OpenZFS datasets with Zmotion

ZFS send/receive

Snapshot source dataset

Create destination

datasetUp NFS

service IP

Shutdown NFS

service IP

ZFSincremental send/receive

TimelineDOWNTIME

DOWNTIME

SRC

DST

All NFS clients crashed on

frontends. Need to umount/

remount NFS.

Page 9: Live migration of OpenZFS datasets with Zmotion
Page 10: Live migration of OpenZFS datasets with Zmotion

Why  NFS  clients  crashed?

• « stale NFS file handle » error

• NFS ID (fh3_fsid) exposed from server has changed after service ip migration

ZFS SRC Filer

ZFS DST Filer

NFS

WEB FRONTENDS

NFS

NFS ID (fh3_fsid) :0x23456789

NFS ID (fh3_fsid) :0x98765432

Page 11: Live migration of OpenZFS datasets with Zmotion

NFS (fh3_fsid)

VFS (fsid)

ZFS (fsid_guid)

Kernel

File System

K3_fsid  (NFS)  -­‐>  fsid  (VFS)  -­‐>  fsid_guid  (ZFS)

Page 12: Live migration of OpenZFS datasets with Zmotion

2nd  try    «  skeet-­‐shoot  »  fsid_guid  in  RAM

Page 13: Live migration of OpenZFS datasets with Zmotion

Get  source  fsid_guid  with  Dtracefbt::zfs_ioc_dataset_list_next:entry { self->zfs_cmd = args[0]; }fbt::dsl_dataset_fast_stat:entry /self->zfs_cmd != NULL/ { printf("zc_name:-%s- guid:-%#lx-\n", stringof(self->zfs_cmd->zc_name), args[0]->ds_fsid_guid); }fbt::zfs_ioc_dataset_list_next:return /self->zfs_cmd != NULL/ { self->zfs_cmd = NULL; }

# zc_name:-foo/t- guid:-0x48a20330327752-

Page 14: Live migration of OpenZFS datasets with Zmotion

Set  source  fsid_guid  with  MDB

# Get memory address where fsid_guid is stored fbt::dsl_dataset_sync:entry { printf("fsid_guid:-%a- address:-%a-\n", args[0]->ds_fsid_guid, &args[0]->ds_fsid_guid); }# fsid_guid:-2360eb4e33dbe— address:-0xffffff025a70e620-

# mdb -kwLoading modules: [ unix genunix dtrace zfs nfs … ] > 0xffffff025a70e620/J0xffffff025a70e620: 2360eb4e33dbe > 0xffffff025a70e620/Z 48a20330327752 0xffffff025a70e620: 0x2360eb4e33dbe = 0x48a20330327752

Page 15: Live migration of OpenZFS datasets with Zmotion

ZFS send/receive

Snapshot source dataset

Create destination

dataset

Up NFS service IPon dest

Shutdown NFS

service IPon src

ZFSincremental send/receive

TimelineDOWNTIME

DOWNTIME

Set destination fsid_guid in

RAM with MDB + zfs umount/

mount to update VFS ID

Get source fsid_guid

with Dtrace

SRC

DST

Page 16: Live migration of OpenZFS datasets with Zmotion

3rd  try’s  the  charm  «  YAZP!  »  (Yet  An  Other  ZFS  Property)

Page 17: Live migration of OpenZFS datasets with Zmotion

fsid_guid  new  ZFS  property

root@src_server# zfs get fsid_guid foo/t NAME PROPERTY VALUE SOURCE foo/t fsid_guid 25231704771932250 -

root@dst_server# zfs create -o fsid_guid=25231704771932250 foo/t

Page 18: Live migration of OpenZFS datasets with Zmotion

mbuffered ZFS send/receive

Snapshot source dataset

zfs create -o fsid_guid destination

dataset

Up NFS service IPon dest

Shutdown NFS

service IP on src

Zmo[onzfs get

fsid_guid source dataset

incremental mbuffered

ZFSsend/receive

… nSRC

DST

Page 19: Live migration of OpenZFS datasets with Zmotion
Page 20: Live migration of OpenZFS datasets with Zmotion

Tiny  hack  but  HUGE  benefits!

• Zmotion is a combination of fsid_guid patch and zfs send/receive orchestration

• Thousands of datasets already Zmotioned

• Make ZFS a bit more « distributable »

• ZoL not concerned : nfsid parameter presents in Linux NFS stack

Page 21: Live migration of OpenZFS datasets with Zmotion

Availability

• fsid_guid ZFS property patch

• illumos gate #6333

/ovh

/6333

Page 22: Live migration of OpenZFS datasets with Zmotion

Ques[ons?

Page 23: Live migration of OpenZFS datasets with Zmotion

Thank  you!

OVH  Storage  Team  Francois  Lesage            @storagebits