Live migration of OpenZFS datasets with Zmotion

Post on 15-Apr-2017

2.831 views 7 download

Transcript of Live migration of OpenZFS datasets with Zmotion

Live Migration of OpenZFS datasets with Zmotion

930,000customers)worldwide

1100employees

18'Mhosted applications

3,5'Mdomain names

Key  Numbers

32 PoP(Point'of'Presence)NETWORK 220,000

servers

4Tbpsbandwidth

Western'Europe North'America Central'Europe

17/DATACENTERS

ZFS  @  OVH• Every flavors of ZFS. Mostly OpenZFS with Illumos and

ZoL

• Various I/O workloads : virtualization, web, emails, databases, logs, backups…

• Designed our own dual headed ZFS HA server

• Intensive usage of custom ZFS properties to store configuration items

• Atypical usage like OpenZFS over/under CePH

• Fragmented storage farms because of continuous allocations/deallocations of zpools

• Heavily fragmented and aged zpools

• Hardware issues or limits

• Switch to new stack (ZFS/OS)

• ZFS bugs (it happens)

Why  do  we  migrate?

ZFS SRC Filer

ZFS DST Filer

WEBFRONTENDS

Let’s  migrate!

OVH BACKBONE

NFS

INTERNET

1st  try  «  naive  »  approach

ZFS send/receive

Snapshot source dataset

Create destination

datasetUp NFS

service IP

Shutdown NFS

service IP

ZFSincremental send/receive

TimelineDOWNTIME

DOWNTIME

SRC

DST

All NFS clients crashed on

frontends. Need to umount/

remount NFS.

Why  NFS  clients  crashed?

• « stale NFS file handle » error

• NFS ID (fh3_fsid) exposed from server has changed after service ip migration

ZFS SRC Filer

ZFS DST Filer

NFS

WEB FRONTENDS

NFS

NFS ID (fh3_fsid) :0x23456789

NFS ID (fh3_fsid) :0x98765432

NFS (fh3_fsid)

VFS (fsid)

ZFS (fsid_guid)

Kernel

File System

K3_fsid  (NFS)  -­‐>  fsid  (VFS)  -­‐>  fsid_guid  (ZFS)

2nd  try    «  skeet-­‐shoot  »  fsid_guid  in  RAM

Get  source  fsid_guid  with  Dtracefbt::zfs_ioc_dataset_list_next:entry { self->zfs_cmd = args[0]; }fbt::dsl_dataset_fast_stat:entry /self->zfs_cmd != NULL/ { printf("zc_name:-%s- guid:-%#lx-\n", stringof(self->zfs_cmd->zc_name), args[0]->ds_fsid_guid); }fbt::zfs_ioc_dataset_list_next:return /self->zfs_cmd != NULL/ { self->zfs_cmd = NULL; }

# zc_name:-foo/t- guid:-0x48a20330327752-

Set  source  fsid_guid  with  MDB

# Get memory address where fsid_guid is stored fbt::dsl_dataset_sync:entry { printf("fsid_guid:-%a- address:-%a-\n", args[0]->ds_fsid_guid, &args[0]->ds_fsid_guid); }# fsid_guid:-2360eb4e33dbe— address:-0xffffff025a70e620-

# mdb -kwLoading modules: [ unix genunix dtrace zfs nfs … ] > 0xffffff025a70e620/J0xffffff025a70e620: 2360eb4e33dbe > 0xffffff025a70e620/Z 48a20330327752 0xffffff025a70e620: 0x2360eb4e33dbe = 0x48a20330327752

ZFS send/receive

Snapshot source dataset

Create destination

dataset

Up NFS service IPon dest

Shutdown NFS

service IPon src

ZFSincremental send/receive

TimelineDOWNTIME

DOWNTIME

Set destination fsid_guid in

RAM with MDB + zfs umount/

mount to update VFS ID

Get source fsid_guid

with Dtrace

SRC

DST

3rd  try’s  the  charm  «  YAZP!  »  (Yet  An  Other  ZFS  Property)

fsid_guid  new  ZFS  property

root@src_server# zfs get fsid_guid foo/t NAME PROPERTY VALUE SOURCE foo/t fsid_guid 25231704771932250 -

root@dst_server# zfs create -o fsid_guid=25231704771932250 foo/t

mbuffered ZFS send/receive

Snapshot source dataset

zfs create -o fsid_guid destination

dataset

Up NFS service IPon dest

Shutdown NFS

service IP on src

Zmo[onzfs get

fsid_guid source dataset

incremental mbuffered

ZFSsend/receive

… nSRC

DST

Tiny  hack  but  HUGE  benefits!

• Zmotion is a combination of fsid_guid patch and zfs send/receive orchestration

• Thousands of datasets already Zmotioned

• Make ZFS a bit more « distributable »

• ZoL not concerned : nfsid parameter presents in Linux NFS stack

Availability

• fsid_guid ZFS property patch

• illumos gate #6333

/ovh

/6333

Ques[ons?

Thank  you!

OVH  Storage  Team  Francois  Lesage            @storagebits