Ganeti walk-through — Ganeti 2.10.0~rc1 documentation.pdf

8/10/2019 Ganeti walk-through Ganeti 2.10.0~rc1 documentation.pdf

1/18

Ganeti walk-through

Documents Ganeti version 2.10

Contents

Ganeti walk-through

Introduction

Cluster creation

Running a burn-in

Instance operations

Creation

Accessing instances

Removal

Recovering from hardware failures

Recovering from node failureRe-adding a node to the cluster

Disk failures

Common cluster problems

Instance status

Unallocated DRBD minors

Orphan volumes

N+1 errors

Network issues

Migration problems

In use disks at instance shutdown

LUXI version mismatch

Introduction

This document serves as a more example-oriented guide to Ganeti; while the administration

guide shows a conceptual approach, here you will find a step-by-step example to managing

instances and the cluster.

Our simulated, example cluster will have three machines, named node1, node2, node3. Note

that in real life machines will usually have FQDNs but here we use short names for brevity.

We will use a secondary network for replication data, 192.0.2.0/24 , with nodes having the

last octet the same as their index. The cluster name will be example-cluster . All nodes have

the same simulated hardware configuration, two disks of 750GB, 32GB of memory and 4

CPUs.

On this cluster, we will create up to seven instances, named instance1to instance7.

Cluster creation

Follow the Ganeti installation tutorialdocument and prepare the nodes. Then its time to

eti walk-through Ganeti 2.10.0~rc1 documentation http://docs.ganeti.org/ganeti/2.10/html/walkthrough.html

18 21.1.2014. 19:32


2/18

initialise the cluster:

$ gnt-clusterinit-s192.0.2.1--enabled-hypervisors=xen-pvmexample-cluster$

The creation was fine. Lets check that one node we have is functioning correctly:

$ gnt-nodelistNode DTotal DFree MTotal MNode MFree Pinst Sinstnode1 1.3T 1.3T 32.0G 1.0G 30.5G 0 0

$ gnt-clusterverifyMon Oct 26 02:08:51 2009 * Verifying global settingsMon Oct 26 02:08:51 2009 * Gathering data (1 nodes)Mon Oct 26 02:08:52 2009 * Verifying node statusMon Oct 26 02:08:52 2009 * Verifying instance statusMon Oct 26 02:08:52 2009 * Verifying orphan volumesMon Oct 26 02:08:52 2009 * Verifying remaining instancesMon Oct 26 02:08:52 2009 * Verifying N+1 Memory redundancyMon Oct 26 02:08:52 2009 * Other NotesMon Oct 26 02:08:52 2009 * Hooks Results

$

Since this proceeded correctly, lets add the other two nodes:

$ gnt-nodeadd-s192.0.2.2node2

-- WARNING --Performing this operation is going to replace the ssh daemon keypairon the target machine (node2) with the ones of the current oneand grant full intra-cluster ssh root access to/from it

Unable to verify hostkey of host xen-devi-5.fra.corp.google.com:

f7:. Do you want to accept it?y/[n]/?: yMon Oct 26 02:11:53 2009 Authentication to node2 via public key failed, trying pasroot password:Mon Oct 26 02:11:54 2009 - INFO: Node will be a master candidate$ gnt-nodeadd-s192.0.2.3node3-- WARNING --Performing this operation is going to replace the ssh daemon keypairon the target machine (node3) with the ones of the current oneand grant full intra-cluster ssh root access to/from it

Mon Oct 26 02:12:43 2009 - INFO: Node will be a master candidate

Checking the cluster status again:

$ gnt-nodelistNode DTotal DFree MTotal MNode MFree Pinst Sinstnode1 1.3T 1.3T 32.0G 1.0G 30.5G 0 0node2 1.3T 1.3T 32.0G 1.0G 30.5G 0 0

node3 1.3T 1.3T 32.0G 1.0G 30.5G 0 0$ gnt-clusterverifyMon Oct 26 02:15:14 2009 * Verifying global settingsMon Oct 26 02:15:14 2009 * Gathering data (3 nodes)Mon Oct 26 02:15:16 2009 * Verifying node status

Mon Oct 26 02:15:16 2009 * Verifying instance statusMon Oct 26 02:15:16 2009 * Verifying orphan volumesMon Oct 26 02:15:16 2009 * Verifying remaining instancesMon Oct 26 02:15:16 2009 * Verifying N+1 Memory redundancyMon Oct 26 02:15:16 2009 * Other Notes


18 21.1.2014. 19:32


3/18


4/18

- Failing over instances * instance instance1

* instance instance5 * Submitted job ID(s) 179, 180, 181, 182, 183 waiting for job 179 for instance1 - Migrating instances

* instance instance1 migration and migration cleanup * instance instance5 migration and migration cleanup * Submitted job ID(s) 184, 185, 186, 187, 188 waiting for job 184 for instance1 - Exporting and re-importing instances * instance instance1 export to node node3

remove instance import from node3 to node1, node2

remove export * instance instance5 export to node node1 remove instance import from node1 to node2, node3 remove export * Submitted job ID(s) 196, 197, 198, 199, 200 waiting for job 196 for instance1

- Reinstalling instances * instance instance1 reinstall without passing the OS reinstall specifying the OS

* instance instance5 reinstall without passing the OS reinstall specifying the OS * Submitted job ID(s) 203, 204, 205, 206, 207 waiting for job 203 for instance1 - Rebooting instances * instance instance1 reboot with type 'hard' reboot with type 'soft'

reboot with type 'full' * instance instance5 reboot with type 'hard' reboot with type 'soft' reboot with type 'full'

* Submitted job ID(s) 208, 209, 210, 211, 212 waiting for job 208 for instance1 - Adding and removing disks * instance instance1 adding a disk removing last disk

* instance instance5 adding a disk

removing last disk * Submitted job ID(s) 213, 214, 215, 216, 217


18 21.1.2014. 19:32


5/18

waiting for job 213 for instance1 - Adding and removing NICs * instance instance1

adding a NIC removing last NIC * instance instance5 adding a NIC

removing last NIC * Submitted job ID(s) 218, 219, 220, 221, 222 waiting for job 218 for instance1 - Activating/deactivating disks * instance instance1 activate disks when online activate disks when offline deactivate disks (when offline) * instance instance5

activate disks when online activate disks when offline

deactivate disks (when offline) * Submitted job ID(s) 223, 224, 225, 226, 227 waiting for job 223 for instance1 - Stopping and starting instances * instance instance1 * instance instance5 * Submitted job ID(s) 230, 231, 232, 233, 234

waiting for job 230 for instance1 - Removing instances * instance instance1

* instance instance5 * Submitted job ID(s) 235, 236, 237, 238, 239 waiting for job 235 for instance1 $

You can see in the above what operations the burn-in does. Ideally, the burn-in log would

proceed successfully through all the steps and end cleanly, without throwing errors.

Instance operations

Creation

At this point, Ganeti and the hardware seems to be functioning correctly, so well follow up

with creating the instances manually:

$ gnt-instanceadd-tdrbd-odebootstrap-s256minstance1Mon Oct 26 04:06:52 2009 - INFO: Selected nodes for instance instance1 via iallocaMon Oct 26 04:06:53 2009 * creating instance disks...

Mon Oct 26 04:06:57 2009 adding instance instance1 to cluster configMon Oct 26 04:06:57 2009 - INFO: Waiting for instance instance1 to sync disks.Mon Oct 26 04:06:57 2009 - INFO: - device disk/0: 20.00% done, 4 estimated secondsMon Oct 26 04:07:01 2009 - INFO: Instance instance1's disks are in sync.Mon Oct 26 04:07:01 2009 creating os for instance instance1 on node node2


18 21.1.2014. 19:32


6/18

Mon Oct 26 04:07:01 2009 * running the instance OS create scripts...Mon Oct 26 04:07:14 2009 * starting instance...$ gnt-instanceadd-tdrbd-odebootstrap-s256m-nnode1:node2instance2Mon Oct 26 04:11:37 2009 * creating instance disks...

Mon Oct 26 04:11:40 2009 adding instance instance2 to cluster configMon Oct 26 04:11:41 2009 - INFO: Waiting for instance instance2 to sync disks.Mon Oct 26 04:11:41 2009 - INFO: - device disk/0: 35.40% done, 1 estimated secondsMon Oct 26 04:11:42 2009 - INFO: - device disk/0: 58.50% done, 1 estimated secondsMon Oct 26 04:11:43 2009 - INFO: - device disk/0: 86.20% done, 0 estimated seconds

Mon Oct 26 04:11:44 2009 - INFO: - device disk/0: 92.40% done, 0 estimated secondsMon Oct 26 04:11:44 2009 - INFO: - device disk/0: 97.00% done, 0 estimated secondsMon Oct 26 04:11:44 2009 - INFO: Instance instance2's disks are in sync.Mon Oct 26 04:11:44 2009 creating os for instance instance2 on node node1Mon Oct 26 04:11:44 2009 * running the instance OS create scripts...Mon Oct 26 04:11:57 2009 * starting instance...$

The above shows one instance created via an iallocator script, and one being created with

manual node assignment. The other three instances were also created and now its time to

check them:

$ gnt-instancelistInstance Hypervisor OS Primary_node Status Memoryinstance1 xen-pvm debootstrap node2 running 128Minstance2 xen-pvm debootstrap node1 running 128Minstance3 xen-pvm debootstrap node1 running 128Minstance4 xen-pvm debootstrap node3 running 128Minstance5 xen-pvm debootstrap node2 running 128M

Accessing instances

Accessing an instances console is easy:

$ gnt-instanceconsoleinstance2[ 0.000000] Bootdata ok (command line is root=/dev/sda1 ro)[ 0.000000] Linux version 2.6

[ 0.000000] BIOS-provided physical RAM map:[ 0.000000] Xen: 0000000000000000 - 0000000008800000 (usable)[13138176.018071] Built 1 zonelists. Total pages: 34816[13138176.018074] Kernel command line: root=/dev/sda1 ro[13138176.018694] Initializing CPU#0Checking file systems...fsck 1.41.3 (12-Oct-2008)

done.Setting kernel variables (/etc/sysctl.conf)...done.Mounting local filesystems...done.Activating swapfile swap...done.

Setting up networking....Configuring network interfaces...done.Setting console screen modes and fonts.INIT: Entering runlevel: 2Starting enhanced syslogd: rsyslogd.Starting periodic command scheduler: crond.

Debian GNU/Linux 5.0 instance2 tty1

instance2 login:

At this moment you can login to the instance and, after configuring the network (and doing

this on all instances), we can check their connectivity:


18 21.1.2014. 19:32


7/18

$ fpinginstance{1..5}instance1 is aliveinstance2 is aliveinstance3 is aliveinstance4 is aliveinstance5 is alive$

Removal

Removing unwanted instances is also easy:

$ gnt-instanceremoveinstance5This will remove the volumes of the instance instance5 (includingmirrors), thus removing all the data of the instance. Continue?y/[n]/?: y$

Recovering from hardware failures

Recovering from node failure

We are now left with four instances. Assume that at this point, node3, which has one primary

and one secondary instance, crashes:

$ gnt-nodeinfonode3Node name: node3

primary ip: 198.51.100.1 secondary ip: 192.0.2.3 master candidate: True drained: False offline: False primary for instances: - instance4

secondary for instances: - instance1$ fpingnode3node3 is unreachable

At this point, the primary instance of that node (instance4) is down, but the secondaryinstance (instance1) is not affected except it has lost disk redundancy:

$ fpinginstance{1,4}

instance1 is aliveinstance4 is unreachable$

If we try to check the status of instance4 via the instance info command, it fails because it

tries to contact node3 which is down:

$ gnt-instanceinfoinstance4Failure: command execution error:Error checking node node3: Connection failed (113: No route to host)

$


18 21.1.2014. 19:32


8/18

So we need to mark node3 as being offline, and thus Ganeti wont talk to it anymore:

$ gnt-nodemodify-Oyes-fnode3Mon Oct 26 04:34:12 2009 - WARNING: Not enough master candidates (desired 10, newMon Oct 26 04:34:15 2009 - WARNING: Communication failure to node node3: ConnectioModified node node3

- offline -> True- master_candidate -> auto-demotion due to offline

$

And now we can failover the instance:

$ gnt-instancefailoverinstance4Failover will happen to image instance4. This requires a shutdown ofthe instance. Continue?y/[n]/?: yMon Oct 26 04:35:34 2009 * checking disk consistency between source and targetFailure: command execution error:Disk disk/0 is degraded on target node, aborting failover.$ gnt-instancefailover--ignore-consistencyinstance4

Failover will happen to image instance4. This requires a shutdown ofthe instance. Continue?y/[n]/?: yMon Oct 26 04:35:47 2009 * checking disk consistency between source and targetMon Oct 26 04:35:47 2009 * shutting down instance on source nodeMon Oct 26 04:35:47 2009 - WARNING: Could not shutdown instance instance4 on nodeMon Oct 26 04:35:47 2009 * deactivating the instance's disks on source nodeMon Oct 26 04:35:47 2009 - WARNING: Could not shutdown block device disk/0 on nodeMon Oct 26 04:35:47 2009 * activating the instance's disks on target nodeMon Oct 26 04:35:47 2009 - WARNING: Could not prepare block device disk/0 on nodeMon Oct 26 04:35:48 2009 * starting the instance on the target node$

Note in our first attempt, Ganeti refused to do the failover since it wasnt sure what is the

status of the instances disks. We pass the --ignore-consistency flag and then we can

failover:

$ gnt-instancelistInstance Hypervisor OS Primary_node Status Memoryinstance1 xen-pvm debootstrap node2 running 128Minstance2 xen-pvm debootstrap node1 running 128Minstance3 xen-pvm debootstrap node1 running 128M

instance4 xen-pvm debootstrap node1 running 128M$

But at this point, both instance1 and instance4 are without disk redundancy:

$ gnt-instanceinfoinstance1Instance name: instance1UUID: 45173e82-d1fa-417c-8758-7d582ab7eef4Serial number: 2Creation time: 2009-10-26 04:06:57Modification time: 2009-10-26 04:07:14State: configured to be up, actual state is up Nodes:

- primary: node2 - secondaries: node3 Operating system: debootstrap Allocated network port: None Hypervisor: xen-pvm


18 21.1.2014. 19:32


9/18


10/18


11/18

Disk failures

A disk failure is simpler than a full node failure. First, a single disk failure should not cause

data-loss for any redundant instance; only the performance of some instances might be

reduced due to more network traffic.

Let take the cluster status in the above listing, and check what volumes are in use:

$ gnt-nodevolumes-ophys,instancenode2PhysDev Instance/dev/sdb1 instance4/dev/sdb1 instance4/dev/sdb1 instance1/dev/sdb1 instance1/dev/sdb1 instance3/dev/sdb1 instance3/dev/sdb1 instance2/dev/sdb1 instance2

$

You can see that all instances on node2 have logical volumes on /dev/sdb1. Lets simulate a

disk failure on that disk:

$ sshnode2# on node2

$ echooffline>/sys/block/sdb/device/state$ vgs /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'. Couldn't find all physical volumes for volume group xenvg. /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error

Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'. Couldn't find all physical volumes for volume group xenvg. Volume group xenvg not found$

At this point, the node is broken and if we are to examine instance2 we get (simplified output

shown):

$ gnt-instanceinfoinstance2

Instance name: instance2State: configured to be up, actual state is up Nodes: - primary: node1 - secondaries: node2 Disks:

- disk/0: drbd8, size 256M on primary: /dev/drbd0 (147:0) in sync, status ok on secondary: /dev/drbd1 (147:1) in sync, status *DEGRADED* *MISSING DISK*

This instance has a secondary only on node2. Lets verify a primary instance of node2:

$ gnt-instanceinfoinstance1Instance name: instance1State: configured to be up, actual state is up


f 18 21.1.2014. 19:32


12/18

Nodes: - primary: node2 - secondaries: node1 Disks:

- disk/0: drbd8, size 256M on primary: /dev/drbd0 (147:0) in sync, status *DEGRADED* *MISSING DISK* on secondary: /dev/drbd3 (147:3) in sync, status ok$ gnt-instanceconsoleinstance1

Debian GNU/Linux 5.0 instance1 tty1

instance1 login: rootLast login: Tue Oct 27 01:24:09 UTC 2009 on tty1instance1:~# date > testinstance1:~# syncinstance1:~# cat testTue Oct 27 01:25:20 UTC 2009instance1:~# dmesg|tail[5439785.235448] NET: Registered protocol family 15[5439785.235489] 802.1Q VLAN Support v1.8 Ben Greear

[5439785.235495] All bugs added by David S. Miller [5439785.235517] XENBUS: Device with no driver: device/console/0

[5439785.236576] kjournald starting. Commit interval 5 seconds[5439785.236588] EXT3-fs: mounted filesystem with ordered data mode.[5439785.236625] VFS: Mounted root (ext3 filesystem) readonly.[5439785.236663] Freeing unused kernel memory: 172k freed[5439787.533779] EXT3 FS on sda1, internal journal[5440655.065431] eth0: no IPv6 routers presentinstance1:~#

As you can see, the instance is running fine and doesnt see any disk issues. It is now time

to fix node2 and re-establish redundancy for the involved instances.

Note: For Ganeti 2.0 we need to fix manually the volume group on node2 by runningvgreduce --removemissing xenvg

$ gnt-noderepair-storagenode2lvm-vgxenvgMon Oct 26 18:14:03 2009 Repairing storage unit 'xenvg' on node2 ...$ sshnode2vgsVG #PV #LV #SN Attr VSize VFreexenvg 1 8 0 wz--n- 673.84G 673.84G$

This has removed the bad disk from the volume group, which is now left with only one PV.

We can now replace the disks for the involved instances:

$ foriininstance{1..4};dognt-instancereplace-disks-a$i;doneMon Oct 26 18:15:38 2009 Replacing disk(s) 0 for instance1Mon Oct 26 18:15:38 2009 STEP 1/6 Check device existenceMon Oct 26 18:15:38 2009 - INFO: Checking disk/0 on node1Mon Oct 26 18:15:38 2009 - INFO: Checking disk/0 on node2Mon Oct 26 18:15:38 2009 - INFO: Checking volume groupsMon Oct 26 18:15:38 2009 STEP 2/6 Check peer consistencyMon Oct 26 18:15:38 2009 - INFO: Checking disk/0 consistency on node node1Mon Oct 26 18:15:39 2009 STEP 3/6 Allocate new storage

Mon Oct 26 18:15:39 2009 - INFO: Adding storage on node2 for disk/0Mon Oct 26 18:15:39 2009 STEP 4/6 Changing drbd configurationMon Oct 26 18:15:39 2009 - INFO: Detaching disk/0 drbd from local storageMon Oct 26 18:15:40 2009 - INFO: Renaming the old LVs on the target nodeMon Oct 26 18:15:40 2009 - INFO: Renaming the new LVs on the target node


f 18 21.1.2014. 19:32


13/18

Mon Oct 26 18:15:40 2009 - INFO: Adding new mirror component on node2Mon Oct 26 18:15:41 2009 STEP 5/6 Sync devicesMon Oct 26 18:15:41 2009 - INFO: Waiting for instance instance1 to sync disks.Mon Oct 26 18:15:41 2009 - INFO: - device disk/0: 12.40% done, 9 estimated seconds

Mon Oct 26 18:15:50 2009 - INFO: Instance instance1's disks are in sync.Mon Oct 26 18:15:50 2009 STEP 6/6 Removing old storageMon Oct 26 18:15:50 2009 - INFO: Remove logical volumes for disk/0Mon Oct 26 18:15:52 2009 Replacing disk(s) 0 for instance2Mon Oct 26 18:15:52 2009 STEP 1/6 Check device existence

Mon Oct 26 18:16:01 2009 STEP 6/6 Removing old storageMon Oct 26 18:16:01 2009 - INFO: Remove logical volumes for disk/0Mon Oct 26 18:16:02 2009 Replacing disk(s) 0 for instance3Mon Oct 26 18:16:02 2009 STEP 1/6 Check device existenceMon Oct 26 18:16:09 2009 STEP 6/6 Removing old storageMon Oct 26 18:16:09 2009 - INFO: Remove logical volumes for disk/0Mon Oct 26 18:16:10 2009 Replacing disk(s) 0 for instance4Mon Oct 26 18:16:10 2009 STEP 1/6 Check device existence

Mon Oct 26 18:16:18 2009 STEP 6/6 Removing old storageMon Oct 26 18:16:18 2009 - INFO: Remove logical volumes for disk/0

$

As this point, all instances should be healthy again.

Note: Ganeti 2.0 doesnt have the -aoption to replace-disks, so for it you have to run the

loop twice, once over primary instances with argument -pand once secondary instances

with argument -s, but otherwise the operations are similar:

$ gnt-instancereplace-disks-pinstance1

$ foriininstance{2..4};dognt-instancereplace-disks-s$i;done

Common cluster problems

There are a number of small issues that might appear on a cluster that can be solved easily

as long as the issue is properly identified. For this exercise we will consider the case of

node3, which was broken previously and re-added to the cluster without reinstallation.

Running cluster verify on the cluster reports:

$ gnt-clusterverifyMon Oct 26 18:30:08 2009 * Verifying global settingsMon Oct 26 18:30:08 2009 * Gathering data (3 nodes)Mon Oct 26 18:30:10 2009 * Verifying node statusMon Oct 26 18:30:10 2009 - ERROR: node node3: unallocated drbd minor 0 is in useMon Oct 26 18:30:10 2009 - ERROR: node node3: unallocated drbd minor 1 is in useMon Oct 26 18:30:10 2009 * Verifying instance statusMon Oct 26 18:30:10 2009 - ERROR: instance instance4: instance should not run on

Mon Oct 26 18:30:10 2009 * Verifying orphan volumesMon Oct 26 18:30:10 2009 - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-7916Mon Oct 26 18:30:10 2009 - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03afMon Oct 26 18:30:10 2009 - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03afMon Oct 26 18:30:10 2009 - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-7916Mon Oct 26 18:30:10 2009 * Verifying remaining instances

Mon Oct 26 18:30:10 2009 * Verifying N+1 Memory redundancyMon Oct 26 18:30:10 2009 * Other NotesMon Oct 26 18:30:10 2009 * Hooks Results


f 18 21.1.2014. 19:32


14/18

$

Instance status

As you can see, instance4has a copy running on node3, because we forced the failover

when node3 failed. This case is dangerous as the instance will have the same IP and MAC

address, wreaking havoc on the network environment and anyone who tries to use it.

Ganeti doesnt directly handle this case. It is recommended to logon to node3 and run:

$ xmdestroyinstance4

Unallocated DRBD minors

There are still unallocated DRBD minors on node3. Again, these are not handled by Ganeti

directly and need to be cleaned up via DRBD commands:

$ sshnode3# on node 3

$ drbdsetup/dev/drbd0down$ drbdsetup/dev/drbd1down$

Orphan volumes

At this point, the only remaining problem should be the so-called orphanvolumes. This can

happen also in the case of an aborted disk-replace, or similar situation where Ganeti was notable to recover automatically. Here you need to remove them manually via LVM commands:

$ sshnode3# on node3

$ lvremovexenvgDo you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data" successfully remDo you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d

Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta" successfully remDo you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data" successfully remDo you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta" successfully remnode3#

At this point cluster verify shouldnt complain anymore:

$ gnt-clusterverifyMon Oct 26 18:37:51 2009 * Verifying global settingsMon Oct 26 18:37:51 2009 * Gathering data (3 nodes)Mon Oct 26 18:37:53 2009 * Verifying node statusMon Oct 26 18:37:53 2009 * Verifying instance status

Mon Oct 26 18:37:53 2009 * Verifying orphan volumesMon Oct 26 18:37:53 2009 * Verifying remaining instancesMon Oct 26 18:37:53 2009 * Verifying N+1 Memory redundancyMon Oct 26 18:37:53 2009 * Other NotesMon Oct 26 18:37:53 2009 * Hooks Results


f 18 21.1.2014. 19:32


15/18

$

N+1 errors

Since redundant instances in Ganeti have a primary/secondary model, it is needed to leave

aside on each node enough memory so that if one of its peer node fails, all the secondary

instances that have that node as primary can be relocated. More specifically, if instance2 hasnode1 as primary and node2 as secondary (and node1 and node2 do not have any other

instances in this layout), then it means that node2 must have enough free memory so that if

node1 fails, we can failover instance2 without any other operations (for reducing the

downtime window). Lets increase the memory of the current instances to 4G, and add three

new instances, two on node2:node3 with 8GB of RAM and one on node1:node2, with 12GB

of RAM (numbers chosen so that we run out of memory):

$ gnt-instancemodify-Bmemory=4Ginstance1Modified instance instance1- be/maxmem -> 4096- be/minmem -> 4096Please don't forget that these parameters take effect only at the next start of the$ gnt-instancemodify

$ gnt-instanceadd-tdrbd-nnode2:node3-s512m-Bmemory=8G-odebootstrapinsta$ gnt-instanceadd-tdrbd-nnode2:node3-s512m-Bmemory=8G-odebootstrapinsta

$ gnt-instanceadd-tdrbd-nnode1:node2-s512m-Bmemory=8G-odebootstrapinsta$ gnt-instancereboot--allThe reboot will operate on 7 instances.Do you want to continue?

Affected instances: instance1 instance2 instance3 instance4 instance5 instance6 instance7y/[n]/?: ySubmitted jobs 677, 678, 679, 680, 681, 682, 683Waiting for job 677 for instance1...Waiting for job 678 for instance2...

Waiting for job 679 for instance3...

Waiting for job 680 for instance4...Waiting for job 681 for instance5...Waiting for job 682 for instance6...Waiting for job 683 for instance7...

$

We rebooted the instances for the memory changes to have effect. Now the cluster looks

like:

$ gnt-nodelistNode DTotal DFree MTotal MNode MFree Pinst Sinst

node1 1.3T 1.3T 32.0G 1.0G 6.5G 4 1node2 1.3T 1.3T 32.0G 1.0G 10.5G 3 4node3 1.3T 1.3T 32.0G 1.0G 30.5G 0 2

$ gnt-clusterverifyMon Oct 26 18:59:36 2009 * Verifying global settings


f 18 21.1.2014. 19:32


16/18

Mon Oct 26 18:59:36 2009 * Gathering data (3 nodes)Mon Oct 26 18:59:37 2009 * Verifying node statusMon Oct 26 18:59:37 2009 * Verifying instance statusMon Oct 26 18:59:37 2009 * Verifying orphan volumes

Mon Oct 26 18:59:37 2009 * Verifying remaining instancesMon Oct 26 18:59:37 2009 * Verifying N+1 Memory redundancyMon Oct 26 18:59:37 2009 - ERROR: node node2: not enough memory to accommodate inMon Oct 26 18:59:37 2009 * Other NotesMon Oct 26 18:59:37 2009 * Hooks Results

$

The cluster verify error above shows that if node1 fails, node2 will not have enough memory

to failover all primary instances on node1 to it. To solve this, you have a number of options:

try to manually move instances around (but this can become complicated for any

non-trivial cluster)

try to reduce the minimum memory of some instances on the source node of the N+1

failure (in the example above node1): this will allow it to start and be failed

over/migrated with less than its maximum memory

try to reduce the runtime/maximum memory of some instances on the destination nodeof the N+1 failure (in the example above node2) to create additional available node

memory (check the Ganeti administrators guideguide for what Ganeti will and wont

automatically do in regards to instance runtime memory modification)

if Ganeti has been built with the htools package enabled, you can run the hbal tool

which will try to compute an automated cluster solution that complies with the N+1 rule

Network issues

In case a node has problems with the network (usually the secondary network, as problemswith the primary network will render the node unusable for ganeti commands), it will show up

in cluster verify as:

$ gnt-clusterverifyMon Oct 26 19:07:19 2009 * Verifying global settingsMon Oct 26 19:07:19 2009 * Gathering data (3 nodes)Mon Oct 26 19:07:23 2009 * Verifying node statusMon Oct 26 19:07:23 2009 - ERROR: node node1: tcp communication with node 'node3':Mon Oct 26 19:07:23 2009 - ERROR: node node2: tcp communication with node 'node3':

Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node1':

Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node2':Mon Oct 26 19:07:23 2009 - ERROR: node node3: tcp communication with node 'node3':Mon Oct 26 19:07:23 2009 * Verifying instance statusMon Oct 26 19:07:23 2009 * Verifying orphan volumesMon Oct 26 19:07:23 2009 * Verifying remaining instancesMon Oct 26 19:07:23 2009 * Verifying N+1 Memory redundancyMon Oct 26 19:07:23 2009 * Other NotesMon Oct 26 19:07:23 2009 * Hooks Results$

This shows that both node1 and node2 have problems contacting node3 over the secondary

network, and node3 has problems contacting them. From this output is can be deduced that

since node1 and node2 can communicate between themselves, node3 is the one havingproblems, and you need to investigate its network settings/connection.


f 18 21.1.2014. 19:32


17/18

Migration problems

Since live migration can sometimes fail and leave the instance in an inconsistent state,

Ganeti provides a --cleanupargument to the migrate command that does:

check on which node the instance is actually running (has the command failed before

or after the actual migration?)

reconfigure the DRBD disks accordingly

It is always safe to run this command as long as the instance has good data on its primary

node (i.e. not showing as degraded). If so, you can simply run:

$ gnt-instancemigrate--cleanupinstance1Instance instance1 will be recovered from a failed migration. Notethat the migration procedure (including cleanup) is **experimental**in this version. This might impact the instance if anything goeswrong. Continue?y/[n]/?: y

Mon Oct 26 19:13:49 2009 Migrating instance instance1Mon Oct 26 19:13:49 2009 * checking where the instance actually runs (if this hangs,Mon Oct 26 19:13:49 2009 * instance confirmed to be running on its primary node (noMon Oct 26 19:13:49 2009 * switching node node1 to secondary modeMon Oct 26 19:13:50 2009 * wait until resync is doneMon Oct 26 19:13:50 2009 * changing into standalone modeMon Oct 26 19:13:50 2009 * changing disks into single-master mode

Mon Oct 26 19:13:50 2009 * wait until resync is doneMon Oct 26 19:13:51 2009 * done$

In use disks at instance shutdown

If you see something like the following when trying to shutdown or deactivate disks for an

instance:

$ gnt-instanceshutdowninstance1Mon Oct 26 19:16:23 2009 - WARNING: Could not shutdown block device disk/0 on node

It most likely means something is holding open the underlying DRBD device. This can be

bad if the instance is not running, as it might mean that there was concurrent access from

both the node and the instance to the disks, but not always (e.g. you could only have had thepartitions activated via kpartx).

To troubleshoot this issue you need to follow standard Linux practices, and pay attention to

the hypervisor being used:

check if (in the above example) /dev/drbd0 on node2 is being mounted somewhere

(cat /proc/mounts)

check if the device is not being used by device mapper itself: dmsetup ls and look for

entries of the form drbd0pX, and if so remove them with either kpartx -d or dmsetup

remove

For Xen, check if its not using the disks itself:


f 18 21.1.2014. 19:32


18/18

Ganeti walk-through — Ganeti 2.10.0~rc1 documentation.pdf

Documents

Transcript of Ganeti walk-through — Ganeti 2.10.0~rc1 documentation.pdf