Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior...

68
Managing and Monitoring SUSE Enterprise Storage Tim Serong Senior Clustering Engineer [email protected] Eric Jackson Senior Software Developer Distributed Storage [email protected] Tim Serong Senior Clustering Engineer [email protected]

Transcript of Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior...

Page 1: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Managing and Monitoring SUSE Enterprise Storage

Tim SerongSenior Clustering Engineer

[email protected]

Eric JacksonSenior Software Developer Distributed Storage

[email protected]

Tim SerongSenior Clustering Engineer

[email protected]

Page 2: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

2

SUSE Enterprise Storage...in 30 seconds or less

• Massively scalable

• No bottlenecks or single points of failure

• Object storage, block storage

• Based on Ceph

Page 3: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

3

SUSE Enterprise Storage...in 30 seconds or less

• Data stored redundantly

• Lots of disks (OSDs) in lots of storage nodes

• A few monitor (MON) nodes

• All on commodity hardware

Page 4: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Deployment

Page 5: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

5

DIY Deployment

• Boot a bunch of nodes

• Install SUSE Linux Enterprise Server 12

• Add SUSE Enterprise Storage

• Run ceph-deploy

Page 6: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

6

DIY Deployment

# ceph-deploy new node1 node2 node3

# ceph-deploy mon create-initial

# ceph-deploy osd prepare node1:sdb

# ceph-deploy osd prepare node1:sdc

# ceph-deploy calamari --master node0 \

connect node1 node2 node3 …

Page 7: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

7

Or, Take a Crowbar to it

• Same technology as SUSE OpenStack Cloud

• Install one admin node

• PXE boot everything else

• Click to deploy

Page 8: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

8

Or, Take a Crowbar to it

Page 9: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

9

Or, Take a Crowbar to it

Page 10: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

10

Or, Take a Crowbar to it

Page 11: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

11

Or, Take a Crowbar to it

Page 12: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

12

Or, Take a Crowbar to it

Page 13: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

13

Or, Take a Crowbar to it

Page 14: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

14

Or, Take a Crowbar to it

Page 15: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

15

Or, Take a Crowbar to it

Page 16: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

16

Or, Take a Crowbar to it

Page 17: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

17

Or, Take a Crowbar to it

Page 18: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

18

Or, Take a Crowbar to it

Page 19: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

19

Or, Take a Crowbar to it

Page 20: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

iSCSI

Page 21: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

21

Configuring iSCSI today

• RBD mapped devices

• Targetcli

- interactive

- command line

• Iblock backstore

Page 22: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

22

Configuring iSCSI challenges

• Simple configuration requires a dozen steps

• Sometimes command order matters

• Locally saved configuration

• Synchronizing redundancy across gateways

• Experimentation can be cumbersome

Page 23: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

23

Lrbd

• Uses rbd and targetcli

• Configuration stored in Ceph

• Synchronization is automatic

• Experimentation is quick

• Configuration format is JSON

• Command line options, man pages

Page 24: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

24

Lrbd information

• Github https://github.com/SUSE/lrbd

• Wiki tutorial https://github.com/SUSE/lrbd.wiki

• 30 configuration samples

- /usr/share/doc/packages/lrbd/samples

Page 25: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Monitoring

Page 26: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

26

Is it Working?

# ceph status

cluster 565bbaaf-11e9-4105-934a-6b468f0b7b7ehealth HEALTH_OKmonmap e1: 1 mons at {node1=192.168.124.81:6789/0} election epoch 1, quorum 0 node1osdmap e12: 2 osds: 2 up, 2 in pgmap v118: 64 pgs, 1 pools, 1024 kB data, 3 objects 74192 kB used, 38817 MB / 38889 MB avail 64 active+clean

Page 27: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

27

Is it Working?

# ceph status

cluster 565bbaaf-11e9-4105-934a-6b468f0b7b7ehealth HEALTH_WARN 33 pgs degraded; 35 pgs stuck... monmap e1: 3 mons at {ceph2=...,ceph3=...,ceph4=...}, election epoch 22, quorum 0,1,2 ceph2,ceph3,...osdmap e411: 52 osds: 52 up, 52 inpgmap v1014: 4288 pgs, 4 pools, 0 bytes data, ... 2466 MB used, 12659 GB / 12662 GB avail 33 active+degraded 2 active+remapped 4253 active+clean

Page 28: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

28

Is it Working?

# ceph status

cluster c9d3ae97-2f4c-4d91-a3f7-ff42bce754dfhealth HEALTH_WARN 2174 pgs backfill; 367 pgs backfilling; 3271 pgs degraded; 23 pgs down; 57 pgs peering; 35 pgs recovering; 188 pgs recovery_wait; 227 pgs stale; 26 pgs stuck inactive; 3065 pgs stuck unclean; recovery 2519083/14004502 objects degraded (17.988%); 1/148 in osds are downmonmap e3: 3 mons at {a001=172.16.25.1:6789/0,a002=172.168.25.2:6789/0,a003=172.16.25.3:6789/0}, election ...osdmap e51357: 168 osds: 147 up, 148 in pgmap v7243525: 20480 pgs, 5 pools, 1183 GB data, 5609 objects 3330 GB used, 88846 GB / 92177 GB avail 2519083/14004502 objects degraded (17.988%) 9 inactive 16755 active+clean 10 degraded+remapped 9 active+degraded+remapped 361 active+degraded+remapped+backfilling 39 stale+active+degraded+remapped+wait_backfill 30 peering 154 active+recovery_wait 171 stale+active+clean 22 active+recovery_wait+degraded+remapped 14 active+remapped+wait_backfill 23 down+peering 3 stale+active+degraded+remapped+backfilling 3 stale+active+recovery_wait 2 active+remapped 19 degraded 673 active+degraded 3 active+remapped+backfilling 4 remapped+peering 6 active+recovery_wait+remapped 3 active+recovery_wait+degraded 11 stale+active+degraded 2121 active+degraded+remapped+wait_backfill 35 active+recovering

Page 29: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

29

Is it Working?

• What do we care about, right now?‒ Overall cluster health

‒ MON quorum

‒ OSD status

‒ PG status

‒ Disk used/free

‒ Is anything dead?

Page 30: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

30

Is it Working Well?

• What do we care about, long term?‒ What's CPU bound?

‒ What's disk bound?

‒ What's network bound?

Page 31: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

31

Enter Calamari and Romana

• Ceph GUI, included with SUSE Enterprise Storage

• Calamari is the backend (REST API)

• Romana is the frontend (GUI)

• Provides monitoring and some management

Page 32: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

32

Cluster Status

Page 33: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

33

Cluster Status

Page 34: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

34

Cluster Performance

Page 35: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

35

Cluster Performance

Page 36: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

36

Cluster Performance

Page 37: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Management

Page 38: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

38

When Everything is Fine

• OSD management

• Pool / Placement Group management

• iSCSI targets

• Users, authentication

• Adding new nodes, disks, etc.

Page 39: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

39

Cluster Settings

Page 40: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

40

OSD Management

Page 41: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

41

Pool / Placement Group Management

Page 42: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

42

Pool / Placement Group Management

Page 43: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

43

CLI Tools for...

• iSCSI targets

• Users, authentication

• Adding new nodes, disks, etc.

Page 44: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

44

When Everything is Not Fine

• Dead disks

• Dead nodes

• Half the building is on fire...

Page 45: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

45

Calamari will tell you about it...

Page 46: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

46

...and help you find the problem

Page 47: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

47

Again, Other Tools for...

• Redeploying

• Adding new nodes

• Replacing OSDs

Page 48: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Beneath the Surface

Page 49: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

49

Calamari Consists Of...

• Romana (the frontend)

• Calamari (the backend REST API)

• Salt (communication and minor configuration)

• Graphite (metrics, graphs)

Page 50: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

50

Usually...

# zypper in romana

# calamari-ctl initialize[INFO] Loading configuration…[INFO] Starting/enabling salt... ...Username (leave blank to use 'root'): ...

# ceph-deploy calamari --master node0 \

connect node1 node2 node3 …

Page 51: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

51

In Practice, Right Now

• Calamari Node:‒ Romana, Calamari, Salt Master, Graphite/Carbon

• Storage Nodes:‒ Salt Minion, Diamond

Page 52: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

52

Variations

• More Salt with your Calamari?

• Longer metric retention/etc/carbon/storage-schemas.conf

• BYO Graphite

• Less Salt with your Calamari?

Page 53: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Alternatives

Page 54: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

54

Alternatives

• Intel Virtual Storage Manager

• InkScope

• openATTIC

Page 55: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

55

Virtual Storage Manager

• Intel

• Openstack Horizon Interface

• Django, Python

• Released 2.0 beta 1 on August 21

• Packages incomplete at OBS

- https://build.opensuse.org/project/show/home:swiftgist:vsm

Page 56: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

56

InkScope

• Orange Labs

• AngularJS and Python REST api

• Rados gateway user management

• Packages at OBS

- https://build.opensuse.org/package/show/home:swiftgist/inkscope

Images from https://github.com/inkscope/inkscope/tree/master/screenshots

Page 57: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

57

Page 58: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

58

Page 59: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

59

Page 60: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

60

openATTIC

• IT-Novum GmbH

• AngularJS and Python REST api

• Released 2.0.1 on July 21

• Complete Linux storage management system

• Demo

- http://demo.openattic.org/openattic/

Page 61: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

61

Page 62: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

62

Page 63: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

63

Page 64: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Questions?

Page 65: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Thank you.

65

For more information aboutSUSE Enterprise Storage:http://suse.com/storage

Page 66: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

66

Page 67: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Corporate HeadquartersMaxfeldstrasse 590409 NurembergGermany

+49 911 740 53 0 (Worldwide)www.suse.com

Join us on:www.opensuse.org

67

Page 68: Managing and Monitoring SUSE Enterprise Storage · SUSE Enterprise Storage Tim Serong Senior Clustering Engineer tserong@suse.com ... • AngularJS and Python REST api ... Further,

Unpublished Work of SUSE LLC. All Rights Reserved.This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.

General DisclaimerThis document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.