Ceph Day Bring Ceph To Enterprise

39
Bring Ceph to Enterprise Setup a 50T mobile cluster in 30 min Alex Lau ( 劉劉劉 ) Software Consultant [email protected]

Transcript of Ceph Day Bring Ceph To Enterprise

Page 1: Ceph Day Bring Ceph To Enterprise

Bring Ceph to Enterprise Setup a 50T mobile cluster in 30 min

Alex Lau (劉俊賢 )Software [email protected]

Page 2: Ceph Day Bring Ceph To Enterprise

Block Storage

File System

Object Storage

001011010101010111011011101110001011010

How to access to ceph storage?Introduction of iSCSI

Remote ClusterData Encrypted

at Rest

MonitorNodes

Management Node

Heterogeneous OS Access

RADOS gateway

RESTful api

iSCSI

001110111010110101101011010110110111010001110111101110100101101011101111011011

111010110101101101110100111010110101010

Page 3: Ceph Day Bring Ceph To Enterprise

SUSE Enterprise Storage 3

A first commercial available ISCSI access to connect to SES3. It allow client access to ceph storage remotely over TCP/IP iscsi protocol. SES3 provide a iscsi Target driver on top of RDB ( RADOS block device ). This allow any iscsi Initiator can access SES3 over network.

Page 4: Ceph Day Bring Ceph To Enterprise

iSCSI Architecture Technical Background

Protocol: ‒ Block storage access over TCP/IP

‒ Initiators the client that access the iscsi target over tcp/ip

‒ Targets, the server that provide access to a local block

SCSI and iSCSI:‒ iSCSI encapsulated commands and responses

‒ TCP package of iscsi is representing SCSI command

Remote access:‒ iSCSI Initiators able to access a remote block like local disk

‒ Attach and format with XFS, brtfs etc.

‒ Booting directly from a iscsi target is supported

Page 5: Ceph Day Bring Ceph To Enterprise

Public Network

OSD1 OSD2 OSD3 OSD4

Before iSCSI RBD support … PRESENT

FUTURE

£€

$

Target System

RBD Block LIO to ISCSI

Initiator System

Page 6: Ceph Day Bring Ceph To Enterprise

Before iSCSI support what’s wrong? Missing features

LIO over RBD: ‒ It doesn’t support “atomic compare and write”

‒ It doesn’t support “persistent group reservations”

iSCSI:‒ ISCSI Active/Active Multiple Path MPIO is not supported

‒ Block layer support all these require a different approach

Page 7: Ceph Day Bring Ceph To Enterprise

Benefit of iSCSI LIO gateway for RBD

Multiple Platform access to ceph: ‒ It doesn’t require to be part of the cluster like radosgw

Standard iSCSI interface:‒ Most OS support iSCSI

‒ Open-iscsi in most Linux OS

LIO Linux IO Target:‒ In kernel scsi target implementation

Flexible configuration:‒ Targetcli utility is available with lrbd

Page 8: Ceph Day Bring Ceph To Enterprise

Config RBD iSCSI gateway Introduction of lrbd

Easy Setup:‒ Package bundle with iscsi since SES2.0

‒ Multi-Node configuration support with targetcli

Technical Background:‒ JSON configuration format

‒ Target, Portals, Pools, Auth

‒ Configuration state stored in ceph cluster

Related Link:‒ https://github.com/swiftgist/lrbd

‒ https://github.com/swiftgist/lrbd/wiki

Page 9: Ceph Day Bring Ceph To Enterprise

9

iSCSI Gateway Optimizations

Efficient handling of certain SCSI operations: ‒ Offload RBD image IO to OSDs

‒ Avoid Locking on iSCSI gateway nodes

‒ Compare and Write ‒ New cmpext OSD operation to handle RBD data comparison

‒ Dispatch as compound cmpext+write OSD request

‒ Write Same ‒ New writesame OSD operation to expand duplicate data at the OSD

‒ Reservations‒ State stored as RBD image extended attribute

‒ Updated using compound cmpxattr+setxattr OSD request

Page 10: Ceph Day Bring Ceph To Enterprise

10

Public Network

OSD1 OSD2 OSD3 OSD4

Multiple Path Support with iSCSI on RBD

PRESENT

FUTURE

£€

$

Cluster Network

iSCSI Gateway

RBD Module

iSCSI Gateway

RBD Module

iSCSI Initiator

RBD image

Page 11: Ceph Day Bring Ceph To Enterprise

How to manage storage growth and costs of ceph ?

Easily scale and manage data storage

Control storage growth and manage costs

Support today’s investment and adapt to the future

$

PRESENT

FUTURE

£€

$

Page 12: Ceph Day Bring Ceph To Enterprise

Introduction to openATTIC

Easily scale and manage data storage

Page 13: Ceph Day Bring Ceph To Enterprise

SUSE Enterprise Storage ManagementVision

Open Source : ‒ Alternative to proprietary storage management systems

Enterprise:‒ Work as expected with traditional storage unified storage

interface e.g. NAS, SAN

SDS Support:‒ Provide initial ceph setup in managing and monitoring to

ease in complicated scale out scenarios

It will be available in next SES release or download it now at

https://build.opensuse.org/package/show/filesystems:openATTIC/openattic

Page 14: Ceph Day Bring Ceph To Enterprise

openATTIC Features Existing capability

Modern Web UI

RESTful API‒ Software Defined Storage

Unified Storage‒ NAS (NFS, CIFS, HTTP)

‒ SAN (iSCSI, Fiber Channel)

Volume Mirroring‒ DRBD

File System‒ LVM, XFS, ZFS, Btrfs,

ext3/4

Monitoring‒ Nagios / Icinga built-in

‒ Ceph Management (WIP)

Page 15: Ceph Day Bring Ceph To Enterprise

openATTIC Architecture Technical Detail

Backend:‒ Python (Django)

‒ Django REST Framework

‒ Nagios / Icinga & PNP4Nagios

‒ Linux tools‒ LVM, LIO, DRBD

‒ Ceph API ‒ librados, librbd

Web Frontend ‒ AngularJS

‒ Bootstrap

‒ REST API

Automated Test Suites‒ Python unit tests

‒ Gatling‒ RESTful API

‒ Protractor / Jasmine ‒ WebUI test

Page 16: Ceph Day Bring Ceph To Enterprise

openATTIC Architecture High Level Overview

Django

Linux OS Tools

openATTIC SYSTEMD

RESTful API

PostgreSQL

DBUS

ShellCeph Storage Cluster

librados/librbd

Web UI REST Client

HTTP

NoDB

Page 17: Ceph Day Bring Ceph To Enterprise

openATTIC DevelopmentCurrent status

- Create and map RBDs as block devices (volumes)- Pool management Web UI (table view)- OSD management Web UI (table view)- RBD management Web UI (add/delete, table view)- Monitor a cluster health and performance- Support for managing Ceph with salt integration (WIP)

- Role management of node, monitor, storage, cephfs, iscsi, radosgw

Page 18: Ceph Day Bring Ceph To Enterprise

Volume Management

Page 19: Ceph Day Bring Ceph To Enterprise

Pool Listing

Page 20: Ceph Day Bring Ceph To Enterprise

OSD Listing

Page 21: Ceph Day Bring Ceph To Enterprise

RBD Listing

Page 22: Ceph Day Bring Ceph To Enterprise

oA Ceph Roadmap future is in your hand

- Ceph Cluster Status Dashboard incl. Performance Graphs- Extend Pool Management- OSD Monitoring/Management- RBD Management/Monitoring- CephFS Management- RGW Management (users, buckets keys)- Deployment, remote configuration of Ceph nodes (via Salt)- Public Roadmap on the openATTIC Wiki to solicit community

feedback: http://bit.ly/28PCTWf

Page 23: Ceph Day Bring Ceph To Enterprise

How ceph control storage cost?

Control storage growth and manage costs

$

Page 24: Ceph Day Bring Ceph To Enterprise

Minimal recommendation

OSD Storage Node‒ 2GB RAM per OSD

‒ 1.5GHz CPU core per OSD

‒ 10GEb public and backend

‒ 4GB RAM for cache tier

MON Monitor Node‒ 3 Mons minimal

‒ 2GB RAM per node

‒ SSD System OS

‒ Mon and OSD should not be virtualized

‒ Bonding 10GEb

Page 25: Ceph Day Bring Ceph To Enterprise

SUSE Storage Pricing

JBOD Storage

Mid-rangeArray

Mid-rangeNAS

High-endDisk Array

SUSE EnterpriseStorage

Fully FeaturedNAS Device

Entry-levelDisk Array

Page 26: Ceph Day Bring Ceph To Enterprise

Use storage with multiple tiers

WRITE APPLICATION READ APPLICATION

Writing Quickly Application like:• e.g. Video Recording• e.g. Lots of IoT Data

Reading Quickly Application like:• e.g. Video Streaming• e.g. Big Data analysis

Write TierHot Pool

Normal TierCold Pool

Read TierHot Pool

SUSE Enterprise Storage Cluster

Normal TierCold Pool

Page 27: Ceph Day Bring Ceph To Enterprise

How to create multiple price point?

1000$ = 1000G 2000MB rw4 PCIe = 4000$ = 8000MB rw 4T Storage 400,000 IOPS4$ per G

250$ = 1000G, 500MB rw16 Driver = 4000$ = 8000MB rw16T Storage 100,000 IOPS1$ per G

250$ = 8000G 150MB rw16 Driver = 4000$ = 2400MB rw128T Storage 2000 IOPS0.1$ per G

Page 28: Ceph Day Bring Ceph To Enterprise

Control Costs

How EC reduce storage cost? $

Copy Copy Copy

OBJECT

Replication Pool

SES CEPH CLUSTSER

Control Costs

OBJECT

Erasure Coded Pool

SES CEPH CLUSTSER

Data Data Data DataParity Parity

Multiple Copy of stored data• 300% cost of data size• Low Latency, Faster Recovery

Single Copy with Parity• 150% cost of data size• Data/Parity ratio trade of CPU

Page 29: Ceph Day Bring Ceph To Enterprise

Public Cloud Setup

H270-H70 - 40000$

- 48 Core * 8 : 384 Cores- 32G * 32: 1T Memory- 1T * 16: 16T SSD- 40GbE * 8

R120-T30 - 5700$ * 7- 48 Core * 7 : 336 Cores- 8 * 16G * 7 : 896G Memory- 1T * 2 * 7 : 14T SSD- 8T * 6 * 7 : 336T HDD - 40GbE * 7- 10GbE * 14

1000 Customer Running 5$ - Web Hosting = 5000$8 Months = 40000$

EC 5+2 is about 250T2500 Customer 100GB2$ Storage = 5000$8 Months = 40000$

Page 30: Ceph Day Bring Ceph To Enterprise

For developer?

OSD1OSD2OSD3OSD4

MON1

OSD5OSD6OSD7OSD8

MON2

OSD9OSD10OSD11OSD12

MON3

Dual 1G Network

300$ 300$

6T = 220$220 * 3 = 660$

512G = 150$

300$

6T = 220$220 * 3 = 660$

512G = 150$

6T = 220$220 * 3 = 660$

512G = 150$

Page 31: Ceph Day Bring Ceph To Enterprise

Pros and Cons of this mobile cluster

Price: ‒ Around 3200$ vs Expensive Laptops

Size:

‒ 50T and 20kg is mobile enough to demo a usable cluster

‒ Real HDD better for presentation of a storage solution

Benchmark:

‒ Beside Networking capability, all features and requirement of a ceph cluster meet

Features:

‒ Great fit for developers and tester to perform software base test but something that VM can’t be done

Page 32: Ceph Day Bring Ceph To Enterprise

How DevOps story fit?Introduce you salt

Support today’s investment and adapt to the future

PRESENT

FUTURE

£€

$

Page 33: Ceph Day Bring Ceph To Enterprise

Salt enable cephExisting capability

Sesceph‒ Python API library that help deploy and manage ceph

‒ Already upstream in to salt available in next release

‒ https://github.com/oms4suse/sesceph

Python-ceph-cfg‒ Python salt module that use sesceph to deploy

‒ https://github.com/oms4suse/python-ceph-cfg

Both library come with SES3.0 already

Page 34: Ceph Day Bring Ceph To Enterprise

Why Salt? Existing capability

Product setup

‒ SUSE OpenStack cloud, SUSE manager and SUSE Enterprise Storage all come with salt enable

Parallel execution

‒ E.g. Compare to ceph-deploy to prepare OSD

Customize Python module

‒ Continuous development on python api easy to manage

Flexible Configuration

‒ Default Jinja2 + YAML ( stateconf )

‒ Pydsl if you like python directly, json, pyobject, etc

Page 35: Ceph Day Bring Ceph To Enterprise

Create a cluster with a single stage file

https://github.com/AvengerMoJo/Ceph-Saltstack/blob/master/stages/ses/ceph/

ceph_create.sls

This is a show case of how a simple way to create a cluster with a simple stage file

It is up to your custom to create your own easily

Page 36: Ceph Day Bring Ceph To Enterprise

Quick deployment example

Git repo for fast deploy and benchmark- https://github.com/AvengerMoJo/Ceph-Saltstack

Demo recording- https://asciinema.org/a/4hmdsrksn0fd8fgpssdgqsjdb

1) Salt setup2) Git clone and copy module to salt _modules3) Saltutil.sync_all push to all minion nodes 4) ntp_update all nodes 5) Create new mons, and create keys 6) Clean disk partitions and prepare OSD7) Update crushmap

Page 37: Ceph Day Bring Ceph To Enterprise

Reduce storage costs and management with SUSE Enterprise Storage

Manage Less

Adapt Quickly

Control Costs

Page 38: Ceph Day Bring Ceph To Enterprise
Page 39: Ceph Day Bring Ceph To Enterprise

Scale storage from terabytes to hundreds of petabytes without downtime

SOCIAL MEDIA

BUSINESS OPERATIONS

MOBILE DATA

CUSTOMER DATA

% 100UPTIME