Accelerate Ceph By SPDK on AArch64

19
© 2018 Arm Limited Jun He, [email protected] Tone Zhang, [email protected] 2018/3/9 Accelerate Ceph By SPDK on AArch64

Transcript of Accelerate Ceph By SPDK on AArch64

Page 1: Accelerate Ceph By SPDK on AArch64

© 2018 Arm Limited

Jun He, [email protected]

Tone Zhang, [email protected]• 2018/3/9

Accelerate CephBy SPDK on

AArch64

Page 2: Accelerate Ceph By SPDK on AArch64

© 2018 Arm Limited

SPDK

Page 3: Accelerate Ceph By SPDK on AArch64

3 © 2018 Arm Limited

SPDKWhat’s SPDK?

• Storage Performance Development Kit

• A set of tools and libraries to create highperformance, scalable, user modestorage applications

• Designed for new storage HW devices(NVMe). Can achieve millions of IOPSper core. Better tail latency.

Architecture diagram

Page 4: Accelerate Ceph By SPDK on AArch64

4 © 2018 Arm Limited

SPDK on AArch64

• Several ARM related patches are merged• Memory_barrier• VA address space

• 17.10 release verified• Kernel: 4.11, 48bit/42bit VA, 4KB pagesize• UIO/VFIO

Page 5: Accelerate Ceph By SPDK on AArch64

5 © 2018 Arm Limited

SPDK Performance on AArch64

• SPDK perf• UIO/4K pagesize

• FIO

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000

IOPS Bandwidth Latency

RandRead

Kernel SPDK

0

200000

400000

600000

800000

1000000

1200000

IOPS Bandwidth Latency

RandWrite

Kernel SPDK

FIO configuration: direct=1, bs=4096, rwmixread=50, iodepth=32, ramp=30s, run_time=180s, jobs=1

Page 6: Accelerate Ceph By SPDK on AArch64

6 © 2018 Arm Limited

SPDK Performance on AArch64

• FIO

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

IOPS Bandwidth Latency

RandRW - read

Kernel SPDK

0

50000

100000

150000

200000

250000

300000

350000

IOPS Bandwidth Latency

RandRW - write

Kernel SPDK

FIO configuration: direct=1, bs=4096, rwmixread=50, iodepth=32, ramp=30s, run_time=180s, jobs=1

Page 7: Accelerate Ceph By SPDK on AArch64

7 © 2018 Arm Limited

What’s the next?

• Optimization with ASIMD and Crypto extensions

• Tuning with different page-size(16KB/64KB)

• Cache strategy improvement for better read/write performance

Page 8: Accelerate Ceph By SPDK on AArch64

© 2018 Arm Limited

Ceph

Page 9: Accelerate Ceph By SPDK on AArch64

9 © 2018 Arm Limited

CephWhat’s Ceph?

• Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability

• Ceph can supply following services• Object storage• Block storage• File system

• The backend storage types• FileStore• BlueStore

Page 10: Accelerate Ceph By SPDK on AArch64

10 © 2018 Arm Limited

BlueStoreBlueStore is a new storage backend for Ceph.

• Full data built-in compression

• Full data checksum

• Boasts better performance• Get rid of file system, and write all data to RAW

device via asynchronous libaio infrastructure

Page 11: Accelerate Ceph By SPDK on AArch64

11 © 2018 Arm Limited

Ceph on AArch64

• Has already been integrated with OpenStack

• Has been validated and released by Linaro SDI team

• Has committed many patches to fix the functional faults and improve the performance

• Has validated “Ceph + SPDK” on top of NVMe devices

• Tuned Ceph performance on AArch64

Page 12: Accelerate Ceph By SPDK on AArch64

12 © 2018 Arm Limited

Ceph + SPDK on AArch64

• Dependencies

• NVMe device

• SPDK/DPDK

• BlueStore

• Enabled SPDK in Ceph on AArch64

• Extended virtual address map bits from 47 to 48 bits in DPDK

Page 13: Accelerate Ceph By SPDK on AArch64

13 © 2018 Arm Limited

Ceph + SPDK on AArch64BlueStore is a new storage backend for Ceph.

• BlueStore can utilize SPDK

• Replace kernel driver with SPDK userspace NVMe driver

• Abstract BlockDevice on top of SPDKNVMe driver

NVMe device

Kernel NVMe driver

BlueFS

BlueRocksENV

RocksDB

metadata

NVMe device

SPDK NVMe driver

BlueFS

BlueRocksENV

RocksDB

metadata

FileStore BlueStore

CEPH RBD Service

BlockDevice

CEPH Object Service CEPHFS Service

Page 14: Accelerate Ceph By SPDK on AArch64

14 © 2018 Arm Limited

Ceph + SPDK Performance test on AArch64Test case

• Ceph cluster• Two OSD, one MON, no MDS and RGW• One NVMe card per OSD• CPU: 2.4GHz multi-core

• Client• CPU: 2.0GHz multi-core

• Test tool• Fio (v2.2.10)

• Test case:• Sequential write with different block_size (4KB,

8KB and 16KB)• 1 and 2 fio streams

Ceph clusterOSD1 OSD2

MON

Client

Page 15: Accelerate Ceph By SPDK on AArch64

15 © 2018 Arm Limited

Write performance result1 stream

1000

1500

2000

2500

3000

3500

IOPS - 4KB

1coe 2cores 4coresKernel NVMe

1core 2cores 4coresSPDK

1000

1500

2000

2500

3000

3500

IOPS - 8KB

1core 2cores 4coresKernel NVMe

1core 2cores 4coresSPDK

1000

1500

2000

2500

3000

3500

IOPS - 16KB

100

150

200

250

300

350

latency - 4KB

msec

100

150

200

250

300

350

latency - 8KB

msec

100

150

200

250

300

350

latency - 16KB

msec

1core 2cores 4coresKernel NVMe

1core 2cores 4coresKernel NVMe

1core 2cores 4coresKernel NVMe

1core 2cores 4coresKernel NVMe

1core 2cores 4coresSPDK

1core 2cores 4coresSPDK

1core 2cores 4coresSPDK

1core 2cores 4coresSPDK

1 fio stream, FIO configuration: bs=4K/8K/16K, rw=write, iodepth=384, run_time=40s, jobs=1, ioengine=rbd

Page 16: Accelerate Ceph By SPDK on AArch64

16 © 2018 Arm Limited

Write performance result2 streams

1000

1500

2000

2500

3000

3500

4000

IOPS - 4K

1000

1500

2000

2500

3000

3500

4000

IOPS - 8K

1000

1500

2000

2500

3000

3500

4000

IOPS - 16K

1core 2cores 4coresSPDK

1core 2cores 4coresSPDK

1core 2cores 4coresSPDK

1core 2cores 4coresKernel NVMe

1core 2cores 4coresKernel NVMe

1core 2cores 4coresKernel NVMe

2 fio streams, FIO configuration: bs=4K/8K/16K, rw=write, iodepth=384, run_time=40s, jobs=1, ioengine=rbd

Page 17: Accelerate Ceph By SPDK on AArch64

17 © 2018 Arm Limited

Performance improvement

SPDK accelerated Ceph in below:

• More IOPS

• Lower latency

• Linear scaling associate with the number of CPU cores

Page 18: Accelerate Ceph By SPDK on AArch64

18 © 2018 Arm Limited

What’s the next?

• Continue improving Ceph performance on top of SPDK

• Enable NVMe-OF and RDMA

• Enable zero-copy in Ceph

• Simplify the locking in Ceph to improve the OSD daemon performance

• Switch PAGE_SIZE to 16KB and 64KB to improve the memory performance

• Modify NVMEDEVICE to improve its performance associate with different PAGE_SIZE

Page 19: Accelerate Ceph By SPDK on AArch64

1919

Thank YouDankeMerci谢谢ありがとうGraciasKiitos감사합니다धन्यवादתודה

© 2018 Arm Limited