Sizing Your Couchbase Cluster_Tokyo_14

31
How Many Nodes? Properly Sizing your Couchbase Cluster Perry Krug Sr. Solutions Architect

Transcript of Sizing Your Couchbase Cluster_Tokyo_14

Page 1: Sizing Your Couchbase Cluster_Tokyo_14

How Many Nodes?Properly Sizing your Couchbase Cluster

Perry Krug

Sr. Solutions Architect

Page 2: Sizing Your Couchbase Cluster_Tokyo_14

Size Couchbase Server

Sizing == performance

•Serve reads out of RAM•Enough IO for writes and disk operations•Mitigate inevitable failures

Reading Data Writing Data

Couchbase Server

Give medocument A

Here is document A

Application Server

A

Couchbase Server

Please storedocument A

OK, I storeddocument A

Application Server

A

Couchbase Serverサイジング

サイジングはパフォーマンスと等価

Page 3: Sizing Your Couchbase Cluster_Tokyo_14

Scaling out permits matching of aggregate flow rates so queues do not grow

Application ServerApplication Server Application Server

network networknetwork

Couchbase Server Couchbase Server Couchbase Server

Page 4: Sizing Your Couchbase Cluster_Tokyo_14

5 Factors of Sizing

Page 5: Sizing Your Couchbase Cluster_Tokyo_14

How many nodes?

5 Key Factors determine number of nodes needed:

1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety

(per-bucket, multiple buckets aggregate)

Couchbase Servers

Web application server

Application user

ノード数の算定

ノード数の決定のための5つのファクター

Page 6: Sizing Your Couchbase Cluster_Tokyo_14

RAM sizing

1) Total RAM:• Managed document cache:

• Working set• Metadata• Active+Replicas

• Index caching (I/O buffer)

Keep working set in RAM for best read performance

Server

Give medocument A

Here is document A

Application Server

A

A

A

Reading DataRAMのサイジング

ワーキングセットをRAM上に保持しつづけることが読込性能の面では重要

Page 7: Sizing Your Couchbase Cluster_Tokyo_14

Working set depends on your application

Late stage social gameMany users no longer

active; few logged in at any given time.

Ad NetworkAny cookie can show up

at any time.

Business applicationUsers logged in during

the day. Day moves around the globe.

working/total set = 1working/total set = .01 working/total set = .33

Couchbase Server Couchbase Server Couchbase Server

ワーキングセットは、開発対象に依存する

Page 8: Sizing Your Couchbase Cluster_Tokyo_14

RAM Sizing - View/Index cache (disk I/O)

• File system cache availability for the index has a big impact performance:

• Test runs based on 10 million items with 16GB bucket quota and 4GB, 8GB system RAM availability for indexes

• Performance results show that by doubling system cache availability­ query latency reduces by half

­ throughput increases by 50%

• Leave RAM free with quotas

Viewやindex キャッシュ(disk I/O) のRAMサイジング

ファイルシステム側のキャッシュ容量が、indexのパフォーマンスに大きく影響

16GBのバケット容量、indexに対して4G/8GBのキャッシュ容量にて、1000万itemをテスト実行

上記4Gと8Gを対比したところ、性能は2倍もの差に

RAM空き容量は十分に確保

Page 9: Sizing Your Couchbase Cluster_Tokyo_14

Disk Sizing: Space and I/O

2) Disk• Sustained write rate• Rebalance capacity• Backups• XDCR• Views/Indexes • Compaction• Total dataset:

(active + replicas + indexes)• Append-only

I/O

Space

Please storedocument A

OK, I storeddocument A

Application Server

A

Server

A

A

Writing DataDiskのサイジング

Page 10: Sizing Your Couchbase Cluster_Tokyo_14

Disk Sizing: Space and I/O• Disk Writes are Buffered

­ Bursts of data expand the disk write queue

­ Sustained writes need corresponding throughput

• Disk throughput affected by disk speed­ SSD > 10K RPM > EBS

­ SSDs give a huge boost to write throughput and startup/warmup times

­ RAID can provide redundancy and increase throughput

• Throughput = read/write+compaction+indexing+XDCR

• 2.1 introduces multiple disk threads­ Default is 3 (1 writer / 2 readers), max is 8 combined

• Best to configure different paths for data and indexes

• Plan on about 3x space (append-only, compaction, backups, etc)

ディスクの空き容量と、I/Oディスク書込みはバッファされる

ディスクスループットは、ディスク速度に影響される

スループット = 読込み/書込み + コンパクション + インデックス + XDCR

2.1からは、ディスク処理はマルチスレッド化

ベストな設定は、データ格納ディレクトリと、インデックス格納ディレクトリを分離すること

ディスク容量は必要データサイズ×3に

(追記のための容量、コンパクションのための容量、バックアップのための容量)

Page 11: Sizing Your Couchbase Cluster_Tokyo_14

CPU sizing

3) CPU• Disk writing• Views/compaction/XDCR• RAM r/w performance not impacted

• Min. production requirement: 4 cores+1 per bucket+1 core per Design Doc+1 core per XDCR stream

CPUのサイジング

Page 12: Sizing Your Couchbase Cluster_Tokyo_14

Network sizing

4) Network• Client traffic• Replication (writes)• Rebalancing• XDCR

Replication (multiply writes) and Rebalancing

Reads+Writes

Networkのサイジング

Page 13: Sizing Your Couchbase Cluster_Tokyo_14

Network Considerations

• Low latency, high throughput (LAN) - within cluster

• Eliminate router hops:­ Within Cluster nodes

­ Between clients and cluster

• Check who else is sharing the network

• Increase bandwidth by:­ Add more nodes (will scale linearly)

­ Upgrade routers/switches/NIC’s/etc

Networkについての検討事項

Page 14: Sizing Your Couchbase Cluster_Tokyo_14

Data Distribution

• 5) Data Distribution / Safety (assuming one replica):

• 1 node = Single point of failure

• 2 nodes = +Replication

• 3+ nodes = Best for production

• Autofailover

• Upgrade-ability

• Further scale-ability

• Note: Many applications will need more than 3 nodes

Servers fail, be prepared. The more nodes, the less impact a failure will have.

データの分散配置

Page 15: Sizing Your Couchbase Cluster_Tokyo_14

How many nodes recap

5 Key Factors determine number of nodes needed:

1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety

Couchbase Servers

Web application server

Application user

ノード数算定(まとめ)

Page 16: Sizing Your Couchbase Cluster_Tokyo_14

Hardware Minimums

RAM: At least ~4GB (highly dependent on data set)

Disk: Fastest “local” storage available-SSD is better-RAID 0 or 10, not 5

CPU (minimums): 4 cores+ 1-per bucket+ 1-per design document+ 1-per XDCR stream

Hardware requirements/recommendations are the intersection of what’s needed versus what’s available.

最低ハードウェアー要件

ハードウェア要件は常に、必要性と可能性のせめぎ合い

RAMは、最低4GB必要(データセットに大いに依存)

「ローカル」ストレージは最低必要(SSDがベター、RAIDは0か10、5はだめ)

CPUは最低4コアで(1つはバケットへ、1つはビューへ、1つはXDCRへ)

Page 17: Sizing Your Couchbase Cluster_Tokyo_14

Effects of…

Page 18: Sizing Your Couchbase Cluster_Tokyo_14

Views/Indexes

• Effect on scale/sizing:­ Increase the CPU and disk IO requirements

• More complex views require more CPU

• More view output requires more disk IO

­ More RAM should be left out of the quota for better IO caching

• Indication:­ Indexes significantly behind data writes (or growing delays)

• What do to:­ Make sure you follow best practices in view writing

­ Add more nodes to distribute processing “work”

­ Look into SSD’s

View/Indexの影響

スケールやサイジングへの影響

スケールアウトの兆候

対処法

Page 19: Sizing Your Couchbase Cluster_Tokyo_14

XDCR

• Effect on scale/sizing:­ XDCR is CPU Intensive

­ Disk IO will double

­ Memory needs to be sized accordingly (bi-directional may mean more data)

• Indication:­ A rising XDCR queue on source

• What to do:­ More nodes on source and destination will drain queue faster (scales

linearly)

­ Tune replication streams according to CPU availability

XDCRの影響

スケールやサイジングへの影響

スケールアウトの兆候

対処法

Page 20: Sizing Your Couchbase Cluster_Tokyo_14

As your workload grows…

• Effects on scale/sizing:­ More reads:

• Individual documents will not be impacted (static working set)

• Views may require faster disks, more disk IO caching

­ More writes will increase disk IO needs

• Indications:­ Cache miss ratio rising

­ Growing disk write queue / XDCR queue

­ Compaction not keeping up

• What to do:­ Revise sizing calculations and add more nodes if needed

Most applications don’t need to scale the number of nodes based upon normal workload variation.

ワークロードが増加した

スケールやサイジングへの影響

スケールアウトの兆候

対処法

Page 21: Sizing Your Couchbase Cluster_Tokyo_14

As your dataset grows…

• Effects on scale/sizing:­ Your RAM needs will grow:

• Metadata needs increase with item count

• Is your working set increasing?

­ Your disk space will likely grow (duh?)

• Indications:­ Dropping resident ratio

­ Rising ejections/cache miss ratio

• What to do:­ Revise sizing calculations, add more nodes

­ Remove un-needed data

This is the most common need for scaling and will most likely result in needing more nodes

スケールやサイジングへの影響

スケールアウトの兆候

対処法

データセットが増加した場合

Page 22: Sizing Your Couchbase Cluster_Tokyo_14

Rebalancing

• Yes there is resource utilization during a rebalance but a “properly” sized cluster should not have any effect on performance during a rebalance:

­ Distribution of data and work across all nodes

­ Managed caching layer separates RAM-based performance from IO utilization

­ Rebalance automatically manages working set in RAM

­ Rebalance automatically throttles itself if needed

­ Can be stopped midway without endangering data or progress

• Proper sizing includes not maxing out all resources: leave some headroom in preparation

リバランスの影響

たしかにリバランス中には、サーバリソースを費やすが、「適正に」サイジングされたクラスならば、パフォーマンスには影響を与えない

「適正な」サイジングとは、必ずしも全リソースの要件上の上限値ではない

Page 23: Sizing Your Couchbase Cluster_Tokyo_14

Monitor and Grow

Page 24: Sizing Your Couchbase Cluster_Tokyo_14

What to Monitor

• Application­ Ops/sec (breakdown of r/w/d/e(xpiration))

­ Latency at client

• RAM­ Cache miss ratio

­ Resident Item Ratio

• Disk­ Disk Write Queue (proxy for IO capacity)

­ Space (compaction and failed-compaction frequency)

• XDCR/Indexing/Compaction progress

モニタ対象

オペレーション数/秒

レイテンシ(クライアントにおいて)

ミスヒット率

アイテム常駐率

ディスク・ライトキュー

ディスク空き容量

Page 25: Sizing Your Couchbase Cluster_Tokyo_14
Page 26: Sizing Your Couchbase Cluster_Tokyo_14

Adding Capacity

Couchbase Scales out Linearly:

Need more RAM? Add nodes…

Need more Disk IO or space? Add nodes…

Monitor sizing parameters and growth to know when to add more nodes

Couchbase also makes it easy to scale up by swapping larger nodes for smaller ones without any disruption

Couchbaseはリニアにスケールする

RAM容量を増やしたい?ノードを追加しよう

ディスク容量を増やしたい?ノードを追加しよう

サイジングパラメータと増加率をモニタすれば、ノードの追加時期を知ることが可能

Couchbaseは、サーバをスケールアップする場合も容易に実行でき、

その際サービス停止を伴わない

能力の増強

Page 27: Sizing Your Couchbase Cluster_Tokyo_14

Sizing is tricky business…

Work with the Couchbase Team

Validate your “on-paper” numbers with testing

Constantly monitor production

いずれにしてもサイジングは困難な作業

Page 28: Sizing Your Couchbase Cluster_Tokyo_14

Dive in…

Gather your workload and dataset requirements:

Item counts and sizes, read/write/delete ratios

Review our documentation and formulas

Test, Deploy, Monitor…rinse and repeat

サイジングを掘り下げていく

Page 29: Sizing Your Couchbase Cluster_Tokyo_14

Want more?

Lots of details and best practices in our documentation:

http://www.couchbase.com/docs/

And my sizing blog:

http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-cluster

理解を深めるためには?

Page 30: Sizing Your Couchbase Cluster_Tokyo_14

Thank you

Couchbase NoSQL Document Database

[email protected]@couchbase

Page 31: Sizing Your Couchbase Cluster_Tokyo_14

Appendix