Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

77
Elasticsearch & Docker Rafał Kuć – Sematext Group, Inc. @ kucrafal @ sematext sematext.com Running High Performance Fault Tolerant Elasticsearch Clusters On Docker

Transcript of Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Page 1: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Elasticsearch & Docker

Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com

Running High PerformanceFault Tolerant

Elasticsearch Clusters On Docker

Page 2: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Next 20 minutes

metrics

Page 3: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With That

Development

Page 4: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With That

Development Test

Page 5: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With That

Development Test QA

Page 6: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With That

Development Test QA

Production enviroment

Page 7: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

And Problems That Come With It

Resources not utilized

Page 8: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

And Problems That Come With It

Resources not utilized

OverprovisonedServers

Page 9: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

And Problems That Come With It

Resources not utilized

OverprovisonedServers

≠ ≠

Page 10: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

The solution

Development Test QA Production

Page 11: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Why Docker?

Light weight

Based onOpen Standards

Secure

Page 12: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Traditional Virtual Machine

Page 13: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Traditional Virtual Machine

Page 14: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Traditional Virtual Machine

Page 15: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Traditional Virtual Machine

Page 16: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Traditional Virtual Machine

Page 17: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Traditional Virtual Machine

Page 18: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Traditional Virtual MachineContainer

Page 19: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Traditional Virtual MachineContainer

Page 20: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Libraries Libraries

Application 1 Application 2

Traditional Virtual MachineContainer

Page 21: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Why Elasticsearch?

{ JSON }

Distributed by design

http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/

Indices Aggs Admin

Monitor Search

Index

Page 22: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Running Offical Elasticsearch Container

$ docker run -d elasticsearch

Page 23: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Running Offical Elasticsearch Container

$ docker run -d elasticsearch:latest

Page 24: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Running Offical Elasticsearch Container

$ docker run -d elasticsearch:latest$ docker run -d --name es1 elasticsearch

Page 25: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Running Offical Elasticsearch Container

$ docker run -d elasticsearch:latest$ docker run -d --name es1 elasticsearch

$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch

Page 26: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Running Offical Elasticsearch Container

$ docker run -d elasticsearch:latest$ docker run -d --name es1 elasticsearch

$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch$ docker run -d --name es1 elasticsearch -Dnode.name bbuzz

Page 28: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

Page 29: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch

Page 30: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Constraints - Good Practices

Limit container memory

Page 31: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Constraints - Good Practices

Limit container memory

Account for I/O cache when giving memory

Page 32: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Constraints - Good Practices

Limit container memory

Account for I/O cache when giving memory

Limit amount of CPU cores

Page 33: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Constraints - Good Practices

Limit container memory

Account for I/O cache when giving memory

Limit amount of CPU cores

Remember about JVM GC

Page 34: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

Page 35: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t bbuzz/example .

Page 36: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t bbuzz/example .

Sending build context to Docker daemon 5.12 kBStep 1 : FROM elasticsearch ---> 1e23f30a3667Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ ---> 015f12adfd2aRemoving intermediate container de560c6ae0d1Successfully built 015f12adfd2a

Page 37: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

Page 38: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch $ docker run -d --link es1 elasticsearch -Ddiscovery.zen.ping.unicast.hosts=es1

Page 39: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

Add network.publish_host when building own container

$ docker run -d --link es1 elasticsearch -Ddiscovery.zen.ping.unicast.hosts=es1

Page 40: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

Add network.publish_host when building own containerAdd discovery.zen.ping.unicast.hosts when building own

container

$ docker run -d --link es1 elasticsearch -Ddiscovery.zen.ping.unicast.hosts=es1

Page 41: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Page 42: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h esnode1 elasticsearch

Page 43: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h esnode1 elasticsearch

Expose only needed ports

Page 44: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h esnode1 elasticsearch

Expose only needed ports

Elasticsearch data & client nodes point to masters only

Page 45: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

Page 46: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

Page 47: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

Page 48: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Dealing With Storage

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

By default in /usr/share/elasticsearch/data

By default not persisted

Use data only Docker volumes

Permissions

Page 49: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Only Docker Volumes

Bypasses Union File System

Page 50: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Page 51: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

Page 52: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

Permissions

Page 53: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

$ docker run --volumes-from esdata elasticsearch

Page 54: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

Page 55: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

minimum_master_nodes = N/2 + 1

Page 56: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Master Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=true -Dnode.data=false -Dnode.client=false

Page 57: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Client Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=false -Dnode.client=true

Page 58: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Data Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=true -Dnode.client=false

Page 59: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

Page 60: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Multiple Tiers

$ docker run -d elasticsearch -Dnode.tag=hot

Page 61: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Multiple Tiers

curl -XPUT 'localhost:9200/data_2016-06-05' -d '{ "settings": { "index.routing.allocation.include.tag" : "hot" }}'

Page 62: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2016-06-05

data_2016-06-05

Page 63: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2016-06-05

data_2016-06-05

Page 64: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Health & Shards

https://sematext.com/spm/

Page 65: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics - CPU

https://sematext.com/spm/

Page 66: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Memory Usage

https://sematext.com/spm/

Page 67: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – I/O Usage

https://sematext.com/spm/

Page 68: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – JVM Heap

https://sematext.com/spm/

Page 69: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Garbage Collector

https://sematext.com/spm/

Page 70: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Request Rate & Latency

https://sematext.com/spm/

Page 71: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics - Caches

https://sematext.com/spm/

Page 72: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Indexing

https://sematext.com/spm/

Page 73: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top Metrics – Refresh Time

https://sematext.com/spm/

Page 74: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Top 10 Metrics – Merge Time

https://sematext.com/spm/

Page 75: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Short summary

http://www.soothetube.com/2013/12/29/thats-all-folks/

Page 76: Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

We Are Hiring !

Dig Search ?Dig Analytics ?Dig Big Data ?Dig Performance ?Dig Logging ?Dig working with and in open – source ?We’re hiring world – wide !

http://sematext.com/about/jobs.html