Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

83
MAKING YOUR ELASTIC CLUSTER PERFORM Created by @jettroCoenradie

Transcript of Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Page 1: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

MAKING YOUR ELASTIC CLUSTER PERFORM

Created by @jettroCoenradie

Page 2: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

WHY USE ELASTICSEARCH

Page 3: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 4: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 5: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 6: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 7: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

START WITH ELASTICSEARCH

Page 8: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 9: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 10: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

 curl 'localhost:9200?pretty'

 { "name" : "Tatterdemalion", "cluster_name" : "elasticsearch", "version" : { "number" : "2.3.1", "build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39", "build_timestamp" : "2016-04-04T12:25:05Z", "build_snapshot" : false, "lucene_version" : "5.5.0" }, "tagline" : "You Know, for Search"} 

Page 11: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

 

curl -XPOST 'localhost:9200/conferences/conference/1?pretty' -d '

{

"name": "Codemotion Amsterdam",

"location": "Kromhouthal"

}'

Page 12: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

THAT WAS EASY!

Page 13: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 14: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

DESIGN YOUR CLUSTER

How to install and configure?

How many nodes?

What hardware?

Page 15: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

INSTALLATION

Just download, unzip and run

Use package manager: yum, apt

Use ansible, chef or puppet

Page 16: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

CONFIGURATION

/etc/defaults/elasticsearch

/etc/elasticsearch/elasticsearch.yml

Page 17: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

/ETC/DEFAULTS/ELASTICSEARCH

 

# Heap size defaults to 256m min, 1g max

# Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g

ES_HEAP_SIZE=2g

Page 18: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

/ETC/ELASTICSEARCH/ELASTICSEARCH.YML

 cluster.name: playgroundnode.name: node-1discovery.zen.ping.unicast.hosts: ["node-1", "node-2", "node-3"]discovery.zen.minimum_master_nodes: 2path.repo: /opt/es_snapshots/script.inline: true

Page 19: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

How many nodes do I need?

Page 20: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Development / non-critical

Page 21: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Small production

Page 22: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Large production

Page 23: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

What hardware do I need?

Page 24: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

HARDWARE

Prefer cores over clock speedChoose between 8-64GbPrefer SSD

Page 25: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

DESIGN YOUR INDICES

How many shards?

How many replicas?

Time based indices?

Page 26: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

What does an index look like?

Page 27: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 28: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

How many shards do I need?

Amount of docs or terms

Indexing speeds

Not bigger than than 50Gb

Why not a lot of shards?

Start small and test

Page 29: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

How many replicas do I need?

Page 30: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Should I use Types?

Page 31: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Should I use Aliases?

Page 32: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Working with time based indices?

Option to change shards per time period

Use index templates

Page 33: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

DESIGN YOUR MAPPING

Do I need a mapping?

What do analyzers do?

Do I need an analyzer?

Page 34: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

The default uses dynamic type mapping

Make your mapping explicit: Date, Geo_point, long

disable dynamic type mapping

 PUT /_settings{ "index.mapper.dynamic":false}

Page 35: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

A mapping is persistent, can only add new things.

Page 36: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Use multi field mapping: name

PUT /conferences{ "mappings": { "conference": { "properties": { "name": { "type": "string", "analyzer": "standard", "fields": { "raw": { "type": "string", "index": "not_analyzed" } } }}}}}

Page 37: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

An analyzer creates terms out of data

Has three components:

Character filter - replace & with and

Tokenizer - on whitespace, regexp, ngrams

Filters - ascii folding, language specific, lowercase, stop words

Page 38: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

CHOOSE THE RIGHT ANALYZER FOR THE JOB

Custom using tokenizer and filters combinations

Use the multi field approach for special analyzers.

Do not analyze if you don't need it.

Page 39: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

INDEXING DOCUMENTS

How to improve indexing performance?

Page 40: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

What happens when we index a document?

Page 41: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 42: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 43: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 44: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 45: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 46: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 47: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

TIPS

DISABLE OR DECREASE REFRESH RATE

curl -XGET 'http://localhost:9200/meetups/_settings' -d '{ "index" : { "refresh_interval" : "-1" } }'

Page 48: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

INDEX WITHOUT REPLICAS

curl -XGET 'http://localhost:9200/meetups/_settings' -d '{ "index" : { "number_of_replicas" : 0 } }'

Page 49: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

USE BULK

Bulk request should be between 5-15Mb max

Round robin requests over nodes

Page 50: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

QUERYING DOCUMENTS

How to make queries faster?

Page 51: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET 'http://localhost:9200/_search'

Page 52: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 53: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 54: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 55: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 56: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 57: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET 'http://localhost:9200/meetups/_search? q=venue.city:amsterdam%20AND%20description:elasticsearch &pretty'

Page 58: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET "http://localhost:9200/meetups/_search" -d'{ "query": { "bool": { "must": [ { "match": { "venue.city": "amsterdam" } }, { "match": { "description": "elasticsearch" } } ]}}}'

Page 59: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

How to make a query Faster?

Page 60: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET "http://localhost:9200/meetups/_search" -d'{ "query": { "bool": { "must": [ { "match": { "description": "elasticsearch" } } ], "filter": { "term": { "venue.city.raw": "Amsterdam" } } }}}'

Page 61: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Query context Filter contextHow well does it match? Does it match?

Calculates score true/false

Not-cacheable Cacheable

Page 62: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Use filter context if you do not need a score

Don't ask for hits if you do not use them

Request only the fields that you need

Page 63: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET "http://localhost:9200/meetups/_search" -d'

{

"_source": {

"include": ["venue.*", "group.name", "name"]

},

"query": {

"simple_query_string": {

"query": "elastic OR elasticsearch"

}

}

}'

Page 64: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Profile api to learn about the performance

"profile": true

Page 65: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

ANALYTICS FROM DOCUMENTS

Why use not_analyzed fields?

Page 66: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Aggregations, maybe the reason why elasticsearch became so popular

Page 67: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

curl -XGET "http://localhost:9200/meetups/_search" -d'{ "size": 0, "aggs": { "byCity": { "terms": { "field": "venue.city.raw", "size": 10 } } }}'

Page 68: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

"buckets": [ { "key": "Amsterdam", "doc_count": 8 }, { "key": "Ede", "doc_count": 1 }, { "key": "Leidschendam", "doc_count": 1 }, { "key": "Rotterdam", "doc_count": 1 }]

Page 69: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 70: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

Inverted index not suitable for aggregations

Page 71: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

DOC_VALUES

Stored on disk during indexing

All fields except analyzed strings

Page 72: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

FIELDDATA

For analyzed strings

Stored in the heap

Page 73: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

MONITORING THE CLUSTER

How can I see what elastic is doing?

What numbers are important?

Page 74: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

GET /_cluster/health

{ "cluster_name": "playground", "status": "yellow", "timed_out": false, "number_of_nodes": 1, "number_of_data_nodes": 1, "active_primary_shards": 55, "active_shards": 55, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 16, "delayed_unassigned_shards": 0, "number_of_pending_tasks": 0, "number_of_in_flight_fetch": 0, "task_max_waiting_in_queue_millis": 0, "active_shards_percent_as_number": 77.46478873239437}

Page 75: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

GET /_cluster/health?level=indices

"indices": { "conferences": { "status": "yellow", "number_of_shards": 5, "number_of_replicas": 1, "active_primary_shards": 5, "active_shards": 5, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 5 }}

Page 76: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

indices - field_data, filter_cacheos - cpu, memory, loadprocess - file descriptors, cpu, memoryjvm - memory, garbage collectionthread_pool - threads, rejectedfs - disk space

GET /_nodes/stats?human

Page 77: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

_CAT API

GET /_cat/health?v

epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent

1462882362 14:12:42 playground yellow 1 1 56 56 0 0

GET /_cat/indices?v

health status index pri rep docs.count docs.deleted store.size pri.store.size

green open gridshore-logs-2016.01.19 5 0 1007 0 1.2mb

green open .kibana 1 0 100 1 100.2kb

yellow open topbeat-2016.05.04 5 1 170264 0 44.4mb

green open meetups-20160509113909 1 0 11 0 67.3kb

GET /_cat/fielddata?v

id host ip node total

aqj9L-DPR86J8CgYitcHsA 127.0.0.1 127.0.0.1 node-JC 0b

Page 78: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016
Page 79: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

CLUSTER LOGS

How to configure what is logged?

Page 80: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

LOGGING

Can be changed dynamically

PUT /_cluster/settings{ "transient" : { "logger.discovery" : "DEBUG" }}

Page 81: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

SLOWLOG

PUT /meetups/_settings{ "index.search.slowlog.threshold.query.warn" : "10s", "index.search.slowlog.threshold.fetch.debug": "500ms", "index.indexing.slowlog.threshold.index.info": "5s"}

PUT /_cluster/settings{ "transient" : { "logger.index.search.slowlog" : "DEBUG", "logger.index.indexing.slowlog" : "WARN" }}

Page 82: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

[2016-05-11 16:25:02,105][DEBUG][index.search.slowlog.query]

[meetups-20160509113909]took[518.5micros],took_millis[0],

types[], stats[], search_type[QUERY_AND_FETCH], total_shards[1]

, source[{"size":0,"aggs":{"byCity":{"terms":{"field":

"venue.city.raw","size":10}}}}], extra_source[],

Page 83: Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam 2016

QUESTIONS?

Twitter: @jettroCoenradie

Github: https://github.com/jettro

Blog: https://amsterdam.luminis.eu/news/

Licence: http://creativecommons.org/licenses/by-nc-sa/3.0/