Download - Scaling real-time search and analytics with Elasticsearch

Transcript
Page 1: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Clinton Gormley @clintongormley

Scaling real time search and analytics with

elasticsearch

Page 2: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Page 3: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch.org/guide

Page 4: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch

Page 5: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch• real-time

Page 6: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch• real-time • distributed

Page 7: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch• real-time • distributed • search

Page 8: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch• real-time • distributed • search • analytics

Page 9: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

how to use it?

Page 10: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

how to use it?

Page 11: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

how does it work?

Page 12: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 1:making text searchable

Page 13: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Page 14: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

where content like “%darling%buds%”

Page 15: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

slow & inflexible

Page 16: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Page 17: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Term Doc  1 Doc  2 Doc  3breathebringsbudsbutbycan…damaskeddarlingdatedaydeafdeathdeclinesdelight

sorted list of unique terms

Page 18: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Term Doc  1 Doc  2 Doc  3breathebringsbudsbutbycan…damaskeddarlingdatedaydeafdeathdeclinesdelight

where they occur

Page 19: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Term Doc  1 Doc  2 Doc  3breathebringsbudsbutbycan…damaskeddarlingdatedaydeafdeathdeclinesdelight

Page 20: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Term Doc  1 Doc  2 Doc  3breathebringsbudsbutbycan…damaskeddarlingdatedaydeafdeathdeclinesdelight

Page 21: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index

Page 22: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies

Page 23: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies » relevance

Page 24: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length

» relevance

Page 25: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length

» relevance » doc weight

Page 26: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length • term positions

» relevance » doc weight

Page 27: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length • term positions

» relevance » doc weight » word proximity

Page 28: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length • term positions • char offsets

» relevance » doc weight » word proximity

Page 29: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index• term frequencies • text length • term positions • char offsets

» relevance » doc weight » word proximity » highlighting

Page 30: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted indexnot just for text

Page 31: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted indexnumbers, dates, bools, enums

geopoints, geoshapes, etc

Page 32: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 2:analytics

Page 33: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

for searchmap values → doc_ids

Page 34: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

for searchmap values → doc_ids

for analyticsmap doc_ids → values

Page 35: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

uninvert the index

Page 36: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

uninvert the indexcache values in memory

called “fielddata”

Page 37: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

uninvert the indexdata access from RAM

very fast

Page 38: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

on-the-fly analyticsin the context of

a user’s query

Page 39: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

on-the-fly analyticsrelevant analytics

for each user

Page 40: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

calculate metricscount, min, max, sum, avg,

percentiles, cardinality, stddev, variance, sum of squares

!

Page 41: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

grouped bypopular terms, significant terms, ranges, dates, geolocation, etc

Page 42: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

grouped bygroups can

… contain subgroups … which contain subgroups

etc

Page 43: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 3:building the inverted index

Page 44: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

inverted index

Page 45: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable

Page 46: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable• cache friendly

Page 47: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable• cache friendly • reads from RAM

Page 48: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable• cache friendly • reads from RAM • fielddata never changes

Page 49: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable• cache friendly • reads from RAM • fielddata never changes • compressible

Page 50: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

immutable• cache friendly • reads from RAM • fielddata never changes • compressible • no locking

Page 51: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but, immutable…

Page 52: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 4:dynamic inverted index

Page 53: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

in-memory buffer

commit

segment

Page 54: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

Page 55: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

commit

Page 56: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

Page 57: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

commit

Page 58: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

Page 59: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit

Page 60: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment

Page 61: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment • write new commit point

Page 62: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment • write new commit point • fsync

Page 63: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment • write new commit point • fsync • clear buffer

Page 64: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment • write new commit point • fsync • clear buffer • reopen index

Page 65: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene commit• write new segment • write new commit point • fsync ← expensive! • clear buffer • reopen index

Page 66: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 5:near real-time search

Page 67: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

in-memory buffer

flush

segment

Page 68: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

Page 69: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

flush

Page 70: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

Page 71: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

flush

Page 72: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

Page 73: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchablecommit

Page 74: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

commit point

segment

searchable

Page 75: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene flush

Page 76: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene flush• write new segment • clear buffer • reopen index !

Page 77: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene flush• write new segment • clear buffer • reopen index • no fsync

Page 78: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lucene flush• write new segment • clear buffer • reopen index • no fsync → lightweight

Page 79: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…data not safe until fsync’ed!

Page 80: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 6:don’t lose data

Page 81: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 6:don’t lose data → transaction log

Page 82: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

in-memory buffer

flush

segment

translog

Page 83: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

translog

Page 84: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

flush

translog

Page 85: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

translog

Page 86: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

translog

commit

Page 87: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

segment

searchable

translog

commit point

Page 88: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

elasticsearch “refresh”• lucene “flush” • makes changes searchable • lightweight !

Page 89: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• lucene “commit” • clears transaction log • persists changes • heavy !

elasticsearch “flush”

Page 90: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

refresh every second

Page 91: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

near real-time search!

Page 92: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

near real-time search!near real-time analytics!

Page 93: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…

Page 94: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• slow searches • poor term frequencies • poor compression !

!

too many segments

Page 95: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 7:reduce segments

Page 96: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 97: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 98: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 99: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 100: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 101: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 102: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 103: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 104: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 105: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 106: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 107: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

searchable

Page 108: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

merge process• many small → one big • removes deleted docs • runs in background • throttled

Page 109: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…

Page 110: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.“Any wonder it broke down” by Brian Snelson is licensed under CC BY 2.0

Page 111: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

sometimes you need another truck

Page 112: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 8:scale out, not up

Page 113: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

shard your data

Page 114: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

shard your datatransparent in elasticsearch

Page 115: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

many segments

Page 116: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

one shard

ss

many segments →

Page 117: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

one shard

ss

many segments

ssssssss

many shards

ss

Page 118: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

one shard

ss

many segments

one index

IIssssssss

many shards

ss

Page 119: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“node”running instance of elasticsearch

≈ one server

Page 120: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“shard”bucket of data

lives on one nodephysical worker unit

Page 121: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“index”logical namespace

points to one or more shards

Page 122: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“index”logical namespace

points to one or more shards

shard = hash(_id) % no_of_shards

Page 123: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

PUT doc _id:1

hash(1) % 3 ⇒ shard_2

node_A

shard_0

node_B

shard_1

node_C

shard_2

Page 124: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

GET doc _id:2

hash(2) % 3 ⇒ shard_0

node_A

shard_0

node_B

shard_1

node_C

shard_2

Page 125: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

Search all docs

shard = hash(_id) % no_of_shards

node_A

shard_0

node_B

shard_1

node_C

shard_2

Page 126: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 9:scaling elastically

Page 127: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

start smallnode_A

shard_0

shard_1

shard_2

Page 128: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add more nodesnode_A

shard_0

shard_1

shard_2

node_B node_C

Page 129: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

shards migratenode_A

shard_0

shard_1

shard_2

node_B

shard_1

node_C

shard_2

Page 130: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

rebalancednode_A

shard_0

node_B

shard_1

node_C

shard_2

Page 131: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add new indexnode_A

shard_0

shard_1

node_B

shard_1

shard_2

node_C

shard_0

shard_2

Page 132: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…

Page 133: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…more hardware?

Page 134: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…more hardware?

more hardware failure

Page 135: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

at 3am on sunday…node_A

shard_0

shard_1

node_B

shard_1

shard_2

node_C

shard_0

shard_2

Page 136: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

boom!node_A

shard_0

shard_1

node_B

shard_1

shard_2

node_C

shard_0

shard_2

Page 137: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 10:add redundancy

Page 138: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

for every shard…make a copy

Page 139: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“primary shard”main shard

Page 140: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“replica shard(s)”copy of primary shard

Page 141: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

one nodenode_A

P0

P1

P2

Page 142: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add a nodenode_A

P0

P1

P2

node_B

Page 143: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add a nodenode_A

P0

P1

P2

node_B

R0

R1

R2

Page 144: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

redundancynode_A

P0

P1

P2

node_B

R0

R1

R2

Page 145: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add a nodenode_A

P0

P1

P2

node_B

R0

R1

R2

node_C

Page 146: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

add a nodenode_A

P0

P1

P2

node_B

R0

R1

R2

R1

node_C

P0

Page 147: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

rebalancednode_A

P0

P1

P2

node_B

R0

R1

R2

node_C

P0

Page 148: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

lose a nodenode_A

P0

P1

P2

node_B

R0

R2

R1

node_C

P0

Page 149: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

replica ⇒ primarynode_A

P0

P1

P2

node_B

R0

R2

Page 150: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

replica ⇒ primarynode_A

P0

P1

P2

node_B

P0

R2

Page 151: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

allocate replicasnode_A

P0

P1

P2

node_B

P0

R2

R0

R1

Page 152: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

rebalancednode_A

P0

P1

P2

node_B

P0

R2

R0

R1

Page 153: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

primary shard• just a role • receives doc changes first • forwards new doc to replicas in parallel • number of primaries fixed

Page 154: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

replica shard• copy of primary shard • serves read/search requests • number of replicas can be changed • more replicas → more read throughput

*if you have more hardware*

Page 155: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…

Page 156: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

but…who controls all this?

Page 157: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

step 11:the master node

Page 158: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.“Master Yoda” by Gonzalo Martín is licensed under CC BY-SA 2.0

Page 159: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“node”running instance of elastic search

Page 160: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“node”running instance of elastic search

node_A

Page 161: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“cluster”one or more nodes

with same cluster name working together

Page 162: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

“cluster”

node_A node_B

Page 163: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

discover a clusterwith multicast/unicast

Page 164: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

discover a clusterwith multicast/unicast

Page 165: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

request routingsend request to any node

Page 166: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

request routingforwards to correct node

Page 167: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

how?

Page 168: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

how?every node knows where

every document is

Page 169: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

cluster stateevery node knows where

every document is

Page 170: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

cluster statecluster level information

Page 171: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

cluster statecluster level information

indices ⇔ shards ⇔ nodes

Page 172: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

cluster statecan only be updated by

the master node

Page 173: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A

master nodeelected when cluster forms

Page 174: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B

master node

Page 175: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B

master node

node_C

Page 176: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

master node

Page 177: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

master nodejust a role

Page 178: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_A node_B node_C

master nodere-elected if master fails

Page 179: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_B node_C

master node

node_A

re-elected if master fails

Page 180: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

node_B node_C

master nodere-elected if master fails

Page 181: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

master nodeonly manages

cluster level changes

Page 182: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

master nodenot doc-level

get/put/search

Page 183: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

the result?

Page 184: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

distributed real-time

search & analytics

Page 185: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

which works in the same way on your laptop…

Page 186: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

…as on your 1,000 node cluster

Page 187: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

who is using it?

Page 188: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• full text search • highlighted search snippets • search-as-you-type • did-you-mean suggestions

Page 189: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• combine visitor logs with social network data

• real-time feedback to editors

Page 190: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• combines full text search with geolocation

• uses more-like-this to find related questions and answers

Page 191: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• search repositories, users, issues, pull requests

• search 130 billion lines of code • track all alerts, events, logs

Page 192: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

• index and analyse 5TB of log data every day

Page 193: Scaling real-time search and analytics with Elasticsearch

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.

thank you@clintongormley

elasticsearch.org/downloadselasticsearch.com/support

elasticsearch.com/jobs