Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of...

26
Elasticsearch in production Alex Brasetvik @alexbrasetvik

Transcript of Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of...

Page 2: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

How marketing thinks our users feel

Page 3: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

How we developers sometimes feel

Page 4: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Who?

Co-founder of Found AS7+ years of search, 2+ Elasticsearch

We manage hundreds of Elasticsearch clusters

… on Amazon's cloud

Page 5: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Agenda

Memory (and stability)Security (and multi-tenancy)

Networking (and reliability)Client (and resiliency)

Page 6: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Memory

Search engines crave memoryCaches, caches, caches

Field- and filter cachesPage cache

Index building

Page 7: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

PostgreSQL

Verifies resource usageSafe >>> fast

Uses disk if necessary

Page 8: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Elasticsearch trusts youBuilt for speed

It'll jump if you ask it to

What could possibly go wrong?

Page 9: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

OutOfMemoryError

Woah there

I ate all the memories

Your cluster may or may not work any more

Page 10: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

May or may not work?

What else was happening at the time?Corrupt cluster state, crashed Netty, …

In short: Don't end up there

Page 11: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Warning signs?

Monitor cache sizes and heap spaceOutgrowing page cache: gradual slowdown

Outgrowing heap space: sudden crash

Page 12: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Understand the memory profileTest realisticly

Bound cache sizes and flush thresholdsv0.90+ takes you longer with field filters, etc.

Page 13: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Large heaps are expensive to garbage collectKeep heap < 32GiB (But test!)

Lots of page cache is good, though!

Page 14: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Security

Elasticsearch trusts everyoneNot its job to do auth(z)

You're the gatekeeper

Page 15: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

_search

Read only?Limit indexes / wrap with filters?

Protect the field caches

Page 16: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Arbitrary code execution

Elasticsearch has powerful scripting Not sandboxedOn by default

Page 17: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Any website can reach your machinehttp://127.0.0.1:9200/_search?callback=capture&source=…

Run in a virtual machine

Page 18: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Networking

Elasticsearch is distributedEasy (for a distributed system)

Supports many usage patterns.

Page 19: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Quite common topologyHigh availability, right?

Page 20: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Obey or risk split brains …… and irrecoverable data-loss

Page 22: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Stormy clouds

Zone vs instance failureThundering herds

Optimizing MTTR is not HA

Page 23: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Client considerations

Idempotent/retry-able requests  Use a connection pool.

_bulk / _msearch

Page 24: Elasticsearch in production - 2013.berlinbuzzwords.de fileWho? Co-founder of Found AS 7+ years of search, 2+ Elasticsearch We manage hundreds of Elasticsearch clusters … on Amazon's

Have enough memoryHave a majority of nodes

Don't allow arbitrary search requestsUse retryable requests