Whats New in Elasticsearch 2.0?

36
What's New in Elasticsearch 2.0? Ryan Ernst Elastic Engineering

Transcript of Whats New in Elasticsearch 2.0?

What's New in Elasticsearch 2.0?

Ryan Ernst Elastic Engineering

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

2

About Elastic

• Founded: July 2012 • Renamed Elasticsearch → Elastic: Mar 2015 • Headquarters: Amsterdam and Mountain View, CA • Develops Elasticsearch, Logstash, Kibana, Beats • Provides: • Training (public and onsite) • Development and production support • Hosted Elasticsearch (Found) • Commercial plugins: Marvel, Shield, Watcher

The Elastic Stack

3

Ingest

Store, Index, & Analyze

User Interface Kibana

Elasticseach

Logstash Beats

Plugins Monitoring Security Alerting

Found: Elasticsearch as a ServiceHosted Service

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

4

Elasticsearch 2.0!

•Very large release • >2,500 Pull Requests

• 469 contributors

• Four themes

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

5

Four Main Themes in 2.0

• Simplification • Removing, deprecating features • Query DSL / Doc improvements

• Security • Always high on customer wish lists

• Resiliency • Started in 1.x, but ongoing

• Features • pipeline aggs • Compression

Theme 1: Simplification

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

7

Removed Entirely

• Rivers - use logstash or create your own ingestion layer • Facets - replaced by aggregations • _shutdown API - use platform specific services • Support for Thrift and Memcached protocols • Bulk UDP - use the standard bulk API, or use UDP to send

documents to Logstash first.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

8

Moved to Plugins

• Delete by query • Problematic, not a "core" feature

• Types: • murmur3 • _size

• Multicast discovery • Unicast was always recommended in production

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

9

Mappings

• Conflicting field mappings • Fields cannot be referenced by short name • Type name prefix removed • Field names cannot contain dots • Type names cannot start with a dot • Type may no longer be deleted • index_analyzer is removed • _analyzer field is removed • date format changes • ... and more ...

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

10

Conflicting Mappings

PUT my_index { "mappings": { "type_one": { "properties": { "name": { "type": "string" } } }, "type_two": { "properties": { "name": { "type": "string", "analyzer": "english" } } } } }

What is the mapping for name? Unexpected results. This is not allowed in Elasticsearch 2.0.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

11

Ambiguous Mappings in < 2.0

PUT my_index { "mappings": { "name": { "properties": { "title": { "type": "string" }, "name": { "properties": { "title": { "type": "string" } } } } } } }

What does name refer to? name.title? name.name.title?

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

12

Refactored Mappings in 2.0

PUT my_index { "mappings": { "name": { "properties": { "title": { "type": "string" }, "name": { "properties": { "title": { "type": "string" } } } } } } }

name.name.title is not a thing.

title

name.title

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

13

Analyzer Mappings

PUT my_index { "mappings": { "my_type": { "properties": { "title": { "type": "string", "analyzer": "my_analyzer } } } } }

There are some changes in how field-specific analyzers are now set. This format, which sets both search and index analyzers, is still acceptable in 2.0.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

14

Analyzer Mappings

• Before 2.0: • analyzer - sets index and search analyzer • search_analyzer - sets search analyzer • index_analyzer - sets index analyzer

• Starting with 2.0: • analyzer - sets index and search analyzers • search_analyzer - overrides search analyzer

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

15

Query and Filter Execution Changes

• Before 2.0 • Queries: • Typically contribute to scoring • No caching

• Filters: • Don't contribute to scoring • Can be cached

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

16

Query and Filter Execution Changes

{ "filtered" : { "query": { query definition }, "filter": { filter definition } } }

Before 2.0:

{ "bool" : { "must": { query definition }, "must_not": { query definition }, "should": { query definition }, "filter": { filter definition } } }

After 2.0:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

17

Query and Filter Execution Changes

• Approximation phase • quickly iterates over a superset of the matching

documents • Verification phase • check if a document in this superset actually matches

the query

Two-Phase Query Execution

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

18

Analyzer Mappings

{ "bool" : { "must": [{ "match_phrase": { "body": "quick fox" }, { "match_phrase": { "body": "brown dog" } }] } }

Two-Phase Query Execution Example

• Approximation phase • all docs with "quick", "fox", "brown", and "dog"

• Verification phase • actual phrase matching

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

19

Query and Filter Execution Changes

• Fully automatic • Keeps track of 256 most recently used queries • Only caches those that appear 5 times or more • Does not cache segments which have less than 10000

documents or 3% of the documents of the index • More efficient query cache (roaring bitmaps) • Non-scoring components are cache-able

Query Caching

Theme 2: Security

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

21

Security Enhancements

• Elasticsearch now binds to local interfaces ONLY • Unicast discovery is now the default • Makes Elasticsearch more secure by default • Protects Elasticsearch in the wild (don't do that!) • Security Manager • Prevents outside access outside of elasticsearch even if

elasticsearch process is compromised • All resources that elasticsearch can access are defined

on node startup

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

22

Plugins

• Isolated from each other (separate class loaders) • Extendable security policy (2.2) • Warns user on install when any additional permissions

are requested • Shared setup to allow common build and test • Maven parent POM in 2.x • Gradle plugin in 3.x

• Plugin descriptor • Contains version of elasticsearch built against

Theme 3: Resiliency

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

24

Durability of Transaction Log

• Before 2.0 transaction log was fsynced every 5 sec • Transaction log is now fsynced after each operation • Configurable • On SSDs indexing is about 7% - 10% slower with bulk

indexing compared to async translog flushes

Index operations are now durable by default!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

25

Multiple data path striping

Take advantage of striping in path.data configuration:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

26

Multiple data path striping

Before Elasticsearch 2.0:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

27

Multiple data path striping

PIC

Now safer in Elasticsearch 2.0!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

28

Cluster State Diffs

• Before 2.0, the entire cluster state was shipped on every change to every node

• Starting with 2.0 only changes are sent • This can be a massive improvement on clusters with large

cluster states!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

29

Non-Ambiguous Setting Units

curl -XPUT "localhost:9200/test/_settings" -d '{ "index" : { "refresh_interval" : "5" } }'

Settings now require units (when appropriate)

5 what??

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

30

Doc Values by Default

• Fielddata was a common culprit in OOMs • Doc Values: Lucene data structure (disk-based) • Dramatic heap memory reduction by default • Values for sorting, aggs, etc are moved onto disk • Let the OS deal with it! • Indexed, not_analyzed fields now use doc values • Only for indices created with 2.0 • Reindex required for older data

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

31

Previous Resiliency Improvements

• Sync-flush (1.6) • Async shard allocation (1.6) • Delayed Allocation (1.7) • Better handling of nodes leaving/rejoining

• Resiliency page contains latest information: • https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html

Theme 4: Features

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

33

Pipeline Aggregations

• Derivatives • Moving average • Holt Winters (prediction / anomaly detection) • Stats: Min/Max/avg • Time-series math

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

34

Index Compression

• 10-30% reduction in index size • Some indexing/merging impact • Dynamic setting - could be set before optimization for

time-based indices

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

35

Upgrading to Elasticsearch 2.0

• Major Version Upgrade!!! • No rolling upgrades • One way - no way to downgrade back to 1.x • Take Snapshot (and test restore) before proceeding • Test! Test! Test! • Use the Migration plugin • Site plugin for 1.x that checks for potential issues • https://github.com/elastic/elasticsearch-migration

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

36

Thank you!

@[email protected]