Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf ·...

20
Transit from SQL to Elastic Search By Manjunathan Raman

Transcript of Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf ·...

Page 1: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Transit from SQL to Elastic Search By Manjunathan Raman

Page 2: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

WHY ELASTIC SEARCH

Highly scalable open-source full-text search and analytics engine

Product Search, Log Analysis (ELK), Search As you Type, Did you mean

Usability Schema-less Default Lucene Standard Analyzer Fuzzy, Facets/aggregations, Histogram, Filter cache, date range, geo distance, boost, doc_values, paging Being fast Relevance Tuning Percolate Search

Page 3: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Document Mapping PUT my_index { "mappings": { "user": { "_all": { "enabled": false }, "properties": { "title": { "type": "string" }, "name": { "type": "string" }, "age": { "type": "integer" } }},"blogpost": { "properties": { "title": { "type": "string" }, "body": { "type": "string" }, "user_id": { "type": "string", "index": "not_analyzed" },"created": { "type": "date", "format": "strict_date_optional_time||epoch_millis" }}}}}

Page 4: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Field Types a simple type like string, date, long, double, boolean or ip.

a type which supports the hierarchical nature of JSON such as object or nested.

or a specialised type like geo_point, geo_shape, or completion.

“Index”: “not_analyzed”

“no”

Page 5: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Analyzer

Page 6: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Inverted Index

Page 7: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Query Related Index

curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{

"user" : "kimchy”,

"post_date" : "2009-11-15T14:12:12”,

"message" : "trying out Elasticsearch”}’

Get

curl -XGET 'http://localhost:9200/twitter/tweet/1'

Delete

curl -XDELETE 'http://localhost:9200/twitter/tweet/1'

Page 8: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Update

curl -XPUT localhost:9200/test/type1/1 -d '{

"counter" : 1,

"tags" : ["red”] }’

Multi Get

curl 'localhost:9200/_mget' -d '{

"docs" : [

{

"_index" : "test", "_type" : "type", "_id" : "1"

},{

"_index" : "test", "_type" : "type", "_id" : "2"

}] }’

Bulk API

The bulk API makes it possible to perform many index/delete operations in a single API call.

Page 9: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Query { "fields" : ["tpnb","name"], "query" : { "filtered": { "filter": { "bool" : { "must" : [ { "term": { "store_price.store": 2396}} ,{ "term": { "tpnb": 50006459} } ,{ "term": { "price": "1.65"} } ] } } } } }

Page 10: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

{ "query" : { "nested" : { "path" : "stores", "query" : { "filtered": { "filter": { "bool" : { "must" : [ { "term": { "stores.store": 2396}},{ "term": { "stores.price": "1.73" }} ]}}}}}} , "sort": [ {"stores.availability": { "order": "asc","mode": "min", "nested_filter" : { "bool" : { "must" : [ { "term": { "stores.store": 2396}},{ "term": { "stores.price": "1.73" }} ]}}}},"popularity"] ,"fields" : ["tpnb"] ,"aggs" : { "storesprice" : { "nested" : { "path" : "stores"}, "aggs" : { "min_price" : { "min" : { "field" : "stores.price" } }}}}}

Page 11: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Distributed Storage An index should be sharded proportionally with the anticipated growth. As more nodes are added to an Elasticsearch cluster, it does a good job at reallocating and moving shards around. As such, Elasticsearch is very easy to scale out.

Each shard contains multiple "segments", where a segment is an inverted index

While you are indexing documents, Elasticsearch collects them in memory

Then every second or so, writes a new small segment to disk, and "refreshes" the search.

Page 12: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Key Points Write once

Query And then fetch

Query/filter And term/match

Nested, Inner Hits

Concurrency Control by document version

You can also specify the consistency level of index-operations, in terms of how many replicas must acknowledge the operation before returning

Elasticsearch has a concept of "query time" joining with parent/child-relations, and "index time" joining with nested types.

Index Vs Type

Alias – Filtered, Routing, multiple indices (array or pattern)

.scripts (groovy) - evaluated custom expression or Function score query

Mustache Template

PostMan, Head plugin

Page 13: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Elastic Search - Limits Architect you application with right set of DB

Relational No SQL – Graph Oriented Database

Non-relational > Denormalized > Document Oriented

Isolation > Transaction > ACID > Distribution Transactions

Nearly Real Time

Robust – costly query – cancel

Split Brain – Data Loss

Security

Page 14: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Different Models - Legacy

Page 15: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Different Model – with Elasticsearch

Page 16: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine
Page 17: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Java Clients for Elastic Search The Native Client – Application node client integrates with Elasticsearch Cluster, it knows the cluster state, less hope to get document from a specific shard of a node.

Jest – Light weight client, uses Elastic Rest API

Spring Data Elasticsearch – Comes with similar feel of other Spring Data Project, one step further to Jest, you can annotate data object like @Id, @Field, @Document

Page 18: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

References https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

https://www.elastic.co/blog/found-elasticsearch-as-nosql

https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up

https://qbox.io/blog/optimizing-search-results-in-elasticsearch-with-scoring-and-boosting

https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch

https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up

http://stackoverflow.com/questions/15426441/understanding-segments-in-elasticsearch

Page 19: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Quiz What query language does elasticsearch use?

SQL Query DSL Query SSL ElasticClient

What filter can be used to combine multiple filters? Term Range Exists Bool

”Synonym" is an example of: Tokenizer Analyzer Token Filter Character Filter

Page 20: Transit from SQL to Elastic Search - Meetupfiles.meetup.com/19156515/ElasticSearch_Session1.pdf · WHY ELASTIC SEARCH Highly scalable open-source full-text search and analytics engine

Quiz Answers

B. Query DSL - Query DSL is Elastic search native Query Domain specific Language

D. Bool, which is to combine multiple filters using must/should

C. Token filter, which is the usage of synonym to filter based on synonyms example

E.g. “synonyms”: [“british,english”, “hen,chicken”]