Elasticsearch in Zalando

70
Elasticsearch Meetup Alaa Elhadba [email protected]

Transcript of Elasticsearch in Zalando

Page 1: Elasticsearch in Zalando

Elasticsearch MeetupAlaa [email protected]

Page 2: Elasticsearch in Zalando

A

C P

Page 3: Elasticsearch in Zalando

REST

Distributed

Scalable

Queryable

Page 4: Elasticsearch in Zalando

Search Engine

Page 5: Elasticsearch in Zalando
Page 6: Elasticsearch in Zalando

Red basketball tshirt...

Page 7: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 8: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 9: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 10: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 11: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 12: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 13: Elasticsearch in Zalando

SearchEngine

Page 14: Elasticsearch in Zalando

SearchEngine

Page 15: Elasticsearch in Zalando

SearchEngine

Page 16: Elasticsearch in Zalando

SearchEngine

Page 17: Elasticsearch in Zalando

SearchEngine

Page 18: Elasticsearch in Zalando

SearchEngine

Page 19: Elasticsearch in Zalando
Page 20: Elasticsearch in Zalando
Page 21: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 22: Elasticsearch in Zalando

SearchEngine

Red basketball tshirt...

Page 23: Elasticsearch in Zalando
Page 24: Elasticsearch in Zalando

Feature Extraction

Page 25: Elasticsearch in Zalando

Extracting features

It’s all about…….. TOKENS

This parka is crafted from sturdy cotton in classic army green and comes with a removable wool gilet that we've printed in leopard and can be worn inside or as an outer. The parka features all the essentials: a quilted hood, a drawstring waist, a fishtail, and utilitarian pockets. Detailed with silky padded sleeves for extra warmth and superb comfort.

?

Page 26: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Page 27: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

& => and, ph => f

Mapping char filter

Page 28: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

<b> Elasticsearch </b> -> Elasticsearch

HTML strip filter

Page 29: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Elasticsearch is an awesome technology

White space tokenizer

Page 30: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

“Foo”, “bar”, “baz”

Pattern Tokenizer

[^\\w]+foo,bar baz

Page 31: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Stemmer Token Filter

Playing, Played, Player => play

Page 32: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Shingle Token Filter

"please divide", "divide this", "this sentence", "sentence into", "into shingles"

Please divide this sentence into shingles

Page 33: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Stop Token Filter

a, about, above, after, again, against, all, am, an, and, any, are, aren't, as, at, be

Page 34: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Synonyms Token Filter

- america, usa- british, english- blue, duke blue, jade blue - cuisine, food

Page 35: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

Page 36: Elasticsearch in Zalando

<p> in the U.S.A. anyone can become president. that’s the problem </p>

Char Filters Tokenizer

Analyzer

HTML strip filter White space Stop

Stemmer

Synonyms

Token Filters

Page 37: Elasticsearch in Zalando
Page 38: Elasticsearch in Zalando

Char Filters Tokenizer Token

Filters

Analyzer

{ “america” “anyone” “become” “president” “problem” }

<p> in the U.S.A. anyone can become president. that’s the problem </p>

Page 39: Elasticsearch in Zalando

Extracting features

It’s all about…….. TOKENS

fishtail, utilitarian pocketsparka, military, army green, army, green, wool, hoodie, silky sleeves, 100% cotton, winter, leopard, jacket, coat, winter, coat, tiger, warm, casual, hiking, …

This parka is crafted from sturdy cotton in classic army green and comes with a removable wool gilet that we've printed in leopard and can be worn inside or as an outer. The parka features all the essentials: a quilted hood, a drawstring waist, a fishtail, and utilitarian pockets. Detailed with silky padded sleeves for extra warmth and superb comfort.

Page 40: Elasticsearch in Zalando

Design for user expectations

Acronyms: “I.B.M” , “Wi-Fi”, “U.S.A” , “IT” , “AFAIK” , “LOL”

Telephone Numbers: (+49) 152-02434977, (0049)15202434977, 015202434977

Names: “John Smith”, “John A. Smith”, “John Adam Smith”, “John S.”

1(800)867-5209

Page 41: Elasticsearch in Zalando

Tailored analysis per field

Page 42: Elasticsearch in Zalando

The Art of Ranking

Page 43: Elasticsearch in Zalando

Ranking

● Filtering

● Boosting

● Scoring

Page 44: Elasticsearch in Zalando

Ranking

User Query

white sneakers

Color Category

Page 45: Elasticsearch in Zalando

Ranking

User Query

Color Category Recency Availability Location Business Value

Page 46: Elasticsearch in Zalando

Ranking

User Query

Color Category Recency Availability Location Business Value

Page 47: Elasticsearch in Zalando

Ranking

User Query

Color Category Recency Availability Location Business Value

Page 48: Elasticsearch in Zalando

Ranking

Boosting Boosting Score Func. Filtering Score Func.Filtering

User Query

Boosting Boosting Score Func. Filtering Score Func.Filtering

Color Category Recency Availability Location Business Value

Page 49: Elasticsearch in Zalando

Ranking

Color Category Recency Availability Location Business Value

Boosting Boosting Score Func. Filtering Score Func.Filtering

User Query

Page 50: Elasticsearch in Zalando

Boosting

Base score

Base Score

Total score = Base score + Additive Score Total score = Base score X Multiplicative Score

Adding Scores Multiplying Scores

Page 51: Elasticsearch in Zalando

Scoring in Elasticsearch

Function Score Query

● weight● field_value_factor● random_score● Decay functions● script_score

Page 52: Elasticsearch in Zalando

SearchEngine

Data ingestion & enrichment

Data retrieval & ranking

Page 53: Elasticsearch in Zalando
Page 54: Elasticsearch in Zalando

Elasticsearch in Zalando

Page 55: Elasticsearch in Zalando

Shop The Look

Page 56: Elasticsearch in Zalando

Shop The Look

Page 57: Elasticsearch in Zalando

Product Service

● Fetch articles by sku

● Fetch articles by urlkey

● Fetch articles by family_sku

Page 58: Elasticsearch in Zalando

Key-Value Store ?

Page 59: Elasticsearch in Zalando

Catalog use-case

example:100 Articles X

Page 60: Elasticsearch in Zalando

Catalog use-case

example:100 Articles X 5 Colors

Page 61: Elasticsearch in Zalando

Catalog use-case

example:100 Articles X 5 Colors X 1000 RPS

Page 62: Elasticsearch in Zalando

Catalog use-case

example:100 Articles X 5 Colors X 1000 RPS = 500,000 RPS

Page 63: Elasticsearch in Zalando

Catalog use-case

Key-Value Store

example:100 Articles X 5 Colors X 1000 RPS = 500,000 RPS

Page 64: Elasticsearch in Zalando

Product Service

Page 65: Elasticsearch in Zalando

Auto Scaling

shards_per_node: 3

Page 66: Elasticsearch in Zalando

Auto Scaling

shards_per_node: 1

Page 67: Elasticsearch in Zalando

The New PDP

Reviews Shop The LookProducts

Page 68: Elasticsearch in Zalando

Elasticsearch Express

● Easy deployment across multiple AZs

● Start serving data in less than 10 minutes

● Full data availability guarantee on each AZ

● Role separation of nodes

● Stable master election

● No manual configuration on AWS

● Automatic data backups in S3 bucket

● ES Monitoring dashboard template

Page 69: Elasticsearch in Zalando
Page 70: Elasticsearch in Zalando

www.search-relevancy-workshop.com

A hands-on workshop for building killer search applications with Elasticsearch.

?