NoSQL Riak MongoDB Elasticsearch - All The Same?

74
MongoDB, Elasticsearch, Riak – all the same? Eberhard Wolff Freelancer Head Technology Advisory Board adesso AG http://ewolff.com

Transcript of NoSQL Riak MongoDB Elasticsearch - All The Same?

Page 1: NoSQL Riak MongoDB Elasticsearch - All The Same?

MongoDB, Elasticsearch, Riak – all the same?

Eberhard Wolff Freelancer

Head Technology Advisory Board adesso AG

http://ewolff.com

Page 2: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Leseprobe: http://bit.ly/CD-Buch

Page 3: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Modeling: Relational

Databases vs. JSON

Page 4: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Financial System •  Different financial products

•  Mapping objects / database

•  Inheritance

Page 5: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

E/R Model

Asset

Stock Zero Bond Option

Country > 20 database tables Up to 25 attributes

Currency

Page 6: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

JOINs L

Page 7: NoSQL Riak MongoDB Elasticsearch - All The Same?

Get all asset with interest

rate x

Page 8: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Page 9: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

JSON

Page 10: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Asset Type ID

Zero Bond

Interest Rate

Fixed Rate Bond

Interest Rate

Stock Option

Preferred Underlying asset

Country Price Country

Currency

Page 11: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

{ "ID" : "42", "type" : "Fixed Rate Bond", "Country" : "DE", "Currency" : "EUR", "ISIN" : "DE0001141562", "Interest Rate" : "2.5" }

Page 12: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

All stores in this presentation

support JSON

Page 13: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling Relational Databases

Page 14: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Larger Server

DB Server DB Server Expensive

Server Limited

Page 15: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Common Storage

DB Server Expensive

Storage

Limited

DB Server

DB Server DB Server e.g. Oracle

RAC

Page 16: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Replication Cheap Server

Almost unlimited DB Server

DB Server DB Server DB Server

Inconsistent data

Conflict resolution or Read only

Page 17: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Replication

DB Server

DB Server DB Server DB Server

MySQL Master-Slave

Oracle Advanced

Replication

Page 18: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Network Failure •  Either Answer & provide outdated data •  or Don’t answer i.e. always provide up to date data

Page 19: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

CAP •  Consistency •  Availability •  Network Partition Tolerance

•  If network fails provide a potentially incorrect answer or no at all?

Page 20: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

BASE •  Basically Available •  Soft State •  Eventually (= in the end) consistent

•  i.e. give potentially incorrect answer

Page 21: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

BASE and Relational DBs •  Very limited

•  Stand by •  Read only replica

•  No truly distributed DB

Page 22: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Relational & BASE •  Most relational operations cover

multiple tables

•  Needs locks across multiple servers

•  Not realistically possible

Page 23: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

NoSQL & BASE •  Typical operation covers one data

structure •  …that contains more information

•  No complex locking

•  More sophisticated BASE

Page 24: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Naïve View on NoSQL

Page 25: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Key / Value Stores •  Map Key to Value •  For simple data structure •  Retrieval only by key

•  Easy scalability •  Only for simple

applications

Key Value 42 Some

data

Page 26: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Document Oriented •  Documents

e.g. JSON •  Complex

structures & queries

•  Still great scalability •  For more complex

applications

{ "author":{ "name":"Eberhard Wolff", "email":"[email protected]" }, "title": "Continuous Delivery”, }

Page 27: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Graph, Column

Oriented…

Page 28: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Educated View on NoSQL

Page 29: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Key / value Document-based

Search engine All the same?

Page 30: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 31: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 32: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

•  Key / value •  Truly distributed database

What is Riak?

Page 33: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Riak: Technologies •  Erlang

•  Open Source (Apache 2.0)

•  Company: Basho

Page 34: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

•  Allows secondary indices

•  Riak Search 2.0: Solr integration •  Solr: Lucene based search engine •  API compatible to Solr

•  Key / value or document based?

More indices

Page 35: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

•  Map/reduce •  Scans all datasets

•  Can store large binary objects

More Features

Page 36: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling Riak •  Based on the Dynamo paper

•  Well understood •  …and battle proofed at Amazon

Page 37: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling Riak Server A

Shard1 Shard3

Shard4

Server B Shard2 Shard1

Shard4

Server D Shard4 Shard2

Shard3

Server C Shard3 Shard2

Shard1

Page 38: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling Riak Server A

Shard1 Shard3

Shard4

Server B Shard2 Shard1

Shard4

Server D Shard4 Shard2

Shard3

Server C Shard3 Shard2

Shard1

Page 39: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling Riak Server A

Shard1 Shard3

Shard4

Server B Shard2 Shard1

Shard4

Server D Shard4 Shard2

Shard3

Server C Shard3 Shard2

Shard1

New Server

Page 40: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Tuning BASE •  N node with replica •  R nodes read from •  W nodes written to •  Trade off

Page 41: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Is it bullet proof?

Page 42: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Jepsen •  Test suite for network failures etc •  https://aphyr.com/tags/jepsen

•  Riak succeeds •  …if tuned correctly •  …might still need to merge versions •  https://aphyr.com/posts/285-call-me-

maybe-riak

Page 43: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 44: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 45: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 46: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

•  Document-oriented •  MMAPv1

Memory-mapped files + journal •  New in 3.0: WiredTiger for complex

loads Humongous

What is MongoDB?

Page 47: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB: Technologies •  C++

•  Open Source (AGPL)

•  Company: MongoDB, Inc.

Page 48: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

•  Can store large binary objects

•  Its own full text search

More Features

Page 49: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

More Features •  Map / Reduce

•  JavaScript

•  Aggregation framework

Page 50: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling MongoDB

Replica 1

Shard 1

Replica 2

Replica 3

Shard 2

Replica 1

Replica 2

Replica 3

Page 51: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Availability

Replica 1

Shard 1

Replica 2

Replica 3

Shard 2

Replica 1

Replica 2

Replica 3

Page 52: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling MongoDB

Replica 1

Shard 1

Replica 2

Replica 3

Replica 1

Shard 2

Replica 2

Replica 3

Replica 1

Shard 3

Replica 2

Replica 3

Page 53: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling MongoDB

Replica 1

Shard 1

Replica 2

Replica 3

Shard 2

Replica 1

Replica 2

Replica 3

?

Page 54: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Tuning BASE •  Write concerns •  How many nodes should

acknowledge the write?

•  Read from primary •  …or also secondaries

Page 55: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Jepsen •  Mongo loses writes •  A bug – might still be there •  Also: non-acknowledge writes might

still survive •  …and overwrite other data

•  https://aphyr.com/posts/284-call-me-maybe-mongodb

Page 56: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 57: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 58: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 59: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Database =Storage + Search

Page 60: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

elasticsearch =Storage + Search

Page 61: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

What is elasticsearch? •  Search Engine

•  Also stores original documents

•  Based on Lucene Search Libray

•  Easy scaling

Page 62: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

elasticsearch: Technologies •  Java

•  REST

•  Open Source (Apache)

•  Backed by company elasticsearch

Page 63: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

elasticsearch Internals •  Append only file •  Many benefits

•  But not too great for updates

Page 64: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scaling elasticsearch

Server Server Server

Shard 1 Replica 1

Replica 2 Shard 2

Replica 3 Shard 3

Page 65: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Tuning BASE •  Write acknowledge: 1, majority, all •  Including indexing

•  Read from primary •  …or also secondaries

Page 66: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Jepsen •  Loses data even if just one node is

partioned (June 2014) •  Actively worked on •  It’s a search engine…

•  https://aphyr.com/posts/317-call-me-maybe-elasticsearch

•  http://www.elasticsearch.org/guide/en/elasticsearch/resiliency/current/

Page 67: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Scenarios

elasticsearch

Page 68: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Search •  Powerful query language

•  Configurable index

•  Text analysis •  Stop words •  Stemming

Page 69: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Facets •  Number of hits by category

•  Useful for statistics •  & Big Data

•  Statistical facet (+ computation)

•  Range facets etc.

Page 70: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 71: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

MongoDB

elasticsearch

Riak

Page 72: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Conclusion •  Relational databases might be

BASE •  NoSQL embraces BASE better •  Key / Value, Document stores and

search engine: very similar features •  Care about scaling •  Care about resilience

Page 73: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Page 74: NoSQL Riak MongoDB Elasticsearch - All The Same?

Eberhard Wolff - @ewolff

Thank You!