NoSQL into E-Commerce: lessons learned

28
Lessons Learned MongoDB into eCommerce websites

description

NoSQL into E-Commerce: lessons learned par Aurélien Foucret du groupe Smile - http://www.smile.fr

Transcript of NoSQL into E-Commerce: lessons learned

Page 1: NoSQL into E-Commerce: lessons learned

Lessons Learned

MongoDB into eCommerce websites

Page 2: NoSQL into E-Commerce: lessons learned

2

WHO ARE WE ?

Page 3: NoSQL into E-Commerce: lessons learned

3

SMILE IN A NUTSHELL

!  SMILE IS THE BIGGEST PLAYER IN OPEN SOURCE IN EUROPE +700 employees, 17 offices, +45 M€ turnover in 2012, 30% of grow / year

Office in Brussels since 2012 with a local team of experts for your projects

! MULTI-TECHNOLOGIES, A UNIQUE EXPERTISE CMS, E-Commerce, Portal, ECM/DMS, ERP, BI, System/Infrastructure, Custom dev

Page 4: NoSQL into E-Commerce: lessons learned

4

OUR EXPERTISE AND OUR CONVICTIONS IN OUR FREE WHITE PAPERS

Page 5: NoSQL into E-Commerce: lessons learned

5

MONGOGENTO PRODUCT STORAGE INTO MONGODB

Lessons Learned MongoDB into eCommerce websites

Page 6: NoSQL into E-Commerce: lessons learned

6

WHY NOSQL ?

! OSS software does not meet the performance needs of our clients out of the box, especially when dealing with huge product catalog (millions of product)

!  The main bottleneck encountered is the database : l Avoid specialization Less scalable component of the LAMP

architecture

l Require a complex data model when dealing with heterogeneous products

Lessons Learned MongoDB into eCommerce websites

Page 7: NoSQL into E-Commerce: lessons learned

7

THE ROAD TO NOSQL

2008 • First Magento release : poor performances

2009 • Smile provides the first integration of SolR into Magento • Magento does it later into its Enterprise edition (v. 1.8)

2012 • First prototype of Magento / MongoDB integration

2013

• MongoGogento is now in production and will be opened to the community

• Several improvements to come by the end of the year

Lessons Learned MongoDB into eCommerce websites

Page 8: NoSQL into E-Commerce: lessons learned

8

WHY MONGODB AMONG OTHER DATABASES ?

!MongoDB is a general purpose document database l  More versatile document selection API

l  Update API allows partial document update and advanced operations (inc, push, …)

!MongoDB is popular : l  More developpers with MongoDB skills

l  Ecosystem : hosting, SaaS, …

!Well documented

Lessons Learned MongoDB into eCommerce websites

Page 9: NoSQL into E-Commerce: lessons learned

9

HOW IS MAGENTO

!Magento uses the EAV model to fit products into an RDBMs :

STORING PRODUCTS ?

id sku

1 345678909876

2 786576809080

3 978786798979

… …

id attribute_id product_id store_id value

1 price 1 0 10.00

2 name 1 0 Product name

3 name 1 1 Nom du produit

… … … … …

40 price 2 0 20.00

41 name 2 0 Other product

42 name 2 1 Autre produit

Attribute values table

Product main table

Lessons Learned MongoDB into eCommerce websites

Page 10: NoSQL into E-Commerce: lessons learned

10

THE EAV MODEL

!Pros : l  You can add or remove attributes or locales without altering tables, thus avoid

downtimes required for such operations when dealing with a lot of products.

!Cons : l  Very slow : lots of joins required to get a single product or a product list (worst,

most of them are left joins).

l  Writing is slow : many inserts for one product with a lot of checks done by the RDBM (fK, indexes, transactionnal logic).

l  The attribute values tables tend to have to grow a lot more than the number of products (average is twenty times faster).

PROS AND CONS

Lessons Learned MongoDB into eCommerce websites

Page 11: NoSQL into E-Commerce: lessons learned

« EAV model fit better into a document database »

Page 12: NoSQL into E-Commerce: lessons learned

12

FIRST VERSION :

!  Pros : l  1 product = 1 document (reads and writes are

very performant)

l  Very flexible model

!  Cons : l  All foreign keys on the product have to be

removed

l  Some attributes are used to compute indexes (sale price, …) : a lot of Magento have to be rewritten or will be broken

STORE EVERYTHING INTO MONGODB

{ _id : 1, attr_0 : { name : “Product name” price : 10.00 }, attr_1 : { name : “Nom du produit” } }

Product document example

Lessons Learned MongoDB into eCommerce websites

Page 13: NoSQL into E-Commerce: lessons learned

13

THE SOLUTION : HYBRID MODEL MONGODB / MYSQL ?

! Keep RDBM storage for : l  entity main table

l  Attributes related to indexes (price, name, …)

!  Put everything else into MongoDB (90% of the attributes such as description, color, …)

! On product loading : l  Load from the RDBM and enrich from MongoDB after

! On product list loading : l  Load filtered product list from MongoDB

l  Load filtered product list from MySQL

l  Merge both product lists (intersection)

Lessons Learned MongoDB into eCommerce websites

Page 14: NoSQL into E-Commerce: lessons learned

14

LESSONS LEARNED

Lessons Learned MongoDB into eCommerce websites

Page 15: NoSQL into E-Commerce: lessons learned

15

LESSON N°1

!  It is very pleasant to work with MongoDB

!  Learning curve is very high. The documentary model is a quite natural way to work for developers

!  The most of work you will have will be "unlearning" the way you are building your models with a RDBMs to take full advantage of the documentary model (nested doc vs reference to a doc).

!  There is often several ways to make something. Some are better than others and only experience will tell you what is the good one.

LEARNING CURVE

Lessons Learned MongoDB into eCommerce websites

Page 16: NoSQL into E-Commerce: lessons learned

16

LESSON N°2

!  Be pragmatic about data normalisation

!  Ratio between read / write : should impact the way you will store data

!  Example : store comments of a product

STORE DATA THE WAY YOU WILL QUERY IT

Lessons Learned MongoDB into eCommerce websites

{ _id : 174474747, content_text : “My Super Comment”, user_id : 346568794, product_id : 87687 }

To be efficient this solution need indexes on both product_id and user_id : •  Indices are time consuming when writing •  All indices have to fit into RAM. Limit the number

of indexes.

Page 17: NoSQL into E-Commerce: lessons learned

17

LESSON N°2

!  You can avoid indices by adding a new collection :

STORE DATA THE WAY YOU WILL QUERY IT

Lessons Learned MongoDB into eCommerce websites

{ _id : ‘user_346568794’, comment_ids : [“174474747”], } { _id : ‘product_87687’, comment_ids : [“174474747”], }

Cons : •  You have to write three doc for one

comments •  Data duplication Pros :

•  You don’t need indices anymore •  Very fast to read comments for a

product or by an user

Page 18: NoSQL into E-Commerce: lessons learned

18

LESSON N°3

!MongoDB does not support transactions …

! … but single document modifications is atomic

!  The questions is: how can I modify my document model to avoid transactions ?

!  If transactions are really needed, their is alternatives you can implement by your own :

l  Two phase commits : http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/

l  Optimistic locking (can be implemented on client way)

MANAGE TRANSACTIONS

Lessons Learned MongoDB into eCommerce websites

Page 19: NoSQL into E-Commerce: lessons learned

19

LESSON N°3

!  You can avoid transactions by avoiding « get for set » operation. You should better use MongoDB update operators instead.

! Examples : append a category to a product and update it’s price

MANAGE TRANSACTIONS

Lessons Learned MongoDB into eCommerce websites

$product = $collection->findOne(array( ‘_id’ => $productId )); $product[‘price’] = 10.00; $product[‘category_ids’][] = 3; $collection->save($product);

$updateCond = array(‘_id’ => $productId); $updateData = array( ‘$set’ => array(‘price’ => 10.00), ‘$push’ => array(‘category_ids’ => 3), );

BAD GOOD

Page 20: NoSQL into E-Commerce: lessons learned

20

LESSON N°4

! Keep in mind, NoSQL is about database specialization : l  Most of time you will have several databases into your project

l  Use the best available for the task you have to perform

! Avoid spread related data into several databases : l  Hard to backup (synchronized)

l  Not very peformant

! Keep pragmatic about what you put into MongoDB : l  Target the performance killers into existing projects

l  If you are trying to reproduce complex transactions, you are probably wrong

l  SQL has not to be trashed

NOSQL = NOT ONLY SQL

Lessons Learned MongoDB into eCommerce websites

Page 21: NoSQL into E-Commerce: lessons learned

21

LESSON N°5

!  Index your documents into a search engine (SolR, ElasticSearch, …)

! They are more efficient where it comes to filtering collections of documents

! Most of time, you will do it anyway, because you will need fulltext search, facetting, …

!  Use : l  Dataimport handlers for SolR

l  River MongoDB plugin for ElasticSearch

USE A SEARCH ENGINE

Lessons Learned MongoDB into eCommerce websites

Page 22: NoSQL into E-Commerce: lessons learned

22

LESSON N°6

! Did I really need : l  Data to be computed in real-time ?

l  Always fresh ? What is a acceptable delay ?

! Example : Top ten of product sales by category. l  Calculated at page load ?

l  Calculated every time someone is buying something ?

l  Calculated every 5 minutes by a batch process ?

!  The tools : l  Use MapReduce (incremental) to batch analysis operations involving large datasets

l  Use tailable cursors to proceed to backgroud stream processing

NEAR TIME PROCESSING IS A BIG WIN

Lessons Learned MongoDB into eCommerce websites

Page 23: NoSQL into E-Commerce: lessons learned

23

LESSON N°7

!MongoDB consumes a lot more storage than RDBMs : l Document structure is stored for each document (into RDBMs it

is done once for the table)

l Document is padded (small amount of free space added at the end) => better perfomance during update, more space on the disk

!  To achieve the best performances, the whole dataset + indices have to fit into RAM

! Don’t hesitate to experiment sharding but do it carefully : shard key can not be updated. Pay attention to what you are choosing.

SCHEMALESS DATABASES HAVE FASTEST GROWTH RATE

Lessons Learned MongoDB into eCommerce websites

Page 24: NoSQL into E-Commerce: lessons learned

24

LESSON N°8

!  Replication is about hardware failures

!  Backup is about human failures :

!  Backup : Full backup + oplog (Point In Time Recovery)

! Difficult to backup a sharded cluster or an hybrid MySQL / MongoDB application (synchronization)

! You can avoid having to recover from backups : l Never use delete operations : mark data as deleted and filter them

instead

l  Use a versionning system instead of updating data

DON’T USE REPLICATION AS A BACKUP

Lessons Learned MongoDB into eCommerce websites

db.catalog_product_entity.drop();

Page 25: NoSQL into E-Commerce: lessons learned

25

MONGOGENTO PRESENT AND FUTURE

Lessons Learned MongoDB into eCommerce websites

Page 26: NoSQL into E-Commerce: lessons learned

26

MONGOGENTO

! Manage product attributes and media galleries l  Product Import performances : x5

l  Frontend / Admin performances : x2

l  Benchmarks (French) :http://www.ecommerce-performances.com/mongogento.php

!  Not so many Magento features broken. The broken ones were not usable with huge catalogs

! June 2013 : Goes live into production.

!  Jan. 2014 : First OpenSource release : l  You can fork it on GitHub : https://github.com/Smile-SA/mongogento

WHAT IT DOES ?

Lessons Learned MongoDB into eCommerce websites

Page 27: NoSQL into E-Commerce: lessons learned

27

MONGOGENTO

! More features to be added l  Fix Magento broken features : catalog rules support, sitemap

l  Cart storage

l  Media assets storage (GridFS)

! Search Engine integration (ElasticSearch) with unique features : l  Behavorial data processing

l  Mahout integration)

!Magento community edition support

THE ROADMAP

Lessons Learned MongoDB into eCommerce websites

Page 28: NoSQL into E-Commerce: lessons learned

http://be.smile.eu

Q/A ?