Using Couchbase and Elasticsearch as data layers

SIZMEK MDX NXTUSING COUCHBASE AND ELASTICSEARCH AS DATA LAYERTAL MAAYANI YUVAL PERRY

HIGH LEVEL MISSION

Building a scalable , highly available , large and fast ad management platform

MAIN TECHNOLOGIES USED AWS Deployment

Micro services architecture

JavaRX

Couchbase

Elasticsearch

RabbitMQ

Consul

WHY NOSQL? NoSQL Document Database Relational Database

Unstructured data Structured data

Memory first approach Disk first approach

No transactions Transactional

Scale Horizontally Scale vertically

No SQL DB allows • Fast read and writes• Hold variety of data models• Large data volumes• Cloud friendly deployment• No single point of failure

Still need to take care of transactionless, eventually consistent data source

WHY COUCHBASE ?

JSON support Indexing and Querying

Cross data center replicationIncremental Map Reduce

OUR DATA LAYER

Generic Data Access Layer

QueryGet(Id)

Save / Update

DEMO – SIZMEK COUCHBASE ADMINISTRATOR TOOL• In house development tool that allows to perform ES queries as well as N1QL

queries

• Usage• Data investigation

• Data migration

HOW WE MAINTAIN ATOMICITY ON TRANSACTION LESS DATA SOURCE

Transaction manager service – maintain flow state between multiple entities change

Provide atomicity & tracking

Example: Save smart version ad flow

Dynamic Campaign

Optimization

Transaction ManagerAsset Mgmt

Ad ServiceCreate Ad

Upload ad assets

Create Smart version

1. Assets created2. Ad Created3. Smart version created

ELASTIC SEARCH – CONSISTENCY PROBLEMAND HOW TO OVERCOME THIS IN AUTOMATIC TESTINGThe problem

In a clustered elastic search environment, one document update is not automatically reflected in all notes.

This caused an inconsistent results in automatic testing.

Example

Change campaign name from A to B.

Automatic test verifies that the change actually tool place by getting the entity and verify its name.

Possible Solutions

• Wait few seconds before checking for updated status

• Use elastic search refresh to force in memory index update

NAME UNIQUENESS IMPLEMENTATIONHOW TO IMPLEMENT UNIQUE CONSTRAINS USING COUCHBASE

Problem

Maintain unique entity name

Real use case

Keep advertisement name unique system wide

Possible Solution

Save uniqueness document Key: entity nameValue : entity id

Save succeeded?

Save entity

Return error

Delete uniqueness

Input: entity to save

Still need to take care of orphan uniqueness documents

N1QL EXAMPLE• Use Query Workbench Developer Preview

• Example queries

1. select mvbucket.`key` from mvbucket where payload._type = 'AdSmartVersion' and payload.createdOn is not missing

2. select * from mvbucket where payload._type = 'AdSmartVersion' and payload.masterAdId = 1073741825 and payload.createdOn between 1349057369158 and 1449057369158

3. select payload.masterAdId, count(1) from mvbucket where payload._type = 'AdSmartVersion' and payload.createdOn between 1349057369158 and 1449057369158

4. select payload.masterAdId, count(1) as count from mvbucket where payload._type = 'AdSmartVersion'

COUCHBASE JAVA CLIENT 2NOTES ON JAVA CLIENT

• Built in support of JSON documents

• Support counters

• Asynchronous client using java RX• Allow exploit already used reactive business logic

• Parallel efficient processing

• Inherent error handling – for example retries get document with an exponential backoffObservable .from(docIds)

.flatMap(id -> { return bucket .async() .get(id) .retryWhen(RetryBuilder

.anyOf(BackpressureException.class)

.delay(Delay.exponential(TimeUnit.MILLISECONDS, 100))

.max(10) .build() ); })

.subscribe();

OUR USE OF ELASTICSEARCHQUERY ENGINE

• Free text search – user boolean queries • Data filtering – data grid filtering • Grouping – data grid grouping• Authorization – filter document according to user permissions• batch processing – internal services that use scan and scroll to operate on

large data set

ELASTIC SEARCH – SOME BEST PRACTICES • Carefully maintain index schema

• Avoid using Dynamic mapping

• Data type collisions

• Large data set – do not save data that is not used

• Build static schema from data model • Updating data model searchable field trigger build of new index

• Some changes in schema required re-indexing, e.g. adding mandatory field, change of enumeration value

• Inconsistency – updated data is not immediately appears on query result• System overall design must be aware of this limitation

• Throttling – must control number of writes

COUCHBASE 4.1 OUR USAGE

• Use optimistic locking - update operations are done through updater lambda function

• N1QL • Do not meet performance for large data set with order by queries

• Took more than 5 sec to query 250 entities

• Used for business logic where no sorting is required

• Used when consistency is important

• XDCR • Customize plugin to index required entities

• Add support of parent child relationship in elasticsearch

QUESTIONS AND ANSWERS

Thank you

Using Couchbase and Elasticsearch as data layers

Software

Transcript of Using Couchbase and Elasticsearch as data layers

Couchbase @ Paypal: Couchbase Connect 2014

Couchbase Mobile 102 – Couchbase Live New York 2015

Using Couchbase and ElasticSearch for real-time data analytics

Couchbase Mobile - Meetupfiles.meetup.com/13875572/Couchbase Mobile - mobile_tea_Boston_0… · Couchbase Mobile Developing Mobile Apps & Ofﬂine Experiences with Couchbase Mobile

Couchbase at LinkedIn: Couchbase Connect 2015

Introduction to Couchbase Mobile: Couchbase Connect 2014

Couchbase N1QL (Couchbase Meetup #7)

Couchbase Mobile 101 – Couchbase Live New York 2015

Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

Couchbase 101: Couchbase Connect 2014

Couchbase 101 - Installation of Couchbase

Performance & scalability of Couchbase Server – Couchbase Connect 2016

Couchbase Live Europe 2015: Couchbase 101

Couchbase 105 - Cross Datacenter Replication and Elasticsearch

Couchbase 105: XDCR and Elasticsearch

Using Elasticsearch and Couchbase Together to Build Large Scale Applications

Performance Tuning Couchbase: Couchbase Connect 2014

Containerizing Couchbase with Swarm & Kubernetes – Couchbase Connect 2016

Couchbase on Docker - Couchbase Connect 2015

Hurence - aspromasprom.com/fintech/hurence.pdf · On peut aussi monter les données de beaucoup de bases de données et moteurs de recherche : Cassandra, MongoDB, Couchbase, Elasticsearch,