Couchbase 104 Webinar

26

Transcript of Couchbase 104 Webinar

Page 1: Couchbase 104 Webinar
Page 2: Couchbase 104 Webinar

Couchbase 104:Views and Indexing

Todd GreensteinSolution Architect

Page 3: Couchbase 104 Webinar

[Agenda]

[1] What are Views[2] Architecture in Couchbase[3] Uses[4] View Anatomy (Map/Reduce)[5] Examples

Page 4: Couchbase 104 Webinar

Indexing Subsystem

• Storing data and Indexing data are separate systems in all databases

• In explicit schema scenarios (RDBMS), Indexes are optimized based on the data type(s)

• In flexible schema scenarios Map-Reduce is used to create indexes

Page 5: Couchbase 104 Webinar

Map Reduce

• Map-Reduce is a technique designed for dealing with Big Data and processing in parallel in distributed systems

• Map-Reduce is also specifically designed for dealing with unstructured or semi-structured data

• Map functions identify data with collections, process them, and output transformed values

• Reduce functions take the output of Map functions and perform numeric aggregate calculations on them

Page 6: Couchbase 104 Webinar

Couchbase Map Reduce• In Couchbase, Map-Reduce is specifically used

to create Indexes• Map functions are applied to JSON

documents and they output or "emit" data that is organized in an Index form

Page 7: Couchbase 104 Webinar

[Architecture]

Page 8: Couchbase 104 Webinar

Storage to Index

Page 9: Couchbase 104 Webinar

Views: Eventual Consistency

Time 1

Time 2

Page 10: Couchbase 104 Webinar

Buckets >> Design Documents >> Views

Couchbase Bucket

Design Document 1 Design Document 2

View ViewViewViewView

Indexers Are Allocated Per Design Doc

All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time

Can Only Access Data in the Bucket Namespace

Can Only Access Data in the Bucket Namespace

Page 11: Couchbase 104 Webinar

33 2

Eventually indexed Views – Data flow2

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1Doc 1

Doc 1

To other node

View engine

Doc 1

Page 12: Couchbase 104 Webinar

Index building details

•All the views within a design document are incrementally updated when the view is accessed or auto-indexing kicks in

•The entire view is recreated if the view definition has changed

•The index information stored on disk consists of the combination of both the key and value information defined within your view.

Page 13: Couchbase 104 Webinar

Automatic Index Updates

• Views can be updated automatically according to:– Update Interval (time period); default 5000 milliseconds– Update Documents (number of changed documents);

default 5000 changes– Replica Update Interval for index replicas

• Can be set globally• Can be set on individual design documents– Configured using REST API

• Operate independently of manual updates• Only updates with eligible documents

Page 14: Couchbase 104 Webinar

Manual Index Updates

• All queries support ‘stale’ parameter to view• Allows application to control updates– Has performance considerations

• Update request for one view updates all views within corresponding design document

• Can be triggered through UI or REST API as well

• Only updates with eligible documents

Page 15: Couchbase 104 Webinar

Understanding Stale

stale = UPDATE_AFTER (default if nothing is specified)always get fastest responsecan take two queries to read your own writesstale = OKauto update will trigger eventuallymight not see your own writes for a few minutesleast frequent updates -> least resource impactstale = FALSEUse with Persistence observe if data needs to be included in view resultsBUT aware of delay it adds, only use when really required

Page 16: Couchbase 104 Webinar

Replica Indexes• In addition to data replicas, optionally create replica for indexes

– Build an index using the data in replica vBuckets• Enabled per bucket (see bucket config) or per design document

(REST API)• Implies additional CPU and I/O overhead

– Each node must update index for active data and replica data

• In the event of failover:– Without replica indexes, queries may be out of date for longer while

rebuilt– Replica indexes enabled if present and queries remain consistent– Replicas indexes are rebuilt on replica nodes

Page 17: Couchbase 104 Webinar

[Uses]

Page 18: Couchbase 104 Webinar

Use Cases

• Iteration• Retrieval (UUID/GUID)• Ranges (Parameters)• Aggregating (Built in Reduces)• Statistical Analysis

Page 19: Couchbase 104 Webinar

[View Anatomy]

Page 20: Couchbase 104 Webinar

Map() Function => Index

function(doc, meta) {emit(doc.username, doc.email)

} indexed key output value(s)create row

json doc doc metadata

Every Document passes through View Map() functions

Map

Page 21: Couchbase 104 Webinar

Single Element Keys (Text Key)

function(doc, meta) {emit(doc.email, doc.points)

} text key

Map

meta.id

doc.emaildoc.po

ints

u::[email protected]

m1000

u::35

[email protected]

1200

u::20

[email protected]

900

Page 22: Couchbase 104 Webinar

Compound Keys (Array)

function(doc, meta) {emit(dateToArray(doc.timestamp), 1)

} array key

Array Based Index Keys get sorted as Strings, but can be grouped by array elements

Map

meta.id

dateToArray(doc.timestamp)

value

u::20 [2012,10,9,18,45] 1

u::1 [2012,9,26,11,15] 1

u::35 [2012,8,13,2,12] 1

Page 23: Couchbase 104 Webinar

Reduce Functions• Optional• Operate on View Value emitted by map function

• Built-in reduce– _count – provides a count of unique keys– _sum – provides a sum total of values– _stats – provides statistics (max, min, avg, etc.) of values

• Automatically re-reduces up B-tree• Results stored pre-computed for fast access

• Custom reductions possible if needed but less performant

Page 24: Couchbase 104 Webinar

View Query Parameters

Page 25: Couchbase 104 Webinar

[Examples]

Page 26: Couchbase 104 Webinar

Sample Inventory Bucket

{"item":"KS-00114629 Dark Grey Glasses","vendor":"kate spade","category":"glasses"}

{"item":"IH-00884501 Triple ring Shoulder bag","vendor":"iron heart","category":"handbag",”discount":"15"}