Netflix's Big Leap from Oracle to Cassandra

61
NETFLIX’S BIG LEAP FROM ORACLE TO C* ROOPA TANGIRALA Engineering Manager Netflix

Transcript of Netflix's Big Leap from Oracle to Cassandra

Page 1: Netflix's Big Leap from Oracle to Cassandra

NETFLIX’S BIG LEAP FROM

ORACLE TO C*

ROOPA TANGIRALAEngineering Manager

Netflix

Page 2: Netflix's Big Leap from Oracle to Cassandra

WHO AM I?

Engineering Manager @ NetflixTwitter - @roopatangiralaEmail [email protected] - https://www.linkedin.com/pub/roopa-tangirala/3/960/2b

Page 3: Netflix's Big Leap from Oracle to Cassandra

OVERVIEW

• Brief History

• Set up

• Migration Strategy

• Migration Challenges

• Example of Real use cases

• Lessons learnt

Page 4: Netflix's Big Leap from Oracle to Cassandra

1997

Page 5: Netflix's Big Leap from Oracle to Cassandra

DATACENTER

Page 6: Netflix's Big Leap from Oracle to Cassandra

BACKEND

Page 7: Netflix's Big Leap from Oracle to Cassandra

ORACLE DATAMODEL LIMITATIONS

Page 8: Netflix's Big Leap from Oracle to Cassandra

NO HORIZONTAL SCALING

Page 9: Netflix's Big Leap from Oracle to Cassandra

LICENSE COST

Page 10: Netflix's Big Leap from Oracle to Cassandra

EVERY TWO WEEKS!!

Page 11: Netflix's Big Leap from Oracle to Cassandra

2008

Page 12: Netflix's Big Leap from Oracle to Cassandra

MOVE TO CLOUD

Page 13: Netflix's Big Leap from Oracle to Cassandra

REQUIREMENTS

• HIGHLY AVAILABLE

• MULTI DATACENTER SUPPORT

• PREDICTABLE PERFORMANCE AT SCALE

Page 14: Netflix's Big Leap from Oracle to Cassandra

WHY C* ?

• Massively scalable architecture

• Multi-datacenter, multi-directional replication

• Linear scale performance

• Transparent fault detection and recovery

• Flexible, dynamic schema data

• Guaranteed data safety

• Tunable data consistency

Page 15: Netflix's Big Leap from Oracle to Cassandra

BACKEND NOW

Page 16: Netflix's Big Leap from Oracle to Cassandra

MICRO SERVICES

• Horizontal, Homogenous, Commoditized

Page 17: Netflix's Big Leap from Oracle to Cassandra

DOWNTIME

Page 18: Netflix's Big Leap from Oracle to Cassandra

ALMOST DAILY PUSHES

Page 19: Netflix's Big Leap from Oracle to Cassandra

ACTIVE ACTIVE

Page 20: Netflix's Big Leap from Oracle to Cassandra

GLOBAL PRESSENCE

Page 21: Netflix's Big Leap from Oracle to Cassandra

MIGRATION STRATEGY

Page 22: Netflix's Big Leap from Oracle to Cassandra

BABY STEPS

Page 23: Netflix's Big Leap from Oracle to Cassandra

NEW FEATURES FIRST TO CLOUD

Page 24: Netflix's Big Leap from Oracle to Cassandra

DATA MODEL REVIEW

keyspace

column family

Rowcolumn

•name

•value

•timestamp

DB/Schema

Table

Rowcolumn

• name

• value

ORACLE

CASSANDRA

Page 25: Netflix's Big Leap from Oracle to Cassandra

SCHEMALESS DESIGN

• Row-oriented

• Number of columns/Names can differ

namexyz Paul zip 95123

nameabc Adam zip 94538 sex Male

namecde XYZ

Page 26: Netflix's Big Leap from Oracle to Cassandra

UNDERSTAND WRITE PATH

client

Commit log (Disk)

Memtable (memory)

sstable sstable sstable

Flush

Page 27: Netflix's Big Leap from Oracle to Cassandra

UNDERSTAND READ PATH

clientmemtable

sstable

sstable

sstable

Row cache/key cache

Page 28: Netflix's Big Leap from Oracle to Cassandra

LOGIC IN APPLICATION

• Stored procedures

• Functions

• Triggers

• Referential integrity constraints

Page 29: Netflix's Big Leap from Oracle to Cassandra

DATACENTER AWS

ORACLE

C*

APP

DUAL WRITESREADS

CONSISTENCY CHECKER

WRITES

READS

WRITES

FORKLIFT

Page 30: Netflix's Big Leap from Oracle to Cassandra

MAIN APPROACH

• DUAL WRITES

• FORKLIFT OLD DATASET

• CONSISTENCY CHECKER

• MIGRATE READS TO C* FIRST

• SWITCH OFF WRITES TO DC

Page 31: Netflix's Big Leap from Oracle to Cassandra

Relationships – Better in App Layer

Page 32: Netflix's Big Leap from Oracle to Cassandra

ORACLE SEQUENCE

• USE UUID for Unique keys

• Distributed Sequence Generator for Ordering

• C* counters – not so much

Page 33: Netflix's Big Leap from Oracle to Cassandra

Heavy Transactional Use Case

RDBMS

Page 34: Netflix's Big Leap from Oracle to Cassandra

CHALLENGES

Page 35: Netflix's Big Leap from Oracle to Cassandra

SECURITY

Page 36: Netflix's Big Leap from Oracle to Cassandra

DENORMALIZE

DENORMALIZE

DENORMALIZE

DENORMALIZE

DENORMALIZE

Page 37: Netflix's Big Leap from Oracle to Cassandra

Roman Riding

Page 38: Netflix's Big Leap from Oracle to Cassandra

Model Around Queries

Page 39: Netflix's Big Leap from Oracle to Cassandra

Engineering Effort

Page 40: Netflix's Big Leap from Oracle to Cassandra

Know limitations

Page 41: Netflix's Big Leap from Oracle to Cassandra

SOURCE OF TRUTHF TRUTH

Page 42: Netflix's Big Leap from Oracle to Cassandra

REAL EXAMPLES

Page 43: Netflix's Big Leap from Oracle to Cassandra

API

• High concurrency

• Range scans

• ~1MB of data

• Caused heap issues at Cassandra level

• Very high read latency

Page 44: Netflix's Big Leap from Oracle to Cassandra

Range scan replaced with inverted index

0000/odp/{ui}/pathEvaluator_2 active Scripts_1 Text datafalseallocation

0000/xbox/{ui}/pathEvaluator_

1active Scripts_1 Text datafalseallocation

0000/tvui/{ui}/pathEvaluator_1 active Scripts_1 Text datafalseallocation

active_Scripts_idx

1,2

Idx_1 /tvui/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator

scripts

client

1

2

Page 45: Netflix's Big Leap from Oracle to Cassandra

Inverted Index considerations

• Column name can be used a row key placeholder

• Hotspots!!

• Sharding

Page 46: Netflix's Big Leap from Oracle to Cassandra

VIEWING HISTORY

Page 47: Netflix's Big Leap from Oracle to Cassandra

Growth of Viewing History

47

Page 48: Netflix's Big Leap from Oracle to Cassandra

Problem

Growth Pattern: “Hockey-stick”

Retention Policy: Retain forever

Access Pattern: Retrieve all for a customer

Scaling and performance challenges as the data grows

48

Page 49: Netflix's Big Leap from Oracle to Cassandra

Goals

Small Storage Footprint

Consistent Read/Write Performance

Infinite Scalability

49

Page 50: Netflix's Big Leap from Oracle to Cassandra

Old Data Model

50

Page 51: Netflix's Big Leap from Oracle to Cassandra

Old Data Model - Pros

Simple Wide Rows

No additional processing

Fast Write

Fast Read for Majority

51

Page 52: Netflix's Big Leap from Oracle to Cassandra

Old Data Model - Cons

Read latency increases as number of viewing records increases

Cassandra Internal

• Number of SSTables

• Compaction

• Read Repair

Lesson learned : Wide Row Limitation

52

Page 53: Netflix's Big Leap from Oracle to Cassandra

New Data Model

Split Viewing History into two Column Families1. Recent• Small records with frequent updates• Cassandra tuning : compaction, read repair, etc.

2. Compressed• Large historical records with rare updates• Rollup• Compression• Cassandra: rare compaction, no read repair

53

Page 54: Netflix's Big Leap from Oracle to Cassandra

New Data Model cont’d

54

Page 55: Netflix's Big Leap from Oracle to Cassandra

RESULTS

55

Page 56: Netflix's Big Leap from Oracle to Cassandra

Think Data Archival

Data stores grow exponentially

Have a process in place to archive data

• Moving to a separate column family

• Moving to a separate cluster (non SSD)

• Setting right expectations w.r.t latencies with historical data

Cassandra TTL’s

Page 57: Netflix's Big Leap from Oracle to Cassandra

Cinematch Rating Service

• First Model

• Second Model

Movie_id12345 BLOB

Movie_id4355 BLOB

545 1 545 4 545 534512345 4 545 2

454355 2 66 2 67 4

Page 58: Netflix's Big Leap from Oracle to Cassandra

WHAT WORKED?

12345

4355

343 674

5

Name

3

4542443 242 Name

BLOB

4 BLOB52

Page 59: Netflix's Big Leap from Oracle to Cassandra

BLOB vs COLUMN/VALUE

Try out column/value approach first and hopefully it satisfies avg/95/99th

Column value pro’s:• Write payload can be smaller

• Query by specific columns

• Read path does not require reading the entire row

Blob considerations• Read percentage is very high (90’s)

• Read latencies are very important

• All the data is read most of the time

Page 60: Netflix's Big Leap from Oracle to Cassandra

LESSONS LEARNT

• Get the model right

• Baby Steps

• Huge Engineering effort

• Think Compression

• Performance test

• DB just for Data

Page 61: Netflix's Big Leap from Oracle to Cassandra

WE ARE HIRING !! – CHECK OUT JOBS.NETFLIX.COM