C* Summit EU 2013: Using Cassandra in a Telco Storage System

19
#CASSANDRA EU CASSANDRASUMMITE U Using Cassandra in a Telco Storage System Pavel Pontryagin, senior developer

description

Speaker: Pavel Pontryagin, Senior Engineer at Peter-Service Video: http://www.youtube.com/watch?v=Sf81x5V8xKY&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=7 Data volume grows and in telecommunication area it is painful to support and scale RDBMS systems. This presentation shows how we switched from SQL to NoSQL. This will be an overview of aspects: * how we model schema for call data using NoSQL vs SQL. * what hardware architecture we use * NoSQL vs SQL insert-select performance * how we store graph data using C*.

Transcript of C* Summit EU 2013: Using Cassandra in a Telco Storage System

Page 1: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

Using Cassandra in a Telco Storage System

Pavel Pontryagin, senior developer

Page 2: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

* What hardware architecture we use

* How we model schema for call, client and graph data using NoSQL vs SQL

* NoSQL vs SQL insert-select performance

What we will discuss today is how we switched from SQL to NoSQL

Page 3: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

SQL

What and how much is Telco data

Page 4: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

* What is CDR? Under the hood.

Page 5: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

500 millions/day 100 millions/day

Scaling data in a traditional schema

Page 6: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

* Speed up processing *  Increase reliability * Decrease costs * Unify units  * Add new functionality

Unanswered questions. How to:

Solution for us is

Switch to Cassandra

Page 7: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL

How to load, model and query data

Page 8: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

The schema is

Local Storage

SSTable Loader

Local Storage

SSTable Loader

Local Storage

SSTable Loader

CDR files CDR files CDR files

Thrift Thrift Thrift 62Gb RAM 12 x CPU 2.50GHz 10Tb array

Page 9: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

SQL data Query

SELECT started, numA, numB FROM events_1003_main WHERE started > sysdate -1 and numA = ‘765-23-14’;

Data modeling. Calls

started numA numB Event type

Lac Cell IMSI IMEI

20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1

Page 10: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL data “raw” data CALLS_201301 index cf CALLS_MSISDN_201301 Query

SliceQuery<String, Composite, String> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_MSISDN_201301”); //define CF

sliceQuery.setKey(“201301.765-23-18”); //Query given msidn for given date Composite start = new Composite(); start.addComponent(0, “201301”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “201303”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, String>> qr = sliceQuery.execute(); //get the result

uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140

201301.765-­‐23-­‐14   {2013-­‐10-­‐15  21/04/58:uuid1}  numB  

201301.765-­‐23-­‐18   {2013-­‐10-­‐15  21/04/58:uuid1}  numA.IMSI1.IMEI1  

Page 11: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

… where num like ‘%123’ Using SolR

Clone Keep using

+SolR indexing

Page 12: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL data “raw” data CALLS_201301 index cf CALLS_LAC_CELL_201301 Query

SliceQuery<String, Composite, Date> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_LAC_CELL_201301”); //define CF

sliceQuery.setKey(“2013.05.01 15:15”); //Query given time slice Composite start = new Composite(); start.addComponent(0, “2901”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “2901”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, Date >> qr = sliceQuery.execute(); //get the result

uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140

2013.05.01 15:15 {2901:35140: uuid1} 20131015210458

Page 13: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

SQL data Query SELECT distinct dn.msisdn, tl.name, t.full FROM SUBS_DATA.DCT_NUMS dn, SUBS_DATA.SUBS_NUMS_HIST snh, subs_data.clnt_attrs_hist cah, SUBS_DATA.DCT_NAMES t, dicts.telcos tl Where (dn.num_id = subn_num_id) and (cah.clnt_clnt_id = snh.clnt_clnt_id) and (t.name_id = cah.reg_name_id) and (tl.telco_id = dn.tlco_tlco_id) and (snh.started > to_date('10.09.2012', 'DD.MM.YYYY')) and (snh.started < to_date('17.09.2013', 'DD.MM.YYYY')) order by dn.msisdn;

Data modeling. Subscribers

started numA numB Event type

Lac Cell IMSI IMEI

20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1

Page 14: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL “raw” subscribers data first slice update index cf SUBS_NAME

uuid1 Pavel, 765-23-14, address1, document1, 2, 21-01-2013, 1(active_status)

uuid2 Vadim, 765-23-16, address2, document2, 2, 21-01-2013, 1(active_status)

Pavel {21-01-2013: document1: 1} {21-01-2014: document1: 0} uuid1 uuid3

Vadim {21-01-2013: document2: 1}

uuid2

uuid3 Pavel, 765-23-14, address1, document1, 2, 21-01-2014, 0(active_status)

Page 15: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

Data modeling. Graph data

Page 16: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL graph data First record from the first switch Second record from the second switch CALLS_EDGES column family

2013/01/12.msisdn1 {VOICE: msisdn2: 1:uuid1} {VOICE: msisdn2: 2:uuid2} OUT OUT

2013/01/12.msisdn2 {VOICE: msisdn1: 1:uuid1} {VOICE: msisdn1: 2:uuid2} IN IN

uuid started numA numB Event type

subsciber

uuid1 2013/01/12 msisdn1 msisdn2 VOICE 1

uuid started numA numB Event type

subsciber

uuid2 2013/01/12 msisdn1 msisdn2 VOICE 1

Page 17: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

*  Loading speed *  Query speed *  Fun

Summary

Write,  rec/sec  

Oracle 10 000 Cassandra 5 000 per node

seconds  

Oracle > 1, depends on plans, statistics etc. Cassandra < 1, stable

Page 18: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

#CASSANDRAEU

CASSANDRASUMMITEU

*  How to model "traditional" data *  How to add full-text search to intensive data load *  A couple of hardware issues

What we discussed today…

Page 19: C* Summit EU 2013: Using Cassandra in a Telco Storage System

#CASSANDRAEU

CASSANDRASUMMITEU

THANK YOU

[email protected]

Pavel Pontryagin, senior developer