Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
C* Summit EU 2013: Using Cassandra in a Telco Storage System
-
Upload
planet-cassandra -
Category
Technology
-
view
667 -
download
0
description
Transcript of C* Summit EU 2013: Using Cassandra in a Telco Storage System
#CASSANDRAEU
CASSANDRASUMMITEU
Using Cassandra in a Telco Storage System
Pavel Pontryagin, senior developer
#CASSANDRAEU
CASSANDRASUMMITEU
* What hardware architecture we use
* How we model schema for call, client and graph data using NoSQL vs SQL
* NoSQL vs SQL insert-select performance
What we will discuss today is how we switched from SQL to NoSQL
#CASSANDRAEU
CASSANDRASUMMITEU
SQL
What and how much is Telco data
#CASSANDRAEU
CASSANDRASUMMITEU
* What is CDR? Under the hood.
#CASSANDRAEU
CASSANDRASUMMITEU
500 millions/day 100 millions/day
Scaling data in a traditional schema
#CASSANDRAEU
CASSANDRASUMMITEU
* Speed up processing * Increase reliability * Decrease costs * Unify units * Add new functionality
Unanswered questions. How to:
Solution for us is
Switch to Cassandra
#CASSANDRAEU
CASSANDRASUMMITEU
noSQL
How to load, model and query data
#CASSANDRAEU
CASSANDRASUMMITEU
The schema is
Local Storage
SSTable Loader
Local Storage
SSTable Loader
Local Storage
SSTable Loader
CDR files CDR files CDR files
Thrift Thrift Thrift 62Gb RAM 12 x CPU 2.50GHz 10Tb array
#CASSANDRAEU
CASSANDRASUMMITEU
SQL data Query
SELECT started, numA, numB FROM events_1003_main WHERE started > sysdate -1 and numA = ‘765-23-14’;
Data modeling. Calls
started numA numB Event type
Lac Cell IMSI IMEI
20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1
#CASSANDRAEU
CASSANDRASUMMITEU
noSQL data “raw” data CALLS_201301 index cf CALLS_MSISDN_201301 Query
SliceQuery<String, Composite, String> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_MSISDN_201301”); //define CF
sliceQuery.setKey(“201301.765-23-18”); //Query given msidn for given date Composite start = new Composite(); start.addComponent(0, “201301”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “201303”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, String>> qr = sliceQuery.execute(); //get the result
uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140
201301.765-‐23-‐14 {2013-‐10-‐15 21/04/58:uuid1} numB
201301.765-‐23-‐18 {2013-‐10-‐15 21/04/58:uuid1} numA.IMSI1.IMEI1
#CASSANDRAEU
CASSANDRASUMMITEU
… where num like ‘%123’ Using SolR
Clone Keep using
+SolR indexing
#CASSANDRAEU
CASSANDRASUMMITEU
noSQL data “raw” data CALLS_201301 index cf CALLS_LAC_CELL_201301 Query
SliceQuery<String, Composite, Date> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_LAC_CELL_201301”); //define CF
sliceQuery.setKey(“2013.05.01 15:15”); //Query given time slice Composite start = new Composite(); start.addComponent(0, “2901”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “2901”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, Date >> qr = sliceQuery.execute(); //get the result
uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140
2013.05.01 15:15 {2901:35140: uuid1} 20131015210458
#CASSANDRAEU
CASSANDRASUMMITEU
SQL data Query SELECT distinct dn.msisdn, tl.name, t.full FROM SUBS_DATA.DCT_NUMS dn, SUBS_DATA.SUBS_NUMS_HIST snh, subs_data.clnt_attrs_hist cah, SUBS_DATA.DCT_NAMES t, dicts.telcos tl Where (dn.num_id = subn_num_id) and (cah.clnt_clnt_id = snh.clnt_clnt_id) and (t.name_id = cah.reg_name_id) and (tl.telco_id = dn.tlco_tlco_id) and (snh.started > to_date('10.09.2012', 'DD.MM.YYYY')) and (snh.started < to_date('17.09.2013', 'DD.MM.YYYY')) order by dn.msisdn;
Data modeling. Subscribers
started numA numB Event type
Lac Cell IMSI IMEI
20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1
#CASSANDRAEU
CASSANDRASUMMITEU
noSQL “raw” subscribers data first slice update index cf SUBS_NAME
uuid1 Pavel, 765-23-14, address1, document1, 2, 21-01-2013, 1(active_status)
uuid2 Vadim, 765-23-16, address2, document2, 2, 21-01-2013, 1(active_status)
Pavel {21-01-2013: document1: 1} {21-01-2014: document1: 0} uuid1 uuid3
Vadim {21-01-2013: document2: 1}
uuid2
uuid3 Pavel, 765-23-14, address1, document1, 2, 21-01-2014, 0(active_status)
#CASSANDRAEU
CASSANDRASUMMITEU
Data modeling. Graph data
#CASSANDRAEU
CASSANDRASUMMITEU
noSQL graph data First record from the first switch Second record from the second switch CALLS_EDGES column family
2013/01/12.msisdn1 {VOICE: msisdn2: 1:uuid1} {VOICE: msisdn2: 2:uuid2} OUT OUT
2013/01/12.msisdn2 {VOICE: msisdn1: 1:uuid1} {VOICE: msisdn1: 2:uuid2} IN IN
uuid started numA numB Event type
subsciber
uuid1 2013/01/12 msisdn1 msisdn2 VOICE 1
uuid started numA numB Event type
subsciber
uuid2 2013/01/12 msisdn1 msisdn2 VOICE 1
#CASSANDRAEU
CASSANDRASUMMITEU
* Loading speed * Query speed * Fun
Summary
Write, rec/sec
Oracle 10 000 Cassandra 5 000 per node
seconds
Oracle > 1, depends on plans, statistics etc. Cassandra < 1, stable
#CASSANDRAEU
CASSANDRASUMMITEU
#CASSANDRAEU
CASSANDRASUMMITEU
* How to model "traditional" data * How to add full-text search to intensive data load * A couple of hardware issues
What we discussed today…