Leveraging Customer Behavioral Data to Drive...

20
1 @arnon86 S7456 Leveraging Customer Behavioral Data to Drive Revenue the GPU way

Transcript of Leveraging Customer Behavioral Data to Drive...

Page 1: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

1@arnon86 S7456

Leveraging Customer Behavioral Data

to Drive Revenue

the GPU way

Page 2: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

2@arnon86 S7456

Hi! Arnon Shimoni

Senior Solutions Architect

I like hardware & parallel / concurrent stuff

In my 4th year at SQream Technologies

Send gifs to @arnon86 or [email protected]

Page 3: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

3@arnon86 S7456

tl;dr

• GPUs are good number crunchers – makes them good for data processing

• SQream DB with GPUs is fast

• Rethink current solutions, the GPU can help

• Simple hardware is good enough, let’s avoid throwing lots of hardware at issues. Don’t need to shovel money at the problem!

Page 4: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

4@arnon86 S7456

SQream DB – an SQL database powered by GPUs

Fast• Columnar storage • Always on compression• 2 TB / hour / GPU ingest speed

Scalable• 10 TB to 1 PB with ease

SQL Database• Familiar ANSI SQL• Standard connectors (ODBC, JDBC)

Extensible for AI• Python, Jupyter, etc• Data science

Powered by GPUs• Massively parallel engine• Relies on GPUs for power, not RAM

</>

Page 5: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

5@arnon86 S7456

This story starts at MWC last yearThat’s my ear!

Page 6: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

SQream knows telecoms

We’ve helped operators with

• Better analysis of network events

• Speeding up CDR preparations

• More history with security management (SIEM)

• And now – customer behaviour

Page 7: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

7@arnon86 S7456

There is a lot of data about customers in telecoms

• Where and when they wake up and where they spend their days(daily grinders)

• When/where were they were Instagramming(When and where data was used)

• How frustrated they got(what the network experience was in each location)

• What modes of transport they use

• How close they are to competitor locations

But are they actually using this data? Are they getting anything actionable?

Are they looking at the entire customer base, and not just a single customer?

Page 8: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

8@arnon86 S7456

“You know, Telefonica has this multi-million dollar product based on Hadoop for selling this customer behaviour data to 3rd party companies.

Have you thought about maybe getting the same solution for your company, but much simpler?”

Page 9: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

9@arnon86 S7456

“Oh, and we’ll do it for you with a single machine”

Page 10: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

10@arnon86 S7456

Why their current setup wasn’t good enough for this

• Data scientists and BI professionals have only short windows of time to run queries, because of overloaded systems

• Windows cut even shorter due to long overnight loading

• Queries take hours, and iterations become painful

Long queries Coffee breaks Bathroom breaks Unhappy managers Unhappy everyone

Page 11: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

11@arnon86 S7456

Databases that displease data scientists

• When data scientists or BI professionals want to ask questions that no one has asked before, these systems tend to ‘break’ and not deliver what’s expected

• They’re just not designed for ad-hoc querying

• Legacy databases require indexing and a lot of manual tuning

• Newer databases like Vertica also require creating projections, which is time-consuming and inflexible

• Distributed databases don’t perform well when JOIN operations are necessary

• In-memory databases are very painful on the wallet if you need more than a couple of terabytes

Page 12: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

12@arnon86 S7456

Picking the wrong databases will cause pain!

Just some of what we saw• Cloudera – for the BI team• Teradata – for the marketing team• Oracle Exadata – Transactional - for CDR collection and customer records• Vertica, Netezza – for financial• Lots of Greenplum – to collect from many sources, for marketing and BI

Page 13: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

13@arnon86 S7456

Chanel says racks are fashionable. Our customers think otherwise

Page 14: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

14@arnon86 S7456

SQream DB softwarein a standard 2U server

Configured with 96GB RAM and a single Tesla K80

for a $4,000 total investment.

Designed to handle ~40 TB of telecom data

Page 15: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

15@arnon86 S7456

Sample dashboards generatedDashboard showing 3G/4G data throughput throughout the day (Morning, Lunch, Evening, Night, …).Larger circles represent more data throughput.

Colour becomes darker as the day progresses.Dark-outline circles mean more night-time traffic.

Dashboard aggregates directly off SQream DB, with no intermediate steps.

Represents 3 table join(3.3B rows ⋈ 40M rows ⋈ 300K rows)

Page 16: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

16@arnon86 S7456

Sample dashboards generatedDashboard showing 3G/4G data throughput throughout the day (Morning, Lunch, Evening, Night, …).Larger circles represent more data throughput.

Colour becomes darker as the day progresses.Dark-outline circles mean more night-time traffic.

Dashboard aggregates directly off SQream DB, with no intermediate steps.

Represents 3 table join(3.3B rows ⋈ 40M rows ⋈ 300K rows)

Page 17: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

17@arnon86 S7456

Saving hours on reporting with SQream DBAugmenting legacy MPP with a faster, easier to use GPU-powered analytics database

CDR 4G

CDR 3G

Non CDR Dozens of Reports

AggregationsETL Process

80 node

5 hours

Da

ta S

ou

rce

s

Direct Loading, 2TB/h ingest rate

20 minutes with SQream DB

15x faster

Page 18: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

The cost of performance

80 nodes – 5 full racks960 CPU cores, 5.12 TB RAM

SQream DB v1.9.6

HP DL380g9 with NVIDIA Tesla K8096 GB RAM + 6 TB storage

$$$10,000,000

120 m

300 m 20 m

10 m

$200,000

ETL time15x faster

Reporting time12x faster

TCO w/license50x more cost

effective

Page 19: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

33.70

56

4.0

12,000,000

That wasn’t an anomalyWe’ve done it against Netezza, Teradata, Oracle, Vertica, and even Hadoop based systems.

31.70

4

4.7

500,000

Netezza

8 full 42U racks, 56 S-Blades7 TB RAM

SQream DB v1.9.7

Dell C4130 with 4x NVIDIA Tesla K80512 GB RAM + iSCSI JBOD (20TB)

Average query time(seconds)

Processing Units(S-Blade / GPUs)

Compression ratio

Cost of Ownership $$

Page 20: Leveraging Customer Behavioral Data to Drive Revenueon-demand.gputechconf.com/gtc/2017/presentation/s7456-arnon-shimoni... · Leveraging Customer Behavioral Data to Drive Revenue

Find out more about SQream’s high performance

GPU-driven database software

www.sqream.comor [email protected]