2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current...

12
Data infrastructure 30.03.2015

Transcript of 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current...

Page 1: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

Data infrastructure30.03.2015

Page 2: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

2CONFIDENTIAL |

• A few words about Ustream

• Ustream data infrastructure before the BI team

• Current data infrastructure

• Big Data lessons learned & some future plans

Agenda

Page 3: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

3CONFIDENTIAL |

• World’s leading live video service provider

• 80+ million monthly users

• SAAS company

• Founded in 2007

• 250 employees around the World- San Francisco, Budapest, Tokyo, and Seoul

Ever heard of Ustream?

Page 4: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

4CONFIDENTIAL |

Probably you’ve seen a few streams provided by us…

Page 5: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

5CONFIDENTIAL |

Data infrastructure – before BI team

Ustream databases

Page 6: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

6CONFIDENTIAL |

Data infrastructure in general

DWH (MySQL

)

ETL (Kettle

)

Hadoop (Amazon S3+EMR)

Ustream DBs

Tableau

Ustream Media Server

Page 7: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

7CONFIDENTIAL |

Big Data infrastructure & data flow

Ustream Media Servers (meta)

Content Delivery Servers (data)

Redis

Log files

Realtime reports

Local backup

Page 8: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

8CONFIDENTIAL |

Compromises in the architecture

• The architecture seems ad-hoc and heterogenous:- Yes, it is.- Important: no magic but still the problem is solved.- Fastest way to do the task with limited resources.

• It’s a partial solution for Ustream’s needs:- Financial, marketing, etc. reports go the traditional

way- No reason to put small amount of data into

Hadoop…

Page 9: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

9CONFIDENTIAL |

• Big Data is not a buzzword for us anymore- Hadoop has some tricks but you can easily use it in production- Amazon EMR is a great place to learn

• Short time-to-market but with compromises- Small investment, still acceptable results

• Key factors:- strong sponsorship and trust from management- dedicated resources for research and development- user expectations had to be managed

Lessons learned

Page 10: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

10CONFIDENTIAL |

• Click-stream, buffering, usage pattern analysis

• Change in logging methods- Use Kafka for log shipping instead of log files- Merge logs into one to understand usage patterns

• Better self-serve interface needed- New version of Hue is promising- Tableau comes with direct Hadoop connector

Future plans

Page 11: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

11CONFIDENTIAL |

Q & A

Time to ask…

Page 12: 2 CONFIDENTIAL | A few words about Ustream Ustream data infrastructure before the BI team Current data infrastructure Big Data lessons learned & some.

12CONFIDENTIAL |

Thanks for your attention!

[email protected]