Insight dataengineering henok_rehearsaldemo
-
Upload
university-of-wyoming -
Category
Data & Analytics
-
view
48 -
download
0
Transcript of Insight dataengineering henok_rehearsaldemo
![Page 1: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/1.jpg)
Where is my tweet?Henok Mengistu
Insight Data Engineering Fellow
Silicon Valley, Summer 2016
![Page 2: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/2.jpg)
Motivation
![Page 3: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/3.jpg)
Motivation
But, this number doesn't show how the tweet spreads-out?
![Page 4: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/4.jpg)
But, a re-tweet graph could show
![Page 6: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/6.jpg)
Under the hood
![Page 7: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/7.jpg)
Engineering Challenges
● Stitching the different components ● Re-tweets could arrive out of order
– Spark can't sort across a data stream
– The driver node should collect and sort re-tweets
![Page 8: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/8.jpg)
● I am Henok– Originally, from Ethiopia
– Currently, a PhD student at the University of Wyoming
● Working on Evolutionary Computation● I was also working as a Teaching assistant
– I like soccer, but not skiing
![Page 9: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/9.jpg)
Thank you!
![Page 10: Insight dataengineering henok_rehearsaldemo](https://reader031.fdocuments.net/reader031/viewer/2022030313/58eca1d61a28aba6268b46d3/html5/thumbnails/10.jpg)
Queries
● On the re-tweet graph
– who are my audiences? ● Geographically, social groups
– Betweenness centrality ● Who is relevant to spread out my tweet?● Identify influential followers