Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

19
Alphago vs Lee Se-Dol Tweeter Analysis using Hadoop and Spark March 18 2016 Jongwook Woo, PhD

Transcript of Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Page 1: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Alphago vs Lee Se-DolTweeter Analysis using Hadoop and

SparkMarch 18 2016Jongwook Woo, PhD

Page 2: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

ContentHadoop and SparkIBM DashDBConclusion

Page 3: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Hadoop and Spark Environment Systems

Azure HDInsights Spark8 Nodes

40 cores: 2.4GHz Intel Xeon Memory - Each Node: 28 GB

Data SourceKeyword ‘alphago’ from Tweeter via Apache NiFi

Data Size 63,193 tweets

Real Time Data Collection period03/12 – 03/17/2016

No data collected on 03/13

Page 4: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Top 10 Countries that Tweets “Alphago”

Positive Negative

Page 5: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Top 10 Countries # of Tweets per Country

USA: > 11,000Japan: > 9,000Korea: > 1,900Russia, UK: > 1,600Thai Land, France : > 1,000 Netherland, Spain, Ukraine: > 600

Page 6: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Top 10 Countries Sentiment

Positive Negative

Page 7: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Top 10 CountriesMost Tweeted Countries

All countries show more positive tweets Korea, Japan, USA

Country Positive NegativeUSA 5070 3567

Japan 8118 217…

Korea 1053 407…

Page 8: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Daily Tweets in 03/12 – 03/17/2016

3/12/2016 3/13/2016 3/14/2016 3/15/2016 3/16/2016 3/17/20160

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

Alphago vs Lee Sedol

Game 4: Mar 13 Lee Se-Dol win

Game 5: Mar 15

Game 3: Mar 12

Page 9: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Ngram words 3 word in row right after Go-Champion “sedol” and “se-dol”

sedol

se-dol3-grams FrequencyAgain-to-win 1,187Is-something-I’ll 369Is-something-i 199In-go-tournament 168

Page 10: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Sentiment Map of Alphago

PositiveNegative

Page 12: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Tweeter Analysis using IBM DashDB Environment:

DashDB and Tweets Services of IBM Bluemix Load existing data

Period: by March 16 2016 Authors and Followers of the Tweets

Page 13: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Top 10 Tweet Countries With Hashtag “#Alphago”

United States: >10,000Japan: >8,000Korea: >1,800

Page 14: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Hashtags Frequency

Page 15: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Sentiment at #Alphago

Page 16: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Gender Counts Who Tweets

female male unknown

Unknown

Page 17: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Tweets counts per months

Aug-2014 Feb-2015 Feb-2016 Jan-2015 Jan-2016 Mar-20160

2000

4000

6000

8000

10000

12000

Tweets counts per months

Page 18: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Daily Tweets During Games

3/9/2016 3/10/2016 3/11/2016 3/12/2016 3/13/2016 3/14/2016 3/15/2016 3/16/20160

500

1000

1500

2000

2500

3000

3500

Daily Tweets during GamesGame 4: Mar 13 Lee Se-Dol win

Game 5: Mar 15 Game 3: Mar 12

Game 1: Mar 9

Game 2: Mar 10

Page 19: Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark

Conclustion Analyze Tweeters with “Alphago” USA and Japan dominates the tweets

More than KoreaEuropean countries as well

More Positive tweetsAlphago and Lee Sedol both become popular