Smalltalk and Big Data - Avi Bryant

71
Smalltalk and Big Data Avi Bryant Twitter

Transcript of Smalltalk and Big Data - Avi Bryant

Page 1: Smalltalk and Big Data - Avi Bryant

Smalltalk and Big Data

Avi BryantTwitter

Page 2: Smalltalk and Big Data - Avi Bryant
Page 3: Smalltalk and Big Data - Avi Bryant

Smalltalk and Big Data

Avi BryantTwitter

and the web

Page 4: Smalltalk and Big Data - Avi Bryant

Smalltalk and Big Data

Avi BryantTwitter

and the weband stuff

Page 5: Smalltalk and Big Data - Avi Bryant

2004

Page 6: Smalltalk and Big Data - Avi Bryant

2004-2011

Page 7: Smalltalk and Big Data - Avi Bryant

View

Controller

Model

Page 8: Smalltalk and Big Data - Avi Bryant

Web Client

Web Server

Storage

Page 9: Smalltalk and Big Data - Avi Bryant

Web Client

HTML

GET/POST

HTML

GET/POST

HTML

GET/POST

HTML

GET/POST

Page 10: Smalltalk and Big Data - Avi Bryant

Web Client

HTML

GET/POST

HTML

GET/POST

HTML

GET/POST

HTML

GET/POST

Page 11: Smalltalk and Big Data - Avi Bryant

Web Client

HTML+JS

GET/POST

JSON

XHR

JSON

XHR

JSON

XHR

Page 12: Smalltalk and Big Data - Avi Bryant
Page 13: Smalltalk and Big Data - Avi Bryant
Page 14: Smalltalk and Big Data - Avi Bryant

Ten days to implement the lexer, parser, bytecode emitter, interpreter, built-in classes, and decompiler.

Ten days without much sleep to build JS from scratch, "make it look like Java" (I made it look like C), and smuggle in its saving graces: first class functions (closures came later but were part of the plan), Self-ish prototypes (one per instance, not many as in Self).

I'll do better in the next life.

— Brendan Eich

Page 15: Smalltalk and Big Data - Avi Bryant
Page 16: Smalltalk and Big Data - Avi Bryant

Lars Bak

Page 17: Smalltalk and Big Data - Avi Bryant
Page 18: Smalltalk and Big Data - Avi Bryant

150M+ active users

Page 19: Smalltalk and Big Data - Avi Bryant
Page 20: Smalltalk and Big Data - Avi Bryant
Page 21: Smalltalk and Big Data - Avi Bryant
Page 22: Smalltalk and Big Data - Avi Bryant
Page 23: Smalltalk and Big Data - Avi Bryant

Web Client

Web Server

Storage

Page 24: Smalltalk and Big Data - Avi Bryant

Web Server

• Continuation-based flow control

Page 25: Smalltalk and Big Data - Avi Bryant
Page 26: Smalltalk and Big Data - Avi Bryant

Web Server

• Continuation-based flow control• HTML generation

Page 27: Smalltalk and Big Data - Avi Bryant

Web Server

• Continuation-based flow control• HTML generation• Stateful UI components

Page 28: Smalltalk and Big Data - Avi Bryant

Web Server

• Continuation-based flow control• HTML generation• Stateful UI components• callbacks with unique IDs

Page 29: Smalltalk and Big Data - Avi Bryant

Web Server

• Continuation-based flow control• HTML generation• Stateful UI components• callbacks with unique IDs

Page 30: Smalltalk and Big Data - Avi Bryant

CSRF<img src=“http://mail.google.com/mail/?logout” />

Page 31: Smalltalk and Big Data - Avi Bryant

CSRF<img src=“http://mail.google.com/mail/?logout” />

http://mail.google.com/mail/?logout&token=ab4367de

Page 32: Smalltalk and Big Data - Avi Bryant

CSRF<img src=“http://mail.google.com/mail/?logout” />

http://mail.google.com/mail/?logout&token=ab4367dehttp://mail.google.com/seaside/mail?_k=ab4367de

Page 33: Smalltalk and Big Data - Avi Bryant

Burn the disk packs• No components, continuations, or canvas• JSON builder w/ callbacks

Page 34: Smalltalk and Big Data - Avi Bryant

Web Client

Web Server

Storage

Page 35: Smalltalk and Big Data - Avi Bryant

Storage

=~

Page 36: Smalltalk and Big Data - Avi Bryant

Stone

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Page 37: Smalltalk and Big Data - Avi Bryant

MySQL

Memcache

Ruby Ruby

Memcache

Ruby Ruby

Memcache

Ruby Ruby

Memcache

Ruby Ruby

Page 38: Smalltalk and Big Data - Avi Bryant

Gem+SPC+Stone = Transparent Management

Ruby+Memcache+MySQL =Explicit Management

Storage

Page 39: Smalltalk and Big Data - Avi Bryant

Gem+SPC+Stone = Transparent Management

Ruby+Memcache+MySQL =Explicit Management

Storage

Page 40: Smalltalk and Big Data - Avi Bryant

Gem+SPC+Stone = Transparent Management

Ruby+Memcache+MySQL =Explicit Management

Storage

Page 41: Smalltalk and Big Data - Avi Bryant

MySQL

Memcache

Ruby Ruby

Memcache

Ruby Ruby

Memcache

Ruby Ruby

Memcache

Ruby Ruby

MySQL MySQL MySQL

Page 42: Smalltalk and Big Data - Avi Bryant

Sharding?

OOCL: 3B objects500GB data

Page 43: Smalltalk and Big Data - Avi Bryant

Sharding?

OOCL: 3B objects500GB data

= 3 weeks of tweets

Page 44: Smalltalk and Big Data - Avi Bryant

Stone Slave

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Shared Page Cache

Gem Gem

Stone Stone Stone

Page 45: Smalltalk and Big Data - Avi Bryant

Web Client

Web Server

Online Storage Offline Storage

Page 46: Smalltalk and Big Data - Avi Bryant

Offline Storage

15TB

Page 47: Smalltalk and Big Data - Avi Bryant

Offline Storage

15TB

Thu

Page 48: Smalltalk and Big Data - Avi Bryant

Offline Storage

15TB

Mon

15TB

Tue

15TB

Wed

15TB

Thu

15TB

Fri

15TB

Sat

15TB

Sun

15TB

Mon

15TB

Tue

15TB

Wed

15TB

Thu

15TB

Fri

15TB

Sat

15TB

Sun

15TB

Mon

15TB

Tue

15TB

Wed

15TB

Thu

15TB

Fri

15TB

Sat

15TB

Sun

Page 49: Smalltalk and Big Data - Avi Bryant

Hadoop

Page 50: Smalltalk and Big Data - Avi Bryant

Hadoop

Page 51: Smalltalk and Big Data - Avi Bryant

Hadoop

Page 52: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0

tweets.tsv

tweets.tsv/part1

tweets.tsv/part2

Page 53: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0tweets.tsv/part1

tweets.tsv

tweets.tsv/part1tweets.tsv/part2

tweets.tsv/part2tweets.tsv/part0

Page 54: Smalltalk and Big Data - Avi Bryant
Page 55: Smalltalk and Big Data - Avi Bryant

MAP REDUCE

Page 56: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0

tweets.tsv

tweets.tsv/part1

tweets.tsv/part2

grep smalltalk tweets.tsv > st.tsv

grep smalltalk tweets.tsv/part0 > st.tsv/part0

grep smalltalk tweets.tsv/part1 > st.tsv/part1

grep smalltalk tweets.tsv/part2 > st.tsv/part2

MAP

Page 57: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0st.tsv/part0

tweets.tsv

tweets.tsv/part1st.tsv/part1

tweets.tsv/part2st.tsv/part2

grep smalltalk tweets.tsv > st.tsv

grep smalltalk tweets.tsv/part0 > st.tsv/part0

grep smalltalk tweets.tsv/part1 > st.tsv/part1

grep smalltalk tweets.tsv/part2 > st.tsv/part2

MAP

Page 58: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0st.tsv/part0

tweets.tsv

tweets.tsv/part1st.tsv/part1

tweets.tsv/part2st.tsv/part2

wc -l st.tsv > count.tsv

REDUCE

sum > count.tsv/part0

wc -l st.tsv/part0

wc -l st.tsv/part1

wc -l st.tsv/part2

Page 59: Smalltalk and Big Data - Avi Bryant

count-words st.tsv/* | sort | sum > count.tsv

squeak 3smalltalk 5visualworks 10squeak 6smalltalk 4visualworks 7squeak 1visualworks 3

squeak 1squeak 3squeak 6smalltalk 4smalltalk 5visualworks 3 visualworks 7visualworks 10

squeak 10smalltalk 9visualworks 20

Page 60: Smalltalk and Big Data - Avi Bryant

tweets.tsv/part0st.tsv/part0

tweets.tsv

tweets.tsv/part1st.tsv/part1

tweets.tsv/part2st.tsv/part2

count-words st.tsv | sort | sum

REDUCE

sum > count.tsv/part2

count-words st.tsv/part0

count-words st.tsv/part1

count-words st.tsv/part2

sum > count.tsv/part1

sum > count.tsv/part0

(word, count)

Page 61: Smalltalk and Big Data - Avi Bryant

MAP REDUCE

MAP

REDUCE

MAP

REDUCE REDUCE

MAP

MAP

REDUCE

Page 62: Smalltalk and Big Data - Avi Bryant

MAP REDUCE

MAP

REDUCE

MAP

REDUCE REDUCE

MAP

MAP

REDUCE

Join

Group & Count

Page 63: Smalltalk and Big Data - Avi Bryant

public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } }

:(

Page 64: Smalltalk and Big Data - Avi Bryant

:/

Page 65: Smalltalk and Big Data - Avi Bryant

users := ‘users.csv’ loadFromHadoop.users_1825 := users select:

[:ea | ea age between: 18 and: 25].

joined := users_1825 joinedWith: pages by: ...

?

!HadoopCollection categoriesFor: ‘map/reduce’!map: mapBlock thenReduce: reduceBlock...

Page 66: Smalltalk and Big Data - Avi Bryant

Doesn’t Need

Raw performance

Extensive libraries

Concurrency/Async IO

Wide industry acceptance

Fast startup time

Page 67: Smalltalk and Big Data - Avi Bryant

Doesn’t Need

Raw performance

Extensive libraries

Concurrency/Async IO

Wide industry acceptance

Fast startup time

Should Have

Lightweight functions/blocks

Dynamic OO

Process migration

Good debugging

Page 68: Smalltalk and Big Data - Avi Bryant

Doesn’t Need

Raw performance

Extensive libraries

Concurrency/Async IO

Wide industry acceptance

Fast startup time

Should Have

Lightweight functions/blocks

Dynamic OO

Process migration

Good debugging

Page 69: Smalltalk and Big Data - Avi Bryant

Doesn’t Need

Raw performance

Extensive libraries

Concurrency/Async IO

Wide industry acceptance

Fast startup time (JVM integration)

Page 70: Smalltalk and Big Data - Avi Bryant

Doesn’t Need

Raw performance

Extensive libraries

Concurrency/Async IO

Wide industry acceptance

Fast startup time

Should Have

Lightweight functions/blocks

Dynamic OO

Process migration

Good debugging

(JVM integration)

Page 71: Smalltalk and Big Data - Avi Bryant

?