Get Real About Big Data

27
The Team Jim Blomo (@JimBlomo) Big Data @ Yelp Amazon, Pbworks Lecturer @ UC Berkeley He likes Distributed systems, startups, fitness, and whatever else you've got. Dave Mariani (@Dmariani, Klout 144) Big Data @ Yahoo! Blue Lithium, MindeShare Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple 4,000 node Hadoop clusters

description

This presentation was used at the Big Data Day at the Computer History Museum on June 7.

Transcript of Get Real About Big Data

Page 1: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 5: Get Real About Big Data

www.crunchbase.sisense.com

Page 6: Get Real About Big Data

!@SiSense

Page 7: Get Real About Big Data

1

DATA SKILLS

Page 8: Get Real About Big Data

Little Bit of That…

Page 9: Get Real About Big Data

Little Bit of This…

Page 10: Get Real About Big Data

Favorite Data Scientist Hire?

Page 11: Get Real About Big Data

Source: Drew Conway

Little Bit of This…

Page 12: Get Real About Big Data

Little Bit of This…

http://bit.ly/dssurvey

Page 13: Get Real About Big Data

2

STARTING FROM SCRATCH?

Page 14: Get Real About Big Data

Starting from SCRATCH?

Page 15: Get Real About Big Data

Starting from CRAP?

Page 16: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 17: Get Real About Big Data

3

SAME DIFF?

Page 18: Get Real About Big Data

WHAT’S WHAT?

Page 19: Get Real About Big Data

Big different from…?

Page 20: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 21: Get Real About Big Data

4

FEATURE FEST?

Page 22: Get Real About Big Data

What feature…?

Page 23: Get Real About Big Data

What feature…?

Page 24: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 26: Get Real About Big Data

TRY IT @ WWW.SISENSE.COM