2.3 Methods for Big Data What is Big Data? Summarizing Big
Data
Slide 3
The Flood of Big Data 90% of all data created by humankind has
been created in the last 2 years
Slide 4
Data Creation Data Flow a Decade AgoData Flow Now Marketing
Survey
Slide 5
What Exactly is BIG DATA? n BIG DATA refers to a collection of
tools, techniques and technologies that make it possible to work
with data at any scale. n BIG DATA is less about size, more about
flow and velocity
Slide 6
The 3 Vs of BIG DATA 1. Volume Larger than conventional
databases can handle 2. Velocity High rate at which data is
generated, processed and analyzed in real time 3. Variety Data
formats are unstructured and inconsistent
Slide 7
Volume
Slide 8
n Walmart collects more than 2.5 petabytes of data every hour
from its customer transactions.
Slide 9
Velocity n Twitter Twitter
Slide 10
Variety: Data formats are Unstructured and Inconsistent
Slide 11
Big Data Technologies n http://aws.amazon.com/big-data/
http://aws.amazon.com/big-data/ n
https://cloud.google.com/products/bigquery/
https://cloud.google.com/products/bigquery/ n
https://support.google.com/fusiontables/ans wer/2571232
https://support.google.com/fusiontables/ans wer/2571232 n
http://www.microsoft.com/en-us/server-
cloud/solutions/big-data.aspx
http://www.microsoft.com/en-us/server-
cloud/solutions/big-data.aspx Word walls, word clouds, correlation
wheels, heat maps, fusion tables, NOSQL, networks
Slide 12
Correlation Wheel (sort of) n http://www.bytemuse.com/post/nfl-
football-schedule/ http://www.bytemuse.com/post/nfl-
football-schedule/