Big Data Science -...

32
Big Data Science @datasciencebe 1

Transcript of Big Data Science -...

Page 1: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Big Data Science

@datasciencebe 1

Page 2: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

@datasciencebe 2

Page 3: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Our mission is

to educate, inspire,

empower scholars and

professionals

to apply big data

sciences

to address humanity’s

grand challenges.

@datasciencebe 3

Page 4: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

4

@datasciencebe

Page 5: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

5

@datasciencebe

Page 6: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

6

• Monthly Meetups• Hackathons for Good• Hands-On Trainings• Mooc coaching• Resource matching

@datasciencebe

Page 8: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

@datasciencebe 8

Page 9: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

@datasciencebe 9

Page 10: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Value

VOLUME(amount &complexity)

VARIETY(type & sources)

VERACITY(quality &

trust)

VELOCITY(speed)

DEFINITIONS

@datasciencebe 10

Page 11: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

LOW HIGH

LOW

HIG

H

INFO

RM

ATI

ON

AN

D K

NO

WLE

DG

E

VOLUME AND FLOWS OF (BIG) DATA

255 MILLION SITES (2011)

2 BILLION PEOPLE (2011)

1 TRILLION DEVICES (2013)

@datasciencebe11

Page 12: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

TIME

DA

TA V

OLU

MES

Source: IDC@datasciencebe 12

Page 13: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Eye-opening facts about Big Data1. Every 2 days we create as much information as we did from the beginning of time until 2003 [Source]

2. Over 90% of all the data in the world was created in the past 2 years. [Source]

3. It is expected that by 2020 the amount of digital information in existence will have grown from 3.2 zettabytes today to 40 zettabytes. [Source]

4. The total amount of data being captured and stored by industry doubles every 1.2 years [Source]

5. Every minute we send 204 million emails, generate 1,8 million Facebook likes, send 278 thousand Tweets, and up-load 200 thousand photos to Facebook [Source]

6. Google alone processes on average over 40 thousand search queries per second, making it over 3.5 billion in a single day. [Source]

7. Around 100 hours of video are uploaded to YouTube every minute and it would take you around 15 years to watch every video uploaded by users in one day. [Source]

8. Facebook users share 30 billion pieces of content between them every day. [Source]

9. If you burned all of the data created in just one day onto DVDs, you could stack them on top of each other and reach the moon – twice. [Source]

10. AT&T is thought to hold the world’s largest volume of data in one unique database – its phone records database is 312 terabytes in size, and contains almost 2 trillion rows. [Source]

11. 570 new websites spring into existence every minute of every day. [Source]

12. 1.9 million IT jobs will be created in the US by 2015 to carry out big data projects. Each of those will be supported by 3 new jobs created outside of IT – meaning a total of 6 million new jobs thanks to big data. [Source]

13. Today’s data centres occupy an area of land equal in size to almost 6,000 football fields. [Source]

14. Between them, companies monitoring Twitter to measure “sentiment” analyze 12 terabytes of tweets every day. [Source]

15. The amount of data transferred over mobile networks increased by 81% to 1.5 exabytes (1.5 billion gigabytes) per month between 2012 and 2014. Video accounts for 53% of that total. [Source]

16. The NSA is thought to analyze 1.6% of all global internet traffic – around 30 petabytes (30 million gigabytes) every day [Source]

17. The value of the Hadoop market is expected to soar from $2 billion in 2013 to $50 billion by 2020, according to market research firm Allied Market Research. [Source]

18. The number of Bits of information stored in the digital universe is thought to have exceeded the number of stars in the physical universe in 2007. [Source]

19. This year, there will be over 1.2 billion smart phones in the world (which are stuffed full of sensors and data collection features), and the growth is predicted to continue. [Source]

20. The boom of the Internet of Things will mean that the amount of devices that connect to the Internet will rise from about 13 billion today to 50 billion by 2020. [Source]

21. 12 million RFID tags – used to capture data and track movement of objects in the physical world – had been sold in by 2011. By 2021, it is estimated that number will have risen to 209 billion as the Internet of Things takes off. [Source]

22. Big data has been used to predict crimes before they happen – a “predictive policing” trial in California was able to identify areas where crime will occur three times more accurately than existing methods of forecasting. [Source]

23. By better integrating big data analytics into healthcare, the industry could save $300bn a year, according to a recent report – that’s the equivalent of reducing the healthcare costs of every man, woman and child by $1,000 a year. [Source]

24. Retailers could increase their profit margins by more than 60% through the full exploitation of big data analytics. [Source]

25, The big data industry is expected to grow from US$10.2 billion in 2013 to about US$54.3 billion by 2017. [Source]13

1.9 million IT jobs will be created in the US by 2015 to carry out big data projects.

Each of those will be supported by 3 new jobs created outside of IT – meaning a total of 6 million new jobs thanks to big data.

Page 16: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Big Data Science use cases:

@datasciencebe 16

Page 17: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

How Data Is Used To Change Our World

• Understanding and Targeting Customers

• Understanding and Optimizing Business Processes

• Personal Quantification and Performance Optimization

• Improving Healthcare and Public Health

• Improving Sports Performance

• Improving Science and Research

• Optimizing Machine and Device Performance

• Improving Security and Law Enforcement

• Improving and Optimizing Cities and Countries

• Financial Trading

@datasciencebe 17

Page 20: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

20

Page 21: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

@datasciencebe21

Page 22: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

“The ability to take data - to be able to understand it, to processit, to extract value from it, to visualize it, to communicate it is going to be a hugely important skill in the next decades.”

Hal Varian - Chief Economist Google@datasciencebe 22

Page 23: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

The main challenge is changing the organization so that it more readily challenges the rationale for decisions, uses data to back up the discussion, and generates explanations.

@datasciencebe23

Page 24: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

How to become a DS in Belgium

• Learn and Share• Follow an academical track at the universities• Join the on topic meetup groups• Participate to the hands-on work groups• Join the local linkedin groups• Follow the latest Moocs• Join a team a follow the Moocs together• Compete on Kaggel• Practice your skills by doing Data4Good

@datasciencebe 24

Page 25: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Data Science competitions

Notable Current and Recent Competitions

• GE NFL $10 Million Head Health Challenge, for more accurate diagnoses of mild brain injury and prognosis for recovery following acute and/or repetitive injuries.

• GE Hospital Quest on Kaggle.Your challenge: Contribute to the design of the ultimate patient experience. Prize Pool: $100,000

• GE Flight Quest on Kaggle.Your Challenge: Develop a usable and scalable algorithm that delivers a real-time flight profile to the pilot, helping them make flights more efficient and reliably on time. Prize Pool: $250,000

• Heritage Health Data Analysis Prize ($3M), can administrative health care data be used to accurately predict which patients will be admitted to the hospital?

Analytics, Data Mining Competition Platforms

• Kaggle, the leading platform for data prediction competitions

• CrowdANALYTIX, converts business challenges into analytics competitions

• DrivenData: Data Science Competitions for Social Good

• Innocentive, mainly focusing on life sciences, but has other interesting competitions

• TunedIT, education, research and industrial contests.

@datasciencebe 25

Page 26: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Learning by doing data4good

@datasciencebe 26

Page 27: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Latest Technical moocs on Data Science

• Coursera:• The Data Scientist’s Toolbox (Johns Hopkins)• R Programming (Johns Hopkins)• Regression Models (Johns Hopkins)• Statistical Inference (Johns Hopkins)• Getting and Cleaning Data (Johns Hopkins)• Practical Machine Learning (Johns Hopkins)• Exploratory Data Analysis (Johns Hopkins)• * Machine Learning (Stanford)• Computational Methods for Data Analysis (University of Washington)• Edx:• Introduction to Big Data with Apache Spark (BerkeleyX)• Data, Analytics and Learning (UT Arlington)• Data analysis to the MAX() (DelftX)• Foundations of Data Analysis (UTAustinX)• Big Data and Social Physics (MITx)• Udacity:• Data Analyst Nanodegree• Data Visualization and D3.js• Intro to Hadoop and MapReduce• Real-Time Analytics with Apache Storm• Intro to Data Science

@datasciencebe 27

Just join one of our MOOC-groups and achieve your certification in

group

Page 28: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Who is Hiring Big Data Science professionals

@datasciencebe 28More jobs here

Page 30: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Some predictions and trends for 2015

• 2014 was big for big data and 2015 is going to be even bigger. There are lots of predictions coming about big data technologies in 2015 and here is a look at some of the key predictions by big data experts around a globe.

• Customers will gain more control over the use of their data – with more and more wearable technology enabling users to create their own data. @SOURCE

• Big data has already made inroads to the cloud, but in 2015, more companies will take big data analytics to the cloud to meet their new demands more effectively in real-time. @SOURCE

• 2015 is expected to be the year when more emphasis will be placed on data collections especially sensor based collection. The key factor now is getting the exact data with least time, cost and risk. Data will be increasingly collected via mobile apps. @SOURCE

• From optimizing workflow to tracking overall employee happiness, we are likely to see an increased use of HR analytics in 2015. @SOURCE

• In 2015, IT will embrace self-service big data to allow business users self-service to big data. @SOURCE• Hadoop, NoSQL and in-memory databases will graduate from mostly experimental pilots to standard components

of enterprise data management, taking their place alongside relational databases. @SOURCE• Companies will extend their security and data governance practices to identify and remediate suspicious

behaviours and “low and slow” advanced-persistent-threat attacks that might otherwise be undetectable. @SOURCE

• Data Visualization and storytelling will become a necessity as every analyst today needs to be a story teller. @SOURCE

• In 2014, we saw marketers and sellers begin to analyse social data in earnest. In 2015, they will turn social intelligence into smarter strategies and start to take advantage of social data. @SOURCE

• Text analysis will gain increasing traction, with Web data, documents and images, with companies finally able to tackle unstructured data in meaningful ways. @SOURCE

@datasciencebe 30

2014 was big for big data and

2015 is going to be even bigger.

Page 31: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Don’t just collect data for data’s sake. Big Data needs

to solve a real business issue and bring added value

f.i. higher customer experience, increased

customer value, reduced costs...

Summary

@datasciencebe 31

Page 32: Big Data Science - download.minoc.comdownload.minoc.com/2014/49/philippevanimpe-brusselsdatascience.pdfEye-opening facts about Big Data 1. Every 2 days we create as much information

Brussels Data Science Community | @datasciencebe