What is big data?


of 11

  • date post

  • Category

  • view

  • download


Embed Size (px)


Slides of the course on big data by C. Levallois from EMLYON Business School. For business students. Check the online video connected with these slides. -> The 5 driving forces behind big data.


<ul><li> 1. MK99 Big Data1Big data &amp; cross-platform analyticsMOOC lectures Pr. Clement Levallois </li></ul> <p> 2. MK99 Big Data2What is big data?You should have watched the video clip about What is data? first. 3. MK99 Big Data3Big data is a mess. but we can find 5 key points helping us understand what it is. 4. MK99 Big Data41. Data gets generated in bigger volumes because of the digitalization of the economyData generated by a movie-goer:1.On the box office ticket: movie title, date, priceIn a movie theaterWatching through Netflix1.Login to Netflix: age, name, gender, location + preferences for movie genres?2.Browsing / purchasing history for movies3.Movie title, date and price for the movie4.Data on movie paused / interrupted?5.Comments / ratings posted6.Follow / Friends activities7.If Netflix account connected to FB: personal info, etc. 5. MK99 Big Data5Another look at the digitalization of the economy: products and distribution channelsBeforeAfterSource: B4B by Wood, Hewlin &amp; Lah (2013) 6. MK99 Big Data62. Low prices made data a cheaper commodityLarger Storage and processing power (computers!)Prices for hard drives and processors get regularly cut by twoPersonal anecdote: in 1994 my parents paid ~ 1,600 a computer with a *450Mb* hard drive.Quicker data communication (Internet! Optical Fiber!)Broadband connections become mainstreamMore powerful, free software for analysisEx: Excel can now analyze 1,000,000 rows x 1,600 col: it was just 16,000 x 256 cols 5 years ago.New Open source software and packages such as R provide free software solutions for analysis=&gt; In practice, this means that SME can afford to generate, store and explore datasets. 7. MK99 Big Data73. More stuff count as data nowIn the 1990sStandardized texts and numbersIn the 2010sStandardized texts and numbers +- Places (info about distance, proximity, etc.)- Networks (info about who is connected with whom)- Free text (semantics and meaning)Typical query would be:In my database, find all customers living in the city Paris, which bought at least one product last month.Example of a query: In my database, find all customers living between Dijon and Paris, who made a negative comment about one of our products and who are popular in their network of friends.This is much richer than the query you see on the left, because it deals with geographical distances, opinions and connections between stuff these are not simple operations on text and numbers!Today, space, semantics and networks count as data with business relevance, that needs to be stored and queried.Big players in these new dbs are Neo4J, CartoDB, MongoDB. Search also for SPARQL. 8. MK99 Big Data84. Growing expectations about the value of large datasetsWith big data, you hope you can-Detect things before they are reported (crimes, epidemics, change in consumer tastes)-Have a 360 view on each person in your db (customer, patient, citizen)-and create the perfect response to that (personalized products and services) 9. MK99 Big Data95. More data to come!Internet of Things (connected objects)Connected camera, phone, toothbrush, watch, shoes, car, scale, aircon, jewelry, etc.-&gt; All these objects generate data about speed, temperature, behavior, etc.Open data movementGovernments, cities, NGOs and firms open up their data to users.Quantified self movementPeople wearing connected objects (bracelets, shoes, phones, etc.) to track their biometrics and possibly sharing them. 10. MK99 Big Data10Next stepsWatch the video clip on 2 popular expressions:the cloud and HadoopContinue the readings for week 1 11. MK99 Big Data11This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com)Contact Clement Levallois (levallois [at] em-lyon.com) for more information. </p>