データサイエンティストとは? そのスキル/ナレッジレベル定義の必要性

59
Big data, Big innovation 次のイノベーションは、ビッグデータから 2012年12月6日 株式会社ブレインパッド 佐藤 洋行 データサイエンティストとは? - そのスキル/ナレッジレベル定義の必要性 -

description

2012年12月6日(木)に開催されEMCジャパン主催「第2回データサイエンティストワークショップ」での弊社社員の講演資料です。

Transcript of データサイエンティストとは? そのスキル/ナレッジレベル定義の必要性

  • 1. Big data, Big innovation- / -2012126

2. 13 2 3. BrainPad Inc.) 3655 5-2-5 KN 2004318630 326,618,9262012930 117 201212 3 4. 4 5. ASP CRM ASP 5 6. 2,0001,9461,7501,500 1,3471,2501,000906 40 750 647 500 420 368 250 130 02006 2007 2008 2009 2010 2011 20126 6 6 6 6 6 6 6 7. 7 8. 8 9. GETTING CONTROL OF BIG DATA Harvard Business Review201210source: Harvard Business School Publishing., Oct 20129 10. Data is king at Amazon Matt Round (2004)source: Harvard Business School Publishing., Oct 2012 10 11. 2/311 12. 201210 GETTING CONTROL OF BIG DATA source: Harvard Business School Publishing., Oct 201212 13. 4/513 14. Davenport, Harris and Morison (2010) 5 DELTA Data Enterprise Leader Target Analyst 14 15. 15 16. 16 17. 17 18. 18 19. The sexy job in the next 10 yearswill be statisticians Hal Varian(Google)NewYorkTimes 20098519For Todays Graduate, Just One Word: Statistics 20. If sexy means having rare qualities that are much in demand, data scientists are already there. Thomas H. Davenport and D.J. Patil (HBS)Harvard Buisiness Review 201210 20 Data Scientist: The Sexiest Job of the 21st Century 21. Data Science Journal200241 Development of the web-based NIST X-rayPhotoelectron Spectroscopy (XPS) Database 21 22. 2002 2009 102012 22 23. 2018 source: McKinsey Global Institute, May 201123 24. 2002 2009 102012 24 25. Davenport and Patil (2012) source: Harvard Business School Publishing., Oct 201225 26. Davenport and Patil (2012) source: Harvard Business School Publishing., Oct 201226 27. 27 28. Davenport and Patil (2012) source: Harvard Business School Publishing., Oct 2012 28 29. Davenport and Patil (2012) source: Harvard Business School Publishing., Oct 2012 29 30. 30 31. CRISP-DMCross-Industry Standard Process for Data Mining DaimlerChrysler, NCR, OHRA, SPSS 61 Buisiness Undertanding2 Data Undertanding3 Data Preparation4 Modeling5 Evaluation6 Deployment31 32. KDD processThe Knowledge Discovery in Databases process Fayyad et al. (1996) 51 Selection2 Pre-processing3 Transformation4 Data Mining5 Interpretation/Evaluation 32 33. SEMMASample, Explore, Modify, Model and Assess SAS5 1 2 3 4 5 33 34. Proceedings of the IADISAzevendo and Santos (2008) KDD process SEMMA CRISP-DM KDD SEMMACRISP-DM --- ---Buisiness Understanding Selection SampleData Understanding Pre processingExplore TransformationModify Data Preparation Data mining ModelModeling Interpretation/Evaluation Assessment Evaluation --- ---Eployment34 35. Proceedings of the IADISAzevendo and Santos (2008) KDD process SEMMA 35 36. Azevendo and Santos (2008) KDD process SEMMA 36 37. Davenport, Harris and Morison (2010) 37 38. What is data science?Mike Loukides (2010) OReilly Media,Inc. 38 39. 1996 2010 39 40. 40 41. Buisiness Understanding 41 42. Data Understanding H/W Excel 42 43. Data PreparationSQLDataBase Hadoop DB 43 44. ModelingH/W 44 45. Evaluation 45 46. Deployment 1 46 47. /47 48. Davenport and Patil (2012) source: Harvard Business School Publishing., Oct 2012 48 49. Davenport, Harris and Morison (2010) 49 50. 50 51. 51 52. /52 53. 53 54. 54 55. 55 56. 56 57. facebookhttp://www.facebook.com/DataScientist.jp 57 58. https://www.facebook.com/groups/datascientist.jp/58 59. 59