Ds01 data science
-
Upload
dotnetcampus -
Category
Technology
-
view
145 -
download
0
Transcript of Ds01 data science
![Page 1: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/1.jpg)
Previously known as
Think Big. Move Fast.
![Page 2: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/2.jpg)
Template designed by
brought to you by
![Page 3: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/3.jpg)
SolidQ
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• www.solidq.com
![Page 4: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/4.jpg)
Davide Mauri
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
![Page 5: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/5.jpg)
Data ScienceReinassance 2.0
![Page 6: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/6.jpg)
“Companies are collecting mountains of information about
you, to predict how likely you are to buy a product,
and using that knowledge to craft a marketing message
precisely calibrated to get you to do so”
![Page 7: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/7.jpg)
Data Science
• Extraction of knowledge from data
• So, what’s new?
• Nothing. Except that it’s now economic and fast.
• It’s now applicable to everything. And we have a lot of data produced everyday that can be used to extract knowledge
![Page 8: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/8.jpg)
Data Science
DecisionsKnowledgeInformationData
![Page 9: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/9.jpg)
Data Science
• A Sum Of• Statistics• Mathematics• Machine Learning• Data Mining• Computer Programming• Data Engineering• Visualization• Data Warehousing• High Performance Computing
• To support (Informed) Decision Making• Data-Driven Decisions
![Page 10: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/10.jpg)
Data Scientist
• IBM• A data scientist represents an evolution from the business or data analyst role.
• The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math.
• What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge.
• It's almost like a Renaissance individual who really wants to learn and bring change to an organization.
![Page 11: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/11.jpg)
• Algorithms are the new gatekeepers
• They decided• What we find
• What we see
• What we buy
![Page 12: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/12.jpg)
Modern Data Environment
MasterData
EDWData Mart
Big Data
UnstructuredData
BI Environment
Analytics Environment
StructuredData
![Page 13: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/13.jpg)
Big Data
The 3 V
No, the 4 V!!!
No, no, the 5 V!!!!!
![Page 14: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/14.jpg)
http://www.ibmbigdatahub.com/infographic/four-vs-big-data
![Page 15: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/15.jpg)
Big Data
• Volume, Velocity, Variety, Veracity….V<your-v-here>
• Data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time
• Grid Computing, Parallel Computing needed• keep processing time reasonable
• provide scalability
![Page 16: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/16.jpg)
Big Data Data
• Paradigm: “Store Now, Figure Out Later”• Data is the new resource. Never throw it away!
• Unstructured Data• Text Files
• Images
• Sounds
• Structured/Semi Structured Data• Sensors
• Transactions
• Logs
![Page 17: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/17.jpg)
Data Storage
• RDBMS• SQL Server
• Hadoop• HDInsight
• Hortonworks Data Platform
• Distributed File (Eco)System• CSV
• JSON
• *.*
![Page 18: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/18.jpg)
Data Storage
• Hadoop Ecosystem
http://hortonworks.com/hadoop-modern-data-architecture/
![Page 19: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/19.jpg)
Data Science & Big Data
• Data Science != Big Data
• Data Science Not Only on Big Data
• Data Science can be applied to Big Data
• Data Science starts from Small Data• 1) find the algorithm that extract knowledge
• 2) measure algorithm results and in terms of probability
![Page 20: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/20.jpg)
Machine Learning
• Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. (Wikipedia)• For example, a machine learning system could be trained on email messages to learn to
distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders.
• Flavors• Supervised
• Unsupervised
![Page 21: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/21.jpg)
Data Analysis
• Common Data Scientists Tools• R
• Weka
• Octave
• Scikit-Learn
• Common Data Scientists Languages• Python
• Scala
• F#
![Page 22: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/22.jpg)
![Page 23: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/23.jpg)
Resources
• https://www.coursera.org/• Data Scientist Specialization
• https://www.khanacademy.org/• Math
• http://www.osservatori.net/business_intelligence• Italian Big Data Market Analysis Resources
• http://www.solidq.com/consulting/• Data Science Services
• Big Data / Business Intelligence / Data Warehousing
![Page 24: Ds01 data science](https://reader034.fdocuments.net/reader034/viewer/2022052522/554eaec0b4c905977e8b4e86/html5/thumbnails/24.jpg)
Previously known as
Think Big. Move Fast.