What is Data Science?

4
What is Data Science? By: Ahmed Banafa, Distinguished Tenured Staff | Faculty | SME | E-Learning Expert | Four-time winner of instructor of the year award What is Data Science? Data science is the study of where information comes from, what it represents and how it can be turned into a valuable resource in the creation of business and IT strategies. Mining large amounts of structured and unstructured data to identify patterns can help an organization rein in costs, increase efficiencies, recognize new market opportunities and increase the organization's competitive advantage. Theories behind Data Science Database Theory: is about organizing data in a way that makes storing and retrieving it efficient. Data can be categorized into objects, objects can be put into collections and objects and collections can have relationships between each other and themselves. The one thing you need to know about this theory is that the way you organize your data will impact the effort required to get answers from it (queries).

description

Data science is the study of where information comes from, what it represents and how it can be turned into a valuable resource in the creation of business and IT strategies. Mining large amounts of structured and unstructured data to identify patterns can help an organization rein in costs, increase efficiencies, recognize new market opportunities and increase the organization's competitive advantage.

Transcript of What is Data Science?

Page 1: What is Data Science?

What is Data Science?

By: Ahmed Banafa, Distinguished Tenured Staff | Faculty | SME | E-Learning Expert | Four-time winner of instructor of the year award

What is Data Science?

Data science is the study of where information comes from, what it represents and how it can be turned into a valuable resource in the creation of business and IT strategies.  Mining large amounts of structured and unstructured data to identify patterns can help an organization rein in costs, increase efficiencies, recognize new market opportunities and increase the organization's competitive advantage.

Theories behind Data Science

Database Theory: is about organizing data in a way that makes storing and retrieving it

efficient. Data can be categorized into objects, objects can be put into collections and

objects and collections can have relationships between each other and themselves. The

one thing you need to know about this theory is that the way you organize your data will

impact the effort required to get answers from it (queries).

Agile Manifesto: is a set of principles that ensures high quality outputs in environments

subject to high levels of change and ambiguity. Agile methods overcome rapid changes

and ambiguity through adopting an iterative development process. The Agile Manifesto

looks to remove all cultural barriers between developer, client and end user and focuses

on using the latest technology to making things simple but not simpler. The one thing you

need to know about this set of principles is that all things change and the longer you take

to test your solution in the live environment the higher the risk of failure.

Page 2: What is Data Science?

Spiral Dynamics: is a theory of human development and behavior and explains why

humans do what we do. The theory talks about two mental states, one of “facts” and one

of “values”. The one thing you need to know about this theory is that our facts and our

values come from what data is presented to us.

How to Do Data Science?

The three components involved in data science are organizing, packaging and delivering data (the OPD of data):

Organizing is where the physical location and structure of the data is planned and executed.

Packaging is where the prototypes are build, the statistics is performed and the visualization is created.

Delivering is where the story gets told and the value is obtained.

However what separates data science from all other existing roles is that they also need to have a continual awareness of What, How, Who and Why.

What does a data scientist do?

A data scientist represents an evolution from the business or data analyst role. The formal

training is similar, with a solid foundation typically in computer science and applications,

modeling, statistics, analytics and math. What sets the data scientist apart is strong business

sense, coupled with the ability to communicate findings to both business and IT leaders in a way

that can influence how an organization approaches a business challenge. Good data scientists

Page 3: What is Data Science?

will not just address business problems; they will pick the right problems that have the most

value to the organization.

Whereas a traditional data analyst may look only at data from a single source – a CRM system,

for example – a data scientist will most likely explore and examine data from multiple disparate

sources. The data scientist will sift through all incoming data with the goal of discovering a

previously hidden insight, which in turn can provide a competitive advantage or address a

pressing business problem. A data scientist does not simply collect and report on data, but also

looks at it from many angles, determines what it means, then recommends ways to apply the

data.

Data scientists are inquisitive: exploring, asking questions, doing “what if” analysis, questioning

existing assumptions and processes. Armed with data and analytical results, a top-tier data

scientist will then communicate informed conclusions and recommendations across an

organization’s leadership structure.

References

http://www.datascientists.net/what-is-data-sciencehttp://www-01.ibm.com/software/data/infosphere/data-scientist/http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagramhttp://www.boozallen.com/media/file/The-Field-Guide-to-Data-Science.pdfhttp://en.wikipedia.org/wiki/Data_science