Data science vs. Data scientist by Jothi Periasamy

18
DATA SCIENCE vs. DATA SCIENTIST A READINESS AND ASSESSMENT CREATE TALENT & TRANSFORM MALAYSIA TO DIGITAL ECONOMY

Transcript of Data science vs. Data scientist by Jothi Periasamy

DATA SCIENCE vs. DATA SCIENTIST

A READINESS AND ASSESSMENT

CREATE TALENT & TRANSFORM MALAYSIA TO DIGITAL ECONOMY

shahriman
Typewritten Text
Jothi Periasamy

2

Objective

Data science - defined

Data scientist competency

Appendix

Contents

Data science vs. enterprise data science

Data scientist competency development approach

THE institute of enterprise analytics (TIOEA)

Our enterprise data science learning

Data scientist competency development approach

3

Strategy Implementa

tion

Executive

Developers

Data

Science

vs.

Data

Scientist

Objective

Review key functions of data

science and how data science

different from traditional

business intelligence (BI)

Understand key competency

area, skills, roles and

responsibilities and

deliverables of data scientist

4

What is data science?

Data science is not new, Data science is just modernizing existing reporting solution,

analytics solutions, data warehousing solution, business intelligence solutions and even data

management solutions.

So Data science is … New thinking , New thoughts, New ideas, New data source, New data

format/structure, New data architecture, New data processing mechanism, New innovation

on data, and New way of solving problems. That’s all.

Traditional Approach to

Data & Analytics

Data Source & Format

ERP, CRP, Oracle, SAP, MS SQL, etc.

Tables

Files

Data Structure

Structured Data

ER Model (Entity Relationship)

MDM Model ( Multi Dimensional)

Data Access

SQL

Filters & Aggregate Functions

Business Rules and Formulas, etc.

Analytics

Reports

Dashboards

Data Analysis

Analytics

Transformation

Data Source & Format

ERP, CRP, Oracle, SAP, MS SQL, etc.

Social ( Web, LinkedIn, Twitter, FB)

Streaming

Data Structure

Structured Data

Unstructured & Semi Structured

Machine Data

Data Access

Parallel processing

Distributed Computing

In-memory Analytics

Analytics

Predicate Analytics ( Liner Regression)

Data Mining

Clustering , Segmentation , etc.

Modern Approach to Data &

Analytics = Data Science

New Data Source & Format

New Data Architecture

New Analytics Architecture

New Analytics Techniques

Data Data

5

Computer

Science

Social

Science

Life

Science

Medical

Science

Material

Science …

Data Science

Measurable

Hidden

Values

Computer Science

Social Science Data

Life Science Data

Medial Science Data

Material Science Data

Social Data

Medical Data

Pharmacy Data

Model

Algorithm

Like computer science, social science, life

science and other sciences , data science is also

science to extract hidden values from any data

by applying scientific, statistical, mathematical

and computing techniques on it.

As you can see, Data science consists of all

sciences together since data is there

everywhere

What is data science ? Continued …

6

What is data science ? Continued …

Structure

Unstructured &

Semi-Structured Machine

Apply scientific, statistics, and mathematical techniques

Financial & Billing

Customer Behavior Cell Phone Call Record

Predictive Analytics

Advanced Analytics

Data Discovery … much more

Big Data

Linear Regression

Time Series & Neural Network

Clustering … much more

Data science offers a

powerful and new

approach to making data

discoveries by combining

aspects of statistics,

computer science, applied

mathematics, and

visualization together.

Data science can turn the

vast amounts of data the

digital age generates into

new insights and new

knowledge

Data Data Data

7

Data Science

Project

Scope

Research &

Development

Enterprise &

Industry

Developing new Models

Developing new Algorithm

New Analytics Techniques & Innovation

New Data Product or Platform Development

New Analytics Product or Platform

Development

etc.

Solving Business Problems

Target marketing & reduce marketing

spend

Consistent customer experience across

all channel – create personalized

customer experience … etc.

Enterprise

Data

Scientist

Typical Data Science Project Scope

Data

Scientist

Typical Data Science Scope

Based on the projects that I have been involved, the scope & focus of a data

scientist role differs but it’s very critical to understand the different focus area

and deliverables of a data science project.

Data Scientist Deliverables

Data Scientist Deliverables

Modernizing Existing

Business Intelligence

Solutions & Data

Solutions

8

Business

Process

Unstructur

ed

Data

Semi

Structured

Data

Structure

Data

Machine

Data

Analytics /

Data Science

Techniques

Finance

Customer

Marketing

Human Resource

Supply Chain

Industry

Oil & Gas

Media

Telecommunication

Power & Utility

Retail

etc. etc.

Enterprise Data Science Framework

Measurable

Business

Values

Linear Regression

Time Series

Clustering

Neural Network

Association

etc.

Reduced 2% Cost

Increased 5%

Revenue

etc.

On an enterprise data science project, an enterprise data scientist expected to

know the industry and it’s associated business process very well to lead, guide and

deliver the project. Following are the core enterprise data science building

blocks

9

Data Science

Project

Scope

Research &

Development

Enterprise &

Industry

Less focus on industry skill

Less focus on business process skill

Deeper Focus on Data skills

Deeper Focus on Analytical skills

Less focus on communication and people skills

Deeper Technology skills

Data Scientist Key Competency Area

Deep focus on industry skill

Deep focus on business process skill

Data skills

Analytical skills

Strong communication and people skills

Technical skills

There are many different skills that’s required to become a data scientist, but these

are our key observations on skills that’s required to deliver a data science project.

Please note, we didn’t list specific skills under each area. For example, under

Data, we will have data management, data governance, data quality, data

modeling ,data architecture, data integration, data mapping , etc.

Entry Level Senior Level

Basic Skill Deeper Skill

Basic Skill Deeper Skill

Entry Level Senior Level

Enterprise

Analytics

Transformati

on Leader

___________

Industry

Expert

Technology

Thought

Leader

____________

PhD’s

Academia

10

Data Science

Project

Scope

Research &

Development

Enterprise &

Industry

Less focus on industry skill

Less focus on business process skill

Deeper Focus on Data skills

Deeper Focus on Analytical skills

Less focus on communication and people skills

Deeper Technology skills

Deep focus on industry skill

Deep focus on business process skill

Data skills

Analytical skills

Strong communication and people skills

Technical skills

I found upskilling industry professionals who has prior experience in BI, data,

data warehousing would be a faster, stable and sustainable approach to deliver

and support an enterprise data science project

Entry Level Senior Level

Basic Skill Deeper Skill

Basic Skill Deeper Skill

Entry Level Senior Level

Data Scientist Competency Development Approach

Who may be a best fit for data scientist ?

Upskill on

Industry and

Business

Process

Upskill on

Advanced

Analytics and

Data Science

Techniques

11

Key Takeaways

Visionary

Domain Expert

Innovator

Transformation leader

Change Agent

Data Expert

Analytical Thinker

Technology Thought Leader

Based on our industry experience, some of the key characteristics of data

scientist on an enterprise analytics transformation initiatives as follows.

Key roles and responsibilities and deliverables of a data scientist on an

enterprise data science projects

Data Scientist Key Roles &

Responsibilities Data Scientist Key Deliverables

Business Case

Strategy and Roadmap

Standards, Policies and

Guidelines

Data Management Framework

Modern Enterprise Data

Architecture – Big Data Lake

Modern Enterprise Analytics

Architecture - Enterprise

Data Science

Plan of Action – Tactical level

Execution Plan

User Adoption

Tools and Templates and

Accelerators

Enterprise Analytics Transformation Initiative or

Enterprise Data Science Project

12

Key Takeaways

Data Engineer

Data Architect

Data Molder

ETL Developer

Information Modeler

Information Security Expert

Data Analyst

Data Visualization Engineer

etc.

Other roles and responsibilities that may involve in an Enterprise Analytics

Transformation Initiative or Enterprise Data Science Project

Please note, these roles are not a mandatory roles, it may or may not even

exists, these roles are subject to change, it’s dependents on project scope and

objectives.

Others Roles & Responsibilities Other Deliverables

Data Provisioning Functional

and Technical Components

Data Modeling

Information Modeling

Data Visualization

Components

etc.

Enterprise Analytics Transformation Initiative or

Enterprise Data Science Project

13

Appendix

14

Industry Use Case

Research and Innovation

Consulting and

Implementation

Training && Talent

Development

Thought Leadership

Tools and Templates

THE institute of enterprise analytics (TIOEA)

TIOEA

Create talent and jobs

Simplify data science learning

and empower learner with

industry use cases and pre-

packaged business contents

Be a thought leader and

governance model for

enterprise data science

implementation

Accelerate enterprise data

science implementation with

proven innovation lab, tools,

templates, standards, polices

and guiltiness

Fresher's

Experienc

ed

Executives

We provide practical coaching and on job learning experience

Enterprise Data Science for Executives

Enterprise Big Data for Executives

Enterprise HADOOP for Executives

Enterprise Data Science for Architects

Enterprise Big Data for Architects

Enterprise HADOOP for Architects

Enterprise Data Science for Developers

Enterprise Big Data for Developers

Enterprise HADOOP for Developers

Learn to build

Learn to deliver

Learn to lead

CAP (Certified Analytics Professional )

Role-based Learning TIOEA

Dra

ft

Functional Learning

Industry Use Case

Strategy & Roadmap

Data

Analytics User Experience

Problem Statement

Business Needs & Challenges

Business Impacts and Benefits

Implementation Methodology

Implementation Options & Plan

Deliverables and Milestones

Data Governance

Data Management

Data Sources & Data Format

Data Modeling

Data Integration

Design and Leading Practices

Data Science Overview

Data Science vs. Enterprise Data Science

Predictive Analytics & Advanced Analytics

Treditional “BI” vs. Data Science

Analytics Techniques

Design and Leading Practices

Data Visualization

Self Servicing and Data Analysis

Reporting and

Insights and Improved Decision Making

Deployment - Desktop, Mobile and Cloud

Design and Leading Practices

Our Enterprise Science Lab ( HADOOP + SAP HANA + Oracle 12C + Analytics Tools + Open Source Technologies )

Learning Roadmap D

ra

ft

Technical Learning

Industry Use Case

“R” Programming

Python Programming

Machine Learning

Enterprise HADOOP

Problem Statement

Business Needs & Challenges

Business Impacts and Benefits

Implementation Methodology

Implementation Options & Plan

Deliverables and Milestones

HADOOP Overview

HADOOP Architecture

HADOOP Core Components

Data Management On HADOOP

Analytics & Application On HADOOP

HADOOP Ecosystem and Total Cost of

Ownership

Enterprise Data Science Overview

Data Science vs. Enterprise Data Science

Predictive Analytics & Advanced Analytics

Analytics on “R” Overview

Analytics on “Python” Overview

Analytics on “Natural Language Processing ”

Data Visualization

Treditional “BI” vs. Data Science

Self Servicing and Data Analysis

Insights and Improved Decision Making

Deployment on Desktop, Mobile and

Cloud

Change Management and Training

Our Enterprise Data Science Lab ( HADOOP + SAP HANA + Oracle 12C + Analytics Tools + Open Source Technologies )

Big Data Enabling Technologies

Cloud Technologies Overview

SAP Analytics Tools Overview

Oracle Analytics Tools Overview

Open source Technologies Overview

Data Management Technologies Overview

Data Management

Implementation Overview

Learning Roadmap D

ra

ft

18

Thank You !!