Implementing R in the old economy

26
© 2010 2016 eoda GmbH Implementing R in old economy companies: How to master the road to production Oliver Bracht | Chief Data Scientist

Transcript of Implementing R in the old economy

© 2010 – 2016 eoda GmbH

Implementing R in old economy companies: How to master the road to production

Oliver Bracht | Chief Data Scientist

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

About eoda

Interdisciplinary Team Statisticians | Engineers | Economists | Sociologist | …

Based in Kassel - Germany

Data Science Consulting, Training, Support, Software and Analytic Services with a focus on R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Core

Pioneer Requirement

Data Science Lab

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Data Science Lab

Guerilla Requirement

• Driver:

• Root:

• Data Sources:

• Motivation:

• Mindset:

• Time-to-production:

Core Already there

Founder and Top-Management

Center

All available

Corporate objective

A natural thing

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Core Data Science Lab

Requirement

• Driver:

• Root:

• Data Sources:

• Motivation:

• Mindset:

• Time-to-production:

Requirementoriented

Data Science Lab

Ready when ready

Management

Top

Sandbox

Research

Let‘s try

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Core Data Science Lab

Requirement

• Driver:

• Root:

• Data Sources:

• Motivation:

• Mindset:

• Time-to-production:

Requirement

Short

Business lines or management

Top

Production / Business

Solving current business problems

Reduce risk by planning / ROI

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Core Data Science Lab

Requirement

• Driver:

• Root:

• Data Sources:

• Motivation:

• Mindset:

• Time-to-production:

Core Business

Pioneer

Long

Single Business unit(s)

Bottom

csv copies

Pioneer‘s conviction

Yes, we can

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Core Data Science Lab

Pioneer Requirement

New Economy

Enterprises

Old Economy

Introduction of R

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

The pioneer‘s road to production

Level 1: GuerillaStakeholders: Data Science

Level 2: Proof of ConceptStakeholders: Data Science + Business

Level 3: ProductionStakeholders: Data Science + Business + IT

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

• Company is not using any Data Science Tools so far - beside Excel

• A single person or small group starts using R for certain tasks

• Often interns or entrants

• Software decision independent from the IT-department

• Non strategic decision

• Colleagues, supervisors and management are excited

Level 1: Guerilla

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Success factors

• Involve IT department as early as possible

• Try to enlarge the internal Data Science Team

• by training (internal, on-site, online)

• by hiring

• Get as soon as possible to the next level

Level 1: Guerilla

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Risks

• IT department raises difficulties

• Getting lost in ad-hoc requests

• Becoming everybody's problem solver

Level 1: Guerilla

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

• Use Case Evaluation

• Data Availability

• Analytical Complexity

• Business Value

• Select the most promising use cases for

implementation

• Plan to accomplish more than one use case

Level 2: Proof of Concept

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Level 2: Proof of Concept

Success Factors

• Make sure that implementation is principally possible

• Have your target in mind, don't get lost in details

• Involve business as much as possible

• Don‘t scare business with statistical terms

• Keep analytic approach and business demand

in line

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Level 2: Proof of Concept

Risks

• Promising particular success

• Sticking to your original ideas if they don‘t fit anymore

• Showing preliminary results to business people

• Investing too much energy in optimization and performance

• Communication between Data Science and Business

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Level 3: Production

• Implementing Proof-of-Concept approaches into Production

• Shift from laboratory to automation

• Shift from hacking to programming

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Input

Level 3: Production

Business Implementation

Output

Technical Implementation

Analytic

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Human / Machine

Level 3: Production

Business Implementation

Technical Implementation

Analytic Human / Machine

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Level 3: Production

Success Factors

• Focus on software developers skills

• Care much about performance

• Put a first stable version on the live system before adding new features

• Open a backlog for future improvements

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Versioning Failure controll

Dependency Management Documentation

Package Building Staging

Testing Deployment

Profiling

Level 3: Production

Sucess Factors

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Output

Technical Implementation

Analytic

Level 3: Production

Risks

• Underestimation of effort for maintenance and bug fixing

• Debugging is harder on Production systems

• Appling code changes without tests on staging systems

• „Unknown“ input

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Conclusion

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Statistics,Methods

Domain-knowledge

Data Science

SoftwareDevelopment

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Guerilla & Proof-of-Concept

Statistics,Methods

Domain-knowledge

SoftwareDevelopment

© 2010 – 2016 eoda GmbHwww.eoda.de

Introduction | Guerilla | Proof of Concept | Production

Statistics,Methods

Domain-knowledge

Production

SoftwareDevelopment

© 2010 – 2016 eoda GmbH

@eodaGmbH

@eodaGmbH eodaGmbH

blog.eoda.de

eoda GmbHUniversitätsplatz 12

34127 Kassel - Germany

www.eoda.de/[email protected]

+49 561 202724-40

The Data Science Specialists.