A Machine Learning based approach for the Competitor ... · information analysis in the automotive...

26
Chair of Software Engineering for Business Information Systems (sebis) Faculty of Informatics Technische Universität München wwwmatthes.in.tum.de A Machine Learning based approach for the Competitor Information Analysis in the Automotive Industry Roman Pass, 24.04.2017, Munich

Transcript of A Machine Learning based approach for the Competitor ... · information analysis in the automotive...

Page 1: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Chair of Software Engineering for Business Information Systems (sebis) Faculty of InformaticsTechnische Universität Münchenwwwmatthes.in.tum.de

A Machine Learning based approach for the CompetitorInformation Analysis in the Automotive IndustryRoman Pass, 24.04.2017, Munich

Page 2: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

1. Introduction Competitor Analysis Problems Research questions

2. Requirements

3. System Design

4. Evaluation System Usability

Feedback

5. Future Work

Outline

© sebis161208 Roman Pass Final Presentation 2

Page 3: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Introduction

© sebis240417 Roman Pass Final Presentation 3

Competitor Analysis (CA)

Identify Strength, Weakness, Opportunities, Threats (SWOT)

Identify All competitors Basic competitors Primary competitors

Support decisions in management and development

→ Spread Competitive Intelligence (CI) within the company

Page 4: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Introduction

© sebis240417 Roman Pass Final Presentation 4

Competitive Intelligence Process

Source: Rainer Michaeli. Competitive Intelligence: Strategische Wettbewerbsvorteile erzielen durch systematische Konkurrenz-, Markt- und Technologieanalysen. Springer-Verlag Berlin Heidelberg, 2006.

Requirements Gathering

PlanningData

AcquisitionProcess

DataAnalysis and Interpretation

Reporting

50 % of process time

Page 5: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Introduction

© sebis240417 Roman Pass Final Presentation 5

Analyst

DB

WordTXT

Internal sources

Excel pptx patents

External sources

Competitive Intelligence Process – Data Acquisition

Page 6: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Introduction

© sebis240417 Roman Pass Final Presentation 6

Too many information sources for manual analysis

Long duration and high occurence of analysis

Destinguish between relevant and irrelevant information

Retrieve processed information

Competitive Intelligence Process – Problems

Page 7: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Introduction

© sebis240417 Roman Pass Final Presentation 7

How to improve the existing competitor information analysis process in theautomotive industry?

What kind of documents and document sources are used for competitor information analysis in the automotive domain?

Which categories/topics do these documents belong to?

Which supervised classification algorithm is suitable for automatic document classification to support competitor information analysis in the automotive domain?

Research Questions

Page 8: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

1. Introduction Competitor Analysis Problems Research questions

2. Requirements

3. System Design

4. Evaluation System Usability

Feedback

5. Future Work

Outline

© sebis161208 Roman Pass Final Presentation 8

Page 9: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Requirements

© sebis240417 Roman Pass Final Presentation 9

One stream of information

Search engine as UI

Automated import of data from external sources

Knowledge management system with the support for storing both the internal and external data

Topic structure within knowledge management system

Page 10: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Automatic Document Classification

© sebis240417 Roman Pass Final Presentation 10

Labels defined by department's topics Prognosis Exhibitions General Information Initial Evaluation Press Evaluation Garbage

Training documents assigned manually to each topic

Supervised Machine Learning (ML) algorithms compared Support Vector Machine (SVM): low prediction accuracy K-Nearest Neighbours (k-NN): low prediction performance Naive Bayes (NB): good prediction performance and accuracy

Page 11: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Automatic Document Classification

© sebis240417 Roman Pass Final Presentation 11

Naive Bayes

Training Phase: n-Fold cross validation (n=10)

Different data split ratios (training : test) were tested

Page 12: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

1. Introduction Competitor Analysis Problems Research questions

2. Requirements

3. System Design

4. Evaluation System Usability

Feedback

5. Future Work

Outline

© sebis161208 Roman Pass Final Presentation 12

Page 13: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

System Design

© sebis240417 Roman Pass Final Presentation 13

Page 14: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

© sebis240417 Roman Pass Final Presentation 14

C O M P E T I • C O R T E XDemo

Page 15: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

1. Introduction Competitor Analysis Problems Research questions

2. Requirements

3. System Design

4. Evaluation System Usability

Feedback

5. Future Work

Outline

© sebis161208 Roman Pass Final Presentation 15

Page 16: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Evaluation

© sebis161208 Matthes English Master Slide Deck 16

A subset of internal documents and messages of external sources used

9 test subjects (managers and analysts)

Introduce system to every person individually

Test the system

Evaluation questionnaire (System Usability and Feedback)

Page 17: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Evaluation

© sebis161208 Matthes English Master Slide Deck 17

0% 100%Min 68% 82,78%

System Usability Scale (SUS)

Ten-item scale [1]

Illustrating subjective assessments of system usability

Positional value for each question (1 – 5)

Calculating SUS-score (0% – 100%)

➔ Exceeds the recommended value of 68% [2]

Source: [1] John Brooke. Sus: A quick and dirty usability scale, 1996.[2] John Brooke. Sus: A retrospective. Journal of Usability Studies, 8(2):29–40, 2013.

Page 18: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Feedback (1)

© sebis161208 Matthes English Master Slide Deck 18

What do you like most about the application?➢ Usability, simplicity & attractiveness of user views (intuitive system handling)

What should be improved in the current system?➢ Transparency in results' relevance

Did the result of your query contain relevant documents?

Page 19: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Feedback (2)

© sebis161208 Matthes English Master Slide Deck 19

Could you retrieve your uploaded file?+ PDF, Word, Powerpoint- Excel

Assess whether the efficiency of the information acquisition can be optimized by using the system

+ 7 subjects agree, if document pool is extended and external message currentness is granted

- 1 subject: Only a subset of internal information sources can be included

Page 20: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

1. Introduction Competitor Analysis Problems Research questions

2. Requirements

3. System Design

4. Evaluation System Usability

Feedback

5. Future Work

Outline

© sebis161208 Roman Pass Final Presentation 20

Page 21: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Future Work

© sebis161208 Matthes English Master Slide Deck 21

Extend information sources → improve automatic document classification

Provide individuality (user accounts)

Dictionary Synonyms Translations

Provide Optical Character Recognition (OCR) for scanned documents

Page 22: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Technische Universität MünchenFaculty of InformaticsChair of Software Engineering for Business Information Systems

Boltzmannstraße 385748 Garching bei München

Tel +49.89.289.Fax +49.89.289.17136

wwwmatthes.in.tum.de

Roman PassB.Eng.

17132

[email protected]

Page 23: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

© sebis161208 Matthes English Master Slide Deck 23

Backup

Page 24: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Use Case

© sebis240417 Roman Pass Final Presentation 24

Page 25: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

CompetiCortex Layers

© sebis240417 Roman Pass Final Presentation 25

Page 26: A Machine Learning based approach for the Competitor ... · information analysis in the automotive domain? Which categories/topics do these documents belong to? Which supervised classification

Comparison Between Classification Algorithms

© sebis240417 Roman Pass Final Presentation 26