ERP Centric Data Mining and KD
description
Transcript of ERP Centric Data Mining and KD
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
1
ERP Centric Data Mining and ERP Centric Data Mining and Knowledge DiscoveryKnowledge Discovery
Naeem Hashmi
Chief Technology Officer
Information Frameworks
e-mail: [email protected]
Web: http://infoframeworks.com
Webcast - searchsap.comSeptember 10, 2002
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
2
• Founder and CTO of Information Frameworks, an author, speaker and world-renowned expert on emerging Information Architectures, Integration and Business Intelligence Technologies.
• Author of the best selling book titled, – SAP Business Information Warehouse for SAP, 2000.
• Technical Editor– SAP BW Certification Guide, authored by Catherine Roze 2002
• Contributing Author, SAP BW Handbook, 2002
• Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The ERP World, Data Mining and the Data Warehouse Institute.
• 25+ years of experience in emerging Information Technology research, development, and management; Information Architectures; Enterprise Application Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object and Client/Server Technologies and Strategic Consulting.
• Email- [email protected] url: http://infoframeworks.com Tel: 603-432-4550
Naeem Hashmi
About the Speaker
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
3Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
4Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
5What is Data Mining and Knowledge Discovery ?
• Data Mining is a tactical process that uses mathematical algorithms to sift through large data-stores to extract data patterns/models/rules
• The Knowledge Discovery is the process of identifying and understanding potentially useful hidden anomalies, trends and patterns. Data mining is an integral part of knowledge discovery process
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
6Data Mining and Statistics ?
• DM sounds very similar to regression analysis but its approach and purpose are quite different
– Statistical methods tests a hypothesis on a data set
– Data Mining starts from the data sets to construct a hypothesis
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
7Data Mining - Present State
Business 317 73%Life Sciences 85 20%Other 31 7%
Source: http://www.kdnuggets.com/polls/
Application Domains
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
8Data Mining Methodologies
Source: http://www.kdnuggets.com/polls/
CRoss Industry Standard Process for Data Mining
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
CRISP-DM
Source: http://www.crisp-dm.org/
http://www.crisp-dm.org/
SIX STEPS PROCESS
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
9Data Mining Process
CRoss Industry Standard Process (CRISP) for Data Mining
Data Understanding
Data Preparation
Data Warehouse
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
Initially will take about Initially will take about 60% to 80% 60% to 80%
of the data mining project of the data mining project timetime
Initially will take about Initially will take about 60% to 80% 60% to 80%
of the data mining project of the data mining project timetime
http://www.crisp-dm.org/
Source: http://www.crisp-dm.org/
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
10Data Mining - Tools and Data Formats
Business 317 73%Life Sciences 85 20%Other 31 7%
Source: http://www.kdnuggets.com/polls/
Domains
57% Flat files37% Proprietary27% DBMS
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
11
VisualizationUse human pattern recognition capabilities
StatisticsApplying statistical techniques to predict
Decision TreesBuilding scripts based on historic data
Association Rules (Rule Induction)Reasoning from specific facts to reach a hypothesis
ClusteringRefers to finding and visualizing groups of facts that were not previously known
Neural NetworksLearning how to solve problems based on examples
K-Nearest NeighborClassification by looking at similar data
Genetic Algorithms
Survival of the fittest …
TECHNIQUES
TECHNIQUES
USAGE
USAGE
Discover
Understand
Predict
Data Mining Technology
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
12Data Mining Models
Regression algorithms • Neural Networks, Rule Induction
• Predict Numerical Outcome
Classification algorithm• CHAID, discriminant analysis
• Predict Symbolic Outcome
Two Types of Data Mining ModelsTwo Types of Data Mining Models
Clustering/Grouping algorithms• K-means, Kohonen, Factor
Analysis Association algorithms
• Apriori, Sequence
Descriptive ModelsGrouping & AssociationsDescriptive ModelsGrouping & Associations
Prediction Models Prediction and ClassificationPrediction Models Prediction and Classification
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
13Traditional DM vendors
• SPSS Clementine
• SAS Enterprise Miner
• IBM Intelligent Miner
• Salford CART/MARTS
• …more
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
14Database Vendors – DM within the Products
• Data Mining Engine in Oracle 9i– Oracle 9i consists of key products
• Oracle9i Database ,Oracle9i Application Server,Oracle9i Developer Suite
• IBM Intelligent Miner into DB2• TeraMiner into Teradata• Microsoft – SQL Server 2000
• When you implement DM functionality in a DBMS, you are limited to a specific database engine and not quite flexible in a typical enterprise application landscape - heterogeneous environment.
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
15Data Mining Standards
• PMML - Predictive Model Markup Language• OleDB for Data Mining • Java Data Mining API• Other Data Exchange Standards for Analytics and
need Data Mining extensions– CWM: Common Warehouse Metadata– XML/A: XML for Analytics– CPEX: Customer Profile EXchange– xCIL: Extensible Customer Information Language
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
16Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
17Enterprise Applications Landscape
• ERP Solutions– Oracle– PeopleSoft – SAP
• ERP vendors have extended scope of their applications far beyond tradition ERP functions to a wide array of business solutions such as: Customer Relationships
Management Business Intelligence Enterprise Portals
• Siebel
• Oracle Business Intelligence Solution
• Peoplesoft Enterprise Performance Management
• SAP Business Information Warehouse
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
18Oracle Business Intelligence Solution
Business Processes (Pre-Built Portlets)• Response to Lead (27)• Lead to Quote (56)• Quote to Order (15)• Order to Cash (34)• Demand to Build (40)• Procure to Pay (28)• Revenue to Compensation (29)• Expiration to Renewal (33)• Issue to Resolution (51)• HR Family (43)
Source: Oracle
Oracle 9i DM Integration• Oracle Marketing Online for Campaign Management • Oracle9iAS Personalization• iStore• more to come…
Oracle9iDS Warehouse Builder Oracle9iAS Discoverer Oracle9iDS Reports Oracle9iAS Portal Oracle9iAS Clickstream Intelligence Oracle9iAS Personalization Oracle9i Data Mining Oracle9iDS Business Intelligence Beans
Oracle 9i Business
Intelligence
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
19PeoplSoft Business Intelligence Solution
Customer Profitability
Finance
Workforce Analytics
Supply Chain Management Process
Workforce Rewards
Enrollment Management
Retail Merchandise
Project Analysis
Student Administration
Balanced Scorecard
Employee Scorecard
Customer Scorecard
Vendor Scorecard
Enterprise Performance Management (EPM)
Courtesy: eBusiness Advantage Inc. (w
ww.ebizadvan.com)
CRM Prospect Analysis
CRM Marketing Analysis
CRM Sales Effectiveness
CRM Service Effectiveness
Data miningCapabilities
No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products.No response from PeopleSoft contacts
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
20SAP Business Intelligence Solution
+420 InfoCubes
+1700 Queries
Source: SAP
SAP CRM
Campaign management
Opportunity analytics
Customer behavior modeling
SAP SCM
Demand planning
Spend optimization
SCOR KPIs
SAP Financials, Human Capital Management
SEM
Balanced scorecard
Planning
Economic profit
Benchmarking
Employee turnover & retention
Corporate investment management
Closed loop platform capabilities
Drill-through (report-report i/f)
Remote cubes (read through)
Real-time data warehousing
Data mining
Write back to operational system
SAP Portals
E-commerce analysis
SAP Markets, Procurement
Bidding, pattern-based offering
Activity reproting, service analytics
90 ODS
Objects
Business Information Warehouse
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
21CRM Venders – Data Mining Integration
• Oracle CRM– Pre 9i Darwin– Post 9i ODM
• RightPoint and E.piphany• SPSS and Siebel• SAP CRM
– Native Data Mining built in SAP BW - Database Independent– Interface to IBM Intelligent Miner Interface with SAP BW
• PeopleSoft CRM– No official data mining product or vendor solution– Waiting for their response on what they have?
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
22Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
23SAP BW 3.0b Data Mining Implementation
• Currently for Customer Subject Area• Algorithm Supported
– Decision Trees– Scoring– Clustering/Segmentation– Association
• Data Mining process– Model definition– Training the model– Performing prediction using the training results– Uploading the results back into BW– Utilizing the mining results (on the operational side)– SAPGUI is the Interface to the Data Mining modeling and analysis
No ExtensiveNo ExtensiveData StagingData StagingNo ExtensiveNo ExtensiveData StagingData Staging
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
24Modeling a Decision Tree
Create a mining model
Source: SAP
2
Model ccolumns1Specifying the column parameters
6
Specifying the values in case the original values in the column are to be treated differently
Indicating the prediction column
4
Indicating the key column
5
The nature of the column content3
Data type of the column
7
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
25Modeling a Decision Tree
Specify Model Parameters
Source: SAP
Use portion (%) of the data for training or the whole data set
for training
1
Size of the window (such as 10%)
The number of repeats with different samples
Stop training when the no. of cases
under the given node is less than/equal to the specified value
4
Stop training when the accuracy is greater than or equal to the expected accuracy
5 If the tree is too big, prune the tree without violating the expected accuracy
6
Use the information gain threshold to check the relevance
7
32
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
26
Create a training source and map the model columns
Source: SAP
2
Modeling a Decision Tree
BW Query Runtime parameters for query
Model columns
1
Selected source columns
3
Mapping between model column and source column4
5
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
27
Create a mining model Train the model Predictions using
Training results Using the data mining
results against BW Query
Source: SAP
SAP BW Data Mining – Process Steps
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
28
Source: SAP
3
5
Viewing Decision Tree Training Results
This decision tree predicts whether the customer has
left or is still “on board1
Chances of a customer leaving is 70.7% if the profession is
“LABOURER”2
Chart shows the distribution at the selected node
28/41 customers are likely to leave
13/41 customers are likely to stay
6
Out of a total of 705 cases, 41 cases are covered under this node
4
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
29
Uploaded in BW Then BEX for further Analysis
Source: SAP
Data Mining – Decision Trees
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
30
• Create a Association model
• Define Model Columns• Train the model• Predictions using
Training results• Using the data mining
results against BW Query
Source: SAP
Data Mining – Association
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
31
Source: SAP
Data Mining – Association
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
32
• Create a Cluster model
• Train the model
• Predictions using Training results
• Using the data mining results against BW Query
Source: SAP
Data Mining – Cluster Analysis
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
33
Source: SAP
Viewing Cluster Analysis Results
1
2
3
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
34
Uploaded in BW Then BEX for further Analysis
Source: SAP
Viewing Cluster Analysis results
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
35
• Good attempt to implement few Data Mining Algorithms• Very traditional Data Mining Approach• Requires a well versed Statistician or Data Mining
Expert to model and interpret the results• Source: BEX Query – Big Limitation in DM• Weak Visualization • BEX for additional discovery - slicing and dicing
SAP Data Mining
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
36
IBM Intelligent Miner is designed to:
SAP BW - IBM Intelligent Miner
• Copy data from SAP BW to IBM Intelligent Miner– Results of reports in BW – Modeling in Business
Explorer Analyzer– Data direct from InfoCubes (for cross-selling analysis) – Descriptions, hierarchies
• Results data from IBM IM back into SAP BW– Results of segmentation can be loaded as master data or
hierarchies
• Data transport is designed through Wizards in SAP BW– Possible to get a good view of Intelligent Miner Results
from SAP BW
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
37Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
38ERPs and Data Mining: Good and the Bad News
• Good News– Known Business Processes– Few data Sources– Improved Data Quality– Metadata Integration– Near real-time data mining– Closed-loop Knowledge Discovery– Consistent Infrastructure
• Bad News– Complex Data Structures– Performance– Availability– Very few Data Mining algorithms - Today
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
CRISP-DM
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
39
Data Understanding
Data Preparation
Deployment
Business Understanding
Data Mining Process and ERP Data Mining
Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment
Will reduce data mining Will reduce data mining project time up toproject time up to
50% 50%
Will reduce data mining Will reduce data mining project time up toproject time up to
50% 50%
Source: http://www.crisp-dm.org/
Good News for Future Business Applications
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
40Agenda
• Data Mining and Knowledge Discovery Basics• ERP Vendors and Data Mining Solutions• Data Mining in SAP Business Information
Warehouse• Pro and Cons of ERP centric Data Mining• Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
41INFORMATION FRAMEWORKS
Technology/Solution Assessment
Product Strategy Solution Strategy
Product Positioning Competitive Analysis
Software product architecture Marketing Strategy
Product Performance and Benchmarking Consulting
Hardware Configuration
Market Research Market Assessment
Competitive Analysis Technology due
Seminars WebinarsKeynotes
Panel ModeratorPublications
Hands-on trainingConferences
Executive and Senior IT Management Consulting
Enterprise Information Architectures (EIA) Business Case Development
Information Architecture Application
Deployment Architectures implementation
Legacy Application Migration Strategies
ERP Application deployment strategies
Enterprise Applications Integration (EAI)
Architectures, Service Modeling and design, EAI technology assessment
Tools and Technology Assessment
Vendor Selection and Assessment
Conference Room Pilot implementation
Business Intelligence and Portals
Architectures, Methodologies
Tool/technology/Vendor assessment and selection
Data Warehouse, Data Marts, Analytics, Information Delivery
Deployment Architectures
Business Intelligence and eBusiness Integration architectures
Portals Strategies, Business case, Assessment, Architectures, Modeling, Planning and knowledge Transfer
KNOWLEDGETRANSFER
INFORMATIONTECHNOLOGY
ORGANIZATION
SOFTWAREAND
SOLUTION VENDORS
INFORMATIONTECHNOLOGYINVESTORS
http://infoframeworks.com
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
42Questions
Naeem HashmiChief Technology Officer
September 10, 2002Email: [email protected] Site: http://infoframeworks.com
Tel: 603-432-4550