Group E Gyrobot Senior Design Project Mid-Term Presentation October 30, 2002.
Group 7_DMBD Term Project
-
Upload
padmavathi-shenoy -
Category
Documents
-
view
213 -
download
0
Transcript of Group 7_DMBD Term Project
-
8/10/2019 Group 7_DMBD Term Project
1/14
By Nancy (
Subhash Rajeev (1Vishnu Poduval (1
Vishal Wagh(Eswar Sunil Kumar (
Aaron Ernest (
-
8/10/2019 Group 7_DMBD Term Project
2/14
Objectives of the study
BusinessObjective
To gauge the buying behavior of customers from an ecommerce point of viewand thereby to identify the major issues that prevent different classes of usersfrom using the internet for making purchases
Factors UnderConsideration
General demographics of the users
Technology demographics of the users
Internet shopping habits
Web and Internet Usage habits
Source ofdatabase
GVUCs 8th WWW user survey( Run from October 10, 1997 to November 16,1997)
Special pointers provided to the by Yahoo, Netscape and WebTV
-
8/10/2019 Group 7_DMBD Term Project
3/14
Mining Main challenge faced - Branching
- Categorization of the dataGeneral
Technology
Internet Usage
Privacy
E-commerce
Gender
Connection
Speed &
Upgrades
Indispensable
Technologies
Cookie
Privacy
Reasons for Using the Web
for Personal Shopping
Primary
Language
Email
accounts Frequency of Use Internet Laws
Time Spent Searching
Personal Shopping
Registered to
Vote
Equipment
Owned
Navigation
Services
Content
Providers
have right to
resell user
information
Success Rate of Personal
Shopping
Imp Issue
Facing the
Internet
Frequency of
Switching
Browsers
Years on
Internet
Which arewho have/online? W
What are t
for people What is th
reasons wiclassificatgeography
Is there a spent on thabits?
How muchmatter in t
customers
-
8/10/2019 Group 7_DMBD Term Project
4/14
Dataset- Numeric, stores in ASCII file
No of datasets- 10044. Fields- 60 After Cleaning, no of datasets- 7290, no of training datasets- 1822, no of test datasets- 5
The training dto develop the
later tested onfields are highleach other.In most of suchthem have bee
Overview of Dataset
Variables(Demographic Segmentation)
Variables(Psychographic Segmentation)
Age Years on Internet
Country Major Occupation
Gender Who_pays_for_access
Education_Attainment Willingness_to_Pay_Fees
Race Most_important_Issue_Facing_the_Internet
Major Geographical Location Sexual Preference
Marital Status Primary_Place_of_WWW Access
Household Income
The entire recoded asindex to thin an addit
file
-
8/10/2019 Group 7_DMBD Term Project
5/14
MethodologyData cleaning
Sampling Training and test
Demographic segmentation models trained
C 5.0 with balancing
C 5.0 with balancing and boosting
C 5.0 with balancing, boosting and misclassification costs
Psychographic segmentation models trained
C 5.0 balanced
C 5.0 Boosted C 5.0 misclassified
Neural network
Boosted Neural network
Testing of models
Interpreting results of most accurate and suitable model
Target variWhether uonlineVariable tyDistributio78% Yes, 2
Factor of baMisclassific
demographNo. of trial
Other models were used as well, only these have been represented in the PPT for having the mostconfusion matrices and lift curves
Sampling mTraining Testing D
EvaluationGains chamatrices
-
8/10/2019 Group 7_DMBD Term Project
6/14
MethodologyModels usedDemographic Psychographi
-
8/10/2019 Group 7_DMBD Term Project
7/14
ResultsDemographic
C 5.0 with balancing
C 5.0 with balancing, boostingand misclassification costs
C 5.0 with balancing and boosting
C 5.0 with balancing, boosting and misclassification costs has the highest model accuracy for te
It also has the highest number of true positives
-
8/10/2019 Group 7_DMBD Term Project
8/14
ResultsDemographic
C 5.0 with balancing
C 5.0 with balancing, boostingand misclassification costs
C 5.0 with balancin
C 5.0 with balancing, boosting and misclassification costs has the highest amount of lif
l
-
8/10/2019 Group 7_DMBD Term Project
9/14
ResultsDemographic- Balanced C 5.0 model with boosting and misc
penalties
-
8/10/2019 Group 7_DMBD Term Project
10/14
ResultsPsychographic
Normal C 5.0 with balancing
C 5.0 with boosting
C 5.0 with misclassification penalties
Simple
Neural netw
-
8/10/2019 Group 7_DMBD Term Project
11/14
ResultsPsychographic
Normal C 5.0 with balancing
C 5.0 with boosting
C 5.0 with misclassification penalties
Simple neural network
Neural network with
-
8/10/2019 Group 7_DMBD Term Project
12/14
ResultsPsychographic Simple neural network
Time spent on the internet seems to be a crucial factor when itcomes to understanding what factors come into considerationwhen deciding whether to purchase online
-
8/10/2019 Group 7_DMBD Term Project
13/14
Interpretation and Concluding Rem Thus we can see that age, years spent on the internet
and primary place of internet access are the most
important characteristics that decide whether usersbuy online
Surprisingly, the purchasing power or monthly incomeand occupation had a much lower importance when itcame to predicting whether a user would purchaseonline
However, since this data is from 1997, at a time whenusers were new to the internet, it makes sense thatfamiliarity with the medium was necessary in order toreach consumers
E-commerce shoppers during this era were earlyadoptersin the product lifecycle and were technology
savvy and already used to the internet. They trusted itas well.
E- commerce websites ther
only advertise online since the cost of advertising as agsince their buyers were ma
familiar with th
-
8/10/2019 Group 7_DMBD Term Project
14/14