Data mining example

16
Artificial intelligence algorithms performance analysis Aamir khan [email protected] IBA karachi, Pakistan

Transcript of Data mining example

Page 1: Data mining example

Artificial intelligence algorithms performance analysis

Aamir [email protected]

IBA karachi, Pakistan

Page 2: Data mining example

Data to be analyzed

Page 3: Data mining example

Task:

• Use KNIME to perform classification on the given dataset. The dataset has been taken from UCI Mahine Learning repository and uses US census data to predict whether the income of a person exceeds $50K/yr. The details of the dataset can be found at

• http://archive.ics.uci.edu/ml/datasets/Adult• You have to experiment with different classification approaches

discussed in the course (i.e. Decision tree, Naive Bayes', Neural Networks) using different set of attributes. You may first need to do some data pre-processing to clean your data first.

• You are required to submit a report describing different experiments that you conducted (along with the screenshots of weka/ knime wherever appropriate) and their results.

Page 4: Data mining example

Baive Bayes Workflow

Page 5: Data mining example

String Manipulation for all three Due to error in data

Page 6: Data mining example

Naïve Bayes Learner

Page 7: Data mining example

Normalizer

Page 8: Data mining example

Confusion Matrix and Accuracy of Naïve Bayes

Page 9: Data mining example

Decision Tree Workflow

Page 10: Data mining example

Decision tree learner

Page 11: Data mining example

Confusion matrix and accuracy for Decision tree

Page 12: Data mining example

ANN workflow

Page 13: Data mining example

Column filter for ANN

Page 14: Data mining example
Page 15: Data mining example

Confusion matrix and Accuracy for ANN

Page 16: Data mining example

Conclusion

• Results of our workflow shows that the given data has accuracy of 76.5% in ANN, 76.4 in Naïve Bayes and 83.2 in Decision tree on the basis of our configurations.

• Hence Decision tree gives best result on the provided data.