Project MLExAI: Machine Learning Experiences in AI

15
Project MLExAI Project MLExAI Machine Learning Experiences Machine Learning Experiences in AI in AI http://uhaweb.hartford.edu/compsci/ccli http://uhaweb.hartford.edu/compsci/ccli Ingrid Russell, University of Hartford Ingrid Russell, University of Hartford Zdravko Markov, Central Connecticut State University Zdravko Markov, Central Connecticut State University Todd Neller, Gettysburg College Todd Neller, Gettysburg College

Transcript of Project MLExAI: Machine Learning Experiences in AI

Page 1: Project MLExAI: Machine Learning Experiences in AI

Project MLExAIProject MLExAIMachine Learning Experiences in AIMachine Learning Experiences in AI

http://uhaweb.hartford.edu/compsci/cclihttp://uhaweb.hartford.edu/compsci/ccli

Ingrid Russell, University of HartfordIngrid Russell, University of Hartford

Zdravko Markov, Central Connecticut State UniversityZdravko Markov, Central Connecticut State University

Todd Neller, Gettysburg CollegeTodd Neller, Gettysburg College

Page 2: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Project GoalProject Goal

The project goal is to develop a framework for teaching core AI The project goal is to develop a framework for teaching core AI topics with a unifying theme of machine learning. A suite of hands-topics with a unifying theme of machine learning. A suite of hands-on term-long projects are developed, each involving the design and on term-long projects are developed, each involving the design and implementation of a learning system that enhances a commonly-implementation of a learning system that enhances a commonly-deployed application. deployed application.

Page 3: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Project ObjectivesProject Objectives

Enhance the student learning experience in the AI course by Enhance the student learning experience in the AI course by implementing a unifying theme of machine learning to tie together implementing a unifying theme of machine learning to tie together the diverse topics in the AI course.the diverse topics in the AI course.

Increase student interest and motivation to learn AI by providing a Increase student interest and motivation to learn AI by providing a framework for the presentation of the major AI topics that framework for the presentation of the major AI topics that emphasizes the strong connection between AI and computer emphasizes the strong connection between AI and computer science.science.

Highlight the bridge that machine learning provides between AI Highlight the bridge that machine learning provides between AI technology and modern software engineering. technology and modern software engineering.

Introduce students to an increasingly important research area, thus Introduce students to an increasingly important research area, thus motivating them to pursue more advanced courses in machine motivating them to pursue more advanced courses in machine learning and to pursue undergraduate research projects in this area. learning and to pursue undergraduate research projects in this area.

Page 4: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Features of MLExAI ProjectsFeatures of MLExAI Projects

Teaching AI with hands-on experiments.Teaching AI with hands-on experiments.Common features in different AI fields are unified through the theme Common features in different AI fields are unified through the theme of machine learning. of machine learning. Emphasis on application of ideas through implementation.Emphasis on application of ideas through implementation.Varying levels of mathematical sophistication with implementation of Varying levels of mathematical sophistication with implementation of concepts being central to the learning process.concepts being central to the learning process.Design and implementation of learning systems.Design and implementation of learning systems.Practical approach that includes real-world applications.Practical approach that includes real-world applications.Easily adaptable and customizable.Easily adaptable and customizable.Various emphases, backgrounds and prerequisites that can serve Various emphases, backgrounds and prerequisites that can serve different goals within the general framework of teaching AI.different goals within the general framework of teaching AI.

Page 5: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

MLExAI ProjectsMLExAI Projects

Web Document ClassificationWeb Document Classification: Investigates the : Investigates the process of tagging web pages using a topic process of tagging web pages using a topic directory structure and applies machine learning directory structure and applies machine learning techniques for automatic tagging.techniques for automatic tagging.Data Mining for Web User Profiling Using Data Mining for Web User Profiling Using Decision Tree LearningDecision Tree Learning: Focuses on the use of : Focuses on the use of decision tree learning to create models of web decision tree learning to create models of web users.users.Character Recognition Using Neural NetworksCharacter Recognition Using Neural Networks: : Involves the development of a character Involves the development of a character recognition system based on a neural network recognition system based on a neural network model.model.

Page 6: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

MLExAI ProjectsMLExAI Projects

Explanation-Based Learning and the N-Puzzle ProblemExplanation-Based Learning and the N-Puzzle Problem: : Involves the application of explanation-based learning to Involves the application of explanation-based learning to improve the performance of uninformed search improve the performance of uninformed search algorithms when solving the N-Puzzle problem.algorithms when solving the N-Puzzle problem.Reinforcement Learning for the jeopardy Dice Game Reinforcement Learning for the jeopardy Dice Game “Pig”“Pig”: Students model the game and several variants, : Students model the game and several variants, implementing dynamic programming and value iteration implementing dynamic programming and value iteration algorithms to compute optimal play.algorithms to compute optimal play.Getting a Clue with Boolean SatisfiabilityGetting a Clue with Boolean Satisfiability: We use SAT : We use SAT solvers to deduce card locations in the popular board solvers to deduce card locations in the popular board game Clue, illustrating principles of knowledge game Clue, illustrating principles of knowledge representation and reasoning, including resolution representation and reasoning, including resolution theorem proving.theorem proving.

Page 7: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Sample Project: Sample Project: Web Document ClassificationWeb Document Classification

GoalGoalTo investigate the process of tagging web To investigate the process of tagging web pages using the topic directory structure pages using the topic directory structure and apply machine learning techniques for and apply machine learning techniques for automatic tagging.automatic tagging.

Page 8: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Web Document ClassificationWeb Document Classification

Project PhasesProject Phases

Data collection – collecting a set of 100 web documents grouped by Data collection – collecting a set of 100 web documents grouped by topic. Will serve as our training set.topic. Will serve as our training set.

Feature extraction and data preparation – web documents will be Feature extraction and data preparation – web documents will be represented by feature vectors, which in turn are used to form a represented by feature vectors, which in turn are used to form a training data set for the Machine Learning stage.training data set for the Machine Learning stage.

Machine learning – applying learning algorithms to create models of Machine learning – applying learning algorithms to create models of the data sets. Using these models the accuracy of the initial topic the data sets. Using these models the accuracy of the initial topic structure is evaluated and new web documents are classified into structure is evaluated and new web documents are classified into existing topics.existing topics.

Analysis – identifying relations between approaches used in the Analysis – identifying relations between approaches used in the project and AI areas of search and knowledge representation and project and AI areas of search and knowledge representation and reasoning (KR&R) reasoning (KR&R)

Page 9: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Phase I: Data CollectionPhase I: Data Collection

Topic 1Topic 1 Topic 2Topic 2 Topic 3Topic 3Computers: Artificial Intelligence: Computers: Artificial Intelligence: Machine LearningMachine Learning Topic 1Topic 1 Topic 2 Topic 4Topic 2 Topic 4Computers: Artificial Intelligence: Computers: Artificial Intelligence: AgentsAgents Topic 1Topic 1 Topic 5 Topic 5 Topic 6 Topic 6Computers: Algorithms: Computers: Algorithms: Sorting and SearchingSorting and Searching Topic 1Topic 1 Topic 7 Topic 8Topic 7 Topic 8Computers: Multimedia: Computers: Multimedia: MPEGMPEG Topic 1 Topic 9Topic 1 Topic 9

ComputersComputers: : HistoryHistory

Page 10: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Phase II: Feature Extraction and Phase II: Feature Extraction and Data CollectionData Collection

Words Frequency Words Frequency Words Frequency Words Frequency Words Frequencyand 10 the 97 the 36 2 60 the 65you 6 of 45 of 31 the 39 of 32to 5 and 34 to 17 data 20 in 27is 5 is 31 in 16 of 20 is 26in 5 to 27 n 13 The 18 a 24

code 4 1 26 Flashsort 12 1 17 n 20Visual 4 a 25 is 12 a 16 to 18

are 4 data 23 a 9 and 15 quicksort 18for 4 63 21 k 9 for 15 i 16

section 4 in 21 A 9 l 14 case 15of 4 27 20 The 8 sort 13 element 15

Basic 4 64 20 L 7 to 12 j 13The 4 9 20 time 7 gap 12 x 12a 4 number 19 be 6 is 11 sequence 11

algorithms 4 14 19 permutation 6 with 11 and 11C 3 72 19 and 6 table 10 elements 11

that 3 58 19 elements 6 ordered 9 worst 10sorting 2 sort 18 cycle 6 sequences 8 are 9

dictionaries 2 be 18 m 6 0 7 by 8this 2 pass 15 vector 5 gapidx 7 log 8

Source 2 The 15 for 5 input 7 that 8may 2 it 15 by 5 count 7 part 8be 2 It 15 algorithm 5 random 6 than 8

This 2 with 14 an 5 factor 6 The 8language 2 performance 12 in-situ 5 are 6 equal 7source 2 sorting 12 are 4 comb 6 it 7

structures 2 algorithm 12 new 4 N 6 as 7files 2 then 11 algorithm 4 tooth 5 time 7with 2 sorted 11 that 4--------------------------------------------------------------------------------5 average 7

document 2 that 11 Neubert 4 beach 5 lo 6programming 2 are 11 page 3 it 5 be 6

I 2 algorithms 10 find 3 in 5 comparison 6data 2 will 10 sorted 3 roughly 5 or 6the 2 elements 10 sorting 3 be 5 two 6If 2 through 9 small 3 switchCnt 4 Figure 6

operations 1 sort 9 leaders 3 current 4 recursion 6just 1 for 9 codes 3 from 4 parts 6

including 1 can 8 O 3 pre-defined 4 hi 6thanks 1 its 8 step 3 end 4 all 5

go 1 small 8 number 3 domain 4 sorted 5high-level 1 exchanges 8 array 3 83 4 first 5

restrictions 1 all 7 length 3 array 4 int 5whose 1 on 7 bits 3 than 4 2 5Hash 1 element 7 with 3 items 4 on 5

nervous 1 one 7 at 3 turtles 4 algorithm 5send 1 Data 7 tag 3 as 4 less 4

appreciated 1 selection 7 element 3 cliff 4 O 4presents 1 sub-list 7 - 3 on 3 one 4

freely 1 different 7 which 3 this 3 with 4notation 1 by 7 Karl-Dietrich 3 can 3 partitioning 4format 1 program 6 position 3 element 3 this 4read 1 quick 6 class 3 22 3 while 4

familiar 1 items 6 auxiliary 3 chosen 3 R 4

51 2 3 4

Page 11: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Phase III: Machine LearningPhase III: Machine Learning

Page 12: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Phase III: Machine LearningPhase III: Machine Learning

Actual Page Classification

Pre

dic

ted

Pa

ge

Cla

ssifi

catio

n

===== Stratified Cross-Validation =====

Correctly Classified Inst.(101) 87.069 %Incorrectly Classified Inst.(15) 12.931 %Kappa statistic 0.8379Mean absolute error 0.0633Root mean squared error 0.2245Relative absolute error 19.7946 %Root relative squared error 56.1278 %Total Number of Instances 116

Actual Page Classification

Pre

dic

ted

Pa

ge

Cla

ssifi

catio

n

===== Stratified Cross-Validation =====

Correctly Classified Inst.(101) 87.069 %Incorrectly Classified Inst.(15) 12.931 %Kappa statistic 0.8379Mean absolute error 0.0633Root mean squared error 0.2245Relative absolute error 19.7946 %Root relative squared error 56.1278 %Total Number of Instances 116

Error Analysis PlotResults

Page 13: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Preliminary ExperiencesPreliminary Experiences

Preliminary results were positive and showed that Preliminary results were positive and showed that students had good experiences in the classes. students had good experiences in the classes.

While covering the main AI topics, the course provided While covering the main AI topics, the course provided students with an introduction to and an appreciation of students with an introduction to and an appreciation of an increasingly important area in AI, Machine Learning. an increasingly important area in AI, Machine Learning.

Using a unified theme proved to be helpful and Using a unified theme proved to be helpful and motivating for the students. Students saw how simple motivating for the students. Students saw how simple search programs evolve into more interesting ones, and search programs evolve into more interesting ones, and finally into a learning framework with interesting finally into a learning framework with interesting theoretical and practical properties. theoretical and practical properties.

Page 14: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

Preliminary Experiences: Student QuotesPreliminary Experiences: Student Quotes

Working on the project was a great experience. I Working on the project was a great experience. I was able to see how various AI concepts tie was able to see how various AI concepts tie together in developing a machine learning system. together in developing a machine learning system. I was amazed by the wide range of applications of I was amazed by the wide range of applications of machine learning in various aspects of our lives. machine learning in various aspects of our lives.

I liked acquiring knowledge about machine learning I liked acquiring knowledge about machine learning techniques and being able to implement a system techniques and being able to implement a system and see it work. This gave me a concrete and see it work. This gave me a concrete understanding of the concepts.understanding of the concepts.

The project was really neat. I was challenged to The project was really neat. I was challenged to strengthen my deductive reasoning skills by strengthen my deductive reasoning skills by formalizing the process by which I derive solutions. formalizing the process by which I derive solutions. The problems associated with “satisfiability” are fun The problems associated with “satisfiability” are fun to work out, but they also provide me with an to work out, but they also provide me with an intellectual challenge.intellectual challenge.

I liked the fact that the project I worked on I liked the fact that the project I worked on pertained to the Internet and web document pertained to the Internet and web document classification. It presented a useful real world classification. It presented a useful real world application of machine learning. Often, examples in application of machine learning. Often, examples in the book or other projects lack real world the book or other projects lack real world usefulness.usefulness.

Page 15: Project MLExAI: Machine Learning Experiences in AI

NSF CCLI Showcase, March 1-5, 2006. Houston, TXNSF CCLI Showcase, March 1-5, 2006. Houston, TX

AcknowledgementAcknowledgement

This work is supported in part by National This work is supported in part by National Science Foundation grant DUE CCLI-A&I Award Science Foundation grant DUE CCLI-A&I Award Number 0409497. Number 0409497.