1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

24
1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011

Transcript of 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

Page 1: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

1

CSC 9010 Spring 2011. Paula Matuszek

CSC 9010

ANN Lab

Paula Matuszek

Spring, 2011

Page 2: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

2

CSC 9010 Spring 2011. Paula Matuszek

Knowledge-Based Systems and Artificial Neural Nets• Knowledge Representation means:

– Capturing human knowledge

– In a form computer can reason about

• Is this relevant to ANNs?

• Consider the question we’ve been modeling all semester:– what class should I take?

• What would be involved in representing this as a neural net?

Page 3: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

3

CSC 9010 Spring 2011. Paula Matuszek

Knowledge for ANNs• What human knowledge do we still need to

capture?– Input variables– Output variables– Training and test cases

• Let’s look at some.– http://www.aispace.org/neural/

Page 4: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

4

CSC 9010 Spring 2011. Paula Matuszek

Example from AISpace• Mail

– Examples– Set properties– Initialize the parameters– Solve– How do we use it? Calculate output

Page 5: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

5

CSC 9010 Spring 2011. Paula Matuszek

Our “Which class to take” problem• Inputs?

• Outputs?

• Sample data

Page 6: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

6

CSC 9010 Spring 2011. Paula Matuszek

Some Examples• Example 1:

– 3 inputs, 1 output, all binary

• Example 2:– same inputs, output inverted

Page 7: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

7

CSC 9010 Spring 2011. Paula Matuszek

Getting the right inputs• Example 3

– Same inputs as 1 and 2– Same output as 1– Outcomes reversed for half the cases

Page 8: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

8

CSC 9010 Spring 2011. Paula Matuszek

Getting the right inputs• Example 3

– Same inputs as 1 and 2– Same output as 1– Outcomes reversed for half the cases

• Network is not converging

• The output here cannot be predicted from these inputs.

• Whatever is determining whether to take the class, we haven’t captured it

Page 9: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

9

CSC 9010 Spring 2011. Paula Matuszek

Representing non-numeric values• Example 4

– Required is represented as “yes” or “no”

Page 10: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

10

CSC 9010 Spring 2011. Paula Matuszek

Representing non-numeric values• Example 4

– Required is represented as “yes” or “no”

• Actual model still uses 1 and 0; transformation is done by applet.

Page 11: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

11

CSC 9010 Spring 2011. Paula Matuszek

More non-numeric values• Example 5

• Workload is low, med, high: text values but they can be ordered.

Page 12: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

12

CSC 9010 Spring 2011. Paula Matuszek

More non-numeric values• Example 5

• Workload is low, med, high: text values but they can be ordered.

• Applet asks us to assign values. 1, 0.5, 0 is typical.

Page 13: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

13

CSC 9010 Spring 2011. Paula Matuszek

Unordered values• Example 6

– Input variables here include professor– Non-numeric, can’t be ordered.

Page 14: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

14

CSC 9010 Spring 2011. Paula Matuszek

Unordered values• Example 6

– Input variables here include professor– Non-numeric, can’t be ordered.

• Still need numeric values

• Solution is to treat n possible values as n separate binary values

• Again, applet does this for us

Page 15: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

15

CSC 9010 Spring 2011. Paula Matuszek

Variables with more values• Example 7

– GPA and number of classes taken are integer values

• Takes considerably longer to solve

• Looks for a while like it’s not converging

Page 16: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

16

CSC 9010 Spring 2011. Paula Matuszek

And Reals• Example 8

– GPA is a real.

• Takes about 20,000 steps to converge

Page 17: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

17

CSC 9010 Spring 2011. Paula Matuszek

And multiple outputs• Small Car database from AIspace

• For any given input case, you will get a value for each possible outcome.

• Typical for, for instance, character recognition.

Page 18: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

18

CSC 9010 Spring 2011. Paula Matuszek

Training and Test Cases• The basic training approach will fit the training

data as closely as possible.

• But we really want something that will generalize to other cases

• This is why we have test cases.– The training cases are used to compute the weights

– The test cases tell us how well they generalize

• Both training and test cases should represent the overall population as well as possible.

Page 19: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

19

CSC 9010 Spring 2011. Paula Matuszek

Representative Training Cases• Example 9

– Training cases and test cases are similar

Page 20: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

20

CSC 9010 Spring 2011. Paula Matuszek

Representative Training Cases• Example 9

– Training cases and test cases are similar– (actually identical...)

• Training error and test error are comparable

Page 21: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

21

CSC 9010 Spring 2011. Paula Matuszek

Non-Representative Training Cases• Example 10

– Training cases and test cases represent different circumstances

– We’ve missed including any cases involving Lee in the training

• Training error goes down, but test error goes up.

• In reality these are probably bad training AND test cases; neither seems representative.

Page 22: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

22

CSC 9010 Spring 2011. Paula Matuszek

So:• Getting a good ANN still involves

understanding your domain and capturing knowledge about it– choosing the right inputs and outputs

– choosing representative training and test sets

– Beware “convenient” training sets

• You can represent any kind of variable: numeric or not, ordered or not.

• Not every set of variables and training cases will produce something that can be trained.

Page 23: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

23

CSC 9010 Spring 2011. Paula Matuszek

Once it’s trained...• When your ANN is trained, you can feed it a specific

set of inputs and get one or more outputs.

• These outputs are typically interpreted as some decision:– take the class

– this is probably a “5”

• The network itself is black box.

• If the situation changes the ANN should be retrained– new variables

– new values for some variables

– new patterns of cases

Page 24: 1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

24

CSC 9010 Spring 2011. Paula Matuszek

One last note• These have all been simple cases, as

examples

• Most of my examples could in fact be predicted much more easily and cleanly with a decision tree, or even a couple of IF statements

• A more typical use for any connectionist system has many more inputs and many more training cases