Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the...

29
Encoding, Validation and Verification Chapter 1

Transcript of Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the...

Page 1: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Encoding, Validation and Verification

Chapter 1

Page 2: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Introduction

• This presentation covers the following:

– Data encoding– Data validation– Data verification

Page 3: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Data encoding

• This is a method of changing the way we represent data.

• We do this to standardise the data we are dealing with.

• The original data is not stored...only the representation of it.

Page 4: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Data encoding

• Some codes are easier to work out than others:

MON TUE WED JAN FEB MAR

• For some, you will need a key.VXCORBLAFDFOCGRE

VX = Vauxhall FD = FordCOR = Corsa FOC = FocusBLA = Black GRE = Green

Page 5: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Take note:

• Create your own encoded data with a key so someone else will understand how it works.

• Looking at someone else’s, are there any limitations with their code? Could it cause any problems or confusion?

• Explain to that person why you think it is either fine or needs some improvement.

Page 6: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Problems with encoding• If you encode data it may become less accurate.

• You may end up limiting the possible number of data entries.

• For example, cars come in lots of different colours, but if you limit the choices to Red, Blue, Black, Silver, etc, you prevent the actual colours being entered.

– Star Silver and Lightning Silver are different...but encoding may regard them both as silver.

– This would be inaccurate and the validity of your data could be questioned.

Page 7: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Problems with encoding• Asking questions to people often returns different responses.

– “Did you enjoy the race?”– “It was good”– “It was alright...got a bit boring in places”– “Fantastic...I am glad he won”.

• Responses can be similar but not always the same. This means that we sometimes have to apply a judgement on how best to collect the response.

• If we had a scale from 1-4 (1=good, 4=rubbish) then where would we put the comments?

• Again, if more than one person is collecting the data we their judgements be the same?

Page 8: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Problems with encoding• Another problem occurs when you come across

some data that wont fit in with your encoding system.

• This means that you have to re-encode your data again which takes time and can also lead to some mistakes being made.

• If inaccuracies do occur how do you know if that data is incorrect? People might still assume the data is fine which could lead to more problems!

Page 9: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Encoding = Good Stuff!• Computers have a limited storage capacity.

• If you encode data you can reduce the amount of storage space needed. When you are dealing with thousands of records the space saved is huge!

• Also, it can be quicker to enter coded data. It doesn’t have to be less accurate either.– For example, M = Male, F = Female.

• A computer can also carry out validation checks on the encoded data to make sure it is valid.– For example, if it is not M or F then there must be a mistake.

Page 10: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Take note:• What is meant by encoding data?

• Describe three advantages of encoding data.

• Describe three disadvantages of encoding data.

• Give an example of how data can be encoded.

• Give two situations where the encoding of data is appropriate. For each situation, explain why data needs to be encoded.

Page 11: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Validation

• Validating data can be done using the following methods:– Range check– Type check– Presence check– Length check– Lookup check– Picture check– Check digit

Page 12: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Range Check

• Range is very simple.

• This involves a lower and an upper boundary for which a value can be entered.

• For instance, 0-100. The number 50 would be accepted as it falls within the boundaries, but the number 101 would exceed the boundary and thus be rejected.

Page 13: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Type Check

• This check prevents incorrect data types to be submitted.

• For example, entering the word “two” into a field which was expecting a numerical value would return an error as “two” is in text format.

Page 14: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Presence Check• You come across

these all the time on websites which ask for certain information to be included.

• The system will insist that you enter these pieces of data before proceeding to the next section.

Page 15: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Length Check

• Length checks prevent more characters being entered than is allowed.

• The word “shoe” has a length of 4.

• If we set the limit to 4 then “shoes” wouldn’t be allowed.

Page 16: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Lookup Check• A lookup check takes a value and compares it to

a set of values in another table.

• If a match is made then a result is returned.

• If no match is made then an error is returned.

• An example of this would be entering a student’s test score into a field and the system returning the student’s grade.

Page 17: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Picture Check• Also known as an Input Mask or Format Check.

• This type of check ensures data is entered in a predefined way.

• A good example of this is when dealing with dates.

• There are many ways to submit a date:– 01/Jan/2008– 01/01/2008– 1/1/08– Etc

• A Picture check will define how the date must be entered.

Page 18: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Check Digit• A check digit is a value which is worked out by performing a calculation on

a number and then is added to the end of that number.

• ISBN numbers have check digits.

• The ISBN for the text book is:– 978-0-340-95825-5

• The check digit is 5.

• Before 2007, when ISBN numbers had 10 numbers, the check digit was calculated using Modulus-11.

• New ISBN numbers are calculated using the modulus 10 method.

Page 19: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Modulus-10

ISBN 0 3 4 0 9 5 8 2 8

Code 10 9 8 7 6 5 4 3 2

ISBN 0 3 4 0 9 5 8 2 8

Code 0 27 32 0 54 25 32 6 16

Remove the check digit. Then write out the numbers in a table like this.The code starts at 2, and increments by 1, going from right to left.

Multiply the number by the code below.

Add up all the numbers. 0+27+32+0+54+25+32+6+16 = 192

Divide the number by 11. 192/11 = 17 remainder 5

Take the remainder from 11. Check Digit = 11 - 5 = 6

If the remainder is 0 the check digit is 0. If the remainder is 1 then the check digit is X.

Page 20: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Modulus-13

ISBN 9 7 8 0 3 4 0 9 5 8 2 8

Code 1 3 1 3 1 3 1 3 1 3 1 3

ISBN 9 7 8 0 3 4 0 9 5 8 2 8

Code 9 21 8 0 3 12 0 27 5 24 2 24

Remove the check digit. Then write out the numbers in a table like this.From right to left, alternate the weighting code from 3 and 1.

Multiply the number by the code below.

Add up all the numbers. 9+21+8+0+3+12+0+27+5+24+2+24 = 135

Divide the number by 10. 135/10 = 13 remainder 5

Take the remainder from 10. Check Digit = 10 - 5 = 5

If the remainder is 0 the check digit is 0. If the remainder is 1 then the check digit is X.

Page 21: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Take note:• In a spreadsheet, try

creating a working Check Digit Checker.

• The spreadsheet should be able to calculate a check digit using the ISBN number and then compare the result with the actual check digit.

• It should say whether it is valid or not.

• To work out a remainder use the =MOD() function.

Page 22: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Take note:• Use modulus-11 on these ISBN numbers.

• For numbers with incorrect digits replace them with correct ones.

– 1-854-87918-9– 0-552-77109-X– 0-330-28414-3– 0-330-34742-X– 0-330-35183-3

Page 23: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Verification

• Verification is not making sure that data is correct, but rather making sure data hasn’t been changed in any way.

• There are two ways of carrying out verification checks:– Double Entry– Manual verification

Page 24: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Double Entry• Basically, entering in data twice.

• For example, some websites ask you to type in your email address twice. This lowers the risk of entering in an address incorrectly.

• If the emails do not match the website will ask you to check them.

• However, if you enter the email address incorrectly both times and make the same mistake, then the website will miss the mistake!

Page 25: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Manual verification• This is like proof reading. A person may read

data from a paper source and then type them into a computer system.

• Humans aren’t very reliable and often make mistakes.

• Common mistakes include:– Transcription errors– Transposition errors

Page 26: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Transcription Errors

• This may involve pressing the wrong key accidently.

• For example,– Surname: Mouse Mowse or Mouce

Page 27: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Transposition Errors

• This is where two characters have been accidently reversed.

• For example:– Surname: Mouse Muose or Moues

Page 28: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Accuracy

• Just because we have use of validation and verification checks doesn’t mean data is accurate.

• For example, a number entered could still pass a range check, or a presence check can be validated because someone pressed the space bar in the field.

Page 29: Encoding, Validation and Verification Chapter 1. Introduction This presentation covers the following: – Data encoding – Data validation – Data verification.

Take note:

• Describe two methods of verification.

• Give two disadvantages of double entry verification.

• Give one advantage of manual verification.

• Explain why verification and validation can not ensure that data is entered accurately but do explain why they are useful despite these problems.