IT Applications Theory Slideshows Data Validation By Mark Kelly mark@vceit Vceit
-
Upload
eliana-allen -
Category
Documents
-
view
36 -
download
1
description
Transcript of IT Applications Theory Slideshows Data Validation By Mark Kelly mark@vceit Vceit
Data Validation• The step taken after data input.• Ensures data are reasonable.• Cannot check for accuracy of data.
• Helps prevents processing of bad data to create bad information.
• GIGO = garbage in, garbage out
Validation
• E.g. a form says a VCE student was born in England.
• Validation can tell it’s reasonable - but it still may be complete rubbish.
• Another form that says the student was born in 45.67 is clearly not reasonable.
What’s reasonable?
• The main validation checks:–Type check–Range check–Existence check (or presence check)
Type check
• A value should be of the right type– Names should be text– Dates should be dd/mm/yyyy format– Number of pets should be numeric– A photo field should contain a JPG image rather
than a spreadsheet– A credit card number field rejects the letter “O”
typed instead of a zero
Range check
• Data should be within a certain range– Kindergarten students’ ages should be between 3
and 6– Number of pets should be >= 0
• Data must exist in a limited list of options– State of residence must be in (Vic, SA, Qld, WA,
ACT, Tas, NT). Sex must be M or F.
Range check
• Data must be unique– e.g. a phone number, customer ID
• Data must be of a minimum, maximum or fixed length – e.g. credit card number has 16 digits
• Data must have a given format– E.g. user ID must have 3 letters and 2 digits– Date must be in dd-mm-yyyy format
Existence check
• Is the data present?• In a pizza delivery list, leaving off the
customer’s name may not be vital, but leaving off the address is.
• Existence check can reject records that lack vital data.
Warning!
• Don’t make validation so strict that valid data are rejected.
• Data can be unexpected but accurate!– kindergarten child could be 8 years old in rare
circumstances– mandating the entry of a phone number would
wrongly reject people with no phone
• Validation should reject absurd data, not slightly odd data!
Bad validation• American websites that insist on a 5 digit zip
code (postcode) – even if you’re not American!• Databases that reject addresses with no Street/
Avenue/Road/Lane (etc) value: some addresses don’t have them.
• Testing for existence in a limited list when the list is not actually limited. (e.g. Title must be one of:Mr, Miss, Mrs, Dr, Sir, Madam’)
Manual & Electronic• Some data needs a human’s common sense to
detect unreasonable entries– Spelling of names– Entry of Adam Sandler in a database of talented
actors– “Jane Smith” registered as a male
• Other checks are better done electronically– Faster, more accurate to get a database to check
for missing ID number values
Some validation techniquesCheck digits • Used for numerical data. An extra digit is
added to a number which is calculated from the digits.
• The computer checks this calculation when data are entered, e.g., The ISBN for a book. The last digit is a check digit calculated using a modulus 11 method.
• Detects mis-typing.
Some validation techniquesSpelling and grammar check • Looks for spelling and grammatical errors. Consistency ChecksChecks fields to ensure data in these fields
corresponds, e.g., If Title = "Mr.", then Gender = "M".
To prevent invalid data entry
Drop-down menus or lists• Force data to be entered from a pre-made
selection– States of Australia– Months of the year
Helping users• Give examples or advice about expected data
formats…
Date (dd/mm/yyyy)Enter your 10-character regocode (case sensitive) __ __ __ __ - __ __ __ __ - __ __
Note how this form clearly indicates which fields are
required and which are optional
Calendar Control• Date cannot be invalid since it must be chosen from a set of valid dates• Can still be inaccurate however!
Calendar control helps users
• User does not have to guess correct date format expected
• 23/08/09?• 23-08-2009?• 23 Aug 09?• 08/23/09 (US format)?• 092308 (Asian format)?
Calendar control helps organisations
• Date data is not ambiguous• Does 02/05/06 entered mean:– 2 May 2006?– 5 Feb 2006?– 6 May 2002?
A bad date could be prevented with good validation