Finding Nemo, Findng Dory, Finding Ourselves: How and Why ...
Finding Information A337/A523. What are some of the possible problems with finding information?
-
Upload
tracy-brown -
Category
Documents
-
view
213 -
download
1
Transcript of Finding Information A337/A523. What are some of the possible problems with finding information?
Finding InformationA337/A523
What are some of the possible problems with finding information?
What are some of the possible problems with finding information? Information is often lacks STRUCTURE
ASSOCIATION between the identifying information (i.e., labels and the actual information is not always obvious) and the data
CONSISTENCY is not always present. E.g., 317-274-0185
(317)274-0185
3172740185
May later need to MANIPULATE data (filter, sorting, etc.)
Typical “Office” Applications
Word Processing
Spreadsheet
Database Management System (DBMS)
Spreadsheets and DBMSes
Columns (labels)
Rows (“instance” or record)
Intersection (value)
Information often lacks STRUCTURE
ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious
CONSISTENCY is not always present. E.g.,
317-274-0185
(317)274-0185
3172740185
May later need to MANIPULATE data (deeper search, sorting, etc.)
Spreadsheets
Tables in MS Excel
Information often lacks STRUCTURE
ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious
CONSISTENCY is not always present. E.g.,
317-274-0185
(317)274-0185
3172740185
May later need to MANIPULATE data (deeper search, sorting, etc.)
DBMSes
Tables in MS Access Table is one of many
objects in a database
Easier to associate tables than in a spreadsheet (i.e., vlookup)
Tables have several unique properties we’ll discuss later
Information often lacks STRUCTURE
ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious
CONSISTENCY is not always present. E.g.,
317-274-0185
(317)274-0185
3172740185
May later need to MANIPULATE data (deeper search, sorting, etc.)
ERP Systems
Centralized database eliminates the need to associated data located on separate systems
Information often lacks STRUCTURE
ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious
CONSISTENCY is not always present. E.g.,
317-274-0185
(317)274-0185
3172740185
May later need to MANIPULATE data (deeper search, sorting, etc.)
Data Quality: What is Dirty Data?
It happens when the UPC code on a package doesn't match the item.
Causes?
Vendor-Unique product code and cost
Retailer-Unique product code and price
Data Quality: What is Dirty Data?
Potential Problems?
Inventory Reorder
Profit per unit Net profit
Customer Satisfaction Repeat Business
Angry Bloggers
Solution: Same code for vendor and retailerData Integrity: Wal-Mart's Dirty Secret
Extract, Transform, Load (ETL)
From Computerworld QuickStudy