L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States...
-
Upload
ashlynn-matthews -
Category
Documents
-
view
217 -
download
2
Transcript of L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States...
L10 - The GIS Database – Part 1
Chapter 8
Entity
• Bangor – Penobscot County,
Maine, United States
– Centroid - 44.801N , -68.778W
– Area 34.4 square miles
– Elevation – 158 feet
– Population 31,473
Standard GIS Data Model
Linked spatial and attribute data.
What is a database
A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: – a telephone book – T.V. Guide – airline reservation system – motor vehicle registration records – papers in your filing cabinet – files on your computer hard drive.
Data vs. information:What is the difference?
• What is data?– Data can be defined in
many ways. Information science defines data as unprocessed information.
• What is information?– Information is data that
have been organized and communicated in a coherent and meaningful manner.
– Data is converted into information, and information is converted into knowledge.
– Knowledge; information evaluated and organized so that it can be used purposefully.
Database Definitions
What is a database? It’s an organized collection of data.
What is a database management system?A database management system (DBMS) which provides you with the software tools you need to organize that data in a flexible manner. It includes tools to add, modify or delete data from the database, ask questions (or queries) about the data stored in the database and produce reports summarizing selected contents.
What is the ultimate purpose of a database management
system?
DataData InformationInformation KnowledgeKnowledge ActionAction
Is to transform
Features of a DBMS
• Database Management Systems provide features to maintain database:– Data independence– Integrity and security– Transaction management– Concurrency control– Backup and recovery– Provides a language for the creation and querying of
the database.– A language for writing application programs
Selecting a Database Management System
Database management systems (or DBMSs) can be divided into two categories -- desktop databases and server databases.
• Generally speaking, desktop databases are oriented toward single-user applications and reside on standard personal computers (hence the term desktop).
• Server databases contain mechanisms to ensure the reliability and consistency of data and are geared toward multi-user applications.
Database Terminology
Relational Databases
• The relational database model is the most dominant model in both the corporate and GIS world, due to its flexibility, organization, and functioning..
• It was defined by Edgar F. Codd (1970).• It can accommodate a wide range of data types.• It is not necessary to know beforehand the types
of processing that will be performed on the database.
Relational Database Terminology
• Each table contains the data for a single entity.• Each instance of an entity is a row/record/tuple
in a table.• Columns contain attributes/fields that describe
the entity.– Attributes must be from the same domain (text,
integer, date).– Column order has no significance.
• Tables are related through keys.
Attributes
• An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.Domain – the set of permitted values for each attribute
• Attribute types:– Simple and composite attributes.– Single-valued and multi-valued attributes
• E.g. multivalued attribute: phone-numbers
– Derived attributes• Can be computed from other attributes• E.g. age, given date of birth
Keys
• A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.
• A candidate key of an entity set is a minimal super key– Customer-id is candidate key of customer– account-number is candidate key of account
• Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.
Keys
• A composite key is a key with more than one attribute.
• A foreign key is an attribute that is a key of one or more relations other than the one in which it appears.
Keys
Primary Key Foreign Key
Primary Key Foreign Key
Primary Key Foreign Key
Relational Algebra
• Codd’s specification of a relational database relied on relational algebra.
• Relational algebra takes tables/relations as inputs and returns tables as outputs.
• The algebra combines or splits tables by rows or by columns to generate either a subset of tables or an expanded tables.
Tables comprise the fundamental building blocks of any database.
The table above contains the employee information for an organization -- characteristics like name, date of birth and title. Examine the construction of the table and you'll find that each column of the table corresponds to a specific employee characteristic (or attribute in database terms). Each row corresponds to one particular employee and contains his or her information.
Fundamental building blocks
Relational Algebra
• Five basic operators
– select: – project: – union: – difference: – – Cartesian product: x
• The operators take one or two relations as inputs and produce a new relation as a result.
Relational Algebra
• Derived Relational operators– Intersection – Divide (not used very often)– Join
• These can be expressed using different combinations of the fundamental operators.
Select Operation – Example
Relation r A B C D
1
5
12
23
7
7
3
10A=B ^ D > 5 (r)
A B C D
1
23
7
10
Select from relation r where A=B AND D>5
Database Queries
• Queries may be made of one table or several tables at the same time.
• In many systems querying is facilitated by icons, or menus, or queries by example (QBE – a graphical query language ).
SQL
• DDL – Data Definition Language; used to create and manage the database.
• DDM – Data Manipulation Language; used to query the database.
SQL• SQL: widely used non-procedural language
– E.g. find the name of the customer with customer-id 192-83-7465select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’
– E.g. find the balances of all accounts held by the customer with customer-id 192-83-7465
select account.balancefrom depositor, accountwhere depositor.customer-id = ‘192-83-7465’ and depositor.account-number = account.account-number
• Application programs generally access databases through one of– Language extensions to allow embedded SQL– Application program interface (e.g. ODBC/JDBC) which allow SQL queries
to be sent to a database
Attribute Queries
The ArcGIS Attribute Query Interface
State’s Table is Open
OptionsRelated tablesSelect by attributesSwitch selectionClear selectionZoom to selectedDelete Selected
Table is Open
No Table is Open
Selection->Select by Attributes from the Menu Bar
A Spatial Query in SQL
SELECT city.name, city.geometry
FROM city, county
WHERE county.name=‘Penobscot’ AND
city.geometry INSIDE county.geometry
city.population>30000;
Spatial Selection
Table Joins
• Table joins depend on the data not the attribute name.
• There are many different types of table joins.
• Tables can be joined regardless of the relationship EXCEPT:– When joining to the feature attribute table, the
relationship must be 1:1 or M:1– Other relationships must use the relate.
One-to-One JoinEmployee-id Job
1 Digislave
2 Useless Supervisor
Employee-id name
1 Tom
2 John
Join Employee-id to Employee-id
After join
Employee-id Job Name
1 Digislave Tom
2 Useless Supervisor John
A join does not permanently alter the table structure
Many-to-One JoinSymbol Description
Qa Quaternary Alluvium
Qe Quaternary Eolian
Pa Permian Abo
Polygon Id Symbol
1 Qa
2 Qa
3 Pa
4 Qe
Polygon ID Symbol Description
1 Qa Quaternary Alluvium
2 Qa Quaternary Alluvium
3 Pa Permian Abo
4 Qe Quaternary Eolian
After Join on Symbol
http://courses.washington.edu/gis250/lessons/introduction_gis/relational_database_model.html
Relate in a GIS
Relational Database Modeling
• Entity Relationship Model – the result is a diagram of all of the entities, their attributes, and the relationships between entities– Each entity becomes a table.– Each relationship becomes a table
• Database Normalization – The result is a number of entity tables. We still need to create the relationship tables.
Entity Relationship Diagram
ENTITY
RELATIONSHIP
ATTRIBUTE
Rectangles represent entity sets.
Diamonds represent relationship sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Ellipses represent attributes
Double ellipses represent multivalued attributes.
Dashed ellipses denote derived attributes.
Underline indicates primary key attributes (will study later)
Entity Relationship Diagram
Student Enrolls In Courses
ID
Name Address
Phone
Major
Advisor Credits
GPA
Number
Name
Credits
Time
Room
Instructor
Types of Relationships between Entities
• 1:1 – one faculty member is assigned to one office.
• 1:M (M:1) – one faculty member teaches many courses.
• M:N – many students take many courses.
Orono Charter Airlines Orono Charter Airlines is an aircraft charter company that
supplies on-demand charter flight services to corporate customers, using a fleet of four different aircraft. For each plane, they need to keep track of the number of passengers it can hold, the charge per seat/mile, the manufacturer, model name, total miles flown, and date of last annual maintenance. Aircraft are always identifiable by a unique registration number. For each flight, the company must keep track of the date, pilot and copilot’s names, the plane used, the destination, distance, air time, ground time and fuel and oil used. In addition to pilots and copilots, the company has customer service personnel, mechanics and maintenance personnel. In addition to the normal employee information, they must keep additional information on the pilots, such as their license number, their aircraft rating and when they had their last medical.
Entity Sets
• Aircraft
• Customer
• Charter
• Pilot
• Employee
• Model
Relationship Sets
Customer CharterRequests1 M
Adding the Attributes
Customer CharterRequests1 M
LicenseNum
Name
StreetAddr
CityAddr
State
Zip
Phone
CreditCardType
CreditCardNum
Date
PlaneRegNum
PilotName
CoPilotName
Destination
Distance
Airtime
GroundTime
FuelUsed
OilUsed
CharterCustomer Requests
Pilot
Pilots
Is A
Employee
AircraftFlies
References
Model
1 M M 1
M
1
1
1
M
1
CoPilots
1
M
LicenseNoCustomerName
CustomerAddressCreditCard
CreditCardNumber
ModelNameManufacturer
AddressPhone
ContactPerson
RegistrationNoMilesFlown
DateLastMaintenanceNoPassengers
Charge_Seat_Mile
DatePlaneRegistrationNo
PilotNameCopilotNameDestination
DistanceAirTime
GroundTimeFuelUsedOilUsed
SSNoLicenseNo
RatingMedicalDate
SSNoName
AddressPhone
JobTitleSalary
Orono Airlines
Database Normalization
• Start with a single relational schema containing all attributes and decompose it.
• The goal is to decrease the chance of inconsistent information in the database.
• There is a hierarchy of rules for decomposing the table.
• In most cases, only the first three are used.
Normalization Rules
• Begin with an unnormalized user view.
• To put all tables in first normal form (1NF), remove repeating groups.
• To put all tables in second normal form (2NF), remove partial dependencies.
• To put all tables in third normal form (3NF) remove all transitive dependencies.
Student_ID
Name Major CRN Title Instructor Instructor
_Office
Grade
3421 Jones
SIS 333
222
Aaaaa
Bbbbb
Holden
Taylor
134 BD
222 MA
C
8725 Dow BIO 555
666
777
Bbbbb
Ccccc
Dddd
Taylor
Allen
Kemp
222 M
351 B
317 Nv
B
B-
C+
…
Unnormalized Table
Remove repeating groups – Creates two tables
StudentID Name Major
3421 Jones SIS
8725 Dow BIO
…
Student Table
Student_course Table
Student_ID
CRN Title Instructor Instructor
_Office
Grade
3421 333 Aaaaaa Holden 125 Bd A
3421 222 Bbbbbb Taylor 222 M C
8725 555 Bbbbbb Taylor 222 M B
8725 666 Cccccc Allen 351 B B-
8725 777 Dddddd Kemp 317 Nv C+
Remove partial dependencies.
This refers only to tables having a composite key.
You need to identify attributes that require only part of the composite key, and remove them.
Student_ID CRN Grade
3421 333 A
3421 222 C
8725 555 B
8725 666 B-
8725 777 C+
Registration Table
CRN Title Instructor Instructor
_Office
333 Aaaaaa Holden 125 Bd
222 Bbbbbb Taylor 222 M
555 Bbbbbb Taylor 222 M
666 Cccccc Allen 351 B
777 Dddddd Kemp 317 Nv
Course-Instructor Table
To put a table in third normal form remove transitive dependencies
A transitive dependency is an attribute that depends only on another attribute, not a key.
CRN Title Instructor
333 Aaaaaa Holden
222 Bbbbbb Taylor
555 Bbbbbb Taylor
666 Cccccc Allen
777 Dddddd Kemp
Course Table
Instructor Instructor
_Office
Holden 125 Bd
Taylor 222 M
Taylor 222 M
Allen 351 B
Kemp 317 Nv
Instructor Table
Student_ID
Name Major CRN Title Instructor Instructor
_Office
Grade
StudentID Name Major
Student_ID CRN Grade
CRN Title Instructor
Instructor Instructor
_Office
When NOT to Normalize
• Relates in large tables require large amounts of computer time to process.– If you have an application that speed is more
important than database structure.
• If you are creating a very simple data base that will be used for a short time and then discarded.