L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States...

57
L10 - The GIS Database – Part 1 Chapter 8

Transcript of L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States...

Page 1: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

L10 - The GIS Database – Part 1

Chapter 8

Page 2: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Entity

• Bangor – Penobscot County,

Maine, United States

– Centroid - 44.801N , -68.778W

– Area 34.4 square miles

– Elevation – 158 feet

– Population 31,473

Page 3: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Standard GIS Data Model

Linked spatial and attribute data.

Page 4: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

What is a database

A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: – a telephone book – T.V. Guide – airline reservation system – motor vehicle registration records – papers in your filing cabinet – files on your computer hard drive. 

Page 5: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Data vs. information:What is the difference?

• What is data?– Data can be defined in

many ways. Information science defines data as unprocessed information.

• What is information?– Information is data that

have been organized and communicated in a coherent and meaningful manner.

– Data is converted into information, and information is converted into knowledge.

– Knowledge; information evaluated and organized so that it can be used purposefully.

Page 6: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Database Definitions

What is a database? It’s an organized collection of data.

What is a database management system?A database management system (DBMS) which provides you with the software tools you need to organize that data in a flexible manner. It includes tools to add, modify or delete data from the database, ask questions (or queries) about the data stored in the database and produce reports summarizing selected contents.

Page 7: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

What is the ultimate purpose of a database management

system?

DataData InformationInformation KnowledgeKnowledge ActionAction

Is to transform

Page 8: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Features of a DBMS

• Database Management Systems provide features to maintain database:– Data independence– Integrity and security– Transaction management– Concurrency control– Backup and recovery– Provides a language for the creation and querying of

the database.– A language for writing application programs

Page 9: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Selecting a Database Management System

Database management systems (or DBMSs) can be divided into two categories -- desktop databases and server databases.  

• Generally speaking, desktop databases are oriented toward single-user applications and reside on standard personal computers (hence the term desktop). 

• Server databases contain mechanisms to ensure the reliability and consistency of data and are geared toward multi-user applications.

Page 10: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Database Terminology

Page 11: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Databases

• The relational database model is the most dominant model in both the corporate and GIS world, due to its flexibility, organization, and functioning..

• It was defined by Edgar F. Codd (1970).• It can accommodate a wide range of data types.• It is not necessary to know beforehand the types

of processing that will be performed on the database.

Page 12: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Database Terminology

• Each table contains the data for a single entity.• Each instance of an entity is a row/record/tuple

in a table.• Columns contain attributes/fields that describe

the entity.– Attributes must be from the same domain (text,

integer, date).– Column order has no significance.

• Tables are related through keys.

Page 13: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Attributes

• An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.Domain – the set of permitted values for each attribute

• Attribute types:– Simple and composite attributes.– Single-valued and multi-valued attributes

• E.g. multivalued attribute: phone-numbers

– Derived attributes• Can be computed from other attributes• E.g. age, given date of birth

Page 14: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Keys

• A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.

• A candidate key of an entity set is a minimal super key– Customer-id is candidate key of customer– account-number is candidate key of account

• Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Page 15: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Keys

• A composite key is a key with more than one attribute.

• A foreign key is an attribute that is a key of one or more relations other than the one in which it appears.

Page 16: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Keys

Primary Key Foreign Key

Primary Key Foreign Key

Primary Key Foreign Key

Page 17: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Algebra

• Codd’s specification of a relational database relied on relational algebra.

• Relational algebra takes tables/relations as inputs and returns tables as outputs.

• The algebra combines or splits tables by rows or by columns to generate either a subset of tables or an expanded tables.

Page 18: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Tables comprise the fundamental building blocks of any database. 

The table above contains the employee information for an organization -- characteristics like name, date of birth and title.  Examine the construction of the table and you'll find that each column of the table corresponds to a specific employee characteristic (or attribute in database terms).  Each row corresponds to one particular employee and contains his or her information. 

Fundamental building blocks

Page 19: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Algebra

• Five basic operators

– select: – project: – union: – difference: – – Cartesian product: x

• The operators take one or two relations as inputs and produce a new relation as a result.

Page 20: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Algebra

• Derived Relational operators– Intersection – Divide (not used very often)– Join

• These can be expressed using different combinations of the fundamental operators.

Page 21: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Select Operation – Example

Relation r A B C D

1

5

12

23

7

7

3

10A=B ^ D > 5 (r)

A B C D

1

23

7

10

Select from relation r where A=B AND D>5

Page 22: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Database Queries

• Queries may be made of one table or several tables at the same time.

• In many systems querying is facilitated by icons, or menus, or queries by example (QBE – a graphical query language ).

Page 23: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

SQL

• DDL – Data Definition Language; used to create and manage the database.

• DDM – Data Manipulation Language; used to query the database.

Page 24: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

SQL• SQL: widely used non-procedural language

– E.g. find the name of the customer with customer-id 192-83-7465select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’

– E.g. find the balances of all accounts held by the customer with customer-id 192-83-7465

select account.balancefrom depositor, accountwhere depositor.customer-id = ‘192-83-7465’ and depositor.account-number = account.account-number

• Application programs generally access databases through one of– Language extensions to allow embedded SQL– Application program interface (e.g. ODBC/JDBC) which allow SQL queries

to be sent to a database

Page 25: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Attribute Queries

Page 26: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.
Page 27: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.
Page 28: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

The ArcGIS Attribute Query Interface

State’s Table is Open

Page 29: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.
Page 30: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

OptionsRelated tablesSelect by attributesSwitch selectionClear selectionZoom to selectedDelete Selected

Table is Open

Page 31: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

No Table is Open

Selection->Select by Attributes from the Menu Bar

Page 32: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

A Spatial Query in SQL

SELECT city.name, city.geometry

FROM city, county

WHERE county.name=‘Penobscot’ AND

city.geometry INSIDE county.geometry

city.population>30000;

Page 33: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Spatial Selection

Page 34: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Table Joins

• Table joins depend on the data not the attribute name.

• There are many different types of table joins.

• Tables can be joined regardless of the relationship EXCEPT:– When joining to the feature attribute table, the

relationship must be 1:1 or M:1– Other relationships must use the relate.

Page 35: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

One-to-One JoinEmployee-id Job

1 Digislave

2 Useless Supervisor

Employee-id name

1 Tom

2 John

Join Employee-id to Employee-id

After join

Employee-id Job Name

1 Digislave Tom

2 Useless Supervisor John

A join does not permanently alter the table structure

Page 36: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Many-to-One JoinSymbol Description

Qa Quaternary Alluvium

Qe Quaternary Eolian

Pa Permian Abo

Polygon Id Symbol

1 Qa

2 Qa

3 Pa

4 Qe

Polygon ID Symbol Description

1 Qa Quaternary Alluvium

2 Qa Quaternary Alluvium

3 Pa Permian Abo

4 Qe Quaternary Eolian

After Join on Symbol

Page 37: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

http://courses.washington.edu/gis250/lessons/introduction_gis/relational_database_model.html

Relate in a GIS

Page 38: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relational Database Modeling

• Entity Relationship Model – the result is a diagram of all of the entities, their attributes, and the relationships between entities– Each entity becomes a table.– Each relationship becomes a table

• Database Normalization – The result is a number of entity tables. We still need to create the relationship tables.

Page 39: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Entity Relationship Diagram

ENTITY

RELATIONSHIP

ATTRIBUTE

Rectangles represent entity sets.

Diamonds represent relationship sets.

Lines link attributes to entity sets and entity sets to relationship sets.

Ellipses represent attributes

Double ellipses represent multivalued attributes.

Dashed ellipses denote derived attributes.

Underline indicates primary key attributes (will study later)

Page 40: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Entity Relationship Diagram

Student Enrolls In Courses

ID

Name Address

Phone

Major

Advisor Credits

GPA

Number

Name

Credits

Time

Room

Instructor

Page 41: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Types of Relationships between Entities

• 1:1 – one faculty member is assigned to one office.

• 1:M (M:1) – one faculty member teaches many courses.

• M:N – many students take many courses.

Page 42: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Orono Charter Airlines Orono Charter Airlines is an aircraft charter company that

supplies on-demand charter flight services to corporate customers, using a fleet of four different aircraft. For each plane, they need to keep track of the number of passengers it can hold, the charge per seat/mile, the manufacturer, model name, total miles flown, and date of last annual maintenance. Aircraft are always identifiable by a unique registration number. For each flight, the company must keep track of the date, pilot and copilot’s names, the plane used, the destination, distance, air time, ground time and fuel and oil used. In addition to pilots and copilots, the company has customer service personnel, mechanics and maintenance personnel. In addition to the normal employee information, they must keep additional information on the pilots, such as their license number, their aircraft rating and when they had their last medical.

Page 43: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Entity Sets

• Aircraft

• Customer

• Charter

• Pilot

• Employee

• Model

Page 44: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Relationship Sets

Customer CharterRequests1 M

Page 45: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Adding the Attributes

Customer CharterRequests1 M

LicenseNum

Name

StreetAddr

CityAddr

State

Zip

Phone

CreditCardType

CreditCardNum

Date

PlaneRegNum

PilotName

CoPilotName

Destination

Distance

Airtime

GroundTime

FuelUsed

OilUsed

Page 46: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

CharterCustomer Requests

Pilot

Pilots

Is A

Employee

AircraftFlies

References

Model

1 M M 1

M

1

1

1

M

1

CoPilots

1

M

LicenseNoCustomerName

CustomerAddressCreditCard

CreditCardNumber

ModelNameManufacturer

AddressPhone

ContactPerson

RegistrationNoMilesFlown

DateLastMaintenanceNoPassengers

Charge_Seat_Mile

DatePlaneRegistrationNo

PilotNameCopilotNameDestination

DistanceAirTime

GroundTimeFuelUsedOilUsed

SSNoLicenseNo

RatingMedicalDate

SSNoName

AddressPhone

JobTitleSalary

Orono Airlines

Page 47: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Database Normalization

• Start with a single relational schema containing all attributes and decompose it.

• The goal is to decrease the chance of inconsistent information in the database.

• There is a hierarchy of rules for decomposing the table.

• In most cases, only the first three are used.

Page 48: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Normalization Rules

• Begin with an unnormalized user view.

• To put all tables in first normal form (1NF), remove repeating groups.

• To put all tables in second normal form (2NF), remove partial dependencies.

• To put all tables in third normal form (3NF) remove all transitive dependencies.

Page 49: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Student_ID

Name Major CRN Title Instructor Instructor

_Office

Grade

3421 Jones

SIS 333

222

Aaaaa

Bbbbb

Holden

Taylor

134 BD

222 MA

C

8725 Dow BIO 555

666

777

Bbbbb

Ccccc

Dddd

Taylor

Allen

Kemp

222 M

351 B

317 Nv

B

B-

C+

Unnormalized Table

Page 50: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Remove repeating groups – Creates two tables

StudentID Name Major

3421 Jones SIS

8725 Dow BIO

Student Table

Page 51: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Student_course Table

Student_ID

CRN Title Instructor Instructor

_Office

Grade

3421 333 Aaaaaa Holden 125 Bd A

3421 222 Bbbbbb Taylor 222 M C

8725 555 Bbbbbb Taylor 222 M B

8725 666 Cccccc Allen 351 B B-

8725 777 Dddddd Kemp 317 Nv C+

Page 52: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Remove partial dependencies.

This refers only to tables having a composite key.

You need to identify attributes that require only part of the composite key, and remove them.

Student_ID CRN Grade

3421 333 A

3421 222 C

8725 555 B

8725 666 B-

8725 777 C+

Registration Table

Page 53: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

CRN Title Instructor Instructor

_Office

333 Aaaaaa Holden 125 Bd

222 Bbbbbb Taylor 222 M

555 Bbbbbb Taylor 222 M

666 Cccccc Allen 351 B

777 Dddddd Kemp 317 Nv

Course-Instructor Table

Page 54: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

To put a table in third normal form remove transitive dependencies

A transitive dependency is an attribute that depends only on another attribute, not a key.

CRN Title Instructor

333 Aaaaaa Holden

222 Bbbbbb Taylor

555 Bbbbbb Taylor

666 Cccccc Allen

777 Dddddd Kemp

Course Table

Page 55: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Instructor Instructor

_Office

Holden 125 Bd

Taylor 222 M

Taylor 222 M

Allen 351 B

Kemp 317 Nv

Instructor Table

Page 56: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

Student_ID

Name Major CRN Title Instructor Instructor

_Office

Grade

StudentID Name Major

Student_ID CRN Grade

CRN Title Instructor

Instructor Instructor

_Office

Page 57: L10 - The GIS Database – Part 1 Chapter 8. Entity Bangor –Penobscot County, Maine, United States –Centroid - 44.801N, - 68.778W –Area 34.4 square miles.

When NOT to Normalize

• Relates in large tables require large amounts of computer time to process.– If you have an application that speed is more

important than database structure.

• If you are creating a very simple data base that will be used for a short time and then discarded.