Post on 27-May-2015
1
Attribute Data Models
GIS Database Management System
Attribute Data Models
GE517Engr. Ablao
Introduction
GIS involves both spatial and attribute data.Spatial – geometry of map featuresAttribute – characteristics of the map features
Attribute data are normally stored in tables.Record or tuple – rowField or item – columnA ib i i f d lAttribute – intersection of row and column
Data models relate spatial & attribute data.
8/20/2010GE 517 Geographic Information System
2
Spatial data (left) are linked to attribute data (right) by the label ID.
8/20/2010GE 517 Geographic Information System
File Structures (File-based datasets)Simple list
Simplest file structureUnordered/unstructuredArrangement is by whichever comes first
Ordered sequential filesSimple lists that are arranged according to some order (ex. Alphabetical order)
Indexed filesAn index to the directory is needed for more efficient searches involving finding entries given certain criteriaCan be developed as direct files or inverted files
8/20/2010GE 517 Geographic Information System
3
Indexed FilesDirect Indexed Files
Records are used to provide access to other ppertinent information
Indirect Indexed FilesIndex is based on possible search criteria, not on the entities themselvesAttributes are the primary search criteria and the entities rely on them for selectionentities rely on them for selection
8/20/2010GE 517 Geographic Information System
Flat file databaseContains all data in a large file
Software could only operate on one file at a timey p
Format is very inflexible with respect to the modification of the database structure
8/20/2010GE 517 Geographic Information System
4
Flat file database
8/20/2010GE 517 Geographic Information System
DatabaseAn integrated set of data on a particular subject
Collection of interrelated data stored together with controlled redundancy to serve one or more applications in an optimal fashion
Requires more elaborate structure called a database structure or database management system
• A DBMS manage attribute data in separate tables
8/20/2010GE 517 Geographic Information System
5
Significance of DatabaseMost GIS activities consist of storing entity and attribute data so that we can retrieve any combination of these objects.y j
Each graphical feature must be stored explicitly with its attributes so that their combined search becomes faster.
8/20/2010GE 517 Geographic Information System
Advantages of Database over File-based datasets
Collecting data at a single location reduces redundancy and duplicationp
Lower maintenance cost due to better organization and decreased data duplication
Multiple applications can use the same data and can evolve separately over time
8/20/2010GE 517 Geographic Information System
6
Advantages of Database over File-based datasets
User knowledge can be transferred between applications more easily because database remains constantFacilitated data sharing, with a corporate view provided to data managers and usersSecurity and standards for data and data access can be established and enforced
8/20/2010GE 517 Geographic Information System
Types of Database Structure1. Hierarchical Data Structures
2. Network Systemsy
3. Relational Database Structures
8/20/2010GE 517 Geographic Information System
7
Hierarchical Data Structure‘one-to-many’ or ‘parent-child’ relationship
Implies that each element has a direct relationship to a number of p psymbolic childrenEach child is capable of having the same direct relationship with his/her own offspring, and so on.
8/20/2010GE 517 Geographic Information System
Hierarchical database
8/20/2010GE 517 Geographic Information System
8
Hierarchical Data StructureAdvantages:
Simple and straightforward data access since parent and p g pchildren are directly linked
Easy to search since structure is well defined
Relatively easy to expand by adding new branches and formulating new decision rules
8/20/2010GE 517 Geographic Information System
Hierarchical Data StructureDisadvantages:
Confined to queries along one branch onlyq g y
Difficult restructuring to allow other possible search criteria
Creates large index files
Redundant entries for searching
8/20/2010GE 517 Geographic Information System
9
Network Systems‘many-to-many’ relationship
Each individual data is linked directly to anywhere h d b h hin the database using pointers, without the parent-
child relationship.
8/20/2010GE 517 Geographic Information System
Network database
8/20/2010GE 517 Geographic Information System
10
Network Systems
8/20/2010GE 517 Geographic Information System
Network SystemsAdvantages:
Less rigid compared to hierarchical structureg p
Can handle many-to-many relationships
Allows much greater flexibility
Reduced redundancy of data
8/20/2010GE 517 Geographic Information System
11
Network SystemsDisadvantages:
In very complex GIS, the number of pointers can b l th i i l t f t become large, thus requiring a lot of storage spaceLinkages between data must still be explicitly defined using pointersNumerous possible linkages can become extremely tangled, resulting to confusion and incorrect linkagesgNot recommended for novice users
8/20/2010GE 517 Geographic Information System
Relational Database Management Systems (RDBMS)
Data are stored as ordered records or rows of attribute values called tuplesTuples are grouped with corresponding data rows in a form called relationsEach column represents data for a single attribute for the entire dataset
8/20/2010GE 517 Geographic Information System
12
Relational Database Management Systems (RDBMS)
A key represents one or more attributes whose values can uniquely identify a record in a table.A k bl bli h A key common to two tables can establish connection between records in the tables.
Primary key – a column which is used to define the search strategy or criterion
Foreign key – column in the second table to which the primary key is linkedwhich the primary key is linked
8/20/2010GE 517 Geographic Information System
Relational database
8/20/2010GE 517 Geographic Information System
13
Relational Database Management Systems (RDBMS)Advantages:
Allow us to collect data in reasonably simple tables, keeping y p p gorganization also simpleCapable of doing relational joins, as long as there is at least one column common to the tables to be joined Allows greatest flexibility, both in design and querying
8/20/2010GE 517 Geographic Information System
Normalization of relational databaseNormalization is a process of decomposition, taking a table with all the attribute data and breaking it down to small tables while breaking it down to small tables while maintaining the necessary linkages between them.Normalization is designed to avoid redundant data in tables, to ensure that attribute data in separate tables can be maintained and updated separately and can be linked when necessary, and to facilitate a distributed databaseto facilitate a distributed database.Normalization slows down data access.
8/20/2010GE 517 Geographic Information System
14
PIN Owner Address Sale date Hectares Zone code Zoning
P101 Gloria 101Pampanga St.
01-20-2001 1.2 1 Residential
Erap 202San Juan St.
P102 Fidel 303Pangasinan St.
06-30-1992 1.5 2 Commercial
Cory 404Tarlac St.
P103 F di 505 06 30 1965 2 1 2 C i lP103 Ferdie 505Ilocos Norte St.
06-30-1965 2.1 2 Commercial
P104 Dado 606Pampanga St.
06-30-1961 0.8 1 Residential
Unnormalized table
8/20/2010GE 517 Geographic Information System
PIN Owner Address Sale date Hectares Zone code Zoning
P101 Gloria 101Pampanga St.
01-20-2001 1.2 1 Residential
P101 Erap 202San Juan St.
01-20-2001 1.2 1 Residential
P102 Fidel 303Pangasinan St.
06-30-1992 1.5 2 Commercial
P102 Cory 404Tarlac St.
06-30-1992 1.5 2 Commercial
P103 F di 505 06 30 1965 2 1 2 C i lP103 Ferdie 505Ilocos Norte St.
06-30-1965 2.1 2 Commercial
P104 Dado 606Pampanga St.
06-30-1961 0.8 1 Residential
First Normal Form
8/20/2010GE 517 Geographic Information System
15
Second NormalForm
8/20/2010 GE 517 Geographic Information System
NormalizedNormalizedForm
8/20/2010 GE 517 Geographic Information System
16
Data Storage in a DBMSObject classes/layers are stored in database tablesEach layer is stored as a single database table in a database management systemRows contain objects, while columns contain attributes/properties of the objects
8/20/2010GE 517 Geographic Information System
Basic Database Functions/OperationsJoin
Tables are joined together using common row/column values or keysj g g yAfter joining two or more tables, a new table is created which contains all the values of the joined tables
Database tables can be joined together to create new relations, or views of the database.
8/20/2010GE 517 Geographic Information System
17
Basic Database Functions/OperationsLink
Tables are linked using common row/column values or keysg yUnlike in joining, linking tables does not result to a new table. The original tables are retained but accessing one enables the user to also access a table linked to it
8/20/2010GE 517 Geographic Information System
Database DesignInvolves three stages: conceptual, logical, and physical
Involves six practical steps (see Figure)p p ( g )
8/20/2010GE 517 Geographic Information System
18
Stages of Database DesignConceptual Model
Logical ModelUser View
Object and
Relationships
Logical Model
Geographic Database
Types
Geographic
Physical Model
Database Schemap
Geographic Representation
Geographic Database Structure
8/20/2010GE 517 Geographic Information System
Conceptual ModelSteps involved are:
1. Model the user’s viewIdentifying organizational functions, determining data requirements of these functions, organizing data into groups for data managementMay be presented using a report with tables
8/20/2010GE 517 Geographic Information System
19
Conceptual Model2. Define objects and their relationships
Specification of object types/classes and functions, and their p yprelationships May be presented using diagrams
8/20/2010GE 517 Geographic Information System
8/20/2010GE 517 Geographic Information System
20
Conceptual Model3. Select geographic representation
Choosing between the types of discrete objects (point, line, or polygon) or field to represent the (point, line, or polygon) or field to represent the dataSelection has a critical impact on the database useAlthough it is possible to switch between representations later on, it would be computationally expensive and would lead to information loss
8/20/2010GE 517 Geographic Information System
Logical ModelSteps involved are:
1. Match to geographic database typesMatching of object types to be studied to specific data types supported by the GIS
2. Organize geographic database structureDefining topological associations, specifying rules and relationships, and assigning coordinate systems
8/20/2010GE 517 Geographic Information System
21
Physical ModelStep involved is:
Define database schemadefinition of the actual physical database schema that will hold the database data valuesusually created using the DBMS software’s data definition language (ex. SQL)
8/20/2010GE 517 Geographic Information System
Attribute data entryField definition
Attribute data entryy
Attribute data verification
Creation of new attribute data
8/20/2010GE 517 Geographic Information System
22
Field definitionDefinition of (a) field name, (b) data type, (c) data width, and (d) number of decimal places.
Data type may be (a) numeric (integer or floating-point), (b) string, (c) Boolean, or (d) date.
Consider measurement scale of data.
8/20/2010GE 517 Geographic Information System
Attribute data entryAkin to digitizing for spatial data entry
Attribute data need to be entered by typingGiven: map with 2,000 polygons and 10 fieldsTime: At 10 seconds per value, it takes 55 hours – 33 minutes – 20 seconds (2.3 days) to enter 20,000 values
Best to determine if an organization has attribute data in digital format (e g NSO)attribute data in digital format (e.g. NSO)
8/20/2010GE 517 Geographic Information System
23
Attribute data verificationIn this step:
Ensure attribute data are properly linked to spatial dataEnsure attribute data are properly linked to spatial dataVerify the accuracy of attribute data
May be difficult due to observation errors, out-of-date data, and data entry errors
To check for errors:Table may be printed for manual verificationy pComputer programs may be written to automate task
8/20/2010GE 517 Geographic Information System
Creation of new attribute dataAttribute data classification
Example: ElevationHigh = {Higher than 600 meters}
Medium = {Between 200 and 600 meters}
Low = {Lower than 200 meters”
Attribute data computationExample: Soil erosion potential = rainfall parameter × Soil parameter ×topographic parameter × land cover parameter × management parameter
Example: Agricultural harvest = area × potential yield
8/20/2010GE 517 Geographic Information System