GIS Data Modelling and Mangement

32
GIS DATA MODELLING AND MANAGEMENT Prof. Ganesh D Bhutkar Subject Teacher Student Group: - GIS Data Modelling and Management 1 Sohan Pachhade BE IT 2008-09 J-29 Vivek Bamne BE IT 2008-09 J-06 Sanyog Salve BE IT 2008-09 J-34 Reference: Chapters 8 & 9: Spatial Data Modeling and GIS Data Management M. Anji Reddy, Remote Sensing and GIS, B S Publications, Second Edition, 2006.

Transcript of GIS Data Modelling and Mangement

Page 1: GIS Data Modelling and Mangement

GIS DATA MODELLING AND MANAGEMENT

Prof. Ganesh D BhutkarSubject Teacher

Student Group: -

GIS Data Modelling and Management 1

Student Group: -Sohan Pachhade BE IT 2008-09 J-29Vivek Bamne BE IT 2008-09 J-06Sanyog Salve BE IT 2008-09 J-34

Reference:Chapters 8 & 9: Spatial Data Modeling and GIS Data Management

M. Anji Reddy, Remote Sensing and GIS, B S Publications, Second Edition, 2006.

Page 2: GIS Data Modelling and Mangement

SPATIAL DATA MODELLING

• It is a precise and clear process about how to turn dataabout spatial entities into graphical representations.

• The two main approaches in which computer can handleand display spatial entities are :-

1. Raster Approach

GIS Data Modelling and Management 2

1. Raster Approach2. Vector Approach

• A map contains spatial elements like monuments, roads,rivers and parks.

• Spatial modeling is very much useful in understandinggeographical problems.

Page 3: GIS Data Modelling and Mangement

STAGES OF GIS DATA MODELLING

• Identifying the spatial features from the real worldthat are of interest in context of application.

• Representing the conceptual data model by anappropriate spatial data model. This involveschoosing between one of the two approaches: raster

GIS Data Modelling and Management 3

choosing between one of the two approaches: rasteror vector.

• Selecting an appropriate spatial data structure tostore the model within the computer. The spatial datastructure is the physical way in which entities arecoded for purpose of storage and manipulation.

Page 4: GIS Data Modelling and Mangement

• An entity is the element in reality• Geographical entities can be represented by 3 main

entities, i.e Points, Lines and Areas.• There are two additional spatial entities :-1. Surface

GRAPHIC REPRESENTATION OF SPATIAL DATA

GIS Data Modelling and Management 4

1. Surface: It is used to represent continuous features orphenomenon. For these features, there is a value atevery location. e. g. Temperature, Population Density.

2. Network: It is a series of interconnecting lines alongwhich there is a flow of data, objects or materials. e. g.Road network along which there is a flow of traffic to andfrom the areas.

Page 5: GIS Data Modelling and Mangement

1. How to select proper entity type for providingappropriate representations ?

2. How to represent changes over time ?

CHALLENGES IN DEFINITION OF ENTITIES

GIS Data Modelling and Management 5

2. How to represent changes over time ?

e. g. Vegetation in forest may be a continuous featurewhich can be represented by a surface forecologists whereas it may be represented as aseries of discrete area entities by governmentofficials.

Page 6: GIS Data Modelling and Mangement

• The terrain is divided into number of parcels or units calledas grid cells. Each grid cell is of same size and hence itoccupies same amount of geographical space.

• It does not provide precise locational information becauseof grid cells. The simplest way of including attribute data foreach entity is to assign a number representing the attributelike a class of land cover, for each cell. E.g. 0 for Water and

RASTER DATA REPRESENTATION

GIS Data Modelling and Management 6

like a class of land cover, for each cell. E.g. 0 for Water and1 for Land.

• The resolution is given by m * n i.e columns * rows.

Problems with raster representation:1. Lack of absolute locational Information,2. Reduced spatial accuracy, reliability of distance.3. Need for large storage capacity.

Page 7: GIS Data Modelling and Mangement

• Vector representation allows us to to give specific spatiallocations specifically.

• All entities are represented using points (basic buildingblocks) having x and y co-ordinates.

• Line and area entities are constructed by connecting aseries of points into chains and polygons.

VECTOR DATA REPRESENTATION

GIS Data Modelling and Management 7

series of points into chains and polygons.• Attributes are linked through software linkage.

Problems with vector representation:1. Selection of appropriate number of points to construct an

entity.2. Representation of networks and surfaces is complex.

Page 8: GIS Data Modelling and Mangement

1. GRID Model

2. IMGRID Model

TYPES OF RASTER GIS MODELS

GIS Data Modelling and Management 8

2. IMGRID Model

3. MAP (Map Analysis Package) Model

Page 9: GIS Data Modelling and Mangement

• Compact data reduces the information content toabsolute minimum.

• Compact data is needed for efficient storage andfaster retrieval.

• Based on nature of GIS data and existence of

COMPACT RASTER DATA MODELS

GIS Data Modelling and Management 9

• Based on nature of GIS data and existence ofavailable facilities the compact methods are groupedas : -

1. Run-Length Coding2. Raster Chain Codes3. Block Codes4. Quad trees

Page 10: GIS Data Modelling and Mangement

COMPACT RASTER DATA MODELS (Contd..)

RUN LENGTH CODES• Each grid cell has a numerical value corresponding to a category of

data.• If there are 500 * 500 grid cells, then 250000 numbers have to be

typed.• There are long strings of same numbers in each row. The long string

is called run. Every run has some length, which is used for

GIS Data Modelling and Management 10

is called run. Every run has some length, which is used forcompactness - (R, N).

• Its disadvantage is that it works on a row by row basis, so it’s tedious.RASTER CHAIN CODES• This method of data reduction works by defining the boundary of the

entity.• Here the directions are represented by numbers to avoid mistakes.(0

is North, 1 is East, 2 is South, 3 is West)• Method of storing data is based on (X,Y,N,D) where (X,Y) - start

points, N - No of cells & D - direction.

Page 11: GIS Data Modelling and Mangement

BLOCK CODES• Modified run length code i.e it selects a square group of cells and

assigns a starting point, the centre or corner, pick a grid cell value andtell the computer how wide the square of grid cell is based on no. ofcells.

• Effective method of reducing the storage space for most thematically

COMPACT RASTER DATA MODELS (Contd..)

GIS Data Modelling and Management 11

layered digital data in GIS.

QUADTREES• It’s a difficult approach which works on a square group of cells.• Map is successively divided into uniform square group of grid cells with

same attribute value.• The map is then divided in 4 quadrants. NW, NE, SW, SE.• This method is only possible with raster data model and is quite

innovative because it uses recursion and divides the images into quadsor quarters till the smallest unit cell.

Page 12: GIS Data Modelling and Mangement

TYPES OF VECTOR GIS MODELS & COMPACT MODELS1. Spaghetti model

2. Topological Models (GBF / DIME, TIGER &POLYVRT)

GIS Data Modelling and Management 12

3. Shape file

Compact Models:

1. Galton’s Model

2. Freeman-Huffman Chain Codes

Page 13: GIS Data Modelling and Mangement

Parameter RASTER VECTOR1. Data Structure Simple Complex

2. Data Structure Compactness

Lesser More

3. Overlay Operations Easily & efficiently implemented

More difficult to implement

COMPARISION OF DATA MODELS

GIS Data Modelling and Management 13

4. High Spatial Variability

Efficiently represented Inefficient

5. Topological Relationships

More difficult to represent

Efficient encoding of topology

6. Graphical Output Less aesthetically pleasing.

Better suited.

7.Base Location-based Object-based

Page 14: GIS Data Modelling and Mangement

DBMS is a software to control the storage,retrieval and modification of data in a database.

It is designed for -

DATABASE MANAGEMENT SYSTEM (DBMS)

GIS Data Modelling and Management 14

It is designed for -� File handling & management� Record maintenance� Extraction of information from data (Queries)� Maintenance of data security and integrity� Application building (Reports)

Page 15: GIS Data Modelling and Mangement

DBMS APPLICATIONS

• Travel agency system, • Banking system • Library management system,

GIS Data Modelling and Management 15

• Library management system, • Railway reservation system, • Student admission system, • Financial accounting system etc.

Page 16: GIS Data Modelling and Mangement

• Security : It refers to protection of data againstaccidental or intentional disclosure tounauthorized persons and protection againstunauthorized access, modification or destructionof database.

• Integrity : It is an ability to protect data from

FUNCTIONS OF DBMS

GIS Data Modelling and Management 16

• Integrity : It is an ability to protect data fromsystems problems through a variety of assurancemeasures like range checking, backup andrecovery.

• Synchronization : It refers to forms of protectionagainst inconsistencies that can result frommultiple simultaneous users.

Page 17: GIS Data Modelling and Mangement

• Physical data independence : It means theunderlying data storage & manipulation hardwareshould not matter to the user.

• Minimization of redundancy : Redundancy isgenerally not advisable in a database. And storing

FUNCTIONS OF DBMS (Contd..)

GIS Data Modelling and Management 17

generally not advisable in a database. And storingand manipulating the dependencies increasesdifficulty of working data. Soit uses Normalization.

• Efficiency : Data retrieval operations mainlydepend on volume of data, method of dataencoding, design of database structures andcomplexity of query.

Page 18: GIS Data Modelling and Mangement

1. Data definition2. Storage definition3. Database administration4. Data manipulation

COMPONENTS OF DBMS

GIS Data Modelling and Management 18

� In data retrieval, mapping must be made betweenhigh-level objects in query language statement andthe physical location of data on storage device.

� Query compiler or optimizer is used to optimize thecode so that performance on the retrieval isimproved.

Page 19: GIS Data Modelling and Mangement

Following are the basic file file structures used in GIS:Simple List :

Records are placed in the order in which they areentered. The main advantage is to add a record justappend it. The disadvantage is lack of structure whichmakes searching very inefficient.

GIS DATA FILE MANAGEMENT

GIS Data Modelling and Management 19

makes searching very inefficient.Ordered Sequential Files:

It uses alphabetic characters. Data Is arranged inrecognizable sequences against which individuals canbe compared . The normal search strategy is sort ofdivide and conquer approach. It avoids search time toget data.

Page 20: GIS Data Modelling and Mangement

Indexed Files:These are more superior than the rest of the methods.These are based on the index or code. It uses apointer to locate a record. This type of search has 3requirements first it requires a criteria before hand,second it requires recalculation of index from original

GIS DATA FILE MANAGEMENT (Contd..)

GIS Data Modelling and Management 20

second it requires recalculation of index from originaldata, third sequential search methods are needed toobtain information.

Relative File:These are like indexed files only; but index used isrecord number.

Page 21: GIS Data Modelling and Mangement

Four Options to build GIS real world model are:LGCU (Least Common Geographical Units) based GIS :

It integrates all pertinent spatial data records into asingle set of all classes.

Layered based GIS : Each layer reflects different set ofattributes. It is a series of thematic layers. GIS data is

BUILDING GIS MODELS

GIS Data Modelling and Management 21

attributes. It is a series of thematic layers. GIS data isbroken down into logical terrain units related to layers.

Feature based GIS : It is a new approach where GISfeatures are stored as spatial or non spatial data.

Object orionted GIS : Features are not divided into layers,but grouped into classes and hierarchies of objects. Theadvantage is its reusability, but Implementation iscomplex.

Page 22: GIS Data Modelling and Mangement

• Implementation Issue is the integration of GIS withexisting internal databases. Most of the database arerelational.

• Other models by which real world database model isbuilt are hierarchical and network database models.

• Almost all existing and most widely used GIS software

DATABASE MODELS

GIS Data Modelling and Management 22

• Almost all existing and most widely used GIS softwarelike ARC / INFO are based on RDBMS.

• RDBMS is Relational DBMS and it is very easy to learnand well suited for adhoc queries. A relational querylanguage like SQL is very easy to learn.

• Three most popular data modeling approaches arerecord-based, object-based and object-relational basedon ER Diagram.

Page 23: GIS Data Modelling and Mangement

• When the data has a parent or a child or one to manyrelation, it is called hierarchical model.

• This model has many advantages like- easy to understand,

HIERARCHIAL DATABASE MODEL

GIS Data Modelling and Management 23

- easy to update or expand,- good for quick data retrieval.

• This model has many disadvantages like- large index files to be maintained,- certain attribute values are repeated, so redundancyincreases and it occupies more storage space and alsodata access becomes slow.

Page 24: GIS Data Modelling and Mangement

• When the data has many to many relationship, it iscalled network systems model.

• This model has many advantages like- more flexibility,

NETWORK DATABASE MODEL

GIS Data Modelling and Management 24

- more flexibility,- avoids redundancy.

• This model has many disadvantages like- overhead of pointers,- complex system,- more no. of pointers, so more storage space.

Page 25: GIS Data Modelling and Mangement

• It is a collection of tabular relations each with a set ofattributes.

• Data is stored as a set of rows called as tuples; consistingof values for each attribute.

• There are two schemas upon which the entire databasedepends. They are relation schema and database schema.

RELATIONAL DATABASE MODEL

GIS Data Modelling and Management 25

depends. They are relation schema and database schema.• Relation Schema – It is usually declared when database is

set up and does not change much during life span of thesystem.

• Database schema – It is a set of relation schema andrelational database with some constraints.

• Primitive operations of relational algebra - Union,Difference, Intersection, Join etc.

Page 26: GIS Data Modelling and Mangement

• Relational algebra provides a specific set of rules fordesign and function of these systems.

• Relational join is a linking mechanism to match / relatedata in one table to another.

• A single or multiple columns can be used to define search

RELATIONAL DATABASE MODEL(Contd..)

GIS Data Modelling and Management 26

• A single or multiple columns can be used to define searchstrategy and this search criterion is called primary key.

• When a primary key in one table is related to anothercolumn in second table, the column in the second tablerow to which primary key is linked, is called foreign key.

• In process of relational joins, many a times redundancy iscreated. A set of rules called Normal Forms has beenestablished to reduce it.

Page 27: GIS Data Modelling and Mangement

There are THREE basic normal forms.• 1st Normal Form - There should be a single value

only in each row location.• 2nd Normal Form - Every column that is not a primary

key be totally dependent on the primary key.

RELATIONAL DATABASE MODEL (Contd..)

GIS Data Modelling and Management 27

key be totally dependent on the primary key.• 3rd Normal Form - Columns should depend on

primary keys but primary keys should not depend onany non-primary key.

There are more advanced normal forms available.They can be used to improve quality of the database,

Page 28: GIS Data Modelling and Mangement

• The tables which are stored in database are queried and theserepresent some virtual views which is done using SQL.

• Queries may be related to one table. e. g. Which hotels in cityare five star? The answer to the query can be Hotel Taj.

• Also, queries may be related to many tables. e. g. Which

STANDARD QUERY LANGUAGE (SQL)

GIS Data Modelling and Management 28

• Also, queries may be related to many tables. e. g. Whichtourists originating from Europe stay more in five star hotels incity? (Two tables involved may be Tourist and Occupancy).

• Advantages of SQL - Completeness, Simplicity, PseudoEnglish language style.

• SQL is not developed to handle geographical concepts like“near to”, “far from”, “connected to” etc.

• RDBMS software supporting SQL – ARC / INFO, ORACLE,Geovision.

Page 29: GIS Data Modelling and Mangement

• It is a layered approach where layer holds informationabout a single thematic domain at a single known time.

• Data is stored in terms of “snapshots”.• Drawbacks:-

1. Data Volume is enormous.

LOCATION BASED REPRESENTATION FOR SPATIO-TEMPORAL DATA

GIS Data Modelling and Management 29

1. Data Volume is enormous.2. Time consuming process to access data.3. Individual change w.r.t cells can’t be determined.

TEMPORAL GRID APPROACH• Variable length list is associated with each pixel.• Each entry brings a change at each location with new

value and time (event history)

Page 30: GIS Data Modelling and Mangement

ENTITY BASED REPRESENTATION FOR SPATIO - TEMPORAL DATA

Also called as: AMENDMENT VECTOR APPROACH• It tracks the changes in geometry of entities w. r. t.

time.• Changes are incrementally recorded (Vectors).• As time progresses, number of amendment vectors

GIS Data Modelling and Management 30

• As time progresses, number of amendment vectorsgrow to increase complexity.

Page 31: GIS Data Modelling and Mangement

• Time-Based Representations for Spatio-Temporal Data usetime as the organizational basis.

• With this type of time-based representation, the changesrelated to time are explicitly stored.

TIME BASED REPRESENTATION FOR SPATIO-TEMPORAL DATA

GIS Data Modelling and Management 31

related to time are explicitly stored.

• This type of representation has the unique advantage offacilitating time-based queries.

• Adding new events as time progresses is also straightforwardand are simply added to the end of timeline.

Page 32: GIS Data Modelling and Mangement

THANK YOU !

GIS Data Modelling and Management 32