Post on 03-Jan-2016
Data Acquisition
Geog 469GIS Workshop
Outline
• Data Acquisition– Acquiring spatial data– Metadata– Spatial data quality– Determining fitness-for-use of data
• Spatial Data Infrastructure (SDI)– Concepts of SDI– Charactizing of SDI
Part I. Data Acquisition
Evaluating the applicability of data is one of essential skills for GIS professionals
Acquiring spatial data
• Use data download service– USGS National Map Seamless Data Distribution System
http://seamless.usgs.gov – USGS EROS Data Center http://eros.usgs.gov/ – Microsoft’s Terraserver http://terraserver.microsoft.com/– TIGER/Line by Census Bureau or ESRI
• http://www.census.gov/geo/www/tiger/tiger2002/tgr2002.html• http://www.esri.com/data/download/census2000_tigerline/index.html
– Subnational GIS clearing house such as WAGDA• http://wagda.lib.washington.edu/• Read http://courses.washington.edu/geog360a/dataatlibs2007.ppt if you’re not familiar
with UW library system
• Use data catalog service (or spatial portal)– Geospatial one-stop http://www.geodata.gov– ESRI geography network http://www.geographynetwork.com/
Tips for spatial & non-spatial data acquisition
• By geographic scale– Data resolution is often related to the geographic scale of data
providing agency being considered – federal data sources have lower resolution with wider
geographic coverage (e.g. LU/LC in EROS Data Center)– parcel data can be found in the local level (e.g. City of Seattle)
• By related agency and organizations– Best data about housing can be found in HUD… – Best data about transportation can be found in BTS…– Best data about education can be found in NCES…– Best data about justice can be found in BJS…
• By theme– Talk to resource persons in the area; they probably have go
through data search processes
Metadata
• Describes characteristics of data, including content• Helps determine fitness for use
– Is the data suitable for the application?• Is metadata always available?
– No (much shared data is more likely to be published with metadata e.g. USGS public domain data)
• What if metadata is not available? – Look for data dictionary at least; or contact persons in
charge• Metadata standard for public data in the U.S.
– FGDC metadata content standard (www.fgdc.gov)
Reading FGDC metadata
Want to know…? Sections in FGDC metadata
Map scale or resolution Data Quality - Lineage
How current? Identification – Time Period
Which area is covered? Identification – Spatial Domain
How is data processed? Data Quality – Lineage
How accurate? Data Quality - Accuracy
Datum, map projection Spatial Reference
Data structure {vector, raster} Spatial Data Organization
Attributes Entity and Attribute
Never miss reading abstract and purpose!
Example: http://wa-node.gis.washington.edu/~uwlib/10mdem.html
Creating metadata
• How do I create metadata?– Use metadata creation/editing tool
• ArcCatalog from ESRI• tkme from
http://geology.usgs.gov/tools/metadata/tools/doc/tkme.html
• How do I check if this metadata conforms to FGDG Content Standard?– Use metadata validation tool
• Install program mp from http://geology.usgs.gov/tools/metadata/tools/doc/mp.html
• Use web service at http://geo-nsdi.er.usgs.gov/validate.php
Spatial data quality
where– Column: components of geographic information– Row: components of data quality
• Accuracy: lack of discrepancy between measurement and values considered true (e.g. is this location near true value?)
• Consistency: whether given components conform to logical rules (e.g. any digitizing error?)
• Completeness: whether what’s required is encoded in data (i.e. anything missing)
Space Time Attribute
Accuracy Positional accuracy Attribute accuracy
Consistency Logical consistency
Completeness Completeness Completeness
FGDC metadata terms
How is spatial data quality related to fitness for use of data?
Determining fitness for use*• Does map scale or resolution of the data provide the level of details
required by the application? – Using low-resolution satellite image for street-level survey is not
acceptable– Any generalization algorithms used?
• Is data current enough to support needs identified from Stage 1?– Using outdated data for replacing a old map is not acceptable
• Are specific characteristics of data useful for the application?– Topology for routing operation– Multispectral image for land use detection– Non-planar representation for 3D visualization
• Any processing steps linked to usefulness of data for specific applications?– Some processing steps brought about irreversible effects on data (e.g.
unknown algorithm parameters)
*Questions in this lecture are not necessarily exhaustive
Determining fitness for use
• Is the stated level of accuracy sufficient given error tolerance?– Requirements for accuracy vary highly by the applications– Required types of accuracy vary by need-to-know questions or
research questions (e.g. measuring parcel size require relative accuracy while surveying require absolute accuracy)
• Is the state level of completeness of features or attribute adequate to need-to-know question?– Some entities and attributes are required rather than optional
• Logical consistency of data?– Doesn’t data lack conformance to logical rules? (e.g. is identifier
generated properly? Doesn’t data has too many sliver?) – Does metadata indicate that the agency put any effort in quality
control? (e.g. lack of information in data quality section)
Part II. Spatial Data Infrastructure
Searching for the day we experience less pain in data acquisition
Role of geographic information
• Information about data use shows that 80% of government-related activities require locational information
• Business demands exist to analyze customers’ need on a locational basis
• Major concern for understanding complexity of human and natural environment interaction
• Sustainability has been widely acknowledged as a future agenda in varying organizational structure
Spatial data integration
Thematic integration
Spatial integration
Locational framework acts as integrating mechanism
Data Futures…
• Imagine the future when information is extracted from data upon request
• In the future, data is right there, and different data are integrated in a seamless manner so that value-added products can be generated in a timely fashion
• What are barriers to getting there? Are we getting there? What are steps towards making the best use of spatial data?
Spatial data sharing challenges• What if there’s no metadata for a dataset?• What if there are no people who know
characteristics and constraints attached to spatial data?
• What if there’s no website for data dissemination?
• What if there’s no standards that promote interoperability (e.g. FGDC metadata content standard)?
• What if there’s no coordination between agencies?
• What if there’s no willingness to share data?
Spatial data as infrastructure
• What is a data infrastructure, e.g. gas, electric, highway?
• Spatial data together with the means to create, maintain, extract, and disseminate spatial data.
• SDI = spatial data + people + technology + standards + policy • SDIs provide enabling environment that facilitates
communication• Due to its dynamic and incremental nature (user-
driven & successive developments of SDIs on top of the existing infrastructure), it is not straightforward to measure benefit for SDI
Characterizing SDI
• SDIs are shared– seek to make available, expensive, geo-referenced spatial data
digitally to a variety of users for diverse application needs (for example, biodiversity, utilities, and health) based on an integrated approach.
• SDIs are open– no pre-defined boundaries limiting the user groups are made,
and typically various government departments, citizens, and private sector are expected to draw upon them.
• SDIs are inherently enabling – not pre-configured to a particular application and can potentially
be used by different entities to design their own applications.
Groot and McLaughlin 2000
Spatial data as commodity
• Spatial data infrastructure provides enabling environment for a spatially enabled society– Geographic information is widely used to support
decision-making• Seen as assets promoting
– good governance– economic development– improved environmental sustainability
• Seen as push towards information society • Access to applicable spatial data is essential to
this endeavor
US SDI
• Spatial Data Infrastructure, e.g. three levels
• 12 Federal Agencies – geoplatform.gov
• 50 States (National States Geographic Information Councils)
• Regional (e.g. Washington State Geographic Data Archive)
Data.gov
US Federal SDI – current architecture
Ext
ern
al U
sers
Infr
astr
uct
ure
Pro
gra
ms L
ocalized
Sh
areable
Lim
ited G
eo
Do
wn
load
& L
ocal
Service A
ccess
Data Administrators
Data.gov
Aware ofSupply Data, Map Services,
&Tools
DevelopersGeo enabled Searchand Visualization
Users
Discovery Use Suppliers
Data Management Systems Management Portfolio Management
Data.gov Geospatial Capability – Without Geo Platform
Server
Data
ServerDevelopers
Server
Geodata
Server
Geodata
Geodata
Data DownloadMap Services
Data DownloadMap ServicesData Download
Jerry Johnston, US EPA, presentation to NGAC “Status Update: Geospatial Platform”http://www.fgdc.gov/ngac/meetings/march-2011/intergovernmental-subcommittee-update.pptx
US Federal SDI – next architectureE
xter
nal
Use
rs /
IT S
up
plie
rsIn
fras
tru
ctu
reP
rog
ram
s Lo
calizedS
hareab
le / Scalab
leE
xtensib
le / Accessib
le
Developers
PlatformManager
Data Administrators Portfolio &Investment Managers
IT Managers
Service Management · Configuration Control· Security / Identity Mgmt· Release Management· Capacity Planning· Performance· Service Desk
Contract Management Vehicles· Cloud Suppliers· Software Suppliers
Service ManagementService Level AgreementsCost Benefit AnalysisProduct & Service Catalog
Data.gov Geospatial Platform Platform ManagementSupport Services
PaaSSoftware Components
Software Libraries
Middleware
Geo Applications
Geo-ToolsGeo
Services
SaaS
ComputingPower
Data Stores
IaaS
Aware of
Aware of
Server
Data
Catalog as a Service
Supply Data
CoordinateIaaS
Developers
Develop in PaaS
Evaluate &Assess
CIO / GIO
Manage
Server
Data
Geo enabled Searchand Visualization
Access:· Advanced Geo Analytical Functionality· Geo Map and Feature Services· Flexible Data Delivery· Image Processing· Geo-processing· Metadata Editing / Management· Coordinate Transforms Mashup / Meshup
From SaaS
Abide by Contracts & Service Agreements
Users
Discovery Use Platform Supplier
Data Management Systems Management Portfolio Management
Data.gov Geospatial Capability – With Geo Platform
Supply Metadata
Jerry Johnston, US EPA, presentation to NGAC “Status Update: Geospatial Platform”http://www.fgdc.gov/ngac/meetings/march-2011/intergovernmental-subcommittee-update.pptx
Regional SDIWashington State Geospatial Data Archive
(WAGDA) 1.0
• Data Access Services • currently in development
● supported - ○ unverified - (blank) not supported - (grey) not preferred
FunctionDirect
ConnectionGeodata Service
Image Service
Web Feature
Service/Web Coverage Service
Web Mapping Service
Geoportal
Fast data view ● ○ ○
Remote data analysis ● ● ● ●
Complete and ready metadata ● ● ● ●Geodatabase versions ● ○ ○ ○Exportable data ● ● ● ● ●Interoperability ● ● ● ●Modifiable access permission ● ● ● ● ● ○Replication/Editing ● ● ● ●
WAGDA 2.0 Architecture
29