Midterm
Multiple choice on scantron/bring #2 pencil
Major concepts moreso than details
Reviewing LECTURES is key PPT files
background & extra in Chapters 1, 3-4, 9, 20 in
Longley et al.
Will not include
Web Sites of the Week (WSWs)
Labs
Learning Assessment/Practice Questions on
class web site
GIS Data Capture:
Getting the Map into the
Computer
Chapter 9, Longley et al.
Overview
Introduction
Primary data capture
Secondary data capture
Data transfer
Capturing attribute data
Managing a data capture project
Error and accuracy
Data Collection
Can be most expensive GIS activity
Many diverse sources
Two broad types of collection
Data capture (direct collection)
Data transfer
Two broad capture methods
Primary (direct measurement)
Secondary (indirect derivation)
Data Collection Techniques
Field/RasterObject/
Vector
Primary
Digital remote
sensing images
GPS
measurements
including VGI
Digital aerial
photographs
Survey
measurements
Secondary
Scanned maps Topographic
surveys
DEMs from mapsToponymy data
sets from atlases
Stages in Data Collection Projects
Planning
Preparation
Collection / Transfer
Editing / Improvement
Evaluation
Primary Data Capture
Capture specifically for GIS use
Raster – remote sensing
e.g., SPOT and IKONOS satellites and aerial
photography, echosounding at sea
Passive and active sensors
Resolution is key consideration
Spatial
Spectral, Acoustic
Temporal
Vector Primary Data Capture
Surveying
Locations of objects determines by angle and
distance measurements from known locations
Uses expensive field equipment and crews
Most accurate method for large scale, small areas
GPS
Collection of satellites used to fix actual
locations on Earth’s surface
Differential GPS used to improve accuracy
Total Station
GPS “Handhelds”
geographic coordinates
text
photos
audio
video
Bluetooth, WiFi
cell towers
+/- 500 m
Google db of
tower locations
Graphic courtesy of Wired, Feb. 2009
Wi-Fi
+/- 30 m
Skyhook
servers and db
GPS
+/- 10 m
iPhone uses
reference network
“Power to the People:”VGI & PPGIS
“Volunteered Geographic Information”
Wikimapia.org
Openstreetmap.org
Aka “crowdsourcing”
“Public Participation GIS”
GEO 599, Fall 2007
Papers still online at
dusk.geo.orst.edu/virtual/
Example:
A Boon for International Development Agencies
Robert Soden, www.developmentseed.org
Kinshasa, Democratic Republic of Congo
International Development, Humanitarian Relief
Robert Soden, www.developmentseed.org
Mogadishu, Somalia
Haiti Disaster, MapAction.org
UCLA Center for Embedded Networked Sensing, http://peir.cens.ucla.edu
“Citizen
Sensors”
Societal Issues
(privacy, surveillance, ethics)
e.g., Google StreetView
Google Maps Mania Blog
Early and late May 2008
More surveillance
(electronic, video, biological,
chemical)
integrated into national system
From Chris Peterson, Foresight
Institute
As presented at OSCON 2008,
Portland
Graphic: Gina Miller
From Chris Peterson, Foresight Institute
As presented at OSCON 2008, Portland
Sewer monitoring has
begun
“The test doesn’t screen people directly but
instead seeks out evidence of illicit drug
abuse in drug residues and metabolites
excreted in urine and flushed toward
municipal sewage treatment plants.”
From Chris Peterson, Foresight Institute
As presented at OSCON 2008, Portland
Secondary Geographic Data Capture
Data collected for other purposes,
then converted for use in GIS
Raster conversion
Scanning of maps, aerial photographs,
documents, etc.
Important scanning parameters are
spatial and spectral (bit depth)
resolution
Scanner
Vector Secondary Data Capture
Collection of vector objects from maps,
photographs, plans, etc.
Photogrammetry – the science and
technology of making measurements from
photographs, etc.
Digitizing
Manual (table)
Heads-up and vectorization
Digitizer
GEOCODING
spatial information ---> digital form
capturing the map (digitizing,
scanning)
sometimes also capturing the
attributes
“mapematical” calculation, e.g.,
address matching
WSW
The Role of Error
Map and attribute data errors are the data
producer's responsibility,
GIS user must understand error.
Accuracy and precision of map and attribute
data in a GIS affect all other operations,
especially when maps are compared across
scales.
Accuracy
closeness to TRUE values
results, computations, or estimates
compromise on “infinite complexity”
generalization of the real world
difficult to identify a TRUE value
e.g., accuracy of a contour
Does not exist in real world
Compare to other sources
Accuracy (cont.)
accuracy of the database = accuracy of the
products computed from database
e.g., accuracy of a slope, aspect, or
watershed computed from a DEM
Positional Accuracy
typical UTM coordinate pair might be:
Easting 579124.349 m
Northing 5194732.247 m
If the database was digitized from a
1:24,000 map sheet, the last four digits in
each coordinate (units, tenths, hundredths,
thousandths) would be questionable
Map scale Ground distance corresponding to
0.5 mm map distance
1:1250 62.5 cm
1:2500 1.25 m
1:5000 2.5 m
1:10,000 5 m
1:24,000 12 m
1:50,000 25 m
1:100,000 50 m
1:250,000 125 m
1:1,000,000 500 m
1:10,000,000 5 km
A useful rule of thumb is that positions measured from maps are
accurate to about 0.5 mm on the map. Multiplying this by the scale of
the map gives the corresponding distance on the ground.
Positional Accuracy
Testing Positional Accuracy
Use an independent source of higher
accuracy:
find a larger scale map (cartographically speaking)
use GPS
Use internal evidence:
digitized polygons that are unclosed, lines
that overshoot or undershoot nodes, etc.
are indications of inaccuracy
sizes of gaps, overshoots, etc. may be a
measure of positional accuracy
not the same as accuracy!
repeatability vs. “truth”
not closeness of results, but number of
decimal places or significant digits in a
measurement
A GIS works at high precision, usually much
higher than the accuracy of the data
themselves
Precision
Accuracy vs. PrecisionAccuracy vs. Precision
High Accuracy
Low Precision
Low Accuracy
High Precision
Many darts in
reproduceable clusters,
but not in the bullseye.
Darts are near the bullseye
(the "true value"), but there
aren't very many clusters of
them (not reproduceable).
Accuracy vs. PrecisionAccuracy vs. Precision
High Accuracy
Low Precision
Low Accuracy
High Precision
Many darts in
reproduceable clusters,
but not in the bullseye.
Darts are near the bullseye
(the "true value"), but there
aren't very many clusters of
them (not reproduceable).
Components of Data Quality
positional accuracy
attribute accuracy
logical consistency
completeness
lineage