These materials were developed by Kenneth E. Foote and Donald
Huebner, Department of Geography, University of Texas at Austin,
These materials may be used for study, research, and education
applications. If you link to or cite these materials,
the authors, Kenneth E. Foote and Donald J. Huebner, The
Project, Department of Geography, The University of Colorado at
These materials may not be copied to or issued from another Web
without the authors' express permission. Copyright ©
commercial rights are reserved. If you have comments or
please contact the author or Ken Foote at firstname.lastname@example.org.
This page is also available in a framed
version. For convenience, a Full
Table of Contents is provided.
1. GIS as Representations of Reality
Perhaps we should use the acronym gIs,
than GIS for geographic information
systems. These are really geographic INFORMATION systems. It is the
they contain that makes them so valuable.
The database is also important because its creation will often
for up to three-quarters of the time and effort involved in
a geographic information system. Once an organization compiles
the database may be maintained for ten to fifty years. For this
shortcuts are not recommended.
It is important, however, to view these GIS databases as
than simple stores of information. The database is used to
specific sorts of information about reality and organize it in a
will prove useful. The database should be viewed as a
model of the world developed for a very specific application.
One of the reasons that there are so many software and
systems employed for GIS is because each system allows users to
and model certain types of phenomena.
2. Basic Types of Representation:
One of the sharpest distinctions among GIS is the way that location
represented in a database, as either a raster or vector position.
2.1. The Raster View of the World
A raster based system displays, locates, and stores graphical data
a matrix or grid of cells. A unique reference coordinate represents
pixel either at a corner or the centroid. In turn each cell or pixel
discrete attribute data assigned to it. Raster data resolution is
on the pixel or grid size and may vary from sub-meter to many
Because these data are two-dimensional, GISs store various
such as forest cover, soil type, land use, wetland habitat, or other
in different layers. Layers are functionally related map features.
raster data requires less processing than vector data, but it
more computer storage space. Scanning remote sensors on satellites
data in raster format. Digital terrain models (DTM) and digital
models (DEM) are examples of raster data (Koeln et al 1994
2.2. The Vector View of the World
A vector based system displays graphical data as points, lines or
or areas with attributes. Cartesian coordinates (i.e., x and
and computational algorithms of the coordinates define points in a
system. Lines or arcs are a series of ordered points. Areas or
are also stored as ordered lists of points, but by making the
and end points the same node the shape is closed and defined.
models define the connectivity of vector based systems. Vector
are capable of very high resolution (less than or equal to .001
graphical output is similar to hand-drawn maps. This system works
with azimuths, distances, and points, but it requires complex data
and is less compatible with remote sensing data. Vector data
computer storage space and maintaining topological relationships is
in this system. Digital line graphs (DLG) and TIGER files are
or vector data (Koeln et al 1994; and Huxhold 1991).
2.3 Graphical Comparison of Raster and
It is important to stress that any given real world situation can be
in both raster and vector modes,
choice is up to the user.
Each of these systems of representation has its advantages and
Simple data structure
Compatible with remotely sensed or scanned data
Simple spatial analysis procedures
Requires greater storage space on computer
Depending on pixel size, graphical output may be less
Projection transformations are more difficult
More difficult to represent topological
Requires less disk storage space
Topological relationships are readily maintained
Graphical output more closely resembles hand-drawn maps
More complex data structure
Not as compatible with remotely sensed data
Software and hardware are often more expensive
Some spatial analysis procedures may be more difficult
Overlaying multiple vector maps is often time
It should also be stressed that data modeled in one system can be
into the other. That is, raster data can be vectorized and vice
Many systems even allow data modeled in raster form to be overlaid
data and vice versa. In this graphical
an aerial photo (raster) is overlaid with with supplemental vector
3. Organizing Attribute Data
GIS use raster and vector representations to model location, but how
must also record information about the real-world phenomena
at each location and the attributes of these phenomena. That is, the
must provide a linkage between spatial and non-spatial data. These
make the GIS "intelligent" insofar as the user can store and examine
about where things are and what they are like.
The relationship can be diagrammed as a linkage between:
At the most abstract level, this is a relationship between:
Location <<< >>> What
Spatial Data <<< >>>
Geographic Features <<< >>>
In a raster system, this symbol is a grid cell location in a matrix.
a vector system, the locational symbol may be a one-dimensional
a two-dimensional line, curve, boundary, or vector; or a three-
area, region, or polygon.
- A Locational Symbol <<< >>>
The linkage between symbol and meaning is established by
every geographic feature at least one unique means of
name or number usually just called its ID. Non-spatial attributes
feature are then stored, usually in one or more separate files,
ID number. In other words, locational
linked to specific information in a database
It is important to realize that this non-spatial data can
away in several different forms depending on how it needs to be
accessed. Perhaps the simplist method is the flat file or
where each geographic feature is matched to one row of data.
3.1 Flat Files and Spreadsheets
A flat file or spreadsheet is
method for storing data. All records in this data base have the same
of "fields". Individual records have different data in each field
one field serving as a key to locate a particular record. For
your social security number may be the key field in a record of your
address, phone number, sex, ethnicity, place of birth, date of
so on. For an person, or a tract of land there could be hundreds of
associated with the record. When the number of fields becomes
flat file is cumbersome to search. Also the key field is usually
by the programmer and searching by other determinants may be
for the user. Although this type of database is simple in its
expanding the number of fields usually entails reprogramming.
adding new records is time consuming, particularly when there are
fields. Other methods offer more flexibility and responsiveness in
3.2 Hierarchical Files
Hierarchical files store data
more than one type of record. This method is usually described as a
one-to-many" relationship. One field is key to all records, but data
one record does not have to be repeated in another. This system
records with similar attributes to be associated together. The
are linked to each other by a key field in a hierarchy of files.
except for the master record, has a higher level record file linked
a key field "pointer". In other words, one record may lead to
so on in a relatively descending pattern. An advantage is that when
relationship is clearly defined, and queries follow a standard
a very efficient data structure results. The database is arranged
to its use and needs. Access to different records is readily
or easy to deny to a user by not furnishing that particular file of
database. One of the disadvantages is one must access the master
with the key field determinant, in order to link "downward" to other
3.3 Relational Files
Relational files connect
files or tables (relations) without using internal pointers or keys.
a common link of data is used to join or associate records. The link
not hierchical. A "matrices of tables" is used to store the
As long as the tables have a common link they may be combined by the
to form new inquires and data output. This is the most flexible
and is particularly suited to SQL (structured query language).
are not limited by a hierarchy of files, but instead are based on
from one type of record to another that the user establishes.
its flexibility this system is the most popular database model for
3.4 Flat, Hierarchical, and Relational
Fast data retrieval
Simple structure and easy to program
Difficult to process multiple values of a data item
Adding new data categories requires reprogramming
Slow data retrieval without the key
Adding and deleting records is easy
Fast data retrieval through higher level records
Multiple associations with like records in different
Pointer path restricts access
Each association requires repetitive data in other
Pointers require large amount of computer storage
Easy access and minimal technical training for users
Flexibility for unforeseen inquiries
Easy modification and addition of new relationships,
data, and records
Physical storage of data can change without affecting
New relations can require considerable processing
Sequential access is slow
Method of storage an disks impacts processing time
Easy to make logical mistakes due to flexibility of
Now, let us consider a couple of examples of matching applications
- Exploratory research--flat files are easy to organize, space
not particular problem
- Government agencies--hierarchical systems are particularly
- Planning and development--relational might be justified for
4. Representing Relationships
GIS have the power to record more than location and simple attribute
In some situations, we will want to examine spatial relationships
upon location, as well as functional and logical relationships among
Functional Relationships among Geographic Features and Their
This includes information about how features are connected and
in real-life terms. A road network might be classified functionally
the largest superhighway down to the most isolated rural road or
cul-de-sac based upon their role in the overall transportation
Minor roads and suburban streets "feed" major highways, but are not
connected to them. As another example in assessing wildlife
environmental conditions function together to define the optimal
environments for certain species. Within cities, ownership is a
classification of great importance as is landuse and zoning
Absolute and relative location
Distance between features
Proximity of features
Features in the "neigborhood" of other features
Direction and movement from place to place
Boolean relationships of "and," "or," "inside," "outside,"
Logical Relationships among Geographic Features and Their
Logical relationships involve "if-then" and "and-or" conditions
exist among features stored in the dataset. For example, no land
permitted to be zoned for residential use if it lies within a
flood plain. Development may disallowed in the habitat of some
Databases can be designed to represent, model, and store
about these relationships as needed for particular applications.
5. The Example of Topological
Topology is one of the most useful relationships maintained in many
databases. It is defined as the mathematics of connectivity or
of points or lines that determines spatial relationships in a GIS.
topological data structure logically determines exactly how and
and lines connect on a map by means of nodes (topological
order of connectivity defines the shape of an arc or polygon. The
stores this information in various tables of the database structure.
storing information in a logical and ordered relationship missing
e.g., a line segment of a polygon, is readily apparent. A GIS
analyzes, and uses topological data in determining data
Network analysis uses topological modeling for determining
paths and alternate routes. For example, a GIS for emergency
may use topological models to quickly ascertain optional routes
vehicles. Automobile commuters perform a similar mental task by
their route to avoid accidents and traffic congestion. Likewise an
utility GIS could rapidly determine different circuit paths to
when service is interrupted by equipment damage. Similarly,
planners could use certain algorithms to determine logical
between population groups and areas for district boundaries.
To see how topology is represented or
is useful to consider an example to see how connections are coded
a database. This involves recording more than use the absolute
of points, lines, and regions.
The first step is to record the location of all "nodes,"
is endpoints and intersections of lines and boundaries.
Based upon these nodes, "arcs" are defined. These arcs
but they are also assigned a direction indicated by the
starting point of the vector is referred to as the "from node" and
destination the "to node." The orientation of a given vector can
in either direction, as long as this direction is recorded and
By keeping track of the orientation of arcs, it is possible to
information to establish routes from node to node or place to
if one wants to move from node 3 to node 1, we can locate the
connections in the database.
Now, "polygons" are defined by arcs. To define a
polygon, trace around its area in a clockwise direction recording
arcs and their orientations. If an arc has to be followed in its
orientation to make the tracing, it is assigned a negative sign in
Finally, for each arc, one records which polygon lies to the left
right side of its direction of orientation. If an arc is on the
the study area, it is bounded by the "universe."
Now that this information has been recorded in the database, it
to pose questions about connectivity and location. For
Arc-node topology, as this is called, was developed several decades
as a convenient way of store information of this sort. It is used to
information used in the US Bureau of Census TIGER boundary files and
the basis of the spatial modeling system used by the Arc/Info
What polygons adjoin polygon A? To find the solution, we first
see what arcs define polygon A, then we check to see what
are defined by these arcs in their negative orientation.
What is the shortest route from node 3 to node 2? Trace all
lead from node 3 to node 2, sum their lengths by calculating
from node list. Choose path with shortest total length.
What polygon is directly across from polygon B along arc D?
the polygon that is defined by the inverse (negative) of arc
6. Object-oriented Databases
The methods of file organization discussed above depend upon the
description of real-world phenomena in terms of their attributes,
as height, weight, or age. It is these attributes that are stored in
database and together they provide a sort of abstracted depiction of
real-world feature. Much recent attention has focused on how to
this information in ways that more readily represent the way users
and use information about the world around them. That is, humans
"objects" immediately in terms of their totality or "wholeness."
and skyscrapers are recognized immediately by form and function. The
can be described in terms of the underlying attributes, but people
these from experience.
The idea of "object-oriented" database is to organize
(that is group attributes) into the sorts of "wholes" that people
Instead of "decomposing" each feature a distinctive list of
emphasis is placed on "grouping" the attributes of a given object
a unit or template that can be stored or retrieved by its natural
Consider the following situation involving two ways of
information about buildings zoned for different uses.
This information can be broken down into attributes, as
||Minimum Lot Size
||Maximum Number of Dwelling Units
To organize this information differently, let us first define some
that reflect the different "objects" we wish to include in the
|SF Single Family
||Token 1=Large Lot
||Token 1=Low Density
||Token 5=High Density
|LO Limited Office
||Must Specify Predominate Use
||Maximum Height=40 ft Minimum Lot
|GO General Office
||Must Specify Predominate Use
||Maximum Height=60 feet Minimum Lot
Once these are created, information can be added to our database
to the template. The template maintains in one place all
in common by a certain class of object. It may be the case that
differences exist between objects of a given category. These
can be stored as "tokens" or additional attributes.
- Height=35 ft
Although templates and tokens may be stored in two different
is easy to see how this method of organization changes the
is not merely a process of simplication. By using templates, users
enter and retrieve data in terms of "real" items. A query might
all "Single Family Houses."
Object-oriented databases thus have the advantage of organizing
information in ways that users often find easier to use. The
as an intuitive feel because it employs that categories that users
naturally in day-to-day life. For this reason, object-oriented
are gaining increased attention in GIS.
7. The Idea of the Expert Systems
If a database has been designed to store information about spatial,
and logical relationships, the user can pose more complex questions
the data. That is, the user can program the system to consider a
of spatial, functional, and logical conditions during query or
Such efforts result in what are termed expert systems
if carried further, artificially intelligent systems.
there simplest, expert systems allow the user to set "rules" that
be followed as data is analyzed. These rules are written to mirror
way an experienced user would compare or judge data. As more and
are written, the system becomes more adept or "expert" at finding
with less directed guidance by users.
The point of expert systems is to build sets of rules that
the sorts of comparisons and judgments that experienced users
By programming these rules into the system, more and more of the
decision making can be passed on to the computer system--including
comparisons that may be difficult or time consuming for even
users to undertake.
Such systems are of interest to GIS practioners in many fields
urban planning and resource analysis. Complex issues involving
land use can often be written in terms of rules that need to be
At the same time, following rules in only a step toward
The difference between expert systems and artificial intelligence
in debate. But to be truly "intelligent" a system must be able to
"think," or "reason," perhaps really to write its own rules from
The definition of artificial intelligence is, in fact, still a
issue. So far, it has been very difficult to program computer
provide a semblance of human thought processes. Yet, the potential
systems makes the effort irresistible. The idea that computer
one day be able to reason about real- world environmental and
problems and issues is a reason why GIS theorists maintain an
in developments in the area of artificial intelligence.
8. References and Supplemental Reading
Chapter 2 in Bolstad, Paul. 2005. GIS Fundamentals: A First Text on
Geographic Information Systems, 2nd. ed. White Bear
MN: Eider Press.
Burrough, P.A. 1986. Principles of Geographical Information
Systems for Land Resource Assessment. New York: Oxford
Chapters 3-5 in Chang, Kang-tsung. 2006. Introduction to Geographic Information
Systems, 3rd. ed. Boston: McGraw Hill.
Chapter 3 in Clarke, Keith C. 2003. Getting Started with Geographic
Information Systems, 4th ed. Upper Saddle River, NJ:
DeMers, Michael N. 2005. Fundamentals
of Geographic Information Systems, 3rd ed. Wiley.
Huxhold, W.E. 1991. An Introduction of Urban Geographic
Systems. New York and Oxford: Oxford University Press.
Koeln, G.T., Cowardin, L.M., Strong, L.L. 1994. Geographic
systems. in T.A. Bookhout, ed. Research and Management
for Wildlife and Habitats. Fifth Edition. Bethesda: The
Pages. pp. 540-566.
Chapters 3, 5 and 6 in Lo, C.P. and Albert K.W. Yeung.
2002. Concepts and
Information Systems. Upper Saddle River, NJ: Prentice
Chapter 8 in Longley, Paul A., Michael F. Goodchild, David J.
and David W. Rhind. 2005. Geographic
Informaiton Systems and Science, 2nd ed. Hoboken, NJ:
Wiley.Antennucci, J.C., Brown, K., Croswell, P.L., Kevany, M.J.
Information Systems. New York and London: Chapman & Hall.
Walker, J.D., Black, R.A., Linn, J.K., Thomas, A.J., Wiseman,
R., and D'Attilio, M.G. 1996. Development of Geographic
Database for Integrated Geological and Geophysical Applications. GSA
A Publication of the Geological Society of America 6(3):2-7.
Last revised on 2014.9.11. KEF