4th International Conference on Integrating GIS and Environmental Modeling (GIS/EM4):
Problems, Prospects and Research Needs. Banff, Alberta, Canada, September 2 - 8, 2000.


Using a knowledge base approach to develop a predictive mapping program for endangered species reconnaissance

GIS/EM4 No. 27

Noah C. Goldstein

Abstract

The National Park Service's Santa Monica Mountains National Recreation Area (SMMNRA) is a unique ecological reserve surrounded by extensive and expanding urbanization. it is home to many rare and endangered species including a number of narrowly endemic taxa. In collaboration with SMMNRA scientists, we developed an ecological knowledge base which can be tested, changed and rendered in a Geographic Information System (GIS).  The knowledge base, which represents a predictive model of endangered species habitat, will be used as an aide to species reconnaissance for ecological research and related management decisions. the SMMNRA was divided into 27,590 Habitat Assessment Units (HAU) which represent landscape facets which would be used as the unit of analysis. The test species for this study was the Dudleya cymosa subspecies complex. The results of the predictive model identifies 14 out of 19 known Dudleya cymosa subspecies complex HAU's and identifies 2129 HAU's as possible sites for the Dudleya cymosa subspecies complex. The results of the fuzzy decision tree indicate a much better model fit to known Dudleya cymosa subspecies complex sites within the SMMNRA.

Keywords

Predictive mapping, endangered species, knowledge base, spatial decision support systems (SDSS), fuzzy logic.


Introduction

Predictive mapping of the distributions of rare and endangered plant species is an important scientific endeavor given the current threats posed by increased urbanization and the large scale effects of human activities. The Santa Monica Mountains are an East- West coastal mountain range, partially separating the Los Angeles Plain from the San Fernando Valley to the North-West.  They are surrounded by urban development on nearly every side and most notably, by the city of Los Angeles, with a population close to 10 million people.  The Santa Monica Mountains National Recreation Area (SMMNRA), operated by  the National Park Service, is highly impacted by the urban population which use the SMMNRA for recreation and development. The scientific staff of the SMMNRA  have the dual challenge of understanding the behavior of  federally listed plant species and preserving their fragile habitats. One of those taxa are the  State- and Federally-listed Dudleya cymosa subspecies complex, including Dudleya cymosa ssp. marcescens (Marcescent Dudleya) and Dudleya cymosa ssp. ovatifolia (Santa Monica Mountains Dudleya).  These species complex was selected in consultation with SMMNRA staff because of its endangered status, data availability, and because its known distributions suggest relatively strong abiotic controls (substrate, topography). In addition, the species were chosen due to the challenges it presents to predictive mapping, namely microsite specificity and low population size.

Methods

The first challenge was in in representing the SMMNRA in a manner which would simultaneously be responsive to both the species' behavior and the possible microsites in the geopsatial data. To resolve this, we divided the study area (76,372 ha) into 27,590 Habitat Assessment Units (HAU's). The polygons, created by combining landscape position and annual solar insolation, represent terrain facets which are both large enough locate in the field and small enough to retain ecological significance.

The HAU's were classified into presence and absence classes using the software package SPLUS. The cross-validated presence/absence classification trees were then rendered in NetWeaver (Miller 2000), a knowledge base visualization software package.  The advantage of using NetWeaver is that it can be manipulated to incorporate expert knowledge and explicitly documents all decisions and data used in the knowledge base. In addition, binary (Boolean) decisions of species presence and absence could be translated into parameterized fuzzy decisions by manipulating the rules of each data link, according to the distribution of the data for the species of focus and the values from the classification trees. The knowledge base was then implemented in a GIS, using the software package EMDS (Reynolds 2000), an Arcview GIS extension that calls NetWeaver. EMDS evaluates an assertion of truth according to decision rules (from NetWeaver) and the available data (from the GIS). EMDS produces fuzzy values that range from -1 (100% not not a membership) to 1 (100% membership).  The data layers included in the classification tree process as well as the GIS were species presence/absence (from a Global Positioning System (GPS)), Annual Solar Insolation, Geology, Soil type, and DEM - related products (Altitude and Slope).  In addition, fine grain (20m) remotely sensed data from the AVIRIS Sensor were employed for estimates of greenness and rockiness.

Findings

Of the 19 known sites of D.cymosa presence, 14 (73.7%) of them were predicted, using the Boolean decision tree. The model predicted 2,129 new presences, comprising an area of 14,886 ha, or 10.4% of the Study Area. See Table 1 for a contingency table of the omission and commission of the model.



 

Predicted presences

Predicted absences

Known presences (out of 19)

14

5

Known absences* (out of  27571)

2129

25442


Table 1. Rates of omission and commission for the Boolean predicted model
for D.cymosa. An * indicates a "false" absence. Some of these may truly
contain D.cymosa, but have not yet been found.



The results for the fuzzy-modified rules of the decision tree produce fewer omission rates for D.cymosa presence. After evaluating SMMNRA with the  assertion that the HAU "contains good D.cymosa habitat", the model indicated 1 known presence site as -1 (100% false), 1 known HAU with a fuzzy  membership of -0.5 of the set "contains good D.cymosa habitat." One known HAU was identified as 1 (100% true) and 19 HAU's known to be D.cymosa sites were evaluated with memberships between 0.87 and 0.99.

For the sites that did not have any known D.cymosa presences, 8  HAU's were given a membership of 1 (100%) and 2121 HAU's were given memberships of 0.999 to the set "contains good D.cymosa habitat."  In total, 4,369 HAU's were assigned memberships greater than 0, therefore having membership of the set "contains good D.cymosa habitat." 23,195 HAU's were assigned memberships less than 0, belonging to the set "does not contain good D.cymosa habitat."

Discussion and Conclusions

The results of the Boolean analysis indicate that the model did fairly well in predicting the known D.cymosa sites. The 2129 HAU's identified in the Boolean decision tree are logically the first place to search for the species. The fuzzy-modified knowledge base was a better fit, with more known D.cymosa sites being identified as members of the "good D.cymosa habitat" set. Using the fuzzy decision tree, the search for D.cymosa habitat should begin at the 2121 HAU's identified as 0.999 members of the "good D.cymosa habitat" set. Both the Boolean and fuzzy decision trees produced similar numbers of HAU's to begin searching for D.cymosa subspecies complex. The SMMNRA staff will be using these results to identify areas to focus conservation and mitigation efforts.

Acknowledgements

This work was funded by a coopertive agreement between the National Park Service and the University of California at Santa Barbara

References used

Reynolds K. 2000 June 6. A knowledge based decision support for ecological assessment. <http://www.fsl.orst.edu/emds> Accessed 2000 June 26.

Miller B. 2000 June 26. NetWeaver for Windows version 1.1. <http://www.kgarden.com/netweave.htm> Accessed 2000 June 26.


Authors

Noah C. Goldstein, Department of Geography
University of California at Santa Barbara, Santa Barbara, California, USA 93106-4060.
Email: noah@geog.ucsb.edu, Tel: +1-805-893-4519, Fax: +1-805-893-3146.