4th International Conference on Integrating GIS and
Environmental Modeling (GIS/EM4):
Problems, Prospects and Research Needs.
Banff, Alberta, Canada, September 2 - 8, 2000.
Uncertainties in modelling with time series data:
estimating the risks posed by crop pests
GIS/EM4 No. 167
Claire H. Jarvis
Neil Stuart
Abstract
The rate of insect development (phenology) is strongly associated with temperature. Within the biological literature, phenologies are estimated largely on the basis of sparsely located point meteorological data. The significance of incorporating a geographical dimension was explored in relation to pest risk assessment (PRA). Colorado beetle (Leptinotarsa decemlineata) was used as an organism representative of those currently a potential threat to British agriculture, but not yet established within the country. To ensure relevance to both pest risk assessment and integrated pest management applications, phenology models were run using daily meteorological data throughout England and Wales. Partial thin plate spline interpolation was chosen as an efficient means to create spatial temperature 'surfaces' from distributed daily maximum and minimum temperature data observed at a subset of 174 meteorological stations. Providing the interaction between phenology models and sequences of geographically relevant temperature data at this daily step and national coverage necessitated the construction of tailor made research software for the project. The coupled temperature interpolation/phenology modelling system was used to provide a range of outputs to explore the accuracy of predicted phenologies over space and time. Confidence within individual risk surfaces for particular years was computed by piping jack-knife cross-validated estimates of temperature through the respective models to evaluate the impact of uncertainties within the input data in biological terms. In order to represent variation in the likelihood of a pest establishing from year to year, a cumulative index derived from multiple binary images was developed.
Keywords
entomology, climate, space/time series, uncertainty,
interpolation
Introduction
This paper explores the manner in which the threat posed by
non-indigenous pests is assessed (pest risk assessment) from an explicitly
geographical perspective. As well as the immediate and longer term costs of
damage when an agricultural or horticultural crop is attacked, there are large
consequential costs involved in containing native pests and eliminating exotic
pests. When assessing the degree of effort required to keep particular exotic
pests out of the country, knowledge is needed of where and for how long both
exotic and our native pests could survive and potentially spread within mainland
Britain on any particular day in the year. Meteorological data are
preferentially located at lowland, and often coastal, sites at the expense of
upland areas. The inference of this is that aggregate risk assessments based
upon point data may be biased in relation to pest development potential over the
full landscape. The effect of adopting a geographical (fully spatial) approach
make to quantitative assessments of risk for non-native pests has been little
explored to date, and forms the context for this paper. Additionally, we
demonstrate methods to assess the reliability of the resultant spatial
phenologies.
Background
Royer and Yang (1991) define pest risk analysis as 'the
estimation of the likelihood of entry of a pest into an area in which it is not
wanted, and the potential impact if the pest became established in that area'.
With increased trade in plant material between European countries and beyond,
this is a continuing need. This study focuses on the assessment phase of pest
risk analysis, and in particular the likelihood of a pest becoming established
throughout the country after arrival on the basis of the temperature available
during its developmental period.
A common starting point in pest risk
assessments is to investigate whether local climate provides conditions under
which a particular non-indigenous pest has the potential to thrive. Results are
regularly presented as point data, confined to the location of known
meteorological recording sites (e.g. Skarratt and others 1995). In a growing
minority of cases, gridded rather than point based climate normals have been
used within pest risk assessments to present a fuller picture of establishment
potential (e.g. Baker 1996, Baufeld and others 1996). However, differences
arising as a result of using a geographical or fully spatial approach, rather
than the point based method, have not been investigated. In addition to this
limited spatial portrayal, insect pest risk assessments to date have most
commonly been based upon monthly aggregated climate indicators rather than the
daily records more commonly used when modelling indigenous pests. Given the
short length of lifecycle of many insects, this raises considerable problems of
temporal scale. Furthermore, conclusions are drawn on the basis of monthly,
seasonal (winter/summer) or annual climate normals. This may result in obscuring
the annual variation in risk posed, despite the fact that within risk analyses
as a whole it is often extreme events that hold greatest
significance.
Additionally, with regard to one particular approach, a
measure of error associated with particular model outputs provides a warning
message to the user to use the simulations to augment, rather than replace,
expert biological knowledge. Understanding the nature of input error propagating
through such a multi-temporal spatially referenced system in relation to the
sensitivities of the models has been the subject of little debate either within
the spatial phenology or geographical science literature. While work by
Heuvelink (1998) and Arbia and others (1998) among others have progressed our
ability to model error propagation within environmental models in space, the
spatio-temporal aspects of error propagation have still to be addressed.
Methods
General approach
The core approach within the work is the linking of spatially
referenced input data (interpolated daily maximum and minimum temperatures) with
phenological models commonly run at sparse point locations only. For estimating
the continuous surfaces of input variables to the models, interpolation was
chosen in preference to process-based atmospheric modelling for the efficiency
of technique, given the data available to the study and the multiple climate
surfaces required. The intention to model daily rather than monthly temperatures
raises issues regarding how best to incorporate knowledge of synoptic rather
than climate processes over space in order to provide appropriate input data to
the biological models. While the treatment of climatology within the study is
empirical, the phenological perspective in contrast draws upon established
biological models rather than being data driven. In this case, pest risk is
modelled by bringing together a nationwide description of daily weather and
other physical conditions and integrating these with process-based phenological
models (Baker and Cohen 1985). Partial thin plate splines (Hutchinson 1991) were
used to construct the national daily maximum and minimum temperature surfaces
used throughout this study, following a series of initial investigations. These
issues have previously been discussed in detail (Jarvis and Stuart 2000ab).
Owing to the temporal nature of the phenological problems under study, a
loosely coupled approach linking custom built software was used for modelling,
while the use of GIS was confined to data preparation and the visualisation
tasks. The geographical bounds of the study cover mainland England and Wales,
since a national perspective is of crucial importance when assessing the threat
posed by non-indigenous pests in particular. Modelling was carried out at a
daily time step, to a resolution of 1km, in order that the findings could
equally be applied to the management of indigenous pests and to incorporate the
rapid development of many insects. Phenology models were run multiple times
using continuous input data, as opposed to the alternative approach taken in
previous studies of interpolating point phenological model results (e.g. Schaub
1995, Régničre 1996). The effect of this approach on the magnitude, distribution
and timing of the areas estimated to be at risk is a further strand of research,
and detailed discussions of the interpolation process, are reported elsewhere
(Jarvis and others 1999).
Risk assessment
Case study pest and phenology model
In order to constrain the scope of the paper, Colorado beetle (Leptinotarsa decemlineata) is used throughout the work in discussions of pest risk assessment as an example of a pest that is not established within Britain, but that poses a considerable threat to British agriculture. The results reported were computed using the model of Baker and Cohen (1985). In order to better demonstrate the nature of insect lifecycles, and their spatio-temporal variability within a single season, the hypothetical development of Colorado beetle was computed for two grid cells of differing elevation in potato producing areas within the Vale of York.
From phenology to 'risk'
The use of phenology models for PRA purposes requires a certain
number of assumptions to be adopted. For example, employing measures of
phenology for assessing the likelihood of a pest establishing in this country
does however imply a worst case scenario. In particular, this assumes that no
other limiting factors, such as unavailability of food or predatory parasites,
limit the pest's survival. Furthermore, all model runs will begin with the
assumption that diapausing (winter resting) adults are present throughout the
landscape at the beginning of each year.
For PRA purposes, the
phenological results were reported as the date in the year at which 50% reach
the young adult stage throughout England and Wales. Reaching this target event
implies the possibility that these individuals could then potentially enter
diapause, assuming other limiting conditions such as available food are met. The
overall cumulative percentage of the nominal population leaving a particular
stage can also be considered to reflect the probability of an insect reaching
the following developmental state. Since only adult beetles are known to survive
the range of winter conditions found in Britain, mortality is implied where the
adult stage has not been reached by the end of the model run.
Critical
to long term pest risk assessment are the questions whether a pest has the
potential to survive from generation to generation, and from year to year. Model
runs were therefore made for each year of the period 1961-90 throughout England
and Wales. This thirty year time span, which encompasses a broad range of
temperature variation, is standard within risk assessments based upon climate
data. Mapped comparisons of Colorado beetle development were then made for the
least favourable, the average and the most favourable years, as selected by
ranking the initial gridded results. In order to highlight the uncertainties
posed by inter-year climate variability within the pest risk assessment process,
the model outputs for the 30 years were individually converted to threshold
plots according to whether or not the target adult stage was reached. These
plots were then combined to form a cumulative risk index (0-30), where an index
of 30 implies a strong possibility of establishment occuring.
Assessing error
Except on rare occasions of an actual outbreak of a pest, there
is little actual data against which these predictions of non-indigenous pests
can be validated. One means to gain a more comprehensive assessment of the
accuracy of these predictions is to analyse mathematically how the uncertainty
in our temperature estimates may lead to uncertainty in the estimated timing of
pest development. Semi-independent cross validation techniques such as the
jack-knife and bootstrap have been a popular means of assessing statistical
estimation and prediction since the mid 1970s (Cressie 1991, p101). For this
study, jack-knife cross validation was adopted to provide a measure of modelling
accuracy that is able to track errors as they are propagated through the
modelling system (both interpolation and biological model) throughout a model
run. Jack-knifing refers to the practice of removing data points singly and
using the remaining data to predict the values at the deleted points. Iterated,
this allows some measure of the overall prediction error to be estimated.
Adoption of this technique allowed the tracking of error at any temporal time
step within the modelling process, both in terms of the input data error and the
resultant phenological model inaccuracy, albeit at selected data dependent
points within the landscape. In this way, the effect of interpolation error
propagated through the phenology models at each stage was investigated. It is
important to note that the variation in results arises rather from the error in
the estimated geographical inputs alone.
The process of cross-validation
results in an estimated model value for each point sample location, which may
then be compared with the known value at that point. Throughout this paper, the
term 'residual' is taken to imply the remainder when the estimate is subtracted
from the known value (Residual = actual - estimated value). The nature of these
residuals will vary according to the model under consideration, but always
relate to a particular 'snapshot' of spatial phenology requested. For Colorado
beetle examples, phenological outputs are expressed in terms of expected
emergence dates. Residuals in these cases are therefore measured in numbers of
days. The significance of the model performance may be explored just as if
comparing truly independent data with modelling estimates, with a similar range
of statistical options.
Results
Space-time variability of pest risk
Long-term, inter-year variability over a nationwide extent
As an illustration of a wider range of model outputs that could assist with
the process of pest risk analysis, Figure 1(a) shows the average potential
emergence date of Colorado Beetle adults in mainland England & Wales. The
temperature conditions could potentially allow beetles to reach young adulthood
in many parts of central and south east England in consecutive years, posing a
severe risk to many potato growing regions. The effect of inter-year climate
variation is illustrated within Figure 1(b). This plots the longer term risk
posed by the organism by shading the country according to the number of years in
the last thirty where adult emergence, computed on the basis of temperature,
would have been feasible should the beetle have entered the country and spread
widely.
|
Figure 1. (a) Hypothetical 30-year average emergence dates for Colorado beetle 1st generation adults, (b) Cumulated long-term risk index for Colorado beetle surviving at least one generation |
Variability within season for a 100*100km area of undulating topography
A biological model predicting the likely development of a
Colorado Beetle was run at each point through mainland England & Wales for
which temperature values had been estimated. A small 100 km. square area centred
on the Vale of York is shown, with model results for two points (1) in the
valley bottom and (2) on the moor top (Figure 2). The graphs plot the %of
beetles in a hypothetical population that are estimated to reach the different
stages of development (Y axis), by a given day in the year (1-365) on the X
axis. Plots from the upland and lowland sites are juxtaposed to show the effect
of temperature differences between the two locations on the time of beetle
development (Figure 3). Pest development at the upland site is delayed by
approximately one month in comparison to dates estimated at the lowland site.
|
Figure 2. (a) Distribution of potato crop (1994) and (b) elevation, Vale of York (100*100km). Figure 3. Estimated life-cycle development throughput the year 1976 for Colorado beetle at upland (2) and lowland (1) sites marked within Figure 154-2, Vale of York. |
Error in predicted Colorado beetle emergence date arising as a result of the use of input data from remote locations
Figure 4 illustrates the distribution of residuals (number of days error in estimated emergence date) for estimates of the date at which the larvae of the example pest reach 50% emergence of both their egg and larval stages over the 174 locations where meteorological data was available. No consistent over or under estimation in the number of days over the country taken as a whole (bias) arises, although individually for a number of locations estimates are both early or late by up to one month in a minority of cases.
|
Figure 4. National estimate of cross validated error, Colorado beetle 1976 (a) 50% emergence adult stage and (b) 50% emergence egg stage. |
Discussion
By integrating temperature surfaces and biological models, the
approach allows us to explore for the first time what differences arise through
time and over space when we seek to assess risk geographically within the
agricultural system. As the hypothetical results for Colorado beetle within the
Vale of York in 1976 demonstrate, variations in geographical conditions over
relatively small distances can lead to large differences between the timings
when these areas will be at risk from pests. This underlines the need to work at
a daily time step and the importance of predicting daily inputs as accurately as
possible, with known error, so that the resulting map of risk is correctly
interpreted. Moreover, the time at which insects reach a certain stages in
phenological development also varies considerably. This suggests that earlier
work in pest risk assessment based on climate data aggregated prior to modelling
(climate normal data) rather than, as in this case, aggregated post-hoc has the
potential to 'dampen down' the assessment of extreme risks, which are often the
most damaging when they occur. Additionally, previous work (e.g. Baufeld and
others 1996) has assessed areas at risk in terms of the overall landscape,
rather in relation to the target crop data. Since the rationale of PRA is to
protect national economic interests, this issue is of considerable relevance to
decision-makers.
True validation of the spatial phenologies for the
purpose of pest management in particular requires an understanding of both
interpolation error and biological modelling error in combination at those
stages of the life cycle pertinent to potential users of the system. In the
absence of trap data in the case of indigenous pests, cross-validation allows an
initial estimate of the biological error arising as a component of the
temperature interpolation process. While cross-validated error 'glyphs' as a
by-product of interpolation have been visualised side by side for two different
time periods (Mitásová and others 1995), the method used within this paper is
novel in comparison since it allows a full exploration of the manner in which
input errors accumulate over time in terms of the resultant phenological
accuracies. The technique may also prove useful in partitioning of interpolation
error from biological modelling error.
More generally, the caveats
relating to the complexity of the overall biological system make it imperative
that the spatial results are interpreted carefully by a biological expert and,
ideally, subjected to a number of further sensitivity analyses. The multiple
options made possible by this integrated geographical modelling framework could
be used as a supportive suite of information from multiple perspectives rather
than the present reliance on a single style of mapped outputs. In turn, this may
lead to a broader understanding of the scientific conclusions and their
sensitivities at a management level. Questions regarding the modelling of
phenologies rather than populations for pest risk, or debate regarding the
wisdom of climatological determinism for such a purpose should also be
considered, and potentially incorporated within future modelling environments
supporting PRA. As King and Kramer (1993) note, 'Modelers should avoid believing
or giving the impression that their models hold the 'answers' for policy makers.
They hold, instead, the refined results of particular points of view.' In this
way, we argue that there is scope for geographical phenologies to form a useful
role within the wider scientific toolbox that is required when undertaking a
pest risk assessment.
Conclusions
With the development of many crop pests constrained within
defined ranges of air temperature, advances in Geographic Information Systems
(GIS) and their use in conjunction with existing physical models of pest
development creates a new opportunity to assess the risk to crops from native
and exotic pests. This research project brought together a nationwide
description of daily weather and other physical conditions and integrated these
with models of pest development to estimate the geographical and temporal
extents over which crop pests may survive. Mapping these results provides a
nationwide assessment of the potential threat from pests on any particular day
of the year. Risk to agriculture and horticulture from pest attack has
previously been assessed mostly using long-term monthly mean climate data,
despite the known influence of much shorter time scale changes in climatic
conditions upon the development of pests.
In an effort to attach to
these geographical maps of risk some statement of reliability, methods were
developed to assess the uncertainties in the driving meteorological input data.
We also investigated how errors may propagate through the combination of spatial
interpolation and modelling and how these errors may be manifest in the daily
geographical distributions of areas estimated to be at risk. Errors in estimates
of what we term 'geographical phenologies' (i.e. geographical distributions of
pest development - the basis for estimating risk) have previously been assessed
by limited field checking. In contrast, the statistical technique of jack-knife
cross-validation provided both a measure of the error in an estimated surface
for every time step and the measure of how this error propagates through the
biological model over time. Given that examples of false confidence and
over-security in computerised maps abound, the explicit provision of residual
and r.m.s. errors, even if partial when considering the multiple sources of
uncertainty, reduces the possibility of model outputs being misinterpreted and
the accompanying caveats being ignored or blurred. This is particularly the case
where the interpretation and application of model results can have important
economic and social consequences, as in this application area.
Recommendations for future research
In follow-on work we compare whether the accuracy of temperature
estimates can be improved further by using different mathematical interpolators
or by using more data on geographical conditions to guide the estimation.
Further work with MAFF, the Meteorological Office and other partners will
explore how accurately we can estimate air temperature and other
agro-meteorological data using the existing UKMO network and what improvements
could be obtained from an expanded network. This work will explore the
possibilities of driving insect phenology models using spatially distributed
model input data. Improved results will support the case for improved spatial
agro-meteorological data provision to complement and capitalise upon biological
research efforts, especially within practical agricultural decision support
systems. In support of this research direction, Mineter, Dowers and Gittings
(2000) identify the possibilities afforded by high end computing when delivering
biological model inputs or outputs to users in near real-time, for example over
the Internet.
The methods developed in this project provide a basis by
which environmental modellers may begin to identify and partition errors that
arise from sparseness of the input data and from the structure of the subsequent
process-based modelling. While we have exploited jacknife cross-validation to
explore the propagation of input errors through biological models run over
multiple time steps, theoretical approaches to the space-time modelling of error
have still to be addressed.
Acknowledgements
This work was initiated under a collaborative PhD studentship for C.H. Jarvis at the Department of Geography, University of Edinburgh funded by Central Science Laboratory (CSL). Thanks are owed to Dr RHA Baker and Dr M Hims of CSL for the provision of phenology models and their ongoing interest in this work. Data was supplied by the Ordnance Survey and the UK Meteorological Office. The opinions expressed within this paper are not necessarily those of CSL.
References
Arbia, G., Griffith, D., Haining, R. (1998) Error propagation modelling in raster GIS: overlay operations, International Journal of Geographical Information Science, 12, 145-167.
Baker, C.R.B., Cohen, L.I. (1985) Further development of a computer model for simulating pest life cycles, Bulletin OEPP/EPPO Bulletin, 15, 317-324.
Baker, R.H.A. (1996) Developing a European pest risk mapping system, Bulletin OEPP/EPPO Bulletin, 26, 485-494.
Baufeld, P., Enzian, S., Motte, G. (1996) Establishment potential of Diabrotica virgifera in Germany, Bulletin OEPP/EPPO Bulletin, 26, 511-518.
Cressie, N. (1991) Statistics for Spatial Data, Wiley: New York, pp 900.
Henderson-Sellars, A. (1996) Can we integrate climate modelling and assessment? Environmental Modeling and Assessment, 1, 59-70.
Heuvelink, G.B.M. (1998) Error propagation in environmental modelling, Taylor and Francis: London, pp127.
Hutchinson, M.F. (1991) The application of thin plate smoothing splines to continent-wide data assimilation, In Data Assimilation Systems, J.D. Jasper (ed.), BMRC Research Report No. 27, Bureau of Meteorology, Melbourne, 104-113.
Jarvis (1999) Insect phenology: a geographical perspective, PhD thesis, Department of Geography, University of Edinburgh, Scotland.
Jarvis, C.H., Baker, R.H.A. (Submitted) Risk assessment for non-indigenous pests: a geographical approach to assessing the likelihood of establishment, Journal of Biogeography
Jarvis, C.H., Morgan, D., Baker, R.H.A (Submitted) Assessing the impact of interpolated input data on the results of phenology models: a case study of codling moth over England and Wales, Agriculture, Ecosystems and Environment
Jarvis, C.H., Stuart, N. (2001a, In press) A comparison between strategies for interpolating maximum and minimum daily air temperatures, a. The selection of 'guiding' variables, Journal of Applied Meteorology.
Jarvis, C.H., Stuart, N. (2001b, In press) A comparison between strategies for interpolating maximum and minimum daily air temperatures, b. The interaction between guiding variable and interpolation method, Journal of Applied Meteorology.
Jarvis, C.H., Stuart, N. (Submitted) Uncertainties in modelling with time series data: estimating the development of crop pests throughout the year, Transactions in GIS.
Jarvis, C.H., Stuart, N., Morgan, D., Baker, R.H.A. (1999). To interpolate and thence to model, or vice versa? In Integrating Information Infrastructures with Geographical Information Technology, Gittings B. (ed.) Chapter 18, Taylor and Francis: London, p229-242.
King, J.L., Kraemer, K.L. (1993) Models, facts, and the policy process: the political ecology of estimated truth, In Goodchild, M.F., Parks, B.O., Steyaert, L.T. (1993) (Eds.) Environmental modelling with GIS, Oxford University Press: New York, pp 353-360.
Mineter, M.J., Dowers, S., Gittings, B.M. (2000) Towards a HPC framework for integrated processing of geographical data: encapsulating the complexity of parallel algorithms, Transactions in GIS, 4, 245-262.
Mitásová, H., Mitás, L., Brown, W.M., Gerdes, D.P., Kosinovsky, I., Baker, T. (1995) Modelling spatially and temporally distributed phenomena: new methods and tools for GRASS GIS, International Journal of Geographical Information Systems, 9, 433-446.
Mitásová, H., Mitás, L., Brown, W.M., Gerdes, D.P., Kosinovsky, I., Baker, T. (1995) Modelling spatially and temporally distributed phenomena: new methods and tools for GRASS GIS. International Journal of Geographical Information Systems, 9, 433-446.
Régničre, J.(1996) A generalized approach to landscape-wide seasonality forecasting with temperature driven simulation models. Environmental Entomology, 25, 869-881.
Royer, M.H., Yang, X.B. (1991) Application of high-resolution weather data to pest risk assessment, Bulletin OEPP/EPPO Bulletin 21, 609-614.
Schaub, L.P., Ravlin, F.W., Gray, D.R., Logan, J.A. (1995) Landscape framework to predict phenological events for gypsy moth (Lepidoptera: Lymantriidae) management problems. Environmental Entomology, 24, 10-18.
Skarratt, D.B., Sutherst, R.W., Maywald, G.F. (1995) CLIMEX for Windows, Version 1.0: User's Guide, CSIRO: Brisbane, pp 92.
Authors
Claire Jarvis, Research Fellow, Department of
Geography
The University of Edinburgh, Drummond St, Edinburgh, Scotland,
United Kingdom, EH8 9XP
Email: chj@geo.ed.ac.uk, Tel: +44-131-650-2662, Fax:
44-131-650-2524
Neil Stuart, Lecturer, Department of
Geography
The University of Edinburgh, Drummond St, Edinburgh, Scotland,
United Kingdom, EH8 9XP
Email: ns@geo.ed.ac.uk, Tel: +44-131-650-2549, Fax:
44-131-650-2524