Posted on 2007-02-07

Logistic Regression for LUCC Modelling

This post is my third contribution to JustScience week.

In Land Use/Cover Change (LUCC) studies, empirical (statistical) models use the observed relationship between independent variables (for example mean annual temperature, human population density) and a dependent variable (for example land-cover type) to predict the future state of that dependent variable. The primary limitation of this approach is the inability to represent systems that are non-stationary.

Non-stationary systems are those in which the relationships between variables are changing through time. The assumption of stationarity rarely holds in landscape studies – both biophysical (e.g. climate change) and socio-economic driving forces (e.g. agricultural subsidies) are open to change. Two primary empirical models are available for studying lands cover and use change; transition matrix (Markov) models and regression models. My research has particularly focused on the latter, particularly the logistic regression model.

Figure 1.

Figure 1 above shows observed land cover for 3 years (1984 – 1999) for SPA 56, with a fourth map (2014) predicted from this data. Models run for observed periods of change for SPA 56 were found to have a pixel-by-pixel accuracy of up to 57%. That is, only just over half of the map was correctly predicted. Not so good really…

Pontius and colleagues have bemoaned such poor performance of models of this type, highlighting that models are often unable to perform even as well as the ‘null model of no change’. That is, assuming the landscape does not change from one point in time to another is often a better predictor of the landscape (at the second point in time) than a regression model! Clearly, maps of future land cover from these models should be understood as a projection of future land cover given observed trends continue unchanged into the future (i.e. the stationarity condition is maintained).

Acknowledgement of the stationarity assumption is perhaps more important, and more likely to be invalid, from a socio-economic perspective than biophysical. Whilst biophysical processes might be assumed to be relatively constant over decadal timescales (climatic change aside), this will likely not be the case for many socio-economic processes. With regard to SPA 56 for example, the recent expansion of the European Union to 25 countries, and the consequent likely restructuring of the Common Agricultural Policy (CAP), will lead to shifts in the political and economic forces driving LUCC in the region. The implication is that where socio-economic factors are important contributors to landscape change regression models are unlikely to be very useful for predicting future landscapes and making subsequent ecological interpretation or management decisions.

Because of the shortcomings of this type of model, alternative methods to better understanding processes of change, and likely future landscape states, will be useful. For example, hierarchical partitioning is a method for using statistical modelling in an explanatory capacity rather than for predictive purposes. Work I did on this with colleagues was recently accepted for publication by Ecosystems and I’ll discuss it in more detail tomorrow. The main thrust of my PhD however, is the development of an integrated socio-ecological simulation model that considers agricultural decision-making, vegetation dynamics and wildfire regimes.

Technorati Tag: regression, modelling, LandUse, stationarity

Posted on 2007-02-06

Characterizing wildfire regimes in the United States

This post is my second contribution to JustScience week, and follows on from the first post yesterday.

During my Master’s Thesis I worked with Dr. Bruce Malamud to examine wildfire frequency-area statistics and their ecological and anthropogenic drivers. Work resulting from this thesis led to the publication of Malamud et al. 2005

We examined wildfires statistics for the conterminous United States (U.S.) in a spatially and temporally explicit manner. Using a high-resolution data set of 88,916 U.S. Department of Agriculture Forest Service wildfires over the time period 1970-2000 to consider wildfire occurrence as a function of biophysical landscape characteristics. We used Bailey’s ecoregions as shown by Figure 1A below.

Figure 1.

In Bailey’s classification, the conterminous U.S. is divided into ecoregion divisions according to common characteristics of climate, vegetation, and soils. Mountainous areas within specific divisions are also classified. In the paper, we used ecoregion divisions to geographically subdivide the wildfire database for statistical analyses as a function of ecoregion division. Figure 1B above shows the location of USFS lands in the conterminous U.S.

We found that wildfires exhibit robust frequency-area power-law behaviour in the 18 different ecoregions and used power-law exponents (normalized by ecoregion area and the temporal extent of the wildfire database) to compare the scaling of wildfire-burned areas between ecoregions. Normalizing the relationships allowed us to map the frequency-area relationships, as shown in Figure 2A below.

Figure 2.

This mapping exercise shows a systematic change east-to-west gradient in power-law exponent beta values. This gradient suggests that the ratio of the number of large to small wildfires decreases from east to west across the conterminous U.S. Controls on the wildfire regime (for example, climate and fuels) vary temporally, spatially, and at different scales, so it is difficult to attribute specific causes to this east-to-west gradient. We suggested that the reduced contribution of large wildfires to total burned area in eastern ecoregion divisions might be due to greater human population densities that have increased forest fragmentation compared with western ecoregions. Alternatively, the gradient may have natural drivers, with climate and vegetation producing conditions more conducive to large wildfires in some ecoregions compared with others.

Finally, this method allowed us to calculate recurrence intervals for wildfires of a given burned area or larger for each ecoregion (Figure 2B above). In turn this allowed for the classification of wildfire regimes for probabilistic hazard estimation in the same vein as is now used for earthquakes.

Read the full paper here.

Technorati Tags: wildfire, statistics, mapping, risk, hazard

Posted on 2007-02-05

Wildfire Frequency-Area Scaling Relationships

This post is the first of my contribution to JustScience week.

Wildfire is considered an integral component of ecosystem functioning, but often comes into conflict with human interests. Thus, understanding and managing relationship between wildfire, ecology and human activity is of particular interest to both ecologists and wildfire managers. Quantifying the wildfire regime is useful in this regard. The wildfire regime is the name given to the combination of the timing, frequency and magnitude of all fires in a region. The relationship between the frequency and magnitude of fires, the frequency-area distribution, is one particular aspect of the wildfire regime that has become of interest recently.

Malamud et al. 1998 examined ‘Forest Fire Cellular Automata‘ finding a power-law relationship between the frequency and size of events. The power-law relationship takes the form:

power-law function

where is the frequency of fires with size , and is a constant. is a measure of the ratio of small to medium to large size fires and how frequently they occur. The smaller the value of , the greater the contribution of large fires (compared to smaller fires) to the total burned area of a region. The greater the value, the smaller the contribution. Such a power-law relation is represented on a log-log plot as straight line, as the example from Malamud et al. 2005 shows:

Shown circles are number of wildfires per “unit bin” of 1 km^2 (in this case normalized by database length in years and area in km^2) plotted as a function of wildfire area. Also shown is a solid line (best least-squares fit) with coefficient of determination r^2. Dashed lines represent lower/upper 95% confidence intervals, calculated from the standard error. Horizontal error bars on burned area are due to measurement and size binning of individual wildfires. Vertical error bars represent two standard deviations of the normalized frequency densities and are approximately the same as the lower and upper 95% confidence interval.

As a result of their work on the forest fire cellular automata Malamud et al. 1998 wondered whether the same relation would hold for empirical wildfire data. They found the power-law relationship did indeed hold for observed wildfire data for parts of the US and Australia. As Millington et al. 2006 discuss, since this seminal publication several other studies have suggested a power-law relationship is the best descriptor of the frequency-size distribution of wildfires around the world.

Technorati Tags: wildfire, statistics, JustScience,

Direction not Destination

Blog by James Millington, PhD

Tag: Statistical

Characterizing wildfire regimes in the United States

Wildfire Frequency-Area Scaling Relationships