Posted on 2007-06-24

Alternative Model Assessment Criteria

Given the discussion in the previous posts regarding the nature of socio-ecological systems, equifinality and relativism in environmental modelling, how should we go about assessing the worth and performance of our simulation models of human-environment systems?

Simulation models are tangible manifestations of a modellers’ ‘mental model’ of the structure of the system being examined. Socio-Ecological Simulation Models (SESMs) may be thought of as logical and factual arguments made by a modeller, based on their mental model. If the model assumptions hold, these arguments should provide a cogent and persuasive indication of how system states may change under different scenarios of environmental, economic and social conditions. However, the resulting simulation model, based upon a logical and factually coherent mental model, is unlikely to be validated on these two criteria (logic and fact) alone.

First, the problems of equifinality suggest that there are multiple logical model structures that could be implemented for any particular system. Second, accurate mimetic reproduction of an empirical system state by a model may be the most persuasive form of the factual proof of a model in many eyes, but the dangers of affirming the consequent make it impossible to prove temporal predictions in models of open systems are truly accurate. Simulation models may be based on facts about empirical systems, but their results cannot be taken as facts about the modelled empirical system.

Thus, some other criteria alongside the logical and factual criteria will be useful to evaluate or validate a SESM. A third and fourth criteria, for environmental simulation models that consider the interaction of social and ecological systems at least, are available by specifically considering the user(s) of a model and its output. These criteria are closely linked.

My third proposed criterion is the establishment of user trust in the model. Trust is used here in the sense of ‘confidence in the model’. If a person using a model or its results does not trust the model it will likely not be deemed fit for its intended purpose. If confidence is lacking in the model or its results, confidence will consequently be lacking in any knowledge derived, decision made, or policy recommended based upon the model. Thus, the use of trust as a criterion for validation is a form of ‘social validation’, ensuring that user(s) agree the model is a legitimate representation of the system.

The fourth criteria by which a model might achieve legitimacy and receive a favourable evaluation (i.e. be validated), is the provision of some form of utility to the user. This utility will be termed ‘practical adequacy’. If a model is not trusted then it will not be practically adequate for its purpose. However, regardless of trust, if the model is not able to address the problems or questions set by the user then the model is equally practically inadequate.

The addition of these two criteria, centred on the model user rather than the model itself, suggests a shift away from falsification and deduction as model validation techniques, toward more reflexive approaches. The shift in emphasis is away from establishing the truth and mimetic accuracy of a model and toward ensuring trust and practical adequacy. By considering trust and practical adequacy, validation becomes an exercise in model evaluation and reclaims its more appropriate meaning of ‘establising a model’s legitimacy’.

From his observation of experimental physicists and work on the ‘experimenter’s regress’, Collins has arrived at the view that there is no distinction between epistemological criteria and social forces to resolve a scientific dispute. The position outlined previously seems to imply a similar situation for models of open, middle-numbered systems where modellers are required to resort to social criteria to justify their models due the inability to do so convincingly epistemologically. This is not necessarily an idea that many natural scientists will sit comfortably with. However, the shift away from truth and mimetic accuracy should not necessarily be something modellers would object to.

First, all modellers know that their models are not true, exact replications of reality. A model is an approximation of reality – there is no need to create a model system if experimentation on the existing empirical system is possible. Furthermore, accepting the results of a model are not ‘true’ (i.e. in the sense that they are perfect predictions of the future) in no way requires the model be built on incorrect logic or facts. As Hesse notes in criticism of Collins, whilst the resolution of scientific disputes might result from a social decision that is not forced by the facts, “it does not follow that social decision has nothing to do with objective fact”.

Second, regardless of truth and mimetic accuracy, modellers have several options to build trust and ensure practical adequacy scientifically. Ensuring models are logically coherent and not factually invalid (i.e. criteria one and two) will already have come some way to make a scientific case. Furthermore, the traditions of scientific methodological and theoretical simplicity and elegance can be observed, and the important unifying potential across theories and between disciplines that modelling offers can be emphasised. Thus, regardless of the failures of epistemological methods for justifying them, socio-ecological and other environmental simulation models must be built upon solid logical and factual foundations;

“The postmodern world may be a nightmare for … normal science (Kuhn 1962), but science still deserves to be privileged, because it is still the best game in town. … [Scientists] need to continue to be meticulous and quantitative. But more than this, we need scientific models that can inform policy and action at the larger scales that matter. Simple questions with one right answer cannot deliver on that front. The myth of science approaching singular truth is no longer tenable, if science is to be useful in the coming age.”
(Allen et al. p.484)

Post-normal science highlights the importance of finding alternative ways for science to engage with both the problems faced in the contemporary world and the people living in that world. As they have been defined here, SESMs will inherently address questions that will be of concern to more than just scientists, including problems of the ‘risk society’. From a modelling perspective, a post-normal science approach highlights the need to build trust in the eyes of non-scientists such that understanding is fostered.

Further, it emphasises the need for SESMs to be practically adequate such that good decisions can be made promptly. It also implies that the manner in which a ‘normal’ scientist will go about assessing the trustworthiness or practical adequacy of a model (such as the methods described above) will differ markedly from that of a non-scientist. For example, scientific model users will often, but not always, have also been the person to develop and construct the model. In such a case the model will be constructed to ensure the model is practically adequate to address their particular scientific problems and questions.

When the model is to be used by other parties the issue of ensuring practical adequacy will not be so straight-forward, and particularly so when the user is a non-scientist. In such situations, the modeller needs to ask the question ‘practically adequate for what’? The inhabitants of the study areas investigated will have a vested interest in the processes being examined and will themselves have questions that could be addressed by the model. In all probability many of these questions will be ones that the modeller themselves has not considered or, if they have, may not have considered relevant. Further, the questions asked by local stakeholders may be non-scientific – or at least may be questions that environmental scientists are not used to attempting to answer.

The use and improvements in technical approaches (such a spatial error matrices from pixel-by-pixel model assessment) will remain useful and necessary in the future. Here however, I have emphasised potential alternative methods for model validation (assessment) might be useful to utilise the additional information and knowledge which is available from those actors driving change in a socio-ecological system. In other words, there is information within the system of study that is not utilised for model assessment by simply comparing observed and predicted system states. This information is present in the form of local stakeholders’ knowledge and experience.

Posted on 2007-06-17

Relativism in Environmental Simulation Modelling

Partly as a result of the epistemological problems described in my previous few posts, Keith Beven has forwarded a modelling philosophy that accepts uncertainty and a more relativist perspective. This realist approach demands greater emphasis on pluralism, use of multiple hypotheses, and probabilistic approaches when formulating and parameterising models. When pressed to comment further on his meaning of relativism, Beven highlights the problems of rigidly objective measures of model performance and of ‘observer dependence’ throughout the modelling process;

“Claims of objectivity will often prove to be an illusion under detailed analysis and for general applications of environmental models to real problems and places. Environmental modelling is, therefore, necessarily relativist.”

Beven suggests the sources of relativistic operator dependencies include;

Operator dependence in setting up one or more conceptual model(s) of the system, including subjective choices about system structure and how it is closed for modelling purposes; the processes and boundary conditions it is necessary to include and the ways in which they are represented.
Operator dependence in the choice of feasible values or prior distributions (where possible) for ‘free’ parameters in the process representations, noting that these should be ‘effective’ values that allow for any implicit scale, nonlinearity and heterogeneity effects.
Operator dependence in the characterization of the input data used to drive the model predictions and the uncertainties of the input data in relation to the available measurements and associated scale and heterogeneity effects.
Operator dependence in deciding how a model should be evaluated, including how predicted variables relate to measurements, characterization of measurement error, the choice of one or more performance measures, and the choice of an evaluation period.
Operator dependence in the choice of scenarios for predictions into the future.

The operator dependencies have been highlighted in the past, but have re-emerged in the thoughts of geographers (Demeritt, Brown, O’Sullivan, Lane et al.), environmental scientists (Oxley and Lemon), social scientists (Agar) and philosophers of science (Collins, Winsberg).

Notably, although with reference to experimental physics rather than environmental simulation modelling, Collins identified the problem of the ‘experimenter’s regress’. This problem states that a successful experiment occurs when experimental apparatus is functioning properly – but in novel experiments the proper function of the apparatus can only be confirmed by the success or failure of the experiment. So in situations at the boundaries of established knowledge and theory, not only are hypotheses contested, but so too are the standards and methods by which those hypotheses are confirmed or refuted. As a result, Collins suggests experimentation becomes a ‘skilful practice’ and that experimenters accept results based not on epistemological or methodological grounds, but on a variety of social (e.g. group consensus) and expert (e.g. perceived utility) factors.

This stance is echoed in many respects by Winsberg’s ‘epistemology of simulation’, which suggests simulation is a ‘motley’ practice and has numerous ingredients of which theoretical knowledge is only one. The approximations, idealisations and transformations used by simulation models to confront analytically intractable problems (often in the face of sparse data), need to be justified internally (within the model construction process) on the basis of existing theory, available data, empirical generalisations, and the modeller’s experience of the system and other attempts made to model it.

Similarly, Brown suggests that in the natural sciences uncertainty is rarely viewed as being due to the interaction of social and physical worlds (though Beven’s environmental modelling philosophy outlined above does) and that modellers of physical environmental processes might learn from the social sciences where the process of gaining knowledge is understood to be important for assessing uncertainty.

However, whilst an extreme rationalist perspective prevents validation and useful analysis of the utility of a model, its output, and the resulting knowledge (because of things like affirming the consequent), so too does an extreme relativist stance which understands model and model builder are inseparable. Rather, as Kleindorfer et al. suggest, modellers need to develop the means to increase the credibility of the model such that “meaningful dialogue on a model’s warrantability” can be conducted. How and why this might be achieved will be discussed in future posts.

Posted on 2007-05-28

Validating Models of Open Systems

A simulation model is an internally logically-consistent theory of how a system functions. Simulation models are currently recognised by environmental scientists as powerful tools, but the ways in which these tools should be used, the questions they should be used to examine, and the ways in which they can be ‘validated’ are still much debated. Whether a model aims to represent an ‘open’ or ‘closed’ systems has implications for the process of validation.

Issues of validation and model assessment are largely absent in discussions of abstract models that purport to represent the fundamental underlying processes of ‘real world’ phenomena such as wildfire, social preferences and human intelligence. These ‘metaphor models’ do not require empirical validation in the sense that environmental and earth systems modellers use it, as the very formulation of the system of study ensures it is ‘closed’. That is, the system the model examines is logically self-contained and uninfluenced by, nor interactive with, outside statements or phenomena. The modellers do not claim to know much about the real world system which their model is purported to represent, and do not claim their model is the best representation of it. Rather, the modelled system is related to the empirical phenomena via ‘rich analogy’ and investigators aim to elucidate the essential system properties that emerge from the simplest model structure and starting conditions.

In contrast to these virtual, logically closed systems, empirically observed systems in the real world are ‘open’. That is, they are in a state of disequilibrium with flows of mass and energy both into and out of them. Examples in environmental systems are flows of water and sediment into and out of watersheds and flows of energy into (via photosynthesis) and out of (via respiration and movement) ecological systems. Real world systems containing humans and human activity are open not only in terms of conservation of energy and mass, but also in terms of information, meaning and value. Political, economic, social, cultural and scientific flows of information across the boundaries of the system cause changes in the meanings, values and states of the processes, patterns and entities of each of the above social structures and knowledge systems. Thus, system behaviour is open to modification by events and phenomena outside the system of study.

Alongside being ‘open’, these systems are also ‘middle-numbered’. Middle-numbered systems differ from small-numbered systems (controlled situations with few interacting components, e.g. two billiard balls colliding) that can be described and studied well using Cartesian methods, and large-numbered systems (many, many interacting components, e.g. air molecules in a room) that can be described and studied using techniques from statistical physics. Rather, middle-numbered systems have many components, the nature of interactions between which is not homogenous and is often dictated or influenced by the condition of other variables, themselves changing (and potentially distant) in time and space. Such a situation might be termed complex (though many perspectives on complexity exist). Systems at the landscape scale in the real world are complex and middle-numbered. They exist in a unique time and place. In these systems history and location are important and their study is necessarily a <a href="http://dx.doi.org/10.1130/0016-7606(1995)1072.3.CO;2″ target=”_blank” class=”regular”>‘historical science’ that recognises the difficulty of analysing unique events scientifically through formal, laboratory-type testing and the hypothetico-deductive method. Most real-world systems possess these properties, and coupled human-environment systems are a prime example.

Traditionally laboratory science has attempted to isolate real world systems such that they become closed and amenable to the hypothetico-eductive method. The hypothetico-deductive method is based upon logical prediction of phenomena independent of time and place and is therefore useful for generating knowledge about logically, energetically and materially ‘closed’ systems. However, the ‘open’ nature of many real-world, environmental systems (which cannot be taken into the laboratory and instead must be studies in situ) is such that the hypothetico-deductive method is often problematic to implement in order to generate knowledge about environmental systems from simulation models. Any conclusions draw using the hypothetico-deductive method for open systems using a simulation model will implicitly be about the model rather than the open system it represents. Validation has also frequently been used, incorrectly, as synonymous with demonstrating that the model is a truly accurate representation of the real world. By contrast, validation in the discussion presented in this series of blog posts refers to the process by which a model constructed to represent a real-world system has been shown to represent that system well enough to serve that model’s intended purpose. That is, validation is taken to mean the establishment of model legitimacy – usually of arguments and methods.

In the next few posts I’ll examine the rise of (critical) realist philosophies in the environmental sciences and environmental modelling and will explore the philosophy underlying these problems of model validation in more detail.

Posted on 2007-05-25

Validating and Interpreting Socio-Ecological Simulation Models

Over the next 9 posts I’ll discuss the validation, evaluation and interpretation of environmental simulation modelling. Much of this discussion is taken from chapter seven of my PhD thesis, arising out of my efforts to model the impacts of agricultural land use change on wildfire regimes in Spain. Specifically, the discussion and argument are focused on simulation models that represent socio-ecological systems. Socio-Ecological Simulation Models (SESMs), as I will refer to them, are those that represent explicitly the feedbacks between the activities and decisions of individual actors and their social, economic and ecological environments.

To represent such real-world behaviour, models of this type are usually spatially explicit and agent-based (e.g. Evans et al., Moss et al., Evans and Kelley, An et al., Matthews and Selman) – the model I developed is an example of a SESM. One motivating question for the discussion that follows is, considering the nature of the systems and issues they are used to examine, how we should go about approaching model evaluation or ‘validation’. That is, how do we identify the level of confidence that can be placed in the knowledge produced by the use of a SESM? A second question is, given the nature of SESMs, what approaches and tools are available and should be used to ensure models of this type provide the most useful knowledge to address contemporary environmental problems?

The discussion that follows adopts a (pragmatic) realist perspective (in the tradition of Richards and Sayer) and recognises and the importance of the open, historically and geographically contingent nature of socio-ecological systems. The difficulties of attempting to use general rules and theories (i.e. a model) to investigate and understand a unique place in time are addressed. As increasingly acknowledged in environmental simulation modelling (e.g. Sarewitz et al.), socio-ecological simulation modelling is a process in itself in which human decisions come to the fore – both because human decision-making is being modelled but also, importantly, because modellers’ decisions during model construction are a vital component of the process.

If these models intended to inform policy-makers and stakeholders about potential impacts of human activity, the uncertainty inherent in them needs to be managed to ensure their effective use. Fostering trust and understanding via a model that is practically adequate for purpose may aid traditional scientific forms of model validation and evaluation. The list below gives the titles of the posts that will follow over the next couple of weeks (and will become links when the post is online).

The Nature of Open Systems
Realist Philosophy in the Environmental Sciences
Equifinality
Interactive vs. Indifferent Kinds
Affirming the Consequent
Relativism in Modelling
Alternative Model Assessment Criteria
Stakeholder Participation and Expertise
Summary

Posted on 2007-05-22

getting my head round things

Now that I’m into my second week at MSU, things have calmed down a little. I’ve ploughed through most of the necessary admin, met many of the people I’ll be working with here at CSIS and throughout MSU (although being summer campus is quiet right now – the undergrads are gone and the postgrads are away on their fieldwork), and finally got my apartment into a liveable state. The next few weeks will no doubt be spent really getting my head around what we’re aiming to achieve with this integrated ecological-economic modelling project. For example, during the next month or two I’ll take a trip up to our study area to get a feel for the landscape, see the experimental plots that have been put in place previously, and gain a better understanding regarding the subsequent effects of timber harvesting. Also I plan on meeting and interviewing several key management stakeholders from organisations such as Michigan’s Department of Natural Resources and The Nature Conservancy to get their perspective on the landscape and what they might gain from our work. I’ve also been examining some of the tools that we hope to utilise and build upon, such as the USFS’ Forest Vegetation Simulator.

So whilst I get my head around exactly what this new project is all about, I’ll continue to blog about some of the work coming out of my Phd thesis. I’ve been threatening to do this for a while, and now I really mean it. Specifically, I’ll walk through the later stages of my thesis where I explored the potential of more reflexive forms of model validation – seeing the modelling process as an end in itself, a learning process, rather than a means to an end (i.e. the model) which is then used to ‘predict’ the future. I’ll discuss the philosophy underlying this perspective before re-examining my efforts to engage the model I produced with local stakeholders after the model had been ‘completed’ with their minimal input.

And of course, I’ll throw in the odd comment to let you know how things are going here in this new world I’ve recently landed in. Like my trip to the grey and windswept Lake Michigan at the weekend – I’m going to have to look into this kite-surfing stuff…

Posted on 2007-05-12

PhD pass!

After a gruelling three-and-a-half hour examination yesterday, my examiners Prof. Keith Richards and Prof. Eric Lambin are satisfied that I should be awarded the degree of PhD, subject to three minor amendments!

Thanks to everyone that helped me celebrate in London last night. Also, thanks to all those that helped me along the way on my PhD journey: George, Raul, David, John, David, Bruce, Shatish, Margaret, Rob, Alison, Isobel, Erin, Kat, Andreas, Ben, Chris, Gareth, Isobel, Helen, Nick, Pete, Chris, Mark, Laura, Jamie, Helen, Neil, Nicky, Javier, Livs, Mum, Dad, Michael and Mark… and anyone else I’ve forgotten! Stay in touch everyone.

I’m off across the pond to start my postdoc at MSU tomorrow. Eight great years in London at King’s over, hopefully many more to come elsewhere…

Posted on 2007-03-13

PhD Thesis Completed

So, finally, it is done. As I write, three copies of my PhD Thesis are being bound ready for submission tomorrow! I’ve posted a short abstract below. If you want a more complete picture of what I’ve done you can look at the Table of Contents and read the online versions of the Introduction and Discussion and Conclusions. Email me if you want a copy of the whole thesis (all 81,000 words, 277 pages of it).

So just the small matter of defending the thesis at my viva voce in May. But before that I think it’s time for a celebratory beer on the South Bank of the Thames in the evening sunshine…

Modelling Land-Use/Cover Change and Wildfire Regimes in a Mediterranean Landscape

James D.A. Millington
March 2007

Department of Geography
King’s College, London

Abstract
This interdisciplinary thesis examines the potential impacts of human land-use/cover change upon wildfire regimes in a Mediterranean landscape using empirical and simulation models that consider both social and ecological processes and phenomena. Such an examination is pertinent given contemporary agricultural land-use decline in some areas of the northern Mediterranean Basin due to social and economic trends, and the ecological uncertainties in the consequent feedbacks between landscape-level patterns and processes of vegetation- and wildfire-dynamics.

The shortcomings of empirical modelling of these processes are highlighted, leading to the development of an integrated socio-ecological simulation model (SESM). A grid-based landscape fire succession model is integrated with an agent-based model of agricultural land-use decision-making. The agent-based component considers non-economic alongside economic influences on actors’ land-use decision-making. The explicit representation of human influence on wildfire frequency and ignition in the model is a novel approach and highlights biases in the areas of land-covers burned according to ignition cause. Model results suggest if agricultural change (i.e. abandonment) continues as it has recently, the risk of large wildfires will increase and greater total area will be burned.

The epistemological problems of representation encountered when attempting to simulate ‘open’, middle numbered systems – as is the case for many ‘real world’ geographical and ecological systems – are discussed. Consequently, and in light of recent calls for increased engagement between science and the public, a shift in emphasis is suggested for SESMs away from establishing the truth of a model’s structure via the mimetic accuracy of its results and toward ensuring trust in a model’s results via practical adequacy. A ‘stakeholder model evaluation’ exercise is undertaken to examine this contention and to evaluate, with the intent of improving, the SESM developed in this thesis. A narrative approach is then adopted to reflect on what has been learnt.

Posted on 2007-03-03

positive thought generator

I am less than two weeks away from submitting my PhD thesis. The BBC Radio 1 Positive Thought Generator has been helping me maintain my sanity over the last few weeks…

http://www.bbc.co.uk/slink/play/games/positive/positive_gen.swf
Click the button. It’s positively uplifting.

Posted on 2007-02-09

Landscape Simulation Modelling

This is my fifth contribution to JustScience week.

The last couple of days I’ve discussed some techniques and case studies of statistical model of landscape processes. Monday and Tuesday I looked at the power-law frequency-area characteristics of wildfire regimes in the US, Wednesday and Thursday I looked at regression modelling for predicting and explaining land use/cover change (LUCC). The main alternative to these empirical modelling methods are simulation modelling techniques.

When a problem is not analytically tractable (i.e. equations cannot be written down to represent the processes) simulation models may be used to represent a system by making certain approximations and idealisations. When attempting to mimic a real world system (for example a forest ecosystem), simulation modelling has become the method of choice for many researchers. This may have become the case since simulation modelling can be used when data is sparse. Also, simulation modelling overcomes many of the problems associated with the large time and space scales involved in landscapes studies. Frequently, study areas are so large (upwards of 10 square kilometres – see photo below of my PhD study area) that empirical experimentation in the field is virtually impossible because of logistic, political and financial constraints. Experimenting with simulation models allows experiments and scenarios to be run and tested that would not be possible in real environments and landscapes.

Spatially-explicit simulation models of LUCC have been used since the 1970s and have dramatically increased in use recently with the growth in computing power available. These advances mean that simulation modelling is now one of the most powerful tools for environmental scientists investigating the interaction(s) between the environment, ecosystems and human activity. A spatially explicit model is one in which the behaviour of a single model unit of spatial representation (often a pixel or grid cell) cannot be predicted without reference to its relative location in the landscape and to neighbouring units. Current spatially-explicit simulation modelling techniques allow the spatial and temporal examination of the interaction of numerous variables, sensitivity analyses of specific variables, and projection of multiple different potential future landscapes. In turn, this allows managers and researchers to evaluate proposed alternative monitoring and management schemes, identify key drivers of change, and potentially improve understanding of the interaction(s) between variables and processes (both spatially and temporally).

Early spatially-explicit simulation models of LUCC typically considered only ecological factors. Because of the recognition that landscapes are the historical outcome of multiple complex interactions between social and natural processes, more recent spatially-explicit LUCC modelling exercises have begun to integrate both ecological and socio-economic process to examine these interactions.

A prime example of a landscape simulation model is LANDIS. LANDIS is a spatially explicit model of forest landscape dynamics and processes, representing vegetation at the species-cohort level. The model requires life-history attributes for each vegetation species modelled (e.g. age of sexual maturity, shade tolerance and effective seed-dispersal distance), along with various other environmental data (e.g. climatic, topographical and lithographic data) to classify ‘land types’ within the landscape. Previous uses of LANDIS examined the interactions between vegetation-dynamics and disturbance regimes , the effects of climate change on landscape disturbance regimes , and simulated the impacts of forest management practices such as timber harvesting.

Recently, LANDIS-II was released with a new website and a paper published in Ecological Modelling;

LANDIS-II advances forest landscape simulation modeling in many respects. Most significantly, LANDIS-II, 1) preserves the functionality of all previous LANDIS versions, 2) has flexible time steps for every process, 3) uses an advanced architecture that significantly increases collaborative potential, and 4) optionally allows for the incorporation of ecosystem processes and states (eg live biomass accumulation) at broad spatial scales.

During my PhD I’ve been developing a spatially-explicit, socio-ecological landscape simulation model. Taking a combined agent-based/cellular automata approach, it directly considers:

human land management decision-making in a low-intensity Mediterranean agricultural landscape [agent-based model]
landscape vegetation dynamics, including seed dispersal and disturbance (human or wildfire) [cellular automata model]
the interaction between 1 and 2

Read more about it here. I’m nearly finished now, so I’ll be posting results from the model in the near future. Finally, some other useful spatial simulation modelling links:

Wisconsin Ecosystem Lab – at the University of Wisconsin

Center for Systems Integration and Sustainability – at Michigan State University

Landscape Ecology and Modelling Laboratory – at Arizona State University

Great Basin Landscape Ecology Lab – at the University of Nevada, Reno

Baltimore Ecosystem Study – at the Institute of Ecosystems Studies

The Macaulay Institute – Scottish land research centre

Posted on 2007-02-08

Hierarchical Partitioning for Understanding LUCC

This post is my fourth contribution to JustScience week.

Multiple regression is an empirical, data-driven approach for modelling the response of a single (dependent) variable from a suite of predictor (independent) variables. Mac Nally (2002) suggests that multiple regression is generally used for two purposes by ecologists and biologists; 1) to assess the amount of variance exhibited by the dependent variable that can be attributed to each predictor variable, and 2) to find the ‘best’ predictive model (the model that explains most total variance). Yesterday I discussed the use of logistic regression (a form of multiple regression) models for predictive purposes in Land Use/Cover Change (LUCC) studies. Today I’ll present some work on an explanatory use of these methods.

Finding a multivariate model that uses the ‘best’ set of predictors does not imply that those predictors will remain the ‘best’ when used independently of one another. Multi-collinearity between predictor variables means that the use of the ‘best’ subset of variables (i.e. model) to infer causality between independent and dependent variables provides little valid ‘explanatory power’ (Mac Nally, 2002). The individual coefficients of a multiple regression model can only be interpreted for direct effects on the response variable when the other predictor variables are held constant (James & McCulloch, 1990). The use of a model to explain versus its use to predict must therefore be considered (Mac Nally, 2000).

Hierarchical partitioning (HP) is a statistical method that provides explanatory power, rather than predictive. It allows the contribution of each predictor to the total explained variance of a model, both independently and in conjunction with the other predictors, to be calculated for all possible candidate models. The use of the HP method developed by Chevan and Sutherland (1991) by ecologists and biologists in their multivariate analyses was first suggested by Mac Nally (1996). More recently, the method has been extended to help provide the ability to statistically choose which variables to retain once they have been ranked for their predictive use (Mac Nally, 2002). Details of how HP works can be found here.

With colleagues, I examined the use of hierarchical partitioning for understanding LUCC in my PhD study area, leading to a recent publication in Ecosystems. We examined the difference in using two different land-cover (LC) classifications for the same landscape, one classification with 10 LC classes, another with four. Using HP we found that more coarse LC classifications (i.e. fewer LC classes) causes the joint effects of variables to suppress total variance explained in LUCC. That is, the combined effect of explanatory variables increases the total explained variance (in LUCC) in regression models using the 10-class LC classification, but reduces total explained variance in the dependent variable for four-class models.

We suggested that (in our case at least) this was because the aggregated nature of the four-class models means broad observed changes (for example from agricultural land to forested land) masks specific changes within the classes (for example from pasture to pine forest or from arable land to oak forest). These specific transitions may have explanatory variables (causes) that oppose one another for the different specific transitions, decreasing the explanatory power of models that use both variables to explain a single broader shift. By considering more specific transitions, the utility of HP for elucidating important causal factors will increase.

We concluded that a systematic examination of specific LUCC transitions is important for elucidating drivers of change, and is one that has been under-used in the literature. Specifically, we suggested hierarchical partitioning should be useful for assessing the importance of causal mechanisms in LUCC studies in many regions around the world.

Technorati Tags: regression, modelling, LandUse

Direction not Destination

Blog by James Millington, PhD

Tag: MyPhD