Deriving guideline values using field-effects data

​​​​​Guideline values for indicators in aquatic ecosystems should ideally be referred to as effects that can be observed in the relevant ecosystem.

When site-specific guideline values cannot be derived for toxicants and you need to apply default guideline values (DGVs) for freshwater or marine water, you still might be able to refine the DGV to account for local water quality conditions. For example, accounting for metal bioavailability under local water characteristics.

Site-specific guideline values that account for local conditions can be derived by using:

Guideline value derivations for ecosystem receptors and associated biological assessments are implicitly local — and ecosystem-specific — because they are based on the local reference data gathered for comparison purposes. (Refer to Assessing changes in aquatic ecosystem receptors for information on guideline value derivations for ecosystem receptors.)

Local effects-based guideline values are preferred over local referentially based ones because they:

  • use information on a stressor’s demonstrated effect on ecosystems
  • account for the local ecosystem’s capacity to assimilate some changes resulting from stressor additions.

Referential approaches are based on the assumption that the ecosystem has some capacity to assimilate a limited change to the stressor within naturally occuring ranges, without actually demonstrating that this is the case. (This may not be the case for ecosystems with naturally elevated stressor levels.)

Local effects-based guideline values may be derived from laboratory studies or field studies.

Here, we consider guideline value derivations for physical and chemical (PC) stressors and toxicant stressors based on a locally relevant level of response and change in an ecosystem receptor — or ‘effect’ measured in the field. Later, we outline the inherent advantages and limitations in laboratory and field data approaches.

Using a mix of field and laboratory data in a multiple lines-of-evidence approach will usually provide the best quality of inference in most circumstances. We strongly recommend a multiple lines-of-evidence approach to guideline value derivations in the Water Quality Guidelines.

Even if field-effects data are inappropriate for deriving guideline values, they are still a critical line of evidence in an overall water quality assessment, as we discuss in Weight of evidence.

Types of field-effects data

The basis of deriving guideline values using field-effects data is to establish, as far as possible, causal or strong inferential relationships between local biological and ecological response data and corresponding PC stressor or toxicant stressor concentrations where a gradient of stressor concentrations exists (e.g. USEPA 2016).

This approach is appropriate for stressors that are directly toxic to biota (e.g. toxicants, salinity, pH, dissolved oxygen, ammonia). It can be more useful for some naturally occurring stressors, such as suspended sediment or nutrients (e.g. nutrient addition studies) that may be difficult to study in a laboratory.

Sourcing field-effects data

You can obtain ecological-effects data through site or ecosystem-specific field monitoring studies and experiments. Locally relevant published scientific literature can be used where available.

The Water Quality Guidelines do not preclude any field-based source of biological or ecological data in contributing to guideline value derivations if there is full awareness and understanding of the limitations and potential artefacts introduced in such acquired data (discussed later).

Limitations aside, the key benefit of field-effects data — particularly when multiple lines of evidence are being considered in guideline value derivations — is greater environmental realism than might be obtained in effects information arising from traditional laboratory-based, single-species toxicity tests (Buchwalter et al. 2017).

Potential sources of data span a full spectrum of complexity and environmental realism (Table 1).

Table 1 Different types of field-effects data sources, ordered from less complex systems with some control of environmental variables through to more complex systems with greater environmental realism
Types of field-effects data Types of data sources
Semi-field data In situ toxicity testing, such as caged single species (e.g. a laboratory test protocol transferred to a field setting)
Microcosms (small experimental units with a small defined set of test species within which key processes are effectively mimicked)
Mesocosms (artificial systems containing complex and self-sustaining populations or communities set in natural environmental conditions).
Field data from natural ecosystems Experimental manipulations in whole ecosystems (e.g. Faith et al. 1991)
Observations from ecosystems across an existing gradient of PC stressor or toxicant stressor concentrations.

Trade-offs exist in acquiring data with greater environmental realism, including costs and confounding by variables other than the contaminant of interest. Refer to Perceval et al. (2009), Norton et al. (2014) and Buchwalter et al. (2017) for useful reviews on acquiring field-effects data.

In practice, field observations from natural ecosystems and mesocosm data are the most commonly acquired and accepted data sources for field-effects guideline value derivations.

The use of mesocosm data for guideline value derivations:

  • was accepted in the ANZECC & ARMCANZ (2000) guidelines
  • is an accepted practice for developing environmental quality standards under the EU Water Framework Directive (e.g. Lepper 2005, Van & Emj 2007)
  • was advocated for this purpose in the United States and more broadly (Buchwalter et al. 2017).

Where data are scant, they can be augmented by relevant published literature.

Modelling is often used to predict site-specific toxicity from field data, as a support for empirical sediment quality guideline values (Batley et al. 2005, Field & Norton 2014).

Advantages of using field-effects data

DGVs for toxicants are typically derived from laboratory toxicity data but we acknowledge that DGVs derived independently of the complexities of ecosystem interactions and other considerations unique to the field context may be under- or overprotective.

Field-effects data present some major advantages (Cormier et al. 2008, Perceval et al. 2009, Norton et al. 2014, Buchwalter et al. 2017) over the traditional laboratory toxicity testing approach to deriving DGVs.

Field-effects data account for complex local interactions amongst the biota and key ecological processes, the stressors and the local environment. Such data can provide the most direct assessment of the ecosystem response to the stressor of interest. In contrast, laboratory approaches require some level of extrapolation from other situations, testing usually a limited subset of the local biodiversity or even surrogate taxa for the local species.

Poorly studied ecosystems or specialist life history requirements of some sensitive taxa may preclude husbandry and culturing for laboratory testing. To this end, field-effects studies may offer a more efficient and viable guideline value derivation approach:

  • Some ecosystem types have no relevant laboratory toxicity testing data. For example, Australia has many naturally saline inland landscapes so guideline values for salinity need to account for the natural salinity to which these ecosystems have adapted. Laboratory toxicity data are largely restricted to freshwater or marine taxa and include few salt lake taxa.
  • Field studies of freshwater in Australia and New Zealand have typically focused on aquatic insects due to inherent sensitivities and long established and standardised methodologies (refer to Protocols for Biological Assessment) but these organisms — apart from chironomids (nonbiting midges) — are difficult to rear in the laboratory. Because of this, toxicity testing using insects may be limited to acute studies based on field-collected specimens (Kefford 2013).

Some of the deficiencies of single-species, single-toxicant laboratory toxicity tests, which research is trying to address, include:

  • toxicant effects that occur via pathways other than direct waterborne exposure (e.g. dietary uptake)
  • interactions amongst multiple PC stressors or toxicant stressors
  • stressors with indirect effects (e.g. nutrients) that
    • are without a toxic chemical mode of action
    • have effects over longer periods of time or within specific habitats that cannot be adequately simulated in the laboratory (e.g. suspended and deposited sediment), or
    • bioaccumulate over extended exposure periods (e.g. Dunlop et al. 2008).

Limitations of using field-effects data

Deriving field effects-based guideline values from natural ecosystems and mesocosm data is often expensive and resource intensive.

In Europe and North America, dedicated and well-established large-scale mesocosm facilities are available but this is not the case in Australia and New Zealand.

Stressor gradients may not always be available in natural ecosystems. Most often, more than one pressure or stressor will be influencing the status of the local ecosystem. This makes it difficult to determine which pressure or stressor was the primary cause of the observed ecosystem responses, especially when pressures or stressors are expected to interact or produce similar responses.

Confounding influences in natural ecosystems can readily result in inaccurate determination of the influence of each stressor or pressure, which means that a gradient of stressor concentrations or a range of concentrations is necessary for assessment.

You can use both laboratory and mesocosm data to overcome lack of stressor gradients or confounding commonly encountered in natural ecosystems by isolating the stressor variable of interest through appropriate experimental design.

Addressing field-effects confounding

Derivation of sediment quality guideline values clearly demonstrates the implications of field-effects confounding.

The most common approach adopted for sediment quality guideline value derivations internationally, including the DGVs in the Water Quality Guidelines, has been the use of an effects database for contaminated and uncontaminated sites, based on field data, laboratory toxicity testing and sediment–pore water equilibrium partitioning predictions of contaminants.

Most of the effects used to derive the sediment quality DGVs suffer from co-occurrence of contaminants, where observed biological effects were ascribed equally to each toxicant. This prevents the observed effects being confidently assigned to any one contaminant and is the source of significant uncertainty in sediment quality DGVs.

Such limitations mean the sediment quality DGVs are only useful as a screening tool because they lack the same quality status as equivalent water quality DGVs (Simpson et al. 2013, Section 3 Part 2). In the future, we intend to provide more guidance on how to deal with multiple stressors in guideline value derivations for both water and sediment quality.

Deciding if confounding is an issue

Field-effects confounding by a number of chemicals present in mixtures is not always regarded as a constraint for setting water quality objectives, particularly where attention to one of the chemical constituents, or a surrogate measure of the mixture, simultaneously addresses water quality management generally.

For example, elevated concentrations of a number of major ions present as mixtures are responsible for loss of stream macroinvertebrates in coal mining effluents of southern Appalachian streams in the United States. In this case, electrical conductivity summates the stressor–receptor response relationship adequately for water quality objective setting (e.g. Cormier & Suter 2013, USEPA 2016).

Similarly, site-specific remediation targets for each of the constituent metals present in mine effluents do not constrain associated water quality management (e.g. Hydrobiology 2016).

Designing studies for mesocosm data

The limitations of mesocosms are widely acknowledged and include problems of ecological scaling (realism) and reproducibility, enclosure biases and size, and effects of isolation (and thus lack of ongoing recruitment) resulting in limited study durations (Perceval et al. 2009). Depending on the purpose of your study, consider these factors and select a design to minimise potential loss of information.

Broad approaches for deriving guideline values from field data and addressing confounding

Where suitable contaminant gradients are available, a few broad approaches have been taken to derive guideline values for water quality from field-effects data:

  1. Abundances or relative proportions of known water quality-sensitive taxa (e.g. Ephemeroptera, Leptophlebiidae) plotted against corresponding contaminant concentrations. An abrupt step-change can indicate a threshold concentration representing adverse effect (e.g. van Dam et al. 2014).
  2. Similar to Approach 1 but contaminant sensitivity ‘profiles’ of all species or taxa in an assemblage are examined, typically collating large datasets from regional monitoring programs where the subsequent approach may be to
    1. determine taxonomic extirpation across stressor ranges or gradients (e.g. Cormier et al. 2013, van Dam et al. 2014, USEPA 2016). Field species (or taxon) sensitivity distributions (SSDs) are then derived, based on proportion of taxa affected versus contaminant concentration.
    2. use the occurrence of each taxon across a contaminant gradient to assign each taxon a sensitivity-tolerance score, which can reveal contaminant threshold points of change in assemblage response when all taxa scores are plotted according to contaminant bins (e.g. Horrigan et al. 2005).
  1. Summary biological responses for ‘sites’ of grading contamination are compared to — or expressed as a proportion of — the same response measured at uncontaminated reference sites, or amongst different classes of site contamination. Contaminant-response curves are derived. Responses may include taxa number within a taxonomic group (e.g. Humphrey & Chandler 2017) or from multiple taxonomic groups (e.g. Hydrobiology 2016), or measures of multivariate similarity (e.g. Kefford et al. 2010, Humphrey & Chandler 2017).

For Approach 2(a) and Approach 3, there is typically an interpolation in the derived response–concentration relationships of a conservative ‘effect’ concentration (e.g. 95% protection) that defines the guideline value.

Drawing inference from collated lines of evidence

A field-effects study deriving PC stressor or toxicant stressor guideline values would typically collate necessary lines of evidence to draw inference about contaminant effects on the biological response of interest (Humphrey & Chandler 2017), including steps to:

  1. establish and describe the significance and nature of relationships between the contaminant(s) of interest and the biological responses measured
  2. compile and weigh the strength of evidence discounting other potential explanations (confounders) for the responses observed
  3. assess consistency in observed response to those demonstrated in relevant studies elsewhere, including corresponding laboratory-effects data.

Developing new approaches to use field data

Internationally, some traditional or improved ways to isolate the effects of stressors, and thereby use field data for deriving water quality guideline values, are available or are being developed. Improvements are focusing on methods to:

  • disentangle effects of confounding stressors (Downes 2010; Suter II & Cormier 2013)
  • use more sensitive and diagnostic biological indicators (e.g. genomic information, stressor-specific biomarkers, toxicodynamic–toxicokinetic models) (Chariton et al. 2016)
  • create new data analysis tools that more deeply interrogate complex field data to determine
    • which stressor was most likely to have caused a measured effect (causality), or
    • a threshold concentration above which an effect is expected (van Dam et al. 2014, Chariton et al. 2016).

Better experimental and analytical approaches that have been used to derive guideline values from field-effects data commonly take statistically robust approaches to fit measured ecosystem responses across a range of stressor levels. Because many of these approaches have not yet been accepted as standard methods for deriving guideline values, it is essential that their application and associated documentation is as rigorous and transparent as possible. Accordingly, independent review of such guideline value derivations is considered essential.


ANZECC & ARMCANZ 2000, Australian and New Zealand Guidelines for Fresh and Marine Water Quality, Australian and New Zealand Environment and Conservation Council and Agriculture and Resource Management Council of Australia and New Zealand, Canberra.

Batley GE, Stahl RG, Babut MP, Bott TL, Clark JR, Field LJ, Ho KT, Mount DR, Swartz RC & Tessier A 2005, Scientific underpinnings of sediment quality guidelines, in: Wenning R, Batley G, Ingersoll C & Moore D (eds), Use of Sediment Quality Guidelines and Related Tools for the Assessment of Contaminated Sediments, , SETAC Press, Pensacola, pp. 39–119.

Buchwalter DB, Clements WH, & Luoma SN 2017, Modernizing water quality criteria in the United States: a need to expand the definition of acceptable data, Environmental Toxicology and Chemistry 36(2): 285–291.

Chariton AA, Sun M, Gibson J, Webb JA, Leung KMY, Hickey CW & Hose GC 2016, Emergent technologies and analytical approaches for understanding the effects of multiple stressors in aquatic environments, Marine and Freshwater Research 67 (4): 414–428.

Cormier SM, Paul JF, Spehar RL, Shaw-Allen P, Berry WJ & Suter II GW 2008, Using field data and weight of evidence to develop water quality criteria, Integrated Environmental Assessment and Management 4 (4): 490504.

Cormier SM & Suter II, GW 2013, A method for deriving water-quality benchmarks using field data, Environmental Toxicology and Chemistry 32: 255–262.

Cormier SM, Suter II GW, Zheng L & Pond GJ 2013, Assessing causation of the extirpation of stream macroinvertebrtes by a mixture of ions, Environmental Toxicology and Chemistry 32(2): 277–287.

Downes BJ 2010, Back to the future: little-used tools and principles of scientific inference can help disentangle effects of multiple stressors on freshwater ecosystems, Freshwater Biology 55 (Suppl. 1): 60–79.

Dunlop JE, Kefford BJ, McNeil VH, McGregor GB, Choy S & Nugegoda D 2008, A Review of Guideline Development for Suspended Solids and Salinity in Tropical Rivers of Queensland, Australia, Australasian Journal of Ecotoxicology 14 (2&3): 129–142.

Faith DP, Humphrey CL & Dostine PL 1991. Statistical power and BACI designs in biological monitoring: comparative evaluation of measures of community dissimilarity based on benthic macroinvertebrate communities in Rockhole Mine Creek, Northern Territory, Australia, Marine and Freshwater Research 42(5), 589–602.

Field LJ & Norton SB 2014, Regional models for sediment toxicity assessment, Environmental Toxicology and Chemistry 33: 708–717.

Horrigan N, Choy S, Marshall J & Recknagel F 2005, Response of stream macroinvertebrates to changes in salinity and the development of a salinity index, Marine and Freshwater Research 56(6): 825–833.

Humphrey C & Chandler L 2017, Use of Field-effects Information to Inform Surface Water Guideline Values for Magnesium Sulfate in Magela Creek, Supervising Scientist Report (in press), Darwin.

Hydrobiology 2016, Rum Jungle Impact Assessment, Hydrobiology Pty Ltd, Brisbane.

Kefford BJ 2013, Rapid Tests for Community-Level Risk Assessments in Ecotoxicology, in: Férard JF & Blaise C (eds), Encyclopedia of Aquatic Ecotoxicology Springer Netherlands, pp. 957–966.

Kefford BJ, Schäfer RB, Liess M, Goonan P, Metzeling L & Nugegoda D 2010, A similarity-index based method to estimate chemical concentration limits protective for ecological communities, Environmental Toxicology and Chemistry 29(9): 2123–2131.

Lepper P 2005, Manual on the Methodological Framework to Derive Environmental Quality Standards for Priority Substances in accordance with Article 16 of the Water Framework Directive (2000/60/EC) PDF [287 KB, 51 pages], Fraunhofer-Institute Molecular Biology and Applied Ecology IME, Schmallenberg.

Norton SB, Cormier SM & Suter II GW (eds) 2014, Ecological Causal Assessment, CRC Press.

Perceval O, Caquet T, Lagadic L Bassères A, Azam D, Lacroix G & Poulsen V 2009, Mesocosms: Their value as tools for managing the quality of aquatic environments, in: Onema meetings Recap, prepared from Ecotoxicology symposium, 14 to 16 October 2009, Le Croisic.

Simpson SL, Batley GB and Chariton AA 2013, Revision of the ANZECC/ARMCANZ Sediment Quality Guidelines, CSIRO Land and Water Science Report 08/07, CSIRO, Canberra.

Suter II GW & Cormier SM 2013, A method for assessing the potential for confounding applied to ionic strength in central Appalachian streams, Environmental Toxicology and Chemistry 32 (2): 288–295.

USEPA 2016, Draft Field-Based Methods for Developing Aquatic Life Criteria for Specific Conductivity, Unites States Environmental Protection Agency, Washington DC.

Van VP & Emj V 2007, Guidance for the Derivation of Environmental Risk Limits within the Framework of ‘International and national environmental quality standards for substances in the Netherlands’ (INS). Revision 2007, RIVM Report 601782001, Dutch National Institute for Public Health and the Environment, Bilthoven.

Van Dam RA, Humphrey CL, Harford AJ, Sinclair A, Jones DR, Davies S & Storey AW 2014, Site-specific water quality guidelines: 1. Derivation approaches based on physico-chemical, ecotoxicological and ecological data,​ Environmental Science and Pollution Research 21(1): 118–130.