Study design preparation and planning

Before reading this page, you should read:

When preparing and planning a study design you should consider:

extent of spatial boundaries
appropriate scales for collecting and reporting data
appropriate study length
whether or not to conduct a pilot study.

A good understanding of variation is also important. Knowledge of between-site and within-site variation, and short and long-term temporal variation, will inform decisions about sample numbers, precision and spatial and temporal sampling intensity.

Use our guidance to plan the specifics of a statistical design that will meet monitoring program objectives. You must have explicitly and clearly written objectives to do this. Program objectives hold the key to informing the design planning. If they are too vague, then some considerations will be open to subjectivity, which may render the resulting design ill-equipped to generate the necessary data.

Spatial boundaries

Defining spatial boundaries is important. Inappropriate boundaries might divert the study away from important driving or consequential factors. This decision should be justified by the monitoring program objectives, rather than being controlled by the desire for convenience or budget constraints.

For example, in an investigation of effects of catchment activities on rivers, lakes and estuaries, the spatial boundaries would normally be those of the catchment.

For a study of the water quality in a river and receiving coastal environment, the monitoring objectives should provide sufficient detail to determine if the study should look only at these water bodies (e.g. descriptive or assessing change studies) or if it should include tributary creeks or the broader catchment (e.g. gaining system understanding).

Scale

Scale refers to the spatial and temporal ranges over which a system is observed. That is, the appropriate level of resolution to answer the questions of concern. Different processes operate at different scales:

movement of sediment in a river system may take decades at the catchment scale
toxicant effects may occur over days and may be localised
nutrient enrichment may occur over many kilometres and the response may take weeks.

The scale of the study should be chosen in relation to the study’s objectives after considering the measurement opportunities at the various possible scales and the likelihood of collecting reliable and valid measurements. The cost of data collection at the various scales should be evaluated; it may constrain the optimal study design.

The distribution of measurement variables can become more heterogeneous and patchy as the spatial extent of data collection gets larger. You may need more replicate samples to achieve the same confidence in the results. Stratification of the extensive system may help to reduce large-scale variability and improve design efficiency.

Some objectives may demand monitoring over a variety of scales (in both space and time). For example, reporting may be required at multiple scales, from local to regional to statewide. The design should aim to accommodate these requirements. Most likely, this means that your sampling will need to be more complex than a standard design to generate sufficient relevant data.

Study duration

From the monitoring objectives it is not always clear how long a study should be conducted to address the issue of concern.

Sometimes it is not possible to determine the length of a study without first collecting and examining data. For example, given the variability of natural rainfall and hence streamflow, what length of time might be required to achieve an appropriate understanding of the system?

The appropriate length of the study is an important decision. Few hydrologists would make definitive statements on the quantity of water resources with data from only 2 or 3 years, yet frequently conclusions from water quality studies are expected from less than 3 years of data. What is a reasonable duration for such a study? How long will it take for a sufficient variety of rainfall events (from droughts to floods) to be experienced to allow a study of the system under extremes?

Pilot studies

Sampling programs can be costly. Try to optimise the sampling program so as to address the hypotheses posed by the program.

One way to optimise a study is to ensure that the behaviour of key variables is well understood. This includes short and long-range spatial variation, short and long-term temporal variation and measurement error.

As discussed in Dobbie & Negus (2013), the measurement error sometimes represents the pooled sources of variation that were not quantified nor are known from other sources. It is desirable that the number of samples generated should be as large as feasible to provide accurate estimates of variation. Small sample numbers may yield unreliable information, potentially leading to poor decisions in optimising the sampling program.

Power analyses can help to identify the number of samples or replication required to detect meaningful changes in water quality. Harcum & Dressing (2015) considered the minimum detectable change as the smallest amount of change in a pollutant concentration or load during a specific time period that is required for the change to be considered statistically significant. A power analysis is used to identify the sampling requirements to assess minimum detectable change with an agreed probability.

Bayesian approaches to power and sample-size calculations often provide a natural framework to use historical data to estimate samples sizes, and update as new data is collected by virtue of the posterior distribution (Sahu & Smith 2006). As evidence accumulates, it may be possible to cease the data collection sooner.

Pilot studies are usually conducted to:

establish the nature of the prevailing system
trial unknown design attributes
inform design parameter decisions before investing more heavily in the ultimate study.

Alternatively or additionally, existing historical data that is spatially and temporally relevant may help inform design parameters, or published studies might provide sufficient detail about variation to help with study design.

For example, oxygen concentrations are known to vary diurnally and to differ between the epilimnion (upper layer) and hypolimnion (lower layer) in stratified lakes. Phosphorus, bound to sediment, is known to be transported during rainfall events implying spatial variation is partially temporally controlled.

The intellectual challenge is to design a sampling approach that minimises errors. The errors in accurately representing a water body or population by a sample, and a sample by a subsample, can far exceed errors in analysis (Gy 1986). These and other statistical sampling issues are reviewed by Helsel & Hirsch (1992).

Other benefits of pilot studies are that they give field staff site-specific training and allow anticipation of potential hazards and logistical problems. Most practitioners recommend that a significant fraction of total project resources should be dedicated to a pilot study; Keith (1991) recommended 10 to 15%.

Spatial design considerations

It is important that the site selection mechanism (spatial design) provides appropriate and adequate spatial information to address the monitoring program objectives and account for between-site variation.

Ecological impacts

When ecological impacts are being assessed, sites will normally be located in the vicinity of the likely disturbance to generate data on the ecological response and causes of the impacts (identified by the underlying conceptual model).

If identifying spatial trends is the objective, then ensuring sufficient sample sites are located at varying spatial resolutions will increase confidence in the findings.

For example, place more sites closer to a point-source disturbance where there may be higher variability in the response and to delineate where the changes may be occurring, then place fewer sites further away from that point source where lower spatial variation is expected and the focus is more on confirming that there are no potential impacts.

If condition or status assessments are the main objective, then spreading the sites out over the population domain will help give rise to good spatial coverage and ensure a ‘representative’ sample is obtained for assessing condition. Relevant reference sites (typically unimpacted or under little impact) should be selected as part of the sample to determine relative condition more generally.

When control or reference sites are included in the design, take care to ensure that they are closely matched with the spatial domain being assessed.

Sometimes information on covariates collected at all the sites can be used to adjust the values of measurement parameters for inherent differences between the sites. The assumptions of the statistical analyses of such data need to be met (refer to Data analysis).

For example, in studies of metals in sediments, sediments from the reference site should have grain size and organic content similar to those from the test sites.

Further details about the reference condition assessment technique are discussed in Hughes (1995), Bailey et al. (2004) and Dobbie & Negus (2013).

Seasonal and local variation

When selecting appropriate sampling locations, you should consider the possibility of seasonal variations, and local variations in other parameters to be measured (e.g. sources of contaminants), by referring not only to the pilot study but also to historical data.

You could source hard data, or records of activities in the catchment, aerial photographs, plans and maps of land use, and oral or other records of the sites and catchments under investigation.

After reviewing this information, you may find, for instance, that the water quality should be monitored not only in major surface waters and groundwaters that might receive inputs of substances from diffuse sources but also in small creeks hydrologically connected with those waters.

Artefacts

It might be important that sites are selected to minimise any artefacts from human interventions that are not part of the monitoring program.

For example, flow may be modified around jetties or bridges, and that may affect some benthic measurement parameters, resulting in spurious data if the effect of the jetty or bridge is not the focus of the monitoring program.

Weirs and similar structures in rivers often alter both the flow and the physical and chemical conditions. Sampling sites need to be located far enough upstream or downstream of such structures if the water quality of the free-flowing water is the major focus of the monitoring program.

Distance and time between samples

If survey sites are located too close together, or samples are collected at too close a time interval, autocorrelation or serial correlation between or within sites can invalidate the assumptions of independence made in some classical statistical designs (refer to Sample survey design).

What constitutes too close (spatially or temporally) depends on the nature of the measurement parameter and, if applicable, the dispersion of the contaminant. The nature of the parameter can be explored through a pilot study if insufficient relevant historical data are available.

You should consider whether or not to select alternative sites, or if sufficient data can be collected to implement designs that can model these spatial patterns properly.

Networks of sites

Some water quality monitoring programs (usually based on chemical and physical measurement parameters) rely on networks of sites. High costs are associated with monitoring sites that are spread out and difficult or time-consuming to reach. Monitoring programs should be optimised with regard to networks and sampling.

A number of spatially optimal water sampling plans exist, and their merits have been reviewed (Dixon & Chiswell 1996). Dobbie et al. (2008) provided a comprehensive review of approaches to spatial design to enable informed decisions about developing practical and optimal spatial designs for future monitoring of stream networks.

Pragmatic considerations for site selection

If the monitoring program requires sampling under difficult or potentially dangerous conditions, then ensure safe work practices are applied under relevant legislation and codes. If the work cannot be carried out safely, then it may be necessary to revisit the objectives, as well as the design. Not being able to sample under particular conditions may lead to systematic biases. For instance, omitting wet-season sampling could overlook the period of highest contaminant risk in some circumstances. Most companies and government agencies have their own Work Health and Safety procedures in place, which may impact on the sample design, as well as its implementation.
Sites need to be accurately identified and pinpointed so that they can be sampled repeatedly. Location information through devices with global positioning system (GPS) is readily available and easy to use.
Groundwater quality monitoring programs often require a carefully staged approach given the potential temporal lags between a disturbance or event and observing chemical signals in groundwater monitoring data. A staged approach needs to account for the local geology and hydrogeology, the vulnerability of the aquifers to contaminants and the land use, and any expectation around the pattern of changes in water quality that may follow a disturbance or event.

Considerations for within-site variation

Spatial variation may exist within a site (a specific location indicated by GPS coordinates or a more general place, such as a particular lake or part of an estuary).

Within-site variation must be quantified in the monitoring program, otherwise the estimates of the chosen measurement parameter may be imprecise or even inaccurate.

In thermally stratified waters, for example, the depth of sampling is important because the concentrations of many measurement parameters (e.g. hydrogen ions, dissolved oxygen, nitrate, hydrogen sulfide, plankton) can vary greatly between the top and bottom layers.

In rivers, samples taken from the edge rather than from mid-stream are likely to contain quite different amounts of suspended material and therefore different amounts of various compounds bound to the particulate matter.

In benthic sampling for biological parameters (e.g. invertebrates, algae) or for sediments, the habitats or sediment types may vary at a site. In formal terms, these different habitats or water types within a site are called strata.

Options for dealing with strata

Stratification in the measurement parameter will affect the data obtained. You have 3 options for dealing with such strata.

Restrict the scope of the inference to a particular stratum

This may require refinement of the monitoring objectives. For example, if sandy sediments dominate the substrate at all the study sites, then it may be sensible to confine sampling to sandy substrates. The stakeholders must be made aware that the inferences drawn are applicable only to ‘sandy substrates within the sites’ and cannot be generalised to strata that were not sampled within the sites.

Divide the sampling effort among the strata

Your goal will be to estimate the value of the measurement parameter for each site as a whole, rather than for a stratum within the site. Stratified random sampling (SRS) is an example of this procedure that is fully explained in basic texts (e.g. Cochran 1977, Elliott 1977). The number of sample units allocated to each stratum can be determined by the relative sizes (e.g. area or volume) of each stratum, or by the within-stratum variation of the measurement parameters.

Make separate estimates for each stratum

If this is consistent with the study objectives, you may want to identify the nutrients in each stratum. For example, at each site in a reservoir, separate nutrient samples collected from the epilimnion and the hypolimnion (two strata) are kept separate throughout the analyses.

Plan for within-site variation

When measurement parameters are being sampled in the water column, it is sometimes assumed that the water is well mixed and that a mid-water or mid-stream sample will be sufficiently representative. This may not be the case. Even in fast-flowing mountain streams, water can be observed flowing upstream in eddies.

In larger rivers, tributary water may not mix fully with the mainstream for many hundreds of metres or even kilometres.

In estuarine waters, salinity may be significantly stratified, and all water bodies can have gradients of redox potential and temperature.

Even if the monitoring goal is to measure the average concentration of a chemical in the water at a site, the sampling process must be planned so that within-site variation is included in the estimate.

The same situation applies to the monitoring of aquifers, where groundwater quality is almost always stratified vertically, and where there can be significant lateral variation in quality (e.g. in areas with multiple point sources or variable diffuse sources of contamination).

Natural spatial variability is potentially much greater in groundwater, where there is much less dispersion of contaminants, than in surface waters.

Temporal design considerations

The objectives of the monitoring program dictate the basis for determining sampling frequency. A program to detect conformity with a guideline value might be based on daily, weekly or quarterly sampling.

You must decide whether seasons are important. For example, wet versus dry season in the tropics, winter versus summer in temperate regions.

Patterns in time include natural systematic changes, ranging from tidal cycles to larger-scale events, such as the El Niño–Southern Oscillation. These may be periodic and predictable (e.g. tides, seasonal filling of a wetland) or nonperiodic (e.g. storms or floods in streams).

Within these events, there are unpredictable variations, such as changes in the recruitment of a species after minor natural disturbances or changes in the concentrations of chemical measurement parameters after rainstorms.

Nonperiodic events, such as storms and associated runoff, can have a dramatic impact on water quality that might be missed by sampling on a fixed-time interval. If you identify this possibility during preliminary assessments, you must design a sampling program that includes these events.

Rapid changes in flow can profoundly affect the concentrations of measurement parameters and therefore the representativeness of sampling. Even under relatively stable flow conditions, you must measure hydrological parameters when sampling if the program is measuring the loads of a measurement parameter rather than its concentration.

You must understand the system and the problem or issue being investigated, as illustrated by its co nceptual process model, before you can select appropriate time intervals for sampling.

The program objective and the expected statistical analyses can both influence the time interval chosen between samples.

For example, you may want to be 95% certain of detecting a 5% increase in nutrient levels. Once this objective is set, it will be a relatively straightforward statistical matter to determine the frequency of sampling.

The values of a particular measurement parameter may not vary at all time scales.

If a measurement parameter has a predictable temporal pattern (e.g. recruitment with onset of the wet season, or deoxygenation during thermal stratification), then the monitoring program must sample the parameter at a frequency that suits its periodicity. Then you can estimate trends, especially in response to chronic or ‘press’ impacts.

If a disturbance is only likely to take place at a certain time of year — for example, mine wastewater discharge during the wet season in Kakadu (Humphrey et al. 1995) — then sampling can be targeted to such predictable ‘pulse’ disturbances.

At the other extreme, to measure the effects of highly variable and unpredictable disturbances (e.g. stormwater discharges), the monitoring program must sample at several time scales.

For example, in the Great Barrier Reef Catchment Loads Monitoring Program (Wallace et al. 2016, Joo et al. 2012), more sampling effort is often placed over the wet season that the dry season, and within the wet season event, sampling takes place where changes in streamflow or discharge automatically trigger water quality samples being taken so as to target those events.

Some measurement parameters give snapshots of immediate condition. Some are integrating measures that reflect conditions over the preceding months.

Basis for time-scale decisions

Characteristics of the parameter being measured.
Purpose of the data collection.
Statistical or other tools that will be used to interpret the data (e.g. for time-series analysis you may have to decide on and set a definite sampling interval).
Characteristics of the response of interest (e.g. weekly measurements might be appropriate for measuring the development of an algal bloom but not for investigating fish; the generation time of the organism might be the critical determinant of time scales).
Recognition that a process cannot be measured if it takes longer to happen than the period over which measurements are made.

Sampling at inadequate frequencies

Frequency of sampling becomes very important when the monitoring program aims to ensure that guideline values or standards are not exceeded.

Figure 1 shows some possible misinterpretations that arise from sampling at inadequate frequencies. The data values obtained at the selected sampling frequency are all below the water quality guideline value. Excursions above the guideline value between samplings were not evident at the selected sampling frequency.

**Figure 1 Frequency of sampling: interpretations of sampling data (modified from Maher et al. 1994)**

Identifying an appropriate sampling frequency can be challenging; it needs to be closely tied to the water quality objectives.

If more frequent data exist, then it is possible to look at the different sources of variation, or how well less frequent sampling will reproduce that underlying time series. This may even be possible by comparing the site of interest to similar sites.

You may use a number of approaches to help choose sampling frequency.

For example, Percival et al. (2011) used wavelets to identify the importance of different temporal changes in temperature in Wivenhoe Dam, South East Queensland. Khalil & Ouarda (2009) reviewed different approaches to select sampling frequency: use of trend analyses and harmonic analyses, characterising the covariance function or semivariogram, and the concept of entropy.

Your conceptual understanding of the site will be important too.

Different water quality variables or catchments behave differently. Intensive sampling may be critical in small flashy catchments, but less frequent sampling may be adequate in larger slow-moving catchments (Johnes 2007, Bowes et al. 2009).

Specific concerns for biological measurement variables

Biological sampling should account for the time dependence of an organism’s behaviour.

For example, Magmann (1991) re-examined a US study on the northern red belly (Phoxinus eos) and finescale dace (Phoxinus neogaeus), in which the population densities of both fish species were reported to be at their highest at or near the shore. Based on a trapping period of 16 to 18 hours starting at 4:00 to 7:00 pm, the original conclusion was that both species exploited the same microhabitat. However, the initial study failed to recognise that the northern red belly has a diurnal pattern of inshore–offshore migratory behaviour. These fish swim in shoals in the inshore zone (< 0.5 m depth) during the day and migrate to the offshore zone (> 2 m depth) at sunset, when shoals break up into single fish, then return to the inshore zone at sunrise. A shorter sampling interval (3 to 4 hours) was required to observe this movement. The density of fish offshore seemed to be lower because the fish shoals had broken up.

Biological variables may have the problem of serial correlation because of the long life span of the organisms involved. For example, the size of fish populations may depend on year-to-year variations in recruitment that may not be consistent across all the sites included in a study. Auxiliary data on the age-structure of the populations would be necessary to unravel these effects.

Serial correlation can affect the appropriateness of and inference from some statistical methods. Positive serial correlation typically reduces the effective sample size. If ignored, confidence intervals may be narrower than they would otherwise be if the samples were independent. Refer to Data analysis for discussion of these concerns and some approaches that incorporate the correlation in the statistical method.

Specific concerns for chemical and physical measurement variables

In still or slow-moving surface waters, special care should be taken when measuring some chemical and physical variables, such as dissolved oxygen and pH levels. The values can change dramatically in these waters during the day, through photosynthesis and respiration.

For example, dissolved oxygen must be measured before sunrise to obtain the diurnal minimum. Diurnal pH fluctuations occur when carbon dioxide concentrations vary, and the pH decreases at night when dissolved carbon dioxide and carbonate accumulate in the absence of photosynthesis.

Sampling at a standardised time of day is often the best way to monitor long-term trends because it ensures a consistent comparison.

If concentrations are being used to calculate loads, then it will be important to decide how to relate flow and concentrations, and on what time basis.

When selecting the frequency of sampling, consider 4 common types of system where the major determinants of water quality are:

base flow and point-source discharges
runoff (volume of storm event) and nonpoint sources
remobilisation, or
diurnal cycles (tidal cycles or biological activity).

Changes in flow

Australian freshwaters are characterised by highly variable flow, which is a major issue that affects both water quality and biology (refer to our guidance for temporary inland waters).

Changes in flow can alter water quality parameters rapidly — and sometimes unpredictably — for several reasons:

Hydrological changes alter the relative proportions of discharge originating from runoff, baseflow and groundwater. Runoff water may be of better quality than groundwater and baseflow (Hart et al. 1987) but not always. In a hydrograph, this is often observed as a period of decreasing electrical conductivity during the rising limb (concentration curve), followed by rising electrical conductivity during the falling (recession) limb. Alternatively, runoff may contain increased concentrations of nutrients (from fertilised fields, urban areas or sewage treatment plants) or heavy metals and organic compounds (from contaminated sites).
Rainfall events in a catchment may give different patterns of water quality depending on their locations.
Deliberate releases of contaminants may coincide with extreme hydrological conditions to take advantage of the large dilution factors available then.
Extreme rainfall conditions may breach bunds and other containment devices used for the retention of contaminants.
Very large changes in water quality may occur in temporary water courses during extreme recessional flow and during the ‘first flush’ of new flow. During the first flush, chemical species that have accumulated in catchment soils and near-surface groundwater (sometimes acidic from organic degradation or sulfide oxidation) may dramatically alter the concentrations of some measurement parameters.

Total heavy metal and phosphorus concentrations are correlated with flow discharge in many water bodies, particularly during the early part of a stormwater runoff or flood event. This is when we see more suspended solids and higher concentrations of associated heavy metals and phosphorus. If a river is only sampled at base flow for chemical and physical measurement parameters, then the resulting data will not truly represent the natural range of heavy metal and phosphorus concentrations in the water body.

Similarly, in the initial stages of an algal bloom, the numbers of algal cells may double every 2 to 3 days. If the monitoring program is measuring some aspect of nutrient fluxes, then sampling needs to reflect flow events that transport materials into and through the aquatic system.

Catchment exports to rivers and estuaries can be assessed by intensive sampling during events.

Sampling under different flow regimes is not traditionally considered when designing a monitoring program but, in the majority of Australian rivers, most (70 to 90%) of the annual flow and constituents are discharged under high-flow or event conditions, although such conditions may only account for only 1 to 10% of the time. Dominant water-quality processes under high-flow or event conditions are the transport and deposition of discharged material during the flow event.

Where the issues underlying the monitoring program relate to flow, consider the entire hydrograph, or at least the importance of:

flow-based monitoring and capturing first-flush and peak events
measuring and recording flow data in conjunction with analyte concentration data obtained at the same time
sampling and obtaining information at all flow regimes, including low flow, so that water quality can be described for all conditions of the water body.

Using robust and reliable automatic sampling devices can help to solve the difficulty of sampling at all flow regimes. These devices can be automatically triggered by rising or falling flow to take samples at predetermined times or stream heights, which will show changing constituent levels throughout an event. You can use these data with the flow data to calculate the contaminant export (or load) for a storm or flood event.

Monitoring effort can be reduced over the long periods between events because there is little water quality variation during these conditions. An exception to this condition is the case of base flow with point-source discharges.

The sampling plan and study design for sediments must consider the effects of natural layering, mixing and variations in particle size and porosity on the indicator being sampled. The likelihood of disturbance and cross-contamination during sampling must not be forgotten. Suspended sediments need to be collected in a representative manner (Batley 1989), as do sediment pore waters. Refer to our guidance on field sampling for sediments .

For all samples of physical and chemical variables, take care to prevent the values of indicators changing during storage and transport.

Sample numbers and precision

Choosing an appropriate number of samples to be collected will help you to achieve the monitoring program objective. This will largely depend on the nature of the investigation but may also be informed by a pilot study.

In descriptive studies or cause-and-effect studies (improve system understanding), the number of samples will determine the power of the data to assess differences.

In studies that measure change, there must be enough samples to detect the minimum effect — or smallest differences or changes— that will cause management action; called the effect size (Keough & Mapstone 1995, page 102).

You must decide on the required precision and accuracy. Answering these questions may help:

How many samples are needed to precisely measure each variable at each site on each sampling occasion?
How many samples can the monitoring program afford to collect?

Base your decision on the results of a pilot study or on other reliable estimates of the variance and the costs of sampling (Keough & Mapstone 1995). Choosing an appropriate level of replication is not a simple decision (Segar et al. 1987; Mapstone 1995) because it must:

be scientifically attainable
be cost-effectively procured through a sampling and analysis program
detect minimum differences or changes that may affect any community values placed on the water body, including ecological values or sociocultural values
ensure adequate precision to minimise the risks of making a Type I error (falsely detecting a meaningful disturbance or environmental impact when one has not occurred; giving a false alarm) or a Type II error (missing a meaningful environmental impact if it has occurred; giving a false sense of security).

The smallest differences or changes that must be detected determine the required number of spatial and temporal replicates (Norris & Georges 1986) and the required precision.

For example, if a copper guideline concentration is 5.00 µg/L, then is it important, environmentally, to be able to detect 5.01, 5.10 or 5.50 µg/L? This not the same as statistical significance.

Once the size of the differences or magnitude of the trends that must be detected has been established, then the number of replicates required can be calculated. This is effectively an application of statistical power analysis, explained in many introductory texts (e.g. Cohen 1988, Sokal & Rohlf 1995). Various formulascan be used to calculate the required numbers of replicates (e.g. Norris et al. 1992, Keough & Mapstone 1995), many of which are available in statistical software tools. You should be aware of the distributional assumptions behind such formulas (refer to Data analysis).

Decisions about optimum sample sizes for more complex designs may be difficult (Green 1989, 1994, Norris et al. 1992, Keough & Mapstone 1995, 1997). Professional statistical assistance may be required.

One important consideration during the process of determining the number of replicates required for the monitoring program is to ensure that the replicates are true replicates.

Pseudoreplication occurs when there is correlation between the replicates (Hurlbert 1984). This may occur because replicates are spatially correlated or correlated by virtue of the collection process. The effect of pseudoreplication is that it overestimates the effective sample size, which can result in smaller apparent standard errors than occur in reality. Pseudoreplication cannot be avoided in some cases, but it is important to interrogate the design and be very clear on where the replication is occurring and how it may affect any inferences.

There are trade-offs with costs, but unless the sampling is done in a way that enables the required data to be collected, it cannot hope to answer the study objectives. If you find that resources are limiting, you may need to reconsider the sampling objectives.

Decide on the sampling effort that is required to test critical hypotheses, if these are being used. If precision will be below that at which the critical hypotheses can be tested, then the proposed study design is a waste of time and money.

If the information generated by the monitoring program is to be used to make decisions, priorities will often be based on the risks associated with making poor decisions. Risk is often viewed not in environmental terms but as political or social costs.

Data requirements

After making decisions about study type, spatial boundaries and measurement parameters, you must summarise the data requirements.

Data requirements include measurement parameters, scale, geographic locations and length of study, frequency, accuracy and precision. These serve as the ‘concrete’ instructions for making decisions about techniques required for data analysis and the design of specifically tailored sampling and analysis programs.

Cost-effectiveness of sampling programs

It is preferable for the cost of sampling programs to be as low as possible while still meeting the stated objectives of the monitoring study.

Considering cost-effectiveness involves a trade-off between loss of statistical power for discriminating between various hypotheses and the cost of data acquisition.

You need to determine all the resources and associated costs required, to ensure the study can be completed.

Costs of data acquisition are determined by:

number of sampling stations
number of sampling occasions
level of replication
cost of collecting samples (staff, transport, consumables)
cost of analysis
cost of data handling and interpretation (cost of reporting).

Refer to information about the optimisation of sampling programs with regard to precision and cost (Montgomery & Hart 1974, Eberhart 1976, Ellis & Lacy 1980, Short 1980, Bailey et al. 1984, Lettenmaier et al. 1984, Hayes et al. 1985, Radford & West 1986, Kratochvil 1987, Downes et al. 2002).

Consideration of reporting schedules

During the study design process, it is important that the primary users and the suppliers of the information agree on the reporting schedules.

If the expected schedules are unreasonable, compromise arrangements need to be made. Promising more than can be delivered within a certain time places unnecessary pressure on the monitoring team. Failure to report findings on time will damage relationships between the information user and the supplier.

All stages of the monitoring program will have their own time frames that must be considered when agreeing to a reporting schedule.

For example, the monitoring of a range of river flows will take months or years. The time frames for reporting on laboratory analyses will vary significantly depending on the analyte.

The design process should consider the reporting needs and expectations of all other stakeholders and information users.

Next steps:

References

Bailey DA, Johnson D & Woolloff DA 1984, The Development of a Comprehensive Sampling Program for Yorkshire Water Authority, Journal of the Institution of Water Engineers and Scientists 38: 435–456.

Bailey RC, Norris RH & Reynoldson TB 2004, Bioassessment of Freshwater Ecosystems: Using the Reference Condition Approach, Kluwer Academic Publishers, Boston.

Batley GE 1989, Collection, preparation and storage of samples for speciation analysis, inn: Batley GE (ed.), Trace Element Speciation: Analytical Methods and Problems, CRC Press Inc, BocaRaton, pages 1–24.

Bowes MJ, Smith JT & Neal C 2009, The value of high-resolution nutrient monitoring: a case study of the River Frome, Dorset, UK, Journal of Hydrology 378 (1–2): 82–96.

Cochran WG 1977, Sampling Techniques, 3rd Edition, Wiley, New York.

Cohen J 1988, Statistical Power Analysis for the Behavioural Sciences, 2nd Edition, Lawrence Earlbaum Associates, Hillsdale, New Jersey.

Dixon W & Chiswell B 1996, Review of aquatic monitoring program design, Water Research 30(9): 1935–1948.

Dobbie MJ, Henderson BL & Stevens DL Jr 2008, Sparse sampling: spatial design for aquatic monitoring, Statistics Surveys 2: 113–153.

Dobbie MJ & Negus P 2013, Addressing statistical and operational challenges in designing large-scale stream condition surveys, Environmental Monitoring and Assessment 185: 7231–7243.

Downes BJ, Barmuta LA, Fairweather PG, Faith DP, Keough MJ, Lake PS, Mapstone BD & Quinn GP 2002, Monitoring Ecological Impacts: Concepts and Practice in Flowing Waters, xii + 434 pp, Cambridge University Press, Cambridge.

Eberhardt LL 1976, Quantitative ecology and impact assessment, Journal of Environmental Management 4: 213–217.

Elliott JM 1977, Some Methods for the Statistical Analysis of Samples of Benthic Invertebrates, 2nd Edition, Freshwater Biological Association, Ambleside.

Ellis JC & Lacy RF 1980, Sampling: defining the task and planning the scheme, Water Pollution Control (UK) 79: 452–467.

Gy P 1986, The analytical and economic importance of correctness in sampling, Analytica Chimica Acta 190: 13–23.

Harcum JB & Dressing SA 2015, Technical Memorandum #3: Minimum Detectable Change and Power Analysis, October 2015 (PDF, 1.3MB), Developed for US Environmental Protection Agency by Tetra Tech, Inc., Fairfax.

Hart BT, Ottaway EM & Noller BN 1987, Magela Creek system, northern Australia, I. 1982–83 wet-season water quality, Australian Journal of Marine and Freshwater Research 38: 261–288.

Hayes CR, Warn AE & Green LA 1985, Development of comprehensive water supply quality control in Anglian Water, Journal of the Institution of Water Engineers and Scientists 39: 539–547.

Helsel D & Hirsch RM 1992, Statistical Methods in Water Resources, Elsevier, Amsterdam.

Hughes RM 1995, Defining acceptable biological status by comparing with reference conditions, in: WS Davis WS & Simon TP (eds), Biological Assessment and Criteria: Tools for Water Resource Planning and Decision Making, pages 31–47, Lewis Publishers, Boca Raton.

Humphrey CL, Faith DP & Dostine PL 1995, Baseline requirements for assessment of mining impact using biological monitoring, Australian Journal of Ecology 20(1): 150–166.

Hurlbert SH 1984, Pseudoreplication and the design of ecological field experiments, Ecological Monographs 54: 187–211

Johnes PJ 2007, Uncertainties in annual riverine phosphorus load estimation: Impact of load estimation methodology, sampling frequency, baseflow index and catchment population density, Journal of Hydrology 332(1–2): 241–258.

Joo M, Raymond MAA, McNeil VH, Huggins R, Turner RDR & Choy S 2012, Estimates of sediment and nutrient loads in 10 major catchments draining to the Great Barrier Reef during 2006–2009, Marine Pollution Bulletin 65 (4–9): 150–166.

Keith LH 1991, Environmental Sampling and Analysis: A Practical Guide, Lewis Publishers, Chelsea.

Keough MJ & Mapstone BD 1995, Protocols for Designing Marine Ecological Monitoring Programs, associated with BEK Mills, National Pulp Mills Research Program No. 11, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra.

Keough MJ & Mapstone BD 1997, Designing environmental monitoring for pulp mills in Australia, Water Science and Technology 35: 397–404.

Khalil B & Ouarda TBMJ 2009, Statistical approaches used to assess and redesign surface water-quality-monitoring networks, Journal of Environmental Monitoring 11: 1915–1929.

Kratochvil B 1987, General principles of sampling, in: JK Taylor (ed.), Sampling and Calibration for Atmospheric Measurement ASTM STP 957, pp. 5–13, American Society for Testing and Materials, Philadelphia.

Lettenmaier DP, Anderson DE & Brenner RN 1984, Consolidation of a stream quality monitoring network, Water Resources Bulletin 20(4): 473–481.

Magmann P 1991, Unrecognized behaviour and sampling limitation bias field data, Environmental Biology of Fishes 31: 403–406.

Mapstone BD 1995, Scalable decision rules for environmental impact studies: Effect size, Type I, and Type II Errors, Ecological Applications 5: 401–410.

Montgomery HAC & Hart IC 1974, The design of sampling programs for rivers and effluents, Water Pollution Control 73: 77–101.

Norris RH & Georges A 1986, Design and analysis for assessment of water quality, in: De Deckker P & Williams WD (eds), Limnology in Australia, pages 555–572, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Melbourne & W Junk, Dordrecht.

Norris RH, McElravy, EP & Resh VH 1992, The sampling problem, in: Calow P & Petts GE (eds), The Rivers Handbook Volume 1: Hydrological and Ecological Principles, pp. 1–15, Blackwell Scientific Publishers, Oxford.

Percival DB, Lennox, SM, Wang YG & Darnell RE 2011, Wavelet-based multiresolution analysis of Wivenhoe Dam water temperatures, Water Resources Research 47(5): [W05552] doi:10.1029/2010WR009657.

Radford PJ & West J 1986, Models to minimize monitoring, Water Research 20: 1059–1066.

Sahu SK & Smith TMF 2006, A Bayesian method of sample size determination with practical applications, Journal of the Royal Statistical Society: Series A 169(2): 235–253.

Segar DA, Phillips DJH & Stamman E 1987, Strategies for long term pollution monitoring of the Coastal Oceans, in: Boyle TP (ed.), New Approaches to Monitoring Aquatic Ecosystems, ASTP 940, pages 12–27, American Society for Testing and Materials, Philadelphia.

Short CS 1980, Sampling programme design for water quality in distribution, Water Services 84: 529.

Sokal RR & Rohlf FJ 1995, Biometry: The Principles and Practice of Statistics in Biological Research, 3rd Edition, Freeman, New York.

Wallace R, Huggins R, King O, Gardiner R, Thomson B, Orr D, Ferguson B, Taylor C, Severino Z, Smith RA, Warne M St J, Turner RDR & Mann R 2016, Total Suspended Solids, Nutrient and Pesticide Loads (2014–2015) for Rivers that Discharge to the Great Barrier Reef — Great Barrier Reef Catchment Loads Monitoring Program (PDF, 2.8MB), Department of Science, Information Technology and Innovation, Brisbane.