Sample survey design

Observed data should be free from selection bias and sufficient in numbers to represent the population of interest. This will ensure that you can make valid and defensible inferences with confidence.

Sample survey design is a very mature and deeply rooted discipline in the statistical literature.

A study design is a set of decisions (design parameters) about what and how much data to collect, and when (how often, for how long) and where to collect it. These decisions are directly informed by the monitoring objectives and are fine-tuned by budget constraints, available resources, implementation considerations, domain or contextual knowledge (e.g. relevant historical data, pilot studies) and known characteristics of the variables of interest (e.g. variability).

Relationship between sample and population

Environmental variability (temporal, spatial and possibly even spatiotemporal) is probably the most important statistical notion to consider in the design of sampling programs (Eberhardt 1978, Morin et al. 1987, Kerekes & Freedman 1989).

Understanding the variability of key measurement variables can be useful to determine a desired or minimum number of sample sites, the desired number of replicate samples to take or record at a site, and the desired frequency of sample collection.

However, desired design parameters and eventual design parameters may be different due to other influences, such as budget or resource constraints or implementation considerations.

The main idea behind study design is to make valid and reliable inference about a population (a particular part of a river, a stream network, a lake, or a body of marine waters) by collecting only a sample. The key to this validity and reliability is to ensure that the observed data are free from selection bias and that there are sufficient sites to best represent the systems being studied.

Ad hoc approaches, such as convenience sampling and representative sampling (both of which appeal to expert knowledge to choose sample locations), should be avoided as they are inherently subjective approaches yielding the potential for substantial bias. There is also no clear link between the sample data and the population characteristics of interest, implying the basis for extrapolation and inference is not justifiable and difficulty in quantifying uncertainty about estimates.

The relationship between the sample and the population should be characterised either:

  • by a process or statistical model (model-based design), or
  • through the sampling process (probability-based design).

In the context of designing large-scale monitoring programs for assessing stream health, Dobbie et al. (2008) discussed the advantages and disadvantages of these two approaches, which we summarise here.

Probability-based study design

Probability-based study designs have 3 distinguishing features:

  • The population being sampled is explicitly described.
  • Every population element is assigned a probability of being included in the sample (giving rise to an inclusion probability density function).
  • Site selection is carried out via a probability regime with an explicit random component (thereby eliminating selection bias, conscious or not).

These features provide mathematical foundations for ensuring the sample is representative and the statistical inferences are valid and unbiased. Cochran (1977), Sarndal et al. (1992) and Thompson (1992) cover the essential attributes of study design.

A particularly attractive aspect of probability-based designs is that the resulting inference can be design-based (using the built-in inference tools associated with a particular probability study design) or model-based.

Through design-based inference, an objective assessment of uncertainty can be made. The merits of probability-based sampling in the context of aquatic monitoring are further discussed in Dobbie et al. (2008) and Dobbie & Negus (2013).

Probability-based study designs include simple random sampling (SRS), systematic sampling (SyS), stratified sampling (random or systematic), two-stage or cluster sampling, double sampling, adaptive sampling and spatially balanced sampling. For large-scale aquatic monitoring designs, Dobbie et al. (2008) reviewed of each of these, some of which we summarise here.​

Simple random sampling

Simple random sampling (SRS) is the simplest form of probability sampling, where a series of random locations or samples are selected from a population of possible locations without constraint.

SRS offers simple prescriptive calculations for the mean response at those locations (or for those samples) and thus straightforward statistical inferences. There is also flexibility to easily increase or decrease the sample size if required.

On the downside, SRS designs tend to yield clustered sampling units, implying that some parts of the population are represented more heavily (due to chance) than others. In other words, there is no assurance of spatial balance or regularity.

For monitoring stream condition, SRS is an impractical and inefficient way to sample.

​Systematic sampling

In systematic sampling (SyS), samples are collected at regular intervals in space or time. The random element can be introduced through selection of a random starting place (in space or in time).

Applying this approach to a spatial context ensures good spatial balance. The broad coverage of systematic samples can be an advantage when sampling rare and clustered populations. When properly planned and executed, SyS can be as unbiased as SRS, and can be significantly cheaper. Refer to Cochran (1977) for a full discussion.

However, you must take care to ensure that bias is not inadvertently incorporated into the sampling scheme. For instance, regular sampling schedules may coincide with periodicities in the disturbance being monitored (e.g. discharges from a factory may be consistently lower in the morning and greatest just before shut-down in the late afternoon). Similar situations can arise spatially.

Another disadvantage of SyS is that it may be difficult to apply for some finite populations (e.g. if lakes in a region are the sampling units).

SyS typically has less flexibility to change the sample point density or add new points without updating the whole design.

If spatial sampling is of interest, random-tessellation stratified (RTS) study designs help to resolve some of the issues concerning periodic or patterned responses not handled by SyS designs.

RTS study designs work by randomly locating a regular grid over the population domain with spacing chosen to provide the required spatial resolution. Random points from within each random tessellation cell are selected to form the required sample. Refer to Overton & Stehman (1993) and Olsen et al. (1998) for an explanation of RTS designs and application of global grids to large-scale environmental sampling of natural resources.

You need a good descriptive base of background information so that SyS can be both cost-effective and unbiased, and you must document the assumptions and choices made when executing such a sampling regime.

Two-stage sampling

Two-stage (or multistage) sampling is appropriate when the target population can be divided into a collection of ‘primary units’ that are sampled in two stages. A random sample of primary units are taken in the first stage, and sampling units are selected either randomly or systematically from the selected primary units in the second stage.

For example, rivers and streams within a subregion may be randomly chosen, and then sites within those selected rivers randomly chosen.

This differs from cluster sampling, where all the sampling units in selected primary units are sampled.

Stratified random sampling

Stratified random sampling divides the population into a collection of strata that exhaustively cover the region of interest. The strata can be of variable size and may be spatial (e.g. different subregions or subpopulations of interest), temporal (e.g. seasons within a year) or reflect operational, management or administrative aspects of the region (e.g. local government boundaries).

Choose appropriate strata so that sampling units within strata are more homogenous than those between strata. The number of strata and number of samples to select from each stratum are dependent on variability and importance.

Stratified random sampling is more complex than SRS or SyS but it can be substantially more efficient, particularly if the variability between strata is large compared to the variation within the strata. Also, the inference is well formed and widely accessible for this probability sampling scheme.

The choice of strata should account for the length and breadth of the study to ensure that appropriate strata can be chosen at the outset so they persist over time.

Examples of sampling through stratification

  • In water sampling to measure nutrients, chlorophyll and algae, a lake may be divided spatially into epilimnion (upper layer) and hypolimnion (lower layer) zones.
  • In estuarine sampling, stratification might be based on the salinity gradient.
  • Temporally, if nutrients are more variable in one season than another, more sampling effort can be allocated to the most variable season, particularly if estimates of the annual concentration or load of the nutrients are the focus of the program.
  • Suppose fish in a lake are being collected to study the accumulation of chemical contaminants, it is important to consider fish mobility and fish age (size). Older fish often accumulate more of a contaminant. Fish ages (sizes) then become the sampling strata, instead of geographical locations or particular periods of time.

Variable probability sampling is a generalisation of stratified random sampling that allows selection probabilities to vary continuously instead of being constant within a discrete stratum. In a practical sense, variable probability sampling uses auxiliary variables to discriminate between subpopulations of interest and inform probability weights, thus improving the precision of the results.

Spatially balanced study designs

Systematic sampling and stratified sampling (including RTS designs) ensure sampling units can be distributed spatially but other approaches go even further to ensure ‘spatial balance’ prevails.

The general philosophy behind spatially balanced sampling is that locations are randomly selected but guaranteed to be distributed across space with a view to maximise the spatial independence between the sample sites. This is because designs with some degree of spatial regularity or balance tend to be more efficient (e.g. yield responses that are less variable) for sampling natural resources than those with no spatial structure (Dobbie et al. 2008).

For the context of monitoring large-scale stream networks, spatial balance is a critical requirement of a probability study design, but it is also desirable (if not essential) that the design can accommodate variable site inclusion probabilities and dynamic adjustment of the sample. Some of the common designs, such as SyS and stratified sampling, can produce spatially well-balanced samples but they face limitations for at least one of the other desirable attributes.

The generalised random-tessellation stratified (GRTS) design (Stevens & Olsen 2004) uses a potentially infinite series of nested coherent grids and some function that converts the population from 2-dimensional space into 1-dimensional space, whilst retaining proximity relationships between points in the domain. Dobbie et al. (2008) and Stevens & Olsen (2004) discussed why it is such an attractive design for sampling natural resources including 0-dimensional (e.g. lakes), 1-dimensional (e.g. streams) and 2-dimensional (e.g. bays) water bodies. Lawrence et al. (2015) described a spatially balanced approach to choosing sites for mapping habitats and developing baselines in offshore marine reserves, which is also relevant to other aquatic contexts.

Other probability study designs

Some other probability-based study designs are worth briefly introducing.

Double sampling (also known as ‘two-phase sampling’) combines an initial sample of units used for obtaining fairly quick and cheap measurements on the quantities of interest, with a second sample of units used to obtain more accurate and expensive measurements. If the two samples are strongly related, then both sources of data can be reliably collated to provide an improved population estimate. Stevens & Jensen (2007) presented an application of this sampling approach for making wetland assessments. In the first phase, they used geologic structural and stratigraphic information to generate a map of areas with high probability of wetland occurrence. Sample points from the map were used to identify wetlands, and then a second sample selected to assess the condition within that wetland.

Adaptive sampling is a probability sampling approach for improving sampling efficiency when the population is clustered. Adaptive sampling (Thompson 1990, Thompson & Seber 1996) allows you to adapt a sampling scheme based on information as it is collected. Measurements are taken for a set of sampling units (selected by SRS or otherwise) and if the response for a particular sampling unit meets some prespecified criteria, then further measurements are taken for adjacent sampling units (e.g. at adjacent sites). This sequence of observation and augmentation is continued until no newly recorded measurements meet the criteria of triggering augmentation.

Model-based study design

In model-based study designs, the inference about the population is based on an explicit specification of the relationship between the selected sites and the population in terms of a statistical model with the stochastic element embedded in the model process.

Statistical models can be used to derive an optimal sampling scheme by determining the number of sites, their spatial arrangement, and the frequency of visiting them that optimises the information content of the observations.

As indicated in Dobbie et al. (2008), these models are ultimately developed to work out how to choose an optimal design for:

  • prediction at an unknown location, or
  • estimation of covariance (or variogram) parameters.

Although, these objectives are inherently linked in a spatial context because optimal spatial prediction relies on estimating the spatial covariance function.

Precise inference may be possible from a limited number of samples if the statistical model is accurate. As an example, Diggle & Lophaven (2006) used a model to consider the optimal deletion of salinity monitoring sites from a set of 70 sites in the Kattegat basin, which is a transitional sea between the North Sea and the Baltic Sea.

Disadvantages of model-based study designs

One disadvantage of model-based study design approaches is that reliability of the inference depends on the adequacy of the model to accurately describe the underlying environmental process.

Another disadvantage is that some ecological systems and monitoring objectives may demand development of a complex model to fully capture the processes happening, increasing the potential computational burden in parameter estimation.

For example, Pagendam et al. (2014) developed a data assimilation approach in the Great Barrier Reef to combine catchment process models with monitoring data to improve the estimation of pollutant loads and the quantification of their uncertainty. In such cases, forming a statistical model uses conceptual models and knowledge of the underlying process of interest and sufficient data on that process, which may be either unavailable or unknown.

These approaches tend to require a heavy investment of time and expertise but their focus on representing the underlying process as faithfully as possible may improve the quality of the inferences that can be made.

Further details about model-based study design approaches are summarised and referenced in Dobbie et al. (2008).


Cochran WG 1977, Sampling Techniques, 3rd Edition, Wiley, New York.

Diggle P & Lophaven SA 2006, Bayesian Geostatistical Design, Scandinavian Journal of Statistics 33(1): 53–64.

Dobbie MJ, Henderson BL & Stevens DL Jr 2008, Sparse sampling: spatial design for aquatic monitoring, Statistics Surveys 2: 113–153.

Dobbie MJ & Negus P 2013, Addressing statistical and operational challenges in designing large-scale stream condition surveys, Environmental Monitoring and Assessment 185: 7231–7243.

Eberhardt, LL 1978, Appraising variability in population studies, Journal of Wildlife Management 42: 207–238.

Kerekes J & Freedman B 1989, Seasonal variation of water chemistry in oligotrophic streams and rivers in Kejimkujik National Park, Novia Scotia, Water, Air, & Soil Pollution 46: 131–144.

Lawrence E, Hayes KR, Lucieer VL, Nichol SL, Dambacher JM, Hill NA, et al. 2015, Mapping habitats and developing baselines in offshore marine reserves with little prior knowledge: a critical evaluation of a new approach, PLoS ONE 10(10): e0141051 doi:10.1371/journal.pone.0141051

Maher WA, Cullen PW & Norris RH 1994, A framework for designing sampling programs, Environmental Monitoring and Assessment 30: 139–162

Mapstone BD 1995, Scalable decision rules for environmental impact studies: Effect size, Type I, and Type II Errors, Ecological Applications 5: 401–410.

Morin A, Mousseau TA & Roff DA 1987, Accuracy and precision of secondary production estimates, Limnology and Oceanography 32: 1342–1352.

Olsen AR, Stevens DL, Jr & White D 1998, Application of global grids in environmental sampling, in: Computing Science and Statistics Proceedings of the 30th Symposium on the Interface 30: 279–284.

Overton WS & Stehman SV 1993, Properties of designs for sampling continuous spatial resources from a triangular grid, Communications in Statistics, Part A – Theory and Methods 22: 2641–2660.

Sarndal CE, Swensson B & Wretman JH 1992, Model Assisted Survey Sampling, Springer, New York.

Stevens DL Jr & Jensen SF 2007, Sample design, execution and analysis for wetland assessment, Wetlands 27(3): 515–523.

Stevens DL Jr & Olsen AR 2004, Spatially balanced sampling of natural resources, Journal of American Statistical Association 99: 262–278.

Thompson SK 1990, Adaptive cluster sampling, Journal of American Statistical Association 85: 1050–1059.

Thompson SK 1992, Sampling, Wiley, New York.

Thompson SK & Seber GAF 1996, Adaptive Sampling, Wiley, New York.