Uncertainty and sensitivity analysis — tdm

Quantifies the induced uncertainty on \(SFD\) and \(K\) time series due to the variability in input parameters applied during TDM data processing. Moreover, it applies a global sensitivity analysis to quantify the impact of each individual parameter on three relevant outputs derived from \(SFD\) and \(K\), namely: i) the mean daily sum of water use, ii) the variability of maximum daily \(SFD\) or \(K\) values, iii) and the duration of daily sap flow. This function provides both the uncertainty and sensitivity indices, as time-series of \(SFD\) and \(K\) with the mean, standard deviation (\(sd\)) and confidence interval (CI) due to parameter uncertainty. Users should ensure that no gaps are present within the input data and environmental time series.

tdm_uncertain(
  input,
  vpd.input,
  sr.input,
  method = "pd",
  n = 2000,
  zero.end = 8 * 60,
  range.end = 16,
  zero.start = 1 * 60,
  range.start = 16,
  probe.length = 20,
  sw.cor = 32.28,
  sw.sd = 16,
  log.a_mu = 4.085,
  log.a_sd = 0.628,
  b_mu = 1.275,
  b_sd = 0.262,
  max.days_min = 1,
  max.days_max = 7,
  ed.window_min = 8,
  ed.window_max = 16,
  criteria.vpd_min = 0.05,
  criteria.vpd_max = 0.5,
  criteria.sr_mean = 30,
  criteria.sr_range = 30,
  criteria.cv_min = 0.5,
  criteria.cv_max = 1,
  min.sfd = 0.5,
  min.k = 0,
  make.plot = TRUE,
  df = FALSE
)

Arguments

input: An is.trex-compliant object (zoo object
vpd.input: An is.trex-compliant object (zoo object, data.frame) containing a timestamp and a vapour pressure deficit (\(VPD\); in \(kPa\)) column with the same temporal extent and time steps as the input object. This input is required when using the environmental dependent ("ed") method.
sr.input: An is.trex-compliant object (zoo object, data.frame) a timestamp and a solar radiation data (sr; e.g., global radiation or PAR) column with the same temporal extent and time steps as the input object. This input is required when using the environmental dependent ("ed") method.
method: Character, specifies the \(\Delta T_{max}\) method on which the sensitivity and uncertainty analysis are to be performed on (see tdm_dt.max). Only one method can be selected, including the pre-dawn ("pd"), moving window ("mw"), double regression ("dr") or the environmental dependent ("ed") method (default = "pd").
n: Numeric, specifies the number of times the bootstrap resampling procedure is repeated (default = 2000). Keep in mind that high values increase processing time.
zero.end: Numeric, defines the end of the predawn period. Values should be in minutes (e.g., predawn conditions until 8:00 = 8*60; default = 8*60).
range.end: Numeric, defines the number of time steps for zero.end (the minimum time step of the input) for which an integer sampling range will be defined (default = 16, assuming a 15-min resolution or a 2 hour range around zero.end).
zero.start: Numeric, defines the start of the predawn period. Values should be in minutes (e.g., predawn conditions from 1:00 = 1*60; default = 1*60).
range.start: Numeric, defines the number of time steps for zero.start (the minimum time step of the input) for which an integer sampling range will be defined (default = 16, assuming a 15-min resolution or a 2 hour range around zero.start).
probe.length: Numeric, the length of the TDM probes in mm (see tdm_hw.cor; default = 20 mm).
sw.cor: Numeric, the sapwood thickness in mm. Default conditions assume the sapwood thickness is equal to a standard probe length (default = 20).
sw.sd: Numeric, the standard deviation for sampling sapwood thickness sampling from a normal distribution (default = 16 mm; defined with a European database on sapwood thickness measurements).
log.a_mu: Numeric, value providing the natural logarithm of the calibration parameter \(a\) (see tdm_cal.sfd; \(SFD = aK^b\)). This value can be obtained from tdm_cal.sfd (see out.param). Default conditions are determined by using all calibration data as described in cal.data (default = 4.085).
log.a_sd: Numeric, the standard deviation of the \(a\) parameter (see log.a_mu) used within the calibration curve for calculating \(SFD\) (default = 0.628).
b_mu: Numeric, the value of the calibration parameter \(b\) (see tdm_cal.sfd; \(SFD = aK^b\)). This value can be obtained from tdm_cal.sfd (see out.param). Default conditions are determined by using all calibration data as described in cal.data (default = 1.275).
b_sd: Numeric, the standard deviation of the \(b\) parameter (see log.a_mu) used within the calibration curve for calculating \(SFD\) (default = 0.262).
max.days_min: Numeric, the minimum value for an integer sampling range of max.days (see tdm_dt.max for the "mw" and "dr" \(\Delta T_{max}\) method). As the "mw" and "dr" method apply a rolling maximum or mean, the provided value should be an uneven number (see tdm_dt.max; default = 15; required for the "mw" and "dr" \(\Delta T_{max}\) method).
max.days_max: Numeric, the maximum value for an integer sampling range of max.days (see tdm_dt.max for the "mw" and "dr" \(\Delta T_{max}\) method). As the "mw" and "dr" method apply a rolling maximum or mean, the provided value should be an uneven number (see tdm_dt.max; default = 5; required for the "mw" and "dr" \(\Delta T_{max}\) method).
ed.window_min: Numeric, the minimum number of time steps for the ed.window parameter (see tdm_dt.max; the minimum time step of the input) for which an integer sampling range will be defined (default = 8, assuming a 15-min resolution or a 2 hour range; required for the "ed" \(\Delta T_{max}\) method).
ed.window_max: Numeric, the maximum number of time steps for the ed.window sampling range (default = 16, assuming a 15-min resolution or a 4 hour range; required for the "ed" \(\Delta T_{max}\) method).
criteria.vpd_min: Numeric, value in \(kPa\) defining the minimum for the fixed sampling range to define the vapour pressure deficit (VPD) threshold to establish zero-flow conditions (default = 0.05 \(kPa\); see tdm_dt.max; required for the "ed" \(\Delta T_{max}\) method).
criteria.vpd_max: Numeric, value in \(kPa\) defining the maximum for the fixed sampling range to define the VPD threshold to establish zero-flow conditions (default = 0.5 \(kPa\); required for the "ed" \(\Delta T_{max}\) method).
criteria.sr_mean: Numeric value defining the mean sr.input value around which the fixed sampling range for the solar irradiance threshold should be established for defining zero-flow conditions (see tdm_dt.max; default = 30 W m-2; required for the "ed" \(\Delta T_{max}\) method).
criteria.sr_range: Numeric, the range (in %) around criteria.sr_mean for establishing the solar irradiance threshold (see tdm_dt.max; default = 30%; required for the "ed" \(\Delta T_{max}\) method).
criteria.cv_min: Numeric, value (in %) defining the minimum value for the fixed sampling range to determine the coefficient of variation (CV) threshold for establishing zero-flow conditions (default = 0.5%; see tdm_dt.max; required for the "ed" \(\Delta T_{max}\) method).
criteria.cv_max: Numeric, value (in %) defining the maximum value for the fixed sampling range to determine the coefficient of variation (CV) threshold for establishing zero-flow conditions (default = 1%; see tdm_dt.max; required for the "ed" \(\Delta T_{max}\) method).
min.sfd: Numeric, defines at which \(SFD\) (\(cm^3 cm^{-2} h^{-1}\)) zero-flow conditions are expected. This parameter is used to define the duration of daily sap flow based on \(SFD\) (default = \(0.5 cm^3 cm^{-2} h^{-1}\)).
min.k: Numeric value defining at which \(K\) (dimensionless, -) zero-flow are expected. This parameter is used to define the duration of daily sap flow based on \(K\) (default = 0).
make.plot: Logical; If TRUE, a plot is generated presenting the sensitivity and uncertainty analyses output (default = TRUE).
df: Logical; If TRUE, output is provided in a data.frame format with a timestamp and a value column. If FALSE, output is provided as a zoo vector object (default = FALSE).

Value

A named list of zoo or data.frame objects in the appropriate format for other functionalities. Items include:

output.data: data.frame containing uncertainty and sensitivity indices for \(SFD\) and K and the included parameters. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"ci.min"] and lower [,"ci.max"] 95% confidence interval.
output.sfd: zoo object or data.frame with the \(SFD\) time series obtained from the bootstrap resampling. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"CIup"] and lower [,"CIlo"] 95% confidence interval.
output.k: zoo object or data.frame with the K time series obtained from the bootstrap resampling. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"ci.max"] and lower [,"ci.min"] 95% confidence interval.
param: a data.frame with an overview of selected parameters used within tdm_uncertain() function.

Details

Uncertainty and sensitivity analysis can be performed on TDM \(\Delta T\) (or \(\Delta V\)) measurements. The function applies a Monte Carlo simulation approach (repetition defined by n) to determine the variability in relevant output variables (defined as uncertainty) and quantifies the contribution of each parameter to this uncertainty (defined as sensitivity). To generate variability in the selected input parameters a Latin Hypercube Sampling is performed with a default or user defined range of parameter values per \(\Delta T_{max}\) method (see tdm_dt.max()). The sampling algorithm generates multiple sampling distributions, including an integer sampling range (for zero.start, zero.end, max.days, and ed.window), a continuous sampling range (criteria for sr, vpd and cv), and a normal distribution (for sw.cor and calibration parameters a and b). Within this algorithm no within-day interpolations are made between the \(\Delta T_{max}\) points (see tdm_dt.max, interpolate = FALSE). This approach ensures near-random sampling across different types of sampling distributions, while avoiding the need for increasing the number of replicates (which increases computation time). For the application of this approach one needs to; i) select the output of interest, ii) identify the relevant input parameters, and iii) determine the parameter range and distribution. For a given time-series three output variables are considered, calculated as the mean over the entire time-series, to be relevant, namely; i) mean daily sum of water use (or Sum, expressed in \(cm^3 cm^{-2} d^{-1}\) for \(SFD\) and unitless for \(K\)), ii) the variability of maximum \(SFD\) or \(K\) values (or CV, expressed as the coefficient of variation in % as this alters climate response correlations), and iii) the duration of daily sap flow based on \(SFD\) or \(K\) (or Duration, expressed in hours per day dependent on a threshold, see min.sfd and min.k). A minimum threshold to define zero-flow \(SFD\) or \(K\) is required for the duration calculation as small variations in night-time \(SFD\) or \(K\) are present. All data-processing steps (starting with "tdm_") are incorporated within the function, excluding tdm_damp() due to the need for detailed visual inspection and significantly longer computation time.

For the sensitivity analysis the total overall sensitivity indices are determined according strategy originally proposed by Sobol' (1993), considering the improvements applied within the sensitivity R package. The method proposed by Sobol' (1993) is a variance-based sensitivity analysis, where sensitivity indices (dimensionless from 0 to 1) indicate the partial variance contribution by a given parameter over the total output variance (e.g., Pappas et al. 2013). This global sensitivity analysis facilitates the identification of key parameters for data-processing improvement and highlights methodological limitations. Users should keep in mind that parameter ranges represent a very critical component of any sensitivity analysis and should be critically assessed and clearly reported for each case and analytical purpose. Moreover, it is advised to run this function on one growing season of input data to reduce processing time.

References

Sobol' I. 1993. Sensitivity analysis for nonlinear mathematical models. Math. Model Comput. Exp. 1:407-414

Pappas C, Fatichi S, Leuzinger S, Wolf A, Burlando P. 2013. Sensitivity analysis of a process-based ecosystem model: Pinpointing parameterization and structural issues. Journal of Geophysical Research 118:505-528 doi:10.1002/jgrg.20035

Examples

if (FALSE) {
#perform an uncertainty and sensitivity analysis on "dr" data processing
raw   <- example.data(type="doy")
input <- is.trex(raw, tz="GMT", time.format="%H:%M",
           solar.time=TRUE, long.deg=7.7459, ref.add=FALSE, df=FALSE)
input<-dt.steps(input,time.int=15,start="2013-04-01 00:00",
             end="2013-11-01 00:00",max.gap=180,decimals=15)
output<- tdm_uncertain(input, probe.length=20, method="pd",
               n=2000,sw.cor=32.28,sw.sd=16,log.a_mu=3.792436,
               log.a_sd=0.4448937,b_mu=1.177099,b_sd=0.3083603,
               make.plot=TRUE)
}