variogram.Rd
This function calculates the empirical variogram of multi-dimensional tracking data for visualizing stationary (time-averaged) autocorrelation structure. One of two algorithms is used. The slow \(O(n^2)\) algorithm is based upon Fleming & Calabrese et al (2014), but with interval-weights instead of lag-weights and an iterative algorithm to adjust for calibrated errors. Additional modifications have also been included to accommodate drift in the sampling rate. The fast \(O(n \log n)\) algorithm is based upon the FFT method of Marcotte (1996), with some tweaks to better handle irregularly sampled data. Both methods reduce to the unbiased “method of moments” estimator in the case of evenly scheduled data, even with missing observations, but they produce slightly different outputs for irregularly sampled data.
variogram(data,dt=NULL,fast=TRUE,res=1,CI="Markov",error=FALSE,axes=c("x","y"),
precision=1/8,trace=TRUE)
telemetry
data object of the 2D timeseries data.
Lag bin width. An ordered array will yield a progressive coarsening of the lags. Defaults to the median sampling interval.
Use the interval-weighted algorithm if FALSE
or the FFT algorithm if TRUE
. The slow algorithm outputs a progress bar.
Increase the discretization resolution for irregularly sampled data with res>1
. Decreases bias at the cost of smoothness.
Argument for confidence-interval estimation. Can be "IID"
to consider all unique lags as independent, "Markov"
to consider only non-overlapping lags as independent, or "Gauss"
for an exact calculation (see Details below).
Adjust for the effect of calibrated errors.
Array of axes to calculate an average (isotropic) variogram for.
Fraction of machine precision to target when adjusting for telemetry error (fast=FALSE
with calibrated errors). precision=1/8
returns about 2 decimal digits of precision.
Display a progress bar if fast=FALSE
.
If no dt
is specified, the median sampling interval is used. This is typically a good assumption for most data, even when there are gaps. A dt
coarser than the sampling interval may bias the variogram (particuarly if fast=TRUE
) and so this should be reserved for poor data quality.
For irregularly sampled data, it may be useful to provide an array of time-lag bin widths to progressively coarsen the variogram. I.e., if you made the very bad choice of changing your sampling interval on the fly from dt1
to dt2
, where dt1
\(<\) dt2
, the an appropriate choice would be dt=c(dt1,dt2)
. On the other hand, if your sampling is itself a noisy process, then you might want to introduce larger and larger dt
components as the visual appearance of the variogram breaks down with increasing lags.
Alternatively, you might try the fast=FALSE
option or aggregating multiple individuals with mean.variogram
.
With irregularly sampled data, different size lags must be aggregated together, and with current fast methods there is a tradeoff between bias and smoothness. The default settings produce a relatively smooth estimate, while increasing res
(or setting fast=FALSE
) will produce a less biased estimate, which is very useful for correlogram
.
In conventional variogram regression treatments, all lags are considered as independent (CI="IID"
) for the purposes of confidence-interval estimation, even if they overlap in time. However, in high resolution datasets this will produce vastly underestimated confidence intervals. Therefore, the default CI="Markov"
behavior is to consider only the maximum number of non-overlapping lags in calculating confidence intervals, though this is a crude approximation and is overly conservative at large lags. CI="Gauss"
implements exact confidence intervals under the assumption of a stationary Gaussian process, but this algorithm is \(O(n^2 \log n)\) even when fast=TRUE
.
If fast=FALSE
and the tracking data are calibrated (see uere
), then with error=TRUE
the variogram of the movement process (sans the telemetry-error process) is estimated using an iterative maximum-likelihood esitmator that downweights more erroneous location estimates (Fleming et al, 2020). The variogram is targeted to have precision
fraction of machine precision. If the data are very irregular and location errors are very homoskedastic, then this algorithm can be slow to converge at time lags where there are few data pairs.
If fast=TRUE
and error=TRUE
, then the estimated contribution to the variogram from location error is subtracted on a per lag basis, which is less ideal for heteroskedastic errors.
Returns a variogram object (class variogram) which is a dataframe containing the time-lag, lag
, the semi-variance estimate at that lag, SVF
, and the approximate number of degrees of freedom associated with that semi-variance, DOF
, with which its confidence intervals can be estimated.
D. Marcotte, ``Fast variogram computation with FFT'', Computers and Geosciences 22:10, 1175-1186 (1996) doi:10.1016/S0098-3004(96)00026-X .
C. H. Fleming, J. M. Calabrese, T. Mueller, K.A. Olson, P. Leimgruber, W. F. Fagan, ``From fine-scale foraging to home ranges: A semi-variance approach to identifying movement modes across spatiotemporal scales'', The American Naturalist, 183:5, E154-E167 (2014) doi:10.1086/675504 .
C. H. Fleming et al, ``A comprehensive framework for handling location error in animal tracking data'', bioRxiv (2020) doi:10.1101/2020.06.12.130195 .
Prior to ctmm
v0.3.6, fast=FALSE
used the lag-weighted esitmator of Fleming et al (2014). Lag weights have been abandoned in favor of interval weights, which are less sensitive to sampling irregularity. The same weighting formulas are used, but with dt
instead of the current lag.