Calculate O-statistics (community-level pairwise niche overlap statistics)

This is the primary function in the Ostats package. It calculates O-statistics by finding the trait density overlap among all pairs of species in each community and taking the mean or median. Next it optionally evaluates the O-statistics against a local null model. This is done separately for each trait.

Ostats(
  traits,
  plots,
  sp,
  discrete = FALSE,
  circular = FALSE,
  output = "median",
  weight_type = "hmean",
  run_null_model = TRUE,
  nperm = 99,
  nullqs = c(0.025, 0.975),
  shuffle_weights = FALSE,
  swap_means = FALSE,
  random_seed = NULL,
  unique_values = NULL,
  circular_args = list(),
  density_args = list(),
  verbose = FALSE
)

Arguments

traits: a numeric vector or matrix of trait measurements. The number of elements in the vector or number of rows in the matrix is the number of individuals, and the number of columns of the matrix is the number of traits.
plots: a factor with length equal to nrow(traits) that indicates the community each individual belongs to.
sp: a factor with length equal to nrow(traits) that indicates the taxon of each individual.
discrete: whether trait data may take continuous or discrete values. Defaults to FALSE (all traits continuous). A single logical value or a logical vector with length equal to the number of columns in traits. See details below.
circular: whether trait data are circular (e.g., hours or angles). Defaults to FALSE (all traits non-circular). A single logical value or a logical vector with length equal to the number of columns in traits. See details below.
output: specifies whether median or mean is calculated. Default "median".
weight_type: specifies weights to be used to calculate the median or mean. Default "hmean" (harmonic mean), meaning each pair of species is weighted by the harmonic mean of abundances.
run_null_model: whether to run a null model (if TRUE) and evaluate the O-statistics against it, or simply return the raw O-statistics (if FALSE). Defaults to TRUE.
nperm: the number of null model permutations to generate. Defaults to 99.
nullqs: numeric vector of probabilities with values in [0,1] to set effect size quantiles. Defaults to c(0.025, 0.975).
shuffle_weights: If TRUE, shuffle weights given to pairwise overlaps within a community when generating null models.
swap_means: If TRUE, swap means of body sizes within a community when generating null models.
random_seed: User may supply a random seed to enable reproducibility of null model output. A warning is issued, and a random seed is generated based on the local time, if the user does not supply a seed.
unique_values: Vector of all possible discrete values that traits can take. Only used if discrete = TRUE and circular = TRUE.
circular_args: optional list of additional arguments to pass to circular. Only used if circular = TRUE and discrete = FALSE. Note that continuous circular data must be provided in radian units.
density_args: additional arguments to pass to density, such as bw, n, or adjust. If none are provided, default values are used.
verbose: If TRUE, progress messages are displayed. Defaults to FALSE.

Value

The function returns a list containing four objects:

overlaps_norm: a matrix showing the O-statistic for each trait and each community, with the area under all density functions normalized to 1.
overlaps_unnorm: a matrix showing O-stats calculated with the area under all density functions proportional to the number of observations in that group.
overlaps_norm_ses: List of matrices of effect size statistics against a null model with the area under all density functions normalized to 1. ses contains the effect sizes (z-scores), ses_lower contains the effect size lower critical values for significance at the level determined by nullqs, and ses_upper contains the upper critical values. raw_lower and raw_upper are the critical values in raw units ranging from 0 to 1.
overlaps_unnorm_ses: List of matrices of effect size statistics against a null model with the area under all density functions proportional to the number of observations in that group. Elements are as in overlaps_norm_ses.

Details

This function calculates overlap statistics and optionally evaluates them against a local null model. By default, it calculates the median of pairwise overlaps, weighted by harmonic mean of species abundaces of the species pairs in each community. Two results are produced, one normalizing the area under all density functions to 1, the other making the area under all density functions proportional to the number of observations in that group.

If discrete = FALSE, continuous kernel density functions are estimated for each species at each community, if TRUE, discrete functions (histograms) are estimated.

If circular = TRUE and discrete = FALSE, the function circular is used to convert each column of traits to an object of class circular. Unless additional arguments about input data type are specified, it is assumed that the circular input data are in radian units (0 to 2*pi).

If circular = TRUE and discrete = TRUE, data will be interpreted as discrete values on a circular scale. For example, data might be integer values representing hours and ranging from 0 to 23.

If run_null_model is TRUE, the O-statistics are evaluated relative to a null model. When both shuffle_weights and swap_means are FALSE, null communities are generated by randomly assigning a taxon that is present in the community to each individual. If shuffle_weights is TRUE, species abundances are also randomly assigned to each species to weight the O-statistic for each null community. If swap_means is TRUE, instead of sampling individuals randomly, species means are sampled randomly among species, keeping the deviation of each individual from its species mean the same. After the null communities are generated, O-stats are calculated for each null community to compare with the observed O-stat.

Effect size statistics are calculated by z-transforming the O-statistics using the mean and standard deviation of the null distribution.

References

Read, Q. D. et al. Among-species overlap in rodent body size distributions predicts species richness along a temperature gradient. Ecography 41, 1718-1727 (2018).

Author

Quentin D. Read, John M. Grady, Arya Y. Yue, Isadora Fluck E., Ben Baiser, Angela Strecker, Phoebe L. Zarnetske, and Sydne Record

Examples

# overlap statistics for body weights of small mammals in NEON sites

# Keep only the relevant part of data
dat <- small_mammal_data[small_mammal_data$siteID %in% c('HARV','JORN'), ]
dat <- dat[!is.na(dat$weight), ]
dat$log_weight <- log10(dat$weight)


#Run O-stats on the data with only a few null model iterations
Ostats_example <- Ostats(traits = as.matrix(dat[,'log_weight']),
                   sp = factor(dat$taxonID),
                   plots = factor(dat$siteID),
                   nperm = 10)
#> Note: argument random_seed was not supplied; setting seed to 1063
#> Note: species abundances differ. Consider sampling equivalent numbers of individuals per species.