Title: | Tests for Rotational Symmetry on the Hypersphere |
---|---|
Description: | Implementation of the tests for rotational symmetry on the hypersphere proposed in García-Portugués, Paindaveine and Verdebout (2020) <doi:10.1080/01621459.2019.1665527>. The package also implements the proposed distributions on the hypersphere, based on the tangent-normal decomposition, and allows for the replication of the data application considered in the paper. |
Authors: | Eduardo García-Portugués [aut, cre]
|
Maintainer: | Eduardo García-Portugués <[email protected]> |
License: | GPL-3 |
Version: | 1.1.5 |
Built: | 2025-03-12 20:27:04 UTC |
Source: | https://github.com/egarpor/rotasym |
Implementation of the tests for rotational symmetry on the hypersphere proposed in García-Portugués, Paindaveine and Verdebout (2020) <doi:10.1080/01621459.2019.1665527>. The package implements the proposed distributions on the hypersphere based on the tangent-normal decomposition. It also allows for the replication of the data application considered in the paper.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
Density and simulation of the Angular Central Gaussian (ACG)
distribution on
,
. The density at
,
, is given by
where is the shape matrix, a
symmetric and positive definite matrix, and
is the surface area of
.
d_ACG(x, Lambda, log = FALSE) c_ACG(p, Lambda, log = FALSE) r_ACG(n, Lambda)
d_ACG(x, Lambda, log = FALSE) c_ACG(p, Lambda, log = FALSE) r_ACG(n, Lambda)
x |
locations in |
Lambda |
the shape matrix |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
p |
dimension of the ambient space |
n |
sample size, a positive integer. |
Due to the projection of the ACG, the shape matrix
is only identified up to a constant,
that is,
and
give the same ACG distribution.
Usually,
is normalized to have trace
equal to
.
c_ACG
is vectorized on p
. If , then the ACG is the
uniform distribution in the set
.
Depending on the function:
d_ACG
: a vector of length nx
or 1
with the
evaluated density at x
.
r_ACG
: a matrix of size c(n, p)
with the random sample.
c_ACG
: the normalizing constant.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
Tyler, D. E. (1987). Statistical analysis for the angular central Gaussian distribution on the sphere. Biometrika, 74(3):579–589. doi:10.1093/biomet/74.3.579
# Simulation and density evaluation for p = 2 Lambda <- diag(c(5, 1)) n <- 1e3 x <- r_ACG(n = n, Lambda = Lambda) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x, pch = 16, col = col[rank(d_ACG(x = x, Lambda = Lambda))]) # Simulation and density evaluation for p = 3 Lambda <- rbind(c(5, 1, 0.5), c(1, 2, 1), c(0.5, 1, 1)) x <- r_ACG(n = n, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(d_ACG(x = x, Lambda = Lambda))], size = 5) }
# Simulation and density evaluation for p = 2 Lambda <- diag(c(5, 1)) n <- 1e3 x <- r_ACG(n = n, Lambda = Lambda) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x, pch = 16, col = col[rank(d_ACG(x = x, Lambda = Lambda))]) # Simulation and density evaluation for p = 3 Lambda <- rbind(c(5, 1, 0.5), c(1, 2, 1), c(0.5, 1, 1)) x <- r_ACG(n = n, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(d_ACG(x = x, Lambda = Lambda))], size = 5) }
Computation of the cosines and multivariate signs of the
hyperspherical sample about a location
,
for
with
.
The cosines are defined as
whereas the multivariate signs are the vectors
defined as
The projection matrix
is a
semi-orthogonal matrix that satisfies
where is the identity matrix of dimension
.
signs(X, theta, Gamma = NULL, check_X = FALSE) cosines(X, theta, check_X = FALSE) Gamma_theta(theta, eig = FALSE)
signs(X, theta, Gamma = NULL, check_X = FALSE) cosines(X, theta, check_X = FALSE) Gamma_theta(theta, eig = FALSE)
X |
hyperspherical data, a matrix of size |
theta |
a unit-norm vector of length |
Gamma |
output from |
check_X |
whether to check the unit norms on the rows of |
eig |
whether |
Note that the projection matrix
is not
unique. In particular, any completion of
to an orthonormal basis
gives a set of
orthonormal
-vectors
that conform the columns of
. If
eig = FALSE
, this approach is employed by rotating the canonical
completion of ,
, by the rotation matrix that rotates
to
:
If eig = TRUE
, then a much more expensive
eigendecomposition of is performed for
determining
.
If signs
and cosines
are called with X
without unit
norms in the rows, then the results will be spurious. Setting
check_X = TRUE
prevents this from happening.
Depending on the function:
cosines
: a vector of length n
with the cosines of
X
.
signs
: a matrix of size c(n, p - 1)
with the
multivariate signs of X
.
Gamma_theta
: a projection matrix
of size
c(p, p - 1)
.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
# Gamma_theta theta <- c(0, 1) Gamma_theta(theta = theta) # Signs and cosines for p = 2 L <- rbind(c(1, 0.5), c(0.5, 1)) X <- r_ACG(n = 1e3, Lambda = L) par(mfrow = c(1, 2)) plot(signs(X = X, theta = theta), main = "Signs", xlab = expression(x[1]), ylab = expression(x[2])) hist(cosines(X = X, theta = theta), prob = TRUE, main = "Cosines", xlab = expression(x * "'" * theta)) # Signs and cosines for p = 3 L <- rbind(c(2, 0.25, 0.25), c(0.25, 0.5, 0.25), c(0.25, 0.25, 0.5)) X <- r_ACG(n = 1e3, Lambda = L) par(mfrow = c(1, 2)) theta <- c(0, 1, 0) plot(signs(X = X, theta = theta), main = "Signs", xlab = expression(x[1]), ylab = expression(x[2])) hist(cosines(X = X, theta = theta), prob = TRUE, main = "Cosines", xlab = expression(x * "'" * theta))
# Gamma_theta theta <- c(0, 1) Gamma_theta(theta = theta) # Signs and cosines for p = 2 L <- rbind(c(1, 0.5), c(0.5, 1)) X <- r_ACG(n = 1e3, Lambda = L) par(mfrow = c(1, 2)) plot(signs(X = X, theta = theta), main = "Signs", xlab = expression(x[1]), ylab = expression(x[2])) hist(cosines(X = X, theta = theta), prob = TRUE, main = "Cosines", xlab = expression(x * "'" * theta)) # Signs and cosines for p = 3 L <- rbind(c(2, 0.25, 0.25), c(0.25, 0.5, 0.25), c(0.25, 0.25, 0.5)) X <- r_ACG(n = 1e3, Lambda = L) par(mfrow = c(1, 2)) theta <- c(0, 1, 0) plot(signs(X = X, theta = theta), main = "Signs", xlab = expression(x[1]), ylab = expression(x[2])) hist(cosines(X = X, theta = theta), prob = TRUE, main = "Cosines", xlab = expression(x * "'" * theta))
Estimation of the axis of rotational symmetry
of a rotational symmetric unit-norm
random vector
in
,
, from a
hyperspherical sample
.
spherical_mean(data) spherical_loc_PCA(data)
spherical_mean(data) spherical_loc_PCA(data)
data |
hyperspherical data, a matrix of size |
The spherical_mean
estimator computes the sample mean of
and normalizes it
by its norm (if the norm is different from zero). It estimates consistently
for rotational symmetric models based on
angular functions
that are
monotone increasing.
The estimator in spherical_loc_PCA
is based on the fact that, under
rotational symmetry, the expectation of
is
for certain constants
. Therefore,
is the eigenvector with unique
multiplicity of the expectation of
. Its
use is recommended if the rotationally symmetric data is not unimodal.
A vector of length p
with an estimate for
.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
# Sample from a vMF n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data <- r_vMF(n = n, mu = theta, kappa = 3) theta_mean <- spherical_mean(data) theta_PCA <- spherical_loc_PCA(data) sqrt(sum((theta - theta_mean)^2)) # More efficient sqrt(sum((theta - theta_PCA)^2)) # Sample from a mixture of antipodal vMF's n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data <- rbind(r_vMF(n = n, mu = theta, kappa = 3), r_vMF(n = n, mu = -theta, kappa = 3)) theta_mean <- spherical_mean(data) theta_PCA <- spherical_loc_PCA(data) sqrt(sum((theta - theta_mean)^2)) sqrt(sum((theta - theta_PCA)^2)) # Better suited in this case
# Sample from a vMF n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data <- r_vMF(n = n, mu = theta, kappa = 3) theta_mean <- spherical_mean(data) theta_PCA <- spherical_loc_PCA(data) sqrt(sum((theta - theta_mean)^2)) # More efficient sqrt(sum((theta - theta_PCA)^2)) # Sample from a mixture of antipodal vMF's n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data <- rbind(r_vMF(n = n, mu = theta, kappa = 3), r_vMF(n = n, mu = -theta, kappa = 3)) theta_mean <- spherical_mean(data) theta_PCA <- spherical_loc_PCA(data) sqrt(sum((theta - theta_mean)^2)) sqrt(sum((theta - theta_PCA)^2)) # Better suited in this case
Processed version of the Debrecen Photoheliographic Data (DPD) sunspot catalogue and the revised version of the Greenwich Photoheliographic Results (GPR) sunspot catalogue. The two sources contain the records of sunspots appeared during 1872–2018 (GPR for 1872–1976; DPD for 1974–2018).
Sunspots appear in groups and have a variable lifetime. This dataset has been processed to account only for the births or emergences (first observations) of groups of sunspots.
sunspots_births
sunspots_births
A data frame with 51303 rows and 6 variables:
UTC date, as POSIXct
, of
the first observation of a group of sunspots.
solar cycle in which the group of sunspots was observed.
total whole spot area of the group, measured in millionths of the solar hemisphere.
distance from the center of Sun's disc, measured in units of the solar radius.
mean longitude angle of the
group position.
mean latitude angle of the
group position.
The mean position of the group of sunspots is obtained by a weighted average of the positions of the single sunspots by the whole spot area of the single spots. The areas are corrected to account for foreshortening.
The angles are such their associated Cartesian
coordinates are:
with denoting the north pole.
The DPD data has different states of completeness and quality control. The longest span of "final complete data" (no missing observation days and the data has undergone a systematic quality control) is from 2005 to 2015.
The data has been preprocessed using the following pipeline:
Retrieve data from the GPR and DPD sunspot catalogues.
Omit observations with NA
s in the sunspot positions.
Filter for sunspot groups.
Relabel the NOAA identifier for the sunspot group for records before 1974, prefixing the "GPR" string. Otherwise, very different groups of sunspots from the two catalogues may share the same identifier.
Keep only the first row of each NOAA instance, the first-ever observation of each sunspot group.
The script performing the preprocessing is available at
sunspots-births.R
Data processed by Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout from the original sources.
http://fenyi.solarobs.csfk.mta.hu
Baranyi, T., Győri, L., Ludmány, A. (2016) On-line tools for solar data compiled at the Debrecen observatory and their extensions with the Greenwich sunspot data. Solar Physics, 291(9–10):3081–3102. doi:10.1007/s11207-016-0930-1
Győri, L., Ludmány, A., Baranyi, T. (2019) Comparative analysis of Debrecen sunspot catalogues. Monthly Notices of the Royal Astronomical Society, 465(2):1259–1273. doi:10.1093/mnras/stw2667
# Load data data("sunspots_births") # Transform to Cartesian coordinates sunspots_births$X <- cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta), cos(sunspots_births$phi) * sin(sunspots_births$theta), sin(sunspots_births$phi)) # Plot data associated to the 23rd cycle sunspots_23 <- subset(sunspots_births, cycle == 23) n <- nrow(sunspots_23$X) if (requireNamespace("rgl")) { rgl::plot3d(0, 0, 0, xlim = c(-1, 1), ylim = c(-1, 1), zlim = c(-1, 1), radius = 1, type = "s", col = "lightblue", alpha = 0.25, lit = FALSE) } n_cols <- 100 cuts <- cut(x = sunspots_23$date, include.lowest = TRUE, breaks = quantile(sunspots_23$date, probs = seq(0, 1, l = n_cols + 1))) if (requireNamespace("rgl")) { rgl::points3d(sunspots_23$X, col = viridisLite::viridis(n_cols)[cuts]) } # Spörer's law: sunspots at the beginning of the solar cycle (dark blue # color) tend to appear at higher latitudes, gradually decreasing to the # equator as the solar cycle advances (yellow color) # Estimation of the density of the cosines V <- cosines(X = sunspots_23$X, theta = c(0, 0, 1)) h <- bw.SJ(x = V, method = "dpi") plot(kde <- density(x = V, bw = h, n = 2^13, from = -1, to = 1), col = 1, xlim = c(-1, 1), ylim = c(0, 3), axes = FALSE, main = "", xlab = "Cosines (latitude angles)", lwd = 2) at <- seq(-1, 1, by = 0.25) axis(2); axis(1, at = at) axis(1, at = at, line = 1, tick = FALSE, labels = paste0("(", 90 - round(acos(at) / pi * 180, 1), "º)")) rug(V) legend("topright", legend = c("Full cycle", "Initial 25% cycle", "Final 25% cycle"), lwd = 2, col = c(1, viridisLite::viridis(12)[c(3, 8)])) # Density for the observations within the initial 25% of the cycle part1 <- sunspots_23$date < quantile(sunspots_23$date, 0.25) V1 <- cosines(X = sunspots_23$X[part1, ], theta = c(0, 0, 1)) h1 <- bw.SJ(x = V1, method = "dpi") lines(kde1 <- density(x = V1, bw = h1, n = 2^13, from = -1, to = 1), col = viridisLite::viridis(12)[3], lwd = 2) # Density for the observations within the final 25% of the cycle part2 <- sunspots_23$date > quantile(sunspots_23$date, 0.75) V2 <- cosines(X = sunspots_23$X[part2, ], theta = c(0, 0, 1)) h2 <- bw.SJ(x = V2, method = "dpi") lines(kde2 <- density(x = V2, bw = h2, n = 2^13, from = -1, to = 1), col = viridisLite::viridis(12)[8], lwd = 2) # Computation the level set of a kernel density estimator that contains # at least 1 - alpha of the probability (kde stands for an object # containing the output of density(x = data)) kde_level_set <- function(kde, data, alpha) { # Estimate c from alpha c <- quantile(approx(x = kde$x, y = kde$y, xout = data)$y, probs = alpha) # Begin and end index for the potentially many intervals in the level sets kde_larger_c <- kde$y >= c run_length_kde <- rle(kde_larger_c) begin <- which(diff(kde_larger_c) > 0) + 1 end <- begin + run_length_kde$lengths[run_length_kde$values] - 1 # Return the [a_i, b_i], i = 1, ..., K in the K rows return(cbind(kde$x[begin], kde$x[end])) } # Level set containing the 90% of the probability, in latitude angles 90 - acos(kde_level_set(kde = kde, data = V, alpha = 0.10)) / pi * 180 # Modes (in cosines and latitude angles) modes <- c(kde$x[kde$x < 0][which.max(kde$y[kde$x < 0])], kde$x[kde$x > 0][which.max(kde$y[kde$x > 0])]) 90 - acos(modes) / pi * 180
# Load data data("sunspots_births") # Transform to Cartesian coordinates sunspots_births$X <- cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta), cos(sunspots_births$phi) * sin(sunspots_births$theta), sin(sunspots_births$phi)) # Plot data associated to the 23rd cycle sunspots_23 <- subset(sunspots_births, cycle == 23) n <- nrow(sunspots_23$X) if (requireNamespace("rgl")) { rgl::plot3d(0, 0, 0, xlim = c(-1, 1), ylim = c(-1, 1), zlim = c(-1, 1), radius = 1, type = "s", col = "lightblue", alpha = 0.25, lit = FALSE) } n_cols <- 100 cuts <- cut(x = sunspots_23$date, include.lowest = TRUE, breaks = quantile(sunspots_23$date, probs = seq(0, 1, l = n_cols + 1))) if (requireNamespace("rgl")) { rgl::points3d(sunspots_23$X, col = viridisLite::viridis(n_cols)[cuts]) } # Spörer's law: sunspots at the beginning of the solar cycle (dark blue # color) tend to appear at higher latitudes, gradually decreasing to the # equator as the solar cycle advances (yellow color) # Estimation of the density of the cosines V <- cosines(X = sunspots_23$X, theta = c(0, 0, 1)) h <- bw.SJ(x = V, method = "dpi") plot(kde <- density(x = V, bw = h, n = 2^13, from = -1, to = 1), col = 1, xlim = c(-1, 1), ylim = c(0, 3), axes = FALSE, main = "", xlab = "Cosines (latitude angles)", lwd = 2) at <- seq(-1, 1, by = 0.25) axis(2); axis(1, at = at) axis(1, at = at, line = 1, tick = FALSE, labels = paste0("(", 90 - round(acos(at) / pi * 180, 1), "º)")) rug(V) legend("topright", legend = c("Full cycle", "Initial 25% cycle", "Final 25% cycle"), lwd = 2, col = c(1, viridisLite::viridis(12)[c(3, 8)])) # Density for the observations within the initial 25% of the cycle part1 <- sunspots_23$date < quantile(sunspots_23$date, 0.25) V1 <- cosines(X = sunspots_23$X[part1, ], theta = c(0, 0, 1)) h1 <- bw.SJ(x = V1, method = "dpi") lines(kde1 <- density(x = V1, bw = h1, n = 2^13, from = -1, to = 1), col = viridisLite::viridis(12)[3], lwd = 2) # Density for the observations within the final 25% of the cycle part2 <- sunspots_23$date > quantile(sunspots_23$date, 0.75) V2 <- cosines(X = sunspots_23$X[part2, ], theta = c(0, 0, 1)) h2 <- bw.SJ(x = V2, method = "dpi") lines(kde2 <- density(x = V2, bw = h2, n = 2^13, from = -1, to = 1), col = viridisLite::viridis(12)[8], lwd = 2) # Computation the level set of a kernel density estimator that contains # at least 1 - alpha of the probability (kde stands for an object # containing the output of density(x = data)) kde_level_set <- function(kde, data, alpha) { # Estimate c from alpha c <- quantile(approx(x = kde$x, y = kde$y, xout = data)$y, probs = alpha) # Begin and end index for the potentially many intervals in the level sets kde_larger_c <- kde$y >= c run_length_kde <- rle(kde_larger_c) begin <- which(diff(kde_larger_c) > 0) + 1 end <- begin + run_length_kde$lengths[run_length_kde$values] - 1 # Return the [a_i, b_i], i = 1, ..., K in the K rows return(cbind(kde$x[begin], kde$x[end])) } # Level set containing the 90% of the probability, in latitude angles 90 - acos(kde_level_set(kde = kde, data = V, alpha = 0.10)) / pi * 180 # Modes (in cosines and latitude angles) modes <- c(kde$x[kde$x < 0][which.max(kde$y[kde$x < 0])], kde$x[kde$x > 0][which.max(kde$y[kde$x > 0])]) 90 - acos(modes) / pi * 180
Density and simulation of a distribution on
,
, obtained by the
tangent-normal decomposition. The tangent-normal decomposition of
the random vector
is
where is a
random variable in
(the cosines of
) and
is a random vector in
(the multivariate signs of
)
and
is the
matrix computed by
Gamma_theta
.
The tangent-normal decomposition can be employed for constructing
distributions for that arise for certain choices of
and
. If
and
are independent, then simulation from
is straightforward using the tangent-normal
decomposition. Also, the density of
at
,
, is readily computed as
where ,
,
is the density of
,
and
is the density
of
for an angular function
with normalizing constant
.
is the surface area of
.
d_tang_norm(x, theta, g_scaled, d_V, d_U, log = FALSE) r_tang_norm(n, theta, r_U, r_V)
d_tang_norm(x, theta, g_scaled, d_V, d_U, log = FALSE) r_tang_norm(n, theta, r_U, r_V)
x |
locations in |
theta |
a unit norm vector of size |
g_scaled |
the scaled angular density |
d_V |
the density |
d_U |
the density |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
n |
sample size, a positive integer. |
r_U |
a function for simulating |
r_V |
a function for simulating |
Either g_scaled
or d_V
can be supplied to d_tang_norm
(the rest of the arguments are compulsory). One possible choice for
g_scaled
is g_vMF
with scaled = TRUE
. Another
possible choice is the angular function , normalized by
its normalizing constant
(see examples).
This angular function makes
to be distributed as a
.
The normalizing constants and densities are computed through log-scales for numerical accuracy.
Depending on the function:
d_tang_norm
: a vector of length nx
or 1
with the evaluated density at x
.
r_tang_norm
: a matrix of size c(n, p)
with the
random sample.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
Gamma_theta
, signs
,
tangent-elliptical
, tangent-vMF
,
vMF
.
## Simulation and density evaluation for p = 2 # Parameters n <- 1e3 p <- 2 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa_V <- 2 kappa_U <- 0.1 # The vMF scaled angular function g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Cosine density for the vMF distribution d_V <- function(v, log) { log_dens <- g_scaled(v, log = log) + (p - 3)/2 * log(1 - v^2) switch(log + 1, exp(log_dens), log_dens) } # Multivariate signs density based on a vMF d_U <- function(x, log) d_vMF(x = x, mu = mu, kappa = kappa_U, log = log) # Simulation functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) r_U <- function(n) r_vMF(n = n, mu = mu, kappa = kappa_U) # Sample and color according to density x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, d_U = d_U) # dens <- d_tang_norm(x = x, theta = theta, d_V = d_V, d_U = d_U) # The same plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa_V <- 2 kappa_U <- 2 # Sample and color according to density x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, d_U = d_U) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Cosine density d_V <- function(v, log) { log_dens <- w_p(p = p - 1, log = TRUE) + g_scaled(t = v, log = TRUE) + (0.5 * (p - 3)) * log(1 - v^2) switch(log + 1, exp(log_dens), log_dens) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density r_U <- function(n) r_unif_sphere(n = n, p = p - 1) x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, d_V = d_V, d_U = d_unif_sphere) # dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, # d_U = d_unif_sphere) # The same if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
## Simulation and density evaluation for p = 2 # Parameters n <- 1e3 p <- 2 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa_V <- 2 kappa_U <- 0.1 # The vMF scaled angular function g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Cosine density for the vMF distribution d_V <- function(v, log) { log_dens <- g_scaled(v, log = log) + (p - 3)/2 * log(1 - v^2) switch(log + 1, exp(log_dens), log_dens) } # Multivariate signs density based on a vMF d_U <- function(x, log) d_vMF(x = x, mu = mu, kappa = kappa_U, log = log) # Simulation functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) r_U <- function(n) r_vMF(n = n, mu = mu, kappa = kappa_U) # Sample and color according to density x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, d_U = d_U) # dens <- d_tang_norm(x = x, theta = theta, d_V = d_V, d_U = d_U) # The same plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa_V <- 2 kappa_U <- 2 # Sample and color according to density x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, d_U = d_U) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Cosine density d_V <- function(v, log) { log_dens <- w_p(p = p - 1, log = TRUE) + g_scaled(t = v, log = TRUE) + (0.5 * (p - 3)) * log(1 - v^2) switch(log + 1, exp(log_dens), log_dens) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density r_U <- function(n) r_unif_sphere(n = n, p = p - 1) x <- r_tang_norm(n = n, theta = theta, r_V = r_V, r_U = r_U) col <- viridisLite::viridis(n) dens <- d_tang_norm(x = x, theta = theta, d_V = d_V, d_U = d_unif_sphere) # dens <- d_tang_norm(x = x, theta = theta, g_scaled = g_scaled, # d_U = d_unif_sphere) # The same if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
Density and simulation of the Tangent Elliptical (TE)
distribution on
,
. The distribution arises
by considering the
tangent-normal decomposition with
multivariate signs distributed as an
Angular Central Gaussian distribution.
d_TE(x, theta, g_scaled, d_V, Lambda, log = FALSE) r_TE(n, theta, r_V, Lambda)
d_TE(x, theta, g_scaled, d_V, Lambda, log = FALSE) r_TE(n, theta, r_V, Lambda)
x |
locations in |
theta |
a unit norm vector of size |
g_scaled |
the scaled angular density |
d_V |
the density |
Lambda |
the shape matrix |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
n |
sample size, a positive integer. |
r_V |
a function for simulating |
The functions are wrappers for d_tang_norm
and
r_tang_norm
with d_U = d_ACG
and
r_U = r_ACG
.
Depending on the function:
d_TE
: a vector of length nx
or 1
with the
evaluated density at x
.
r_TE
: a matrix of size c(n, p)
with the random sample.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
tang-norm-decomp
,
tangent-vMF
, ACG
.
## Simulation and density evaluation for p = 2 # Parameters p <- 2 n <- 1e3 theta <- c(rep(0, p - 1), 1) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 kappa_V <- 2 # Required functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Sample and color according to density x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 kappa_V <- 2 # Sample and color according to density x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density kappa_V <- 1 Lambda <- matrix(0.75, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
## Simulation and density evaluation for p = 2 # Parameters p <- 2 n <- 1e3 theta <- c(rep(0, p - 1), 1) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 kappa_V <- 2 # Required functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Sample and color according to density x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 kappa_V <- 2 # Sample and color according to density x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density kappa_V <- 1 Lambda <- matrix(0.75, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 x <- r_TE(n = n, theta = theta, r_V = r_V, Lambda = Lambda) col <- viridisLite::viridis(n) dens <- d_TE(x = x, theta = theta, g_scaled = g_scaled, Lambda = Lambda) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
Density and simulation of the Tangent von Mises–Fisher (TM)
distribution on
,
. The distribution arises
by considering the
tangent-normal decomposition with
multivariate signs distributed as a
von Mises–Fisher distribution.
d_TM(x, theta, g_scaled, d_V, mu, kappa, log = FALSE) r_TM(n, theta, r_V, mu, kappa)
d_TM(x, theta, g_scaled, d_V, mu, kappa, log = FALSE) r_TM(n, theta, r_V, mu, kappa)
x |
locations in |
theta |
a unit norm vector of size |
g_scaled |
the scaled angular density |
d_V |
the density |
mu |
the directional mean |
kappa |
concentration parameter |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
n |
sample size, a positive integer. |
r_V |
a function for simulating |
The functions are wrappers for d_tang_norm
and
r_tang_norm
with d_U = d_vMF
and
r_U = r_vMF
.
Depending on the function:
d_TM
: a vector of length nx
or 1
with the
evaluated density at x
.
r_TM
: a matrix of size c(n, p)
with the random sample.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
tang-norm-decomp
,
tangent-elliptical
, vMF
.
## Simulation and density evaluation for p = 2 # Parameters p <- 2 n <- 1e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa <- 1 kappa_V <- 2 # Required functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Sample and color according to density x <- r_TM(n = n, theta = theta, r_V = r_V, mu = 1, kappa = kappa) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa <- 1 kappa_V <- 2 # Sample and color according to density x <- r_TM(n = n, theta = theta, r_V = r_V, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density kappa <- 0.5 x <- r_TM(n = n, theta = theta, r_V = r_V, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
## Simulation and density evaluation for p = 2 # Parameters p <- 2 n <- 1e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa <- 1 kappa_V <- 2 # Required functions r_V <- function(n) r_g_vMF(n = n, p = p, kappa = kappa_V) g_scaled <- function(t, log) { g_vMF(t, p = p - 1, kappa = kappa_V, scaled = TRUE, log = log) } # Sample and color according to density x <- r_TM(n = n, theta = theta, r_V = r_V, mu = 1, kappa = kappa) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) plot(r * x, pch = 16, col = col[rank(dens)]) ## Simulation and density evaluation for p = 3 # Parameters p <- 3 n <- 5e3 theta <- c(rep(0, p - 1), 1) mu <- c(rep(0, p - 2), 1) kappa <- 1 kappa_V <- 2 # Sample and color according to density x <- r_TM(n = n, theta = theta, r_V = r_V, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) } ## A non-vMF angular function: g(t) = 1 - t^2. It is sssociated to the ## Beta(1/2, (p + 1)/2) distribution. # Scaled angular function g_scaled <- function(t, log) { log_c_g <- lgamma(0.5 * p) + log(0.5 * p / (p - 1)) - 0.5 * p * log(pi) log_g <- log_c_g + log(1 - t^2) switch(log + 1, exp(log_g), log_g) } # Simulation r_V <- function(n) { sample(x = c(-1, 1), size = n, replace = TRUE) * sqrt(rbeta(n = n, shape1 = 0.5, shape2 = 0.5 * (p + 1))) } # Sample and color according to density kappa <- 0.5 x <- r_TM(n = n, theta = theta, r_V = r_V, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) dens <- d_TM(x = x, theta = theta, g_scaled = g_scaled, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(dens)], size = 5) }
Tests for assessing the rotational symmetry of a unit-norm
random vector in
,
, about a location
, from a
hyperspherical sample
.
The vector is said to be rotational symmetric about
if the distributions of
and
coincide, where
is any
rotation matrix
that fixes
, i.e.,
.
test_rotasym(data, theta = spherical_mean, type = c("sc", "loc", "loc_vMF", "hyb", "hyb_vMF")[5], Fisher = FALSE, U = NULL, V = NULL)
test_rotasym(data, theta = spherical_mean, type = c("sc", "loc", "loc_vMF", "hyb", "hyb_vMF")[5], Fisher = FALSE, U = NULL, V = NULL)
data |
hyperspherical data, a matrix of size |
theta |
either a unit norm vector of size |
type |
a character string (case insensitive) indicating the type of test to conduct:
See the details below for further explanations of the tests. |
Fisher |
if |
U |
multivariate signs of |
V |
cosines of |
Descriptions of the tests:
The "scatter" test is locally and asymptotically optimal against
tangent elliptical alternatives to rotational
symmetry. However, it is not consistent against
tangent von Mises–Fisher (vMF) alternatives.
The asymptotic null distribution of
is unaffected if
is estimated, that is,
the asymptotic null distributions of
and
are
the same.
The "location" test is locally and asymptotically most powerful
against vMF alternatives to rotational symmetry. However, it is not
consistent against tangent elliptical alternatives. The asymptotic
null distribution of
for known
(the one implemented in
test_rotasym
) does change if
is estimated by
. Therefore, if
the test is performed with an estimated
(if
theta
is a function)
will not be properly calibrated.
test_rotasym
will give a warning in
such case.
The "vMF location" test is a modification of the "location" test
designed to make its null asymptotic distribution invariant from the
estimation of (as the "scatter" test is).
The test is optimal against tangent vMF alternatives with a specific,
vMF-based, angular function
g_vMF
. Despite not
being optimal against all tangent vMF alternatives, it is
consistent for all of them. As the location test,
it is not consistent against tangent elliptical alternatives.
The "hybrid" test combines (see below how) the "scatter" and
"location" tests. The test is neither optimal against tangent elliptical nor
tangent vMF alternatives, but it is consistent against both. Since it is
based on the "location" test, if computed with an estimator
, the test statistic will not
be properly calibrated.
test_rotasym
will give a warning in such case.
The "vMF hybrid" test is the analogous of the "hybrid" test but replaces the "location" test by the "vMF location" test.
The combination of the scatter and location tests in the hybrid tests is done in two different ways:
If Fisher = FALSE
, then the scatter and location tests
statistics give the hybrid test statistic
If Fisher = TRUE
, then Fisher's method for aggregating
independent tests (the two test statistics are independent under rotational
symmetry) is considered, resulting the hybrid test statistic:
where and
are
the
-values of the scatter and location tests, respectively.
The hybrid test statistic follows analogously to
by replacing
with
.
Finally, recall that the tests are designed to test implications of rotational symmetry. Therefore, the tests are not consistent against all types of alternatives to rotational symmetry.
An object of the htest
class with the following elements:
statistic
: test statistic.
parameter
: degrees of freedom of the chi-square distribution
appearing in all the null asymptotic distributions.
p.value
: -value of the test.
method
: information on the type of test performed.
data.name
: name of the value of data
.
U
: multivariate signs of data
.
V
: cosines of data
.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. doi:10.1080/01621459.2019.1665527
tangent-elliptical
, tangent-vMF
,
spherical_mean
.
## Rotational symmetry holds # Sample data from a vMF (rotational symmetric distribution about mu) n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data_0 <- r_vMF(n = n, mu = theta, kappa = 1) # theta known test_rotasym(data = data_0, theta = theta, type = "sc") test_rotasym(data = data_0, theta = theta, type = "loc") test_rotasym(data = data_0, theta = theta, type = "loc_vMF") test_rotasym(data = data_0, theta = theta, type = "hyb") test_rotasym(data = data_0, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_0, theta = theta, type = "hyb_vMF") test_rotasym(data = data_0, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_0, type = "sc") test_rotasym(data = data_0, type = "loc") # Warning test_rotasym(data = data_0, type = "loc_vMF") test_rotasym(data = data_0, type = "hyb") # Warning test_rotasym(data = data_0, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_0, type = "hyb_vMF") test_rotasym(data = data_0, type = "hyb_vMF", Fisher = TRUE) ## Rotational symmetry does not hold # Sample non-rotational symmetric data from a tangent-vMF distribution # The scatter test is blind to these deviations, while the location tests # are optimal n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) mu <- c(rep(0, p - 2), 1) kappa <- 2 set.seed(123456789) r_V <- function(n) { r_g_vMF(n = n, p = p, kappa = 1) } data_1 <- r_TM(n = n, r_V = r_V, theta = theta, mu = mu, kappa = kappa) # theta known test_rotasym(data = data_1, theta = theta, type = "sc") test_rotasym(data = data_1, theta = theta, type = "loc") test_rotasym(data = data_1, theta = theta, type = "loc_vMF") test_rotasym(data = data_1, theta = theta, type = "hyb") test_rotasym(data = data_1, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_1, theta = theta, type = "hyb_vMF") test_rotasym(data = data_1, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_1, type = "sc") test_rotasym(data = data_1, type = "loc") # Warning test_rotasym(data = data_1, type = "loc_vMF") test_rotasym(data = data_1, type = "hyb") # Warning test_rotasym(data = data_1, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_1, type = "hyb_vMF") test_rotasym(data = data_1, type = "hyb_vMF", Fisher = TRUE) # Sample non-rotational symmetric data from a tangent-elliptical distribution # The location tests are blind to these deviations, while the # scatter test is optimal n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 set.seed(123456789) r_V <- function(n) { r_g_vMF(n = n, p = p, kappa = 1) } data_2 <- r_TE(n = n, r_V = r_V, theta = theta, Lambda = Lambda) # theta known test_rotasym(data = data_2, theta = theta, type = "sc") test_rotasym(data = data_2, theta = theta, type = "loc") test_rotasym(data = data_2, theta = theta, type = "loc_vMF") test_rotasym(data = data_2, theta = theta, type = "hyb") test_rotasym(data = data_2, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_2, theta = theta, type = "hyb_vMF") test_rotasym(data = data_2, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_2, type = "sc") test_rotasym(data = data_2, type = "loc") # Warning test_rotasym(data = data_2, type = "loc_vMF") test_rotasym(data = data_2, type = "hyb") # Warning test_rotasym(data = data_2, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_2, type = "hyb_vMF") test_rotasym(data = data_2, type = "hyb_vMF", Fisher = TRUE) ## Sunspots births data # Load data data("sunspots_births") sunspots_births$X <- cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta), cos(sunspots_births$phi) * sin(sunspots_births$theta), sin(sunspots_births$phi)) # Test rotational symmetry for the 23rd cycle, specified theta sunspots_23 <- subset(sunspots_births, cycle == 23) test_rotasym(data = sunspots_23$X, type = "sc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_23$X, type = "loc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_23$X, type = "hyb", theta = c(0, 0, 1)) # Test rotational symmetry for the 23rd cycle, unspecified theta spherical_loc_PCA(sunspots_23$X) test_rotasym(data = sunspots_23$X, type = "sc", theta = spherical_loc_PCA) test_rotasym(data = sunspots_23$X, type = "loc_vMF", theta = spherical_loc_PCA) test_rotasym(data = sunspots_23$X, type = "hyb_vMF", theta = spherical_loc_PCA) # Test rotational symmetry for the 22nd cycle, specified theta sunspots_22 <- subset(sunspots_births, cycle == 22) test_rotasym(data = sunspots_22$X, type = "sc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_22$X, type = "loc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_22$X, type = "hyb", theta = c(0, 0, 1)) # Test rotational symmetry for the 22nd cycle, unspecified theta spherical_loc_PCA(sunspots_22$X) test_rotasym(data = sunspots_22$X, type = "sc", theta = spherical_loc_PCA) test_rotasym(data = sunspots_22$X, type = "loc_vMF", theta = spherical_loc_PCA) test_rotasym(data = sunspots_22$X, type = "hyb_vMF", theta = spherical_loc_PCA)
## Rotational symmetry holds # Sample data from a vMF (rotational symmetric distribution about mu) n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) set.seed(123456789) data_0 <- r_vMF(n = n, mu = theta, kappa = 1) # theta known test_rotasym(data = data_0, theta = theta, type = "sc") test_rotasym(data = data_0, theta = theta, type = "loc") test_rotasym(data = data_0, theta = theta, type = "loc_vMF") test_rotasym(data = data_0, theta = theta, type = "hyb") test_rotasym(data = data_0, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_0, theta = theta, type = "hyb_vMF") test_rotasym(data = data_0, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_0, type = "sc") test_rotasym(data = data_0, type = "loc") # Warning test_rotasym(data = data_0, type = "loc_vMF") test_rotasym(data = data_0, type = "hyb") # Warning test_rotasym(data = data_0, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_0, type = "hyb_vMF") test_rotasym(data = data_0, type = "hyb_vMF", Fisher = TRUE) ## Rotational symmetry does not hold # Sample non-rotational symmetric data from a tangent-vMF distribution # The scatter test is blind to these deviations, while the location tests # are optimal n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) mu <- c(rep(0, p - 2), 1) kappa <- 2 set.seed(123456789) r_V <- function(n) { r_g_vMF(n = n, p = p, kappa = 1) } data_1 <- r_TM(n = n, r_V = r_V, theta = theta, mu = mu, kappa = kappa) # theta known test_rotasym(data = data_1, theta = theta, type = "sc") test_rotasym(data = data_1, theta = theta, type = "loc") test_rotasym(data = data_1, theta = theta, type = "loc_vMF") test_rotasym(data = data_1, theta = theta, type = "hyb") test_rotasym(data = data_1, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_1, theta = theta, type = "hyb_vMF") test_rotasym(data = data_1, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_1, type = "sc") test_rotasym(data = data_1, type = "loc") # Warning test_rotasym(data = data_1, type = "loc_vMF") test_rotasym(data = data_1, type = "hyb") # Warning test_rotasym(data = data_1, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_1, type = "hyb_vMF") test_rotasym(data = data_1, type = "hyb_vMF", Fisher = TRUE) # Sample non-rotational symmetric data from a tangent-elliptical distribution # The location tests are blind to these deviations, while the # scatter test is optimal n <- 200 p <- 10 theta <- c(1, rep(0, p - 1)) Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1) diag(Lambda) <- 1 set.seed(123456789) r_V <- function(n) { r_g_vMF(n = n, p = p, kappa = 1) } data_2 <- r_TE(n = n, r_V = r_V, theta = theta, Lambda = Lambda) # theta known test_rotasym(data = data_2, theta = theta, type = "sc") test_rotasym(data = data_2, theta = theta, type = "loc") test_rotasym(data = data_2, theta = theta, type = "loc_vMF") test_rotasym(data = data_2, theta = theta, type = "hyb") test_rotasym(data = data_2, theta = theta, type = "hyb", Fisher = TRUE) test_rotasym(data = data_2, theta = theta, type = "hyb_vMF") test_rotasym(data = data_2, theta = theta, type = "hyb_vMF", Fisher = TRUE) # theta unknown (employs the spherical mean as estimator) test_rotasym(data = data_2, type = "sc") test_rotasym(data = data_2, type = "loc") # Warning test_rotasym(data = data_2, type = "loc_vMF") test_rotasym(data = data_2, type = "hyb") # Warning test_rotasym(data = data_2, type = "hyb", Fisher = TRUE) # Warning test_rotasym(data = data_2, type = "hyb_vMF") test_rotasym(data = data_2, type = "hyb_vMF", Fisher = TRUE) ## Sunspots births data # Load data data("sunspots_births") sunspots_births$X <- cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta), cos(sunspots_births$phi) * sin(sunspots_births$theta), sin(sunspots_births$phi)) # Test rotational symmetry for the 23rd cycle, specified theta sunspots_23 <- subset(sunspots_births, cycle == 23) test_rotasym(data = sunspots_23$X, type = "sc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_23$X, type = "loc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_23$X, type = "hyb", theta = c(0, 0, 1)) # Test rotational symmetry for the 23rd cycle, unspecified theta spherical_loc_PCA(sunspots_23$X) test_rotasym(data = sunspots_23$X, type = "sc", theta = spherical_loc_PCA) test_rotasym(data = sunspots_23$X, type = "loc_vMF", theta = spherical_loc_PCA) test_rotasym(data = sunspots_23$X, type = "hyb_vMF", theta = spherical_loc_PCA) # Test rotational symmetry for the 22nd cycle, specified theta sunspots_22 <- subset(sunspots_births, cycle == 22) test_rotasym(data = sunspots_22$X, type = "sc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_22$X, type = "loc", theta = c(0, 0, 1)) test_rotasym(data = sunspots_22$X, type = "hyb", theta = c(0, 0, 1)) # Test rotational symmetry for the 22nd cycle, unspecified theta spherical_loc_PCA(sunspots_22$X) test_rotasym(data = sunspots_22$X, type = "sc", theta = spherical_loc_PCA) test_rotasym(data = sunspots_22$X, type = "loc_vMF", theta = spherical_loc_PCA) test_rotasym(data = sunspots_22$X, type = "hyb_vMF", theta = spherical_loc_PCA)
Density and simulation of the uniform distribution on
,
. The density is just the
inverse of the surface area of
, given by
d_unif_sphere(x, log = FALSE) r_unif_sphere(n, p) w_p(p, log = FALSE)
d_unif_sphere(x, log = FALSE) r_unif_sphere(n, p) w_p(p, log = FALSE)
x |
locations in |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
n |
sample size, a positive integer. |
p |
dimension of the ambient space |
If , then
and the "surface area" is
. The function
w_p
is vectorized on p
.
Depending on the function:
d_unif_sphere
: a vector of length nx
or 1
with
the evaluated density at x
.
r_unif_sphere
: a matrix of size c(n, p)
with the
random sample.
w_p
: the surface area of .
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
## Area of S^{p - 1} # Areas of S^0, S^1, and S^2 w_p(p = 1:3) # Area as a function of p p <- 1:20 plot(p, w_p(p = p), type = "o", pch = 16, xlab = "p", ylab = "Area", main = expression("Surface area of " * S^{p - 1}), axes = FALSE) box() axis(1, at = p) axis(2, at = seq(0, 34, by = 2)) ## Simulation and density evaluation for p = 1, 2, 3 # p = 1 n <- 500 x <- r_unif_sphere(n = n, p = 1) barplot(table(x) / n) head(d_unif_sphere(x)) # p = 2 x <- r_unif_sphere(n = n, p = 3) plot(x) head(d_unif_sphere(x)) # p = 3 x <- r_unif_sphere(n = n, p = 3) if (requireNamespace("rgl")) { rgl::plot3d(x) } head(d_unif_sphere(x))
## Area of S^{p - 1} # Areas of S^0, S^1, and S^2 w_p(p = 1:3) # Area as a function of p p <- 1:20 plot(p, w_p(p = p), type = "o", pch = 16, xlab = "p", ylab = "Area", main = expression("Surface area of " * S^{p - 1}), axes = FALSE) box() axis(1, at = p) axis(2, at = seq(0, 34, by = 2)) ## Simulation and density evaluation for p = 1, 2, 3 # p = 1 n <- 500 x <- r_unif_sphere(n = n, p = 1) barplot(table(x) / n) head(d_unif_sphere(x)) # p = 2 x <- r_unif_sphere(n = n, p = 3) plot(x) head(d_unif_sphere(x)) # p = 3 x <- r_unif_sphere(n = n, p = 3) if (requireNamespace("rgl")) { rgl::plot3d(x) } head(d_unif_sphere(x))
Density and simulation of the von Mises–Fisher (vMF)
distribution on
,
. The density at
is given by
where is the directional
mean,
is the concentration parameter about
, and
is the order-
modified Bessel function of the first kind.
The angular function of the vMF is . The
associated cosines density is
,
where
is the surface area of
.
d_vMF(x, mu, kappa, log = FALSE) c_vMF(p, kappa, log = FALSE) r_vMF(n, mu, kappa) g_vMF(t, p, kappa, scaled = TRUE, log = FALSE) r_g_vMF(n, p, kappa)
d_vMF(x, mu, kappa, log = FALSE) c_vMF(p, kappa, log = FALSE) r_vMF(n, mu, kappa) g_vMF(t, p, kappa, scaled = TRUE, log = FALSE) r_g_vMF(n, p, kappa)
x |
locations in |
mu |
the directional mean |
kappa |
concentration parameter |
log |
flag to indicate if the logarithm of the density (or the normalizing constant) is to be computed. |
p |
dimension of the ambient space |
n |
sample size, a positive integer. |
t |
a vector with values in |
scaled |
whether to scale the angular function by the von Mises–Fisher
normalizing constant. Defaults to |
r_g_vMF
implements algorithm VM in Wood (1994). c_vMF
is
vectorized on p
and kappa
.
Depending on the function:
d_vMF
: a vector of length nx
or 1
with the
evaluated density at x
.
r_vMF
: a matrix of size c(n, p)
with the random sample.
c_vMF
: the normalizing constant.
g_vMF
: a vector of size length(t)
with the evaluated
angular function.
r_g_vMF
: a vector of length n
containing simulated
values from the cosines density associated to the angular function.
Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.
Wood, A. T. A. (1994) Simulation of the von Mises Fisher distribution. Commun. Stat. Simulat., 23(1):157–164. doi:10.1080/03610919408813161
# Simulation and density evaluation for p = 2 mu <- c(0, 1) kappa <- 2 n <- 1e3 x <- r_vMF(n = n, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x, pch = 16, col = col[rank(d_vMF(x = x, mu = mu, kappa = kappa))]) # Simulation and density evaluation for p = 3 mu <- c(0, 0, 1) kappa <- 2 x <- r_vMF(n = n, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(d_vMF(x = x, mu = mu, kappa = kappa))], size = 5) } # Cosines density g_tilde <- function(t, p, kappa) { exp(w_p(p = p - 1, log = TRUE) + g_vMF(t = t, p = p, kappa = kappa, scaled = TRUE, log = TRUE) + ((p - 3) / 2) * log(1 - t^2)) } # Simulated data from the cosines density n <- 1e3 p <- 3 kappa <- 2 hist(r_g_vMF(n = n, p = p, kappa = kappa), breaks = seq(-1, 1, l = 20), probability = TRUE, main = "Simulated data from g_vMF", xlab = "t") t <- seq(-1, 1, by = 0.01) lines(t, g_tilde(t = t, p = p, kappa = kappa)) # Cosine density as a function of the dimension M <- 100 col <- viridisLite::viridis(M) plot(t, g_tilde(t = t, p = 2, kappa = kappa), col = col[2], type = "l", ylab = "Density") for (p in 3:M) { lines(t, g_tilde(t = t, p = p, kappa = kappa), col = col[p]) }
# Simulation and density evaluation for p = 2 mu <- c(0, 1) kappa <- 2 n <- 1e3 x <- r_vMF(n = n, mu = mu, kappa = kappa) col <- viridisLite::viridis(n) r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x, pch = 16, col = col[rank(d_vMF(x = x, mu = mu, kappa = kappa))]) # Simulation and density evaluation for p = 3 mu <- c(0, 0, 1) kappa <- 2 x <- r_vMF(n = n, mu = mu, kappa = kappa) if (requireNamespace("rgl")) { rgl::plot3d(x, col = col[rank(d_vMF(x = x, mu = mu, kappa = kappa))], size = 5) } # Cosines density g_tilde <- function(t, p, kappa) { exp(w_p(p = p - 1, log = TRUE) + g_vMF(t = t, p = p, kappa = kappa, scaled = TRUE, log = TRUE) + ((p - 3) / 2) * log(1 - t^2)) } # Simulated data from the cosines density n <- 1e3 p <- 3 kappa <- 2 hist(r_g_vMF(n = n, p = p, kappa = kappa), breaks = seq(-1, 1, l = 20), probability = TRUE, main = "Simulated data from g_vMF", xlab = "t") t <- seq(-1, 1, by = 0.01) lines(t, g_tilde(t = t, p = p, kappa = kappa)) # Cosine density as a function of the dimension M <- 100 col <- viridisLite::viridis(M) plot(t, g_tilde(t = t, p = 2, kappa = kappa), col = col[2], type = "l", ylab = "Density") for (p in 3:M) { lines(t, g_tilde(t = t, p = p, kappa = kappa), col = col[p]) }