Package 'lancor'

Title: Statistical Inference via Lancaster Correlation
Description: Implementation of the methods described in Holzmann, Klar (2024) <doi: 10.1111/sjos.12733>. Lancaster correlation is a correlation coefficient which equals the absolute value of the Pearson correlation for the bivariate normal distribution, and is equal to or slightly less than the maximum correlation coefficient for a variety of bivariate distributions. Rank and moment-based estimators and corresponding confidence intervals are implemented, as well as independence tests based on these statistics.
Authors: Bernhard Klar [aut, cre] (ORCID: <https://orcid.org/0000-0002-1419-5473>), Hajo Holzmann [aut], Lucas Iglesias [ctb]
Maintainer: Bernhard Klar <[email protected]>
License: GPL-2
Version: 0.1.3
Built: 2026-05-22 18:58:08 UTC
Source: https://github.com/bernhardklar/lancor

Help Index


Lancaster correlation

Description

Computes the Lancaster correlation coefficient.

Usage

lcor(x, y = NULL, type = c("rank", "linear"))

Arguments

x

a numeric vector, or a matrix or data frame with two columns.

y

NULL (default) or a vector with same length as x.

type

a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated.

Details

Let FXF_X and FYF_Y be the distribution functions of XX and YY, and define

X=Φ1(FX(X)),Y=Φ1(FY(Y)),X^* = \Phi^{-1}(F_X(X)), \quad Y^* = \Phi^{-1}(F_Y(Y)),

where Φ1\Phi^{-1} is the standard normal quantile function. Furthermore for XX and YY with finite fourth moment, let

X~=(XE(X))/sd(X),Y~=(YE(Y))/sd(Y).\tilde{X} = (X - \mathbb{E}(X)) / \operatorname{sd}(X), \quad \tilde{Y} = (Y - \mathbb{E}(Y)) / \operatorname{sd}(Y).

Then

ρL(X,Y)=max{CorPearson(X,Y),  CorPearson((X)2,(Y)2)}\rho_L(X,Y) = \max\{|\operatorname{Cor}_{\text{Pearson}}(X^*,Y^*)|,\; | \operatorname{Cor}_{\text{Pearson}}((X^*)^2,(Y^*)^2)|\}

and

ρL,1(X,Y)=max{CorPearson(X,Y),  CorPearson((X~)2,(Y~)2)}\rho_{L,1}(X,Y) = \max\{|\operatorname{Cor}_{\text{Pearson}}(X,Y)|,\; | \operatorname{Cor}_{\text{Pearson}}((\tilde{X})^2,(\tilde{Y})^2)|\}

are called the Lancaster correlation coefficient and the linear Lancaster correlation coefficient, respectively. Two estimation methods are supported:

  • Linear estimator for ρL,1\bold{\rho_{L,1}} (type = "linear"): Consider ρL1=CorPearson(X,Y)\rho_{L1} = \operatorname{Cor}_{\text{Pearson}}(X,Y) and ρL2=CorPearson((X~)2,(Y~)2)\rho_{L2} = \operatorname{Cor}_{\text{Pearson}}((\tilde{X})^2,(\tilde{Y})^2). Let ρ^L1\hat\rho_{L1} be the sample Pearson correlation and ρ^L2\hat\rho_{L2} the empirical correlation of the squares of the empirically standardized observations, and set ρ^L,1=max{ρ^L1,  ρ^L2}\hat\rho_{L,1} = \max\{\,|\hat\rho_{L1}|,\;|\hat\rho_{L2}|\,\}.

  • Rank-based estimator for ρL\bold{\rho_{L}} (type = "rank"): Consider ρR1=CorPearson(X,Y)\rho_{R1} = \operatorname{Cor}_{\text{Pearson}}(X^*,Y^*) and ρR2=CorPearson((X)2,(Y)2)\rho_{R2} = \operatorname{Cor}_{\text{Pearson}}((X^*)^2,(Y^*)^2). Let QiQ_i and RiR_i be the ranks of XiX_i and YiY_i, within X1,...,XnX_1,...,X_n or Y1,...,YnY_1,...,Y_n respectively. Define

    ρ^R1=1nsa2j=1na(Qj)a(Rj),\hat\rho_{R1} = \frac{1}{n\,s_a^2}\sum_{j=1}^n a(Q_j)\,a(R_j),

    ρ^R2=1nsb2j=1n(b(Qj)bˉ)(b(Rj)bˉ),\hat\rho_{R2} = \frac{1}{n\,s_b^2}\sum_{j=1}^n \bigl(b(Q_j)-\bar b\bigr)\,\bigl(b(R_j)-\bar b\bigr),

    where the scores are, for j=1,...,nj=1,...,n,

    a(j)=Φ1 ⁣(jn+1),b(j)=a(j)2,a(j) = \Phi^{-1}\!\Bigl(\frac{j}{n+1}\Bigr), \quad b(j)=a(j)^2,

    bˉ=1nj=1nb(j),sa2=1nj=1n(a(j)aˉ)2,sb2=1nj=1n(b(j)bˉ)2.\bar b=\frac{1}{n}\sum_{j=1}^n b(j), \quad s_a^2 = \frac{1}{n}\sum_{j=1}^n\bigl(a(j)-\bar a\bigr)^2, \quad s_b^2 = \frac{1}{n}\sum_{j=1}^n\bigl(b(j)-\bar b\bigr)^2.

    Finally, the rank‐based Lancaster correlation is

    ρ^L=max{ρ^R1,ρ^R2}.\hat\rho_{L} = \max\bigl\{\,|\hat\rho_{R1}|, |\hat\rho_{R2}|\bigr\}.

Value

the sample Lancaster correlation.

Author(s)

Hajo Holzmann, Bernhard Klar

References

Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733

See Also

lcor.comp, lcor.ci, lcor.test

Examples

Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
lcor(x, type = "rank")
lcor(x, type = "linear")

x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu)
cor(y[,1], y[,2], method = "spearman")
lcor(y, type = "rank")

Confidence intervals for the Lancaster correlation coefficient

Description

Computes confidence intervals for the Lancaster correlation coefficient. Lancaster correlation is a bivariate measures of dependence.

Usage

lcor.ci(
  x,
  y = NULL,
  conf.level = 0.95,
  type = c("rank", "linear"),
  con = TRUE,
  R = 1000,
  method = c("plugin", "boot", "pretest")
)

Arguments

x

a numeric vector, or a matrix or data frame with two columns.

y

NULL (default) or a vector with same length as x.

conf.level

confidence level of the interval.

type

a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated.

con

logical; if TRUE (default), conservative asymptotic confidence intervals are computed.

R

number of bootstrap replications.

method

a character string indicating how the asymptotic covariance matrix is computed if type ="linear". One of "plugin" (default), "boot" or "symmetric": can be abbreviated.

Details

Computes asymptotic and bootstrap-based confidence intervals for the (linear) Lancaster correlation coefficient ρL\rho_L (ρL,1\rho_{L,1}). For more details see lcor.

Asymptotic confidence intervals are derived under two cases (analogously for ρL\rho_{L}; see Holzmann and Klar (2024)):

Case 1: If ρL1ρL2|\rho_{L1}|\neq|\rho_{L2}|, the 1α1-\alpha asymptotic interval is

[max{ρ^L,1z1α/2s/n,0}, min{ρ^L,1+z1α/2s/n,1}],\left[ \max\{\hat\rho_{L,1} - z_{1-\alpha/2}\,s/\sqrt{n}, 0\},\ \min\{\hat\rho_{L,1} + z_{1-\alpha/2}\,s/\sqrt{n}, 1\} \right],

where z1α/2z_{1-\alpha/2} is the standard normal quantile and ss is an estimator of the corresponding standard deviation.

Case 2: If ρL1=ρL2=a>0|\rho_{L1}|=|\rho_{L2}|=a>0, let τ\tau denote the correlation between the two components and let q1α/2q_{1-\alpha/2} be the 1α/21-\alpha/2 quantile of the asymptotic distribution of n(ρ^L,1a)\sqrt{n}(\hat\rho_{L,1} - a). A conservative asymptotic interval is

[max{ρ^L,1q1α/2/n,0}, min{ρ^L,1+z1α/2s/n,1}].\left[ \max\{\hat\rho_{L,1} - q_{1-\alpha/2}/\sqrt{n}, 0\},\ \min\{\hat\rho_{L,1} + z_{1-\alpha/2}\,s/\sqrt{n}, 1\} \right].

Additionally, bootstrap-based intervals can be obtained by resampling and estimating the covariance matrix of the rank or linear correlation components.

Value

a vector containing the lower and upper limits of the confidence interval.

Author(s)

Hajo Holzmann, Bernhard Klar

References

Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733

See Also

lcor, lcor.comp, lcor.test

Examples

n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu) # multivariate t
lcor(y, type = "rank")
lcor.ci(y, type = "rank")

Lancaster correlation and its components

Description

Computes the Lancaster correlation coefficient and its components.

Usage

lcor.comp(x, y = NULL, type = c("rank", "linear"), plot = FALSE)

Arguments

x

a numeric vector, or a matrix or data frame with two columns.

y

NULL (default) or a vector with same length as x.

type

a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated.

plot

logical; if TRUE, scatterplots of the transformed x and y values and of their squares are drawn.

Details

For more details see lcor.

Value

a vector containing the two components rho1 and rho2 and the sample Lancaster correlation.

Author(s)

Hajo Holzmann, Bernhard Klar

References

Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733

See Also

lcor, lcor.comp, lcor.test

Examples

Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 8
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
cor(y[,1], y[,2])
lcor.comp(y, type = "linear")

x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
cor(y[,1], y[,2], method = "spearman")
lcor.comp(y, type = "rank", plot = TRUE)

Lancaster correlation test

Description

Lancaster correlation test of bivariate independence. Lancaster correlation is a bivariate measures of dependence.

Usage

lcor.test(
  x,
  y = NULL,
  type = c("rank", "linear"),
  nperm = 999,
  method = c("permutation", "asymptotic", "symmetric")
)

Arguments

x

a numeric vector, or a matrix or data frame with two columns.

y

NULL (default) or a vector with same length as x

type

a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated.

nperm

number of permutations.

method

a character string indicating how the p-value is computed if type ="linear". One of "permutation" (default), "asymptotic" or "symmetric": can be abbreviated.

Details

For more details on the testing procedure see Remark2Remark \, 2 in Holzmann, Klar (2024).

Value

A list containing the following components:

lcor

the value of the test statistic

pval

the p-value of the test

Author(s)

Hajo Holzmann, Bernhard Klar

References

Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733

See Also

lcor, lcor.comp, lcor.ci and for for performing an ACE permutation test of independence see acepack (https://cran.r-project.org/package=acepack).

Examples

n <- 200
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu)
cor.test(y[,1], y[,2], method = "spearman")
lcor.test(y, type = "rank")

Covariance matrix of components of Lancaster correlation coefficient

Description

Estimate of covariance matrix of the two components of Lancaster correlation. Lancaster correlation is a bivariate measures of dependence.

Usage

Sigma.est(xx)

Arguments

xx

a matrix or data frame with two columns.

Details

For more details see the Appendix in Holzmann, Klar (2024).

Value

the estimated covariance matrix.

Author(s)

Hajo Holzmann, Bernhard Klar

References

Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733

See Also

lcor.ci

Examples

Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 8
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
Sigma.est(y)