library(tidyverse)
In practice most statisticians will rely on already existing functions to perform tests and/or confidence intervals. However, there are cases when a function doesn’t exist, or the existing function does something slightly different from what you want.
In this lab, you’ll see some of the functions mentioned in class for conducting the procedures we’ve seen over the last week or two, as well as a few cases where no function exists in the the base
or stats
packages (those installed by default). In most cases you’ll also see the explicit calculation of the test statistic. Why? It’s useful to calculate the test statistic explicitly to:
- Check your understanding and practice for exam situations.
- Serve as a check before using a
base
/stats
R function, or one from a contributed package.
Wilcoxon Rank Sum
Recall a Wilcoxon Rank Sum test tests null hypotheses about the mean/median under an assumption that the population is symmetric:
\[ H_0: \mu = \mu_0 \]
Let’s work through the code for the example from lecture:
y <- c(0.8, 2.1, 2.8, 4.3, 5.3, 6.1, 7.3, 8.2, 9.3, 10.1, 10.9, 12.1)
n <- length(y)
Recall we were interested in the hypothesis: \(H_0: \mu = 4\).
mu_0 <- 4
The test statistic is calculated through three steps:
Find the distance of each observed value from the hypothesized center, \(c_0\).
abs(y - mu_0)
Assign a rank to each observation based on its distance from \(c_0\): from 1 for closest, to \(n\) for furthest from \(c_0\).
rank(abs(y - mu_0))
Test statistic: \(S =\) Sum of the ranks for the values that were larger than \(c_0\).
# Is observed value larger than hypothesized mean? y > mu_0 # Use above to subset ranks rank(abs(y - mu_0))[y > mu_0] # And sum (S <- sum(rank(abs(y - mu_0))[y > mu_0]))
We saw you could either use the CLT to build an approximate test or compute an exact reference distribution. To illustrate the approximate test, we first need to standardize by the mean and variance of the test statistic, which depend only on \(n\):
expected_S <- (n * (n + 1) / 4)
var_S <- n * (n + 1) * (2 * n + 1) / 24
(z <- (S - expected_S)/sqrt(var_S))
Leaving us with a Z-statistic that we can compare to a Standard Normal, e.g. to get a p-value:
2 * (1 - pnorm(abs(z)))
The approximate p-value can also be obtained with the wilcox.test()
function. We simply pass in the observed values and hypothesized mean:
wilcox.test(y, mu = 4, exact = FALSE, correct = FALSE)
A p-value based on the exact reference distribution (based on permuting the ranks between above and below the hypothesized median) with the exact
argument set to TRUE
:
wilcox.test(y, mu = 4, exact = TRUE)
Like most tests in base R you can extract elements from the result with $
, for example the p-value:
wilcox.test(y, mu = 4, exact = FALSE, correct = FALSE)$p.value
Chi-square test of variance
The Chi-square test of variance is used to test hypotheses about the population variance, e.g.: \[ H_0: \sigma = \sigma_0 \]
There isn’t a function in base
R for this test, but recall the test statistic is: \[
X(\sigma_0^2) = (n-1)\frac{s^2}{\sigma_0^2}
\]
Which is easily calculated. For example, let’s consider the same sample as above, but now the hypothesis: \(H_0: \sigma^2 = 12\).
sigma2_0 <- 12
Our test statistic is the scaled ratio of observed to hypothesized variance:
(X <- (n - 1) * sd(y)^2 / sigma2_0)
And we compare it to a \(\chi^2_{(n-1)}\) distribution. For example, the critical values for a two-sided alternative at \(\alpha = 0.05\) are:
alpha <- 0.05
(c_vals <- c(qchisq(alpha/2, df = n - 1),
qchisq(1 - alpha/2, df = n - 1)))
That is we reject for \(X(\sigma_0^2) < 3.82\) or \(X(\sigma_0^2) > 21.92\). In this case we fail to reject the null hypothesis.
What would the p-value be?
How would you use R to find the confidence interval for \(\sigma^2\)?
What assumption was crucial for the Chi-square test for variance to perform well?
t-test of variance
An alternative to the Chi-square test of variance for the hypothesis \(H_0: \sigma^2 = \sigma_0^2\) was the t-test of variance.
The t-test of variance is simply a t-test on the transformed variables: \[ Z_i = \left(Y_i - \overline{Y} \right)^2 \]
z <- (y - mean(y))^2
with the hypothesis \(H_0: \mu_Z = \frac{n-1}{n}\sigma_0^2\):
t.test(z, mu = ((n-1)/n) *sigma2_0)
Could you construct a confidence interval for \(\sigma^2\) from this output?
Kolmogorov-Smirnov test
Recall the K-S test examines hypotheses of the form \(H_0: F(y) = F_0(y)\), where \(F(y)\) is the population c.d.f and \(F_0(y)\) is some completely specified hypothesized c.d.f.
Let’s continue using the same sample data, but now consider the null hypothesis that the population is Uniform on 0, 14 (Why not Uniform(0, 10), like in class?).
Recall the K-S test statistic is the largest distance between the hypothesized and empirical c.d.fs. The empirical c.d.f (ecdf) function can be obtained in R with ecdf()
:
ecdf(y)
This function returns a function. That means to find the ecdf at a particular value we need to call the function, i.e. the ecdf at 5 is
ecdf(y)(5)
# or to make it clearer
ecdf_y <- ecdf(y)
ecdf_y(5)
We can use this to generate points along the ecdf, at a sequence of values for \(y\). For example
x <- seq(from = 0, to = max(y) + 1, by = 0.01)
fhat <- ecdf_y(x)
ggplot(mapping = aes(x = x, y = fhat)) +
geom_line()
It’s then easy to also evaluate the hypothesized cdf at these same points:
f_0 <- punif(x, min = 0, max = 14)
# and add them to our picture
ggplot(mapping = aes(x = x, y = fhat)) +
geom_line() +
geom_line(aes(y = f_0), linetype = "dashed")
We can look for the largest difference:
max(abs(fhat - f_0))
Which is 0.14 and occurs at \(y = 10.9\).
All of this is automated in ks.test()
. For example,
ks.test(x = y, y = punif, min = 0, max = 14)
- Why is no confidence interval reported?
- Try testing the hypothesis that the population is Normal(5, 25)
Chi-square goodness of fit
The Chi-square goodness of fit test is like the K-S for discrete distributions. It tests hypotheses of the form \(H_0: P(Y = y) = p_0(y)\), where \(p_0\) is a hypothesized probability mass function.
Recall our dice example from class. For illustration let’s generate some data based on a fair die:
n <- 60
rolls <- sample(1:6, size = 60, replace = TRUE,
prob = rep(1/6, 6))
A simple table of the outcomes can be computed with table()
table(rolls)
And the tabulated values can be fed into the chisq.test()
function, along with the hypothesized probabilities of each outcome:
chisq.test(table(rolls), p = rep(1/6, 6))
To find the test statistic by hand, we need expected counts for each category:
E <- 1/6*n
And then we can compare them to the observed ones (pulled from table()
):
O <- table(rolls)
(chi_sq <- sum((O - E)^2 / E))
What is the reference distribution? Can you replicate the p-value from chisq.test()
with chi_sq
and the relevant distribution function?