Wilcoxon Signed Rank Test
Usual setting
Population: some population distribution
Sample: i.i.d from population:
Parameter: ?
Null Hypothesis the population ‘center’ is .
Let’s talk about the procedure first, then come back to why it’s hard to be specific here.
Wilcoxon Signed Rank Test Procedure
- Find the distance of each observed value from the hypothesized center, .
- Assign a rank to each observation based on its distance from : from 1 for closest, to for furthest from .
- Test statistic: Sum of the ranks for the values that were larger than .
Example: test statistic calculation
- Find distance to .
Assign ranks
Test statistic: sum of ranks for
Reference distribution
Either:
- Use an exact p-value, by assuming each rank has the same chance of being assigned above or below , or
- Use the Normal approximation to the null distribution of
Reference distribution: Exact p-values
If the population distribution were symmetric about , each rank independently has probability 0.5 of being assigned to an observation above .
We can consider all possible ways of assigning the ranks above and below to work out the exact reference distribution (this is what the R function wilcox.test()
does if you use the argument exact = TRUE
)
Reference distribution: Normal approximation p-values
If the population distribution were symmetric about ,
(Can prove by considering as a sum of products between Bernoulli(0.5) r.v’s and the integers )
So, we can construct a Z-statistic
and compare it to a N(0, 1)
Example: continued
2 * (1 - pnorm(abs(z)))
## [1] 0.03417047
Why is it hard to say what it tests?
Your turn: Sketch worksheet
Performance of the Wilcoxon Signed Rank Test
With no additional assumptions
As a test of the population mean:
- The Wilcoxon Signed Rank test is not assymptotically exact
- The Wilcoxon Signed Rank test is not consistent
As a test of the population median:
- The Wilcoxon Signed Rank test is not assymptotically exact
- The Wilcoxon Signed Rank test is not consistent
Performance of the Wilcoxon Signed Rank Test
If you add an assumption: the population distribution is symmetric.
The Wilcoxon Signed Rank test is assymptotically exact
The Wilcoxon Signed Rank test is consistent
Null hypothesis:
We learn about the mean/median. Of course we could learn more about these parameters directly with a t-test or sign test without the additional symmetry assumption.
Performance of the Wilcoxon Signed Rank Test
Often presented as:
“The nonparametric Wilcoxon signed rank test compares the median of a single column of numbers against a hypothetical median.” Incorrect, without symmetry assumption, and then it’s equally a test of the mean.
“This is another test that is a non-parametric equivalent of a 1-Sample t-test”. Incorrect, without symmetry assumption.
“The Wilcoxon signed-rank test applies to the case of symmetric continuous distributions. Under this assumption, the mean equals the median. The null hypothesis is ” Correct
In R
y <- c(0.8, 2.1, 2.8, 4.3, 5.3, 6.1, 7.3, 8.2,
9.3, 10.1, 10.9, 12.1)
Exact p-values with exact = TRUE
(default)
wilcox.test(y, mu = 4, exact = TRUE)
##
## Wilcoxon signed rank test
##
## data: y
## V = 66, p-value = 0.03418
## alternative hypothesis: true location is not equal to 4
In R
y <- c(0.8, 2.1, 2.8, 4.3, 5.3, 6.1, 7.3, 8.2,
9.3, 10.1, 10.9, 12.1)
Approximate p-values with exact = FALSE
and no continuity correction
wilcox.test(y, mu = 4, exact = FALSE, correct = FALSE)
##
## Wilcoxon signed rank test
##
## data: y
## V = 66, p-value = 0.03417
## alternative hypothesis: true location is not equal to 4