# Sign test

## Data Setting

**Population:** \(Y \sim\) something with c.d.f \(F_Y(y) = P(Y \le y)\)

**Parameter**: \(M = F^{-1}_Y(0.5)\), the population median

**Sample:** \(n\) i.i.d from population: \(Y_1, \ldots, Y_n\)

**Null hypothesis:** \(H_0: M = M_0\)

## Your Turn:

Consider the hypothesis \(H_0: M = M_0\)

Imagine transforming the \(Y_i, \, i = 1, \ldots, n\) to \[ X_i = \begin{cases} 1, & Y_i \le M_0 \\ 0, & Y_i > M_0 \end{cases} \]

**If the null hypothesis is true \(M = M_0\), what is \(P(X_i = 1)\)?**

(You can assume \(Y\) is a continuous distribution)

## Sign test

To test \(H_0: M = M_0\), perform a Binomial test on \(X_i = \pmb{1}\{Y_i \le M_0 \}\) with \(H_0: p = 0.5\).

## Example

Consider a sample, \(n = 12\), with the sample values:

Consider testing \(H_0: M = 4\) versus a two-sided alternative \(H_A: M \ne 4\) (at the \(\alpha = 0.05\) level).

\(X_i = \pmb{1}\{Y_i \le 4 \}\)

\(\hat{p}_{M_0} = \frac{1}{n}\sum_{i = 1}^{n}X_i = 0.25\)

\(Z(p_0 = 0.5) = \frac{\hat{p}_{M_0} - p_0}{\sqrt{p_0(1-p_0)/n}} = -1.73\)

Compare to \(z_{1-\alpha/2} = 1.96\)

We **fail to reject** the null hypothesis.

## Your turn: 95% Confidence Interval

We can *invert the test* by considering all \(M_0\) for which we would fail to reject the null hypothesis \(H_0: M = M_0\).

**Would you reject for the value on your slip of paper?**

**Why do we only need to consider the actual sample values?**

## Your turn: 95% Confidence Interval

```
## m_0
## 1 0.8
## 2 2.1
## 3 2.8
## 4 4.3
## 5 5.3
## 6 6.1
## 7 7.2
## 8 8.2
## 9 9.3
## 10 10.1
## 11 10.9
## 12 12.1
```

95% confidence interval for \(M\) is \((\qquad, \qquad)\)

## Confidence interval in general

Solve for \(M_0\) that satisfy (i.e. not in rejection region)

\[ \left| \frac{X/n - 0.5}{0.5 \sqrt{n}} \right| < z_{1-\alpha/2}, \quad \text{where } X = \text{number of observations } \le M_0 \]

\[ -z_{1-\alpha/2} < \frac{X/n - 0.5}{0.5 / \sqrt{n}} < z_{1-\alpha/2} \]

\[ \begin{aligned} n(0.5 -z_{1-\alpha/2}\frac{0.5}{\sqrt{n}}) &< X < n(0.5 + z_{1-\alpha/2}\frac{0.5}{\sqrt{n}}) \\ \frac{1}{2}(n -z_{1-\alpha/2}\sqrt{n}) &< X < \frac{1}{2}(n + z_{1-\alpha/2}\sqrt{n}) \end{aligned} \]

## Confidence interval in general

So, \(M_0\) is in interval if the number of observations smaller than \(M_0\) is between: \[ \frac{1}{2}(n -z_{1-\alpha/2}\sqrt{n}) \quad \text{ and } \quad \frac{1}{2}(n + z_{1-\alpha/2}\sqrt{n}) \]

The smallest value that satisifies this is the \[ \left(\frac{1}{2}(n -z_{1-\alpha/2}\sqrt{n})\right)^{\text{th}} \text{ smallest observation} \]

The largest value that satisifies this is the \[ \left(\frac{1}{2}(n + z_{1-\alpha/2}\sqrt{n}) + 1\right)^{\text{th}} \text{ smallest observation} \]

## Confidence interval for median

Approximate (based on approximate Binomial test) confidence interval for the median:

\[ \begin{aligned} \Biggl( \left(\frac{n - z_{1-\alpha/2}\sqrt{n}}{2} \right)^{\text{th}} \text{ smallest observation}, \\ \left(\frac{n + z_{1-\alpha/2}\sqrt{n}}{2} + 1 \right)^{\text{th}} \text{ smallest observation}\Biggr) \end{aligned} \]

May need to round \((.)^\text{th}\) to nearest integers

## Example, continued

\(n = 12, \alpha = 0.05 \implies\)

\[ \begin{aligned} &\Biggl(\left(\frac{12 - 1.96\sqrt{12}}{2} \right)^{\text{th}} \text{ smallest observation}, \\ &\qquad \left(\frac{12 +1.96\sqrt{12}}{2} + 1 \right)^{\text{th}} \text{ smallest observation} \Biggr) \\ &\left(\left(2.61\right)^{\text{th}} \text{ smallest observation}, \left(10.40\right)^{\text{th}} \text{ smallest observation} \right)\\ &\left(3^{\text{rd}} \text{ smallest observation}, 10^{\text{th}} \text{ smallest observation} \right) \\ &\left(2.8, 10.1 \right) \end{aligned} \]

## Sign test for discrete distributions/data

- Remove all values exactly equal to \(M_0\)
- Proceed with test as usual (with a reduced sample size \(n\))

## Sign test: exactness

**Finite sample exact?** No

- Discrete nature of data means we can’t achieve a lot of signficance levels
- Normal approximation is only an approximation…

**Assymptotically exact?** Yes

## Sign test: consistency

The sign test test is consistent. Comes from Binomial test being consistent (which comes from Z-test being consistent).

## Next time…

Signed Rank test