Inference: Hypothesis Tests
Key Components
- Hypotheses
- Null Hypothesis
- Alternative Hypothesis
Test Statistic
- Reference Distribution (Null Distribution)
- Rejection Region
Possible outcomes
\(H_0\) is true | \(H_A\) is true | |
---|---|---|
Reject Null | Type I error | Correct decision |
Fail to reject Null | Correct decision | Type II error |
Balancing Type I and Type II error
Hypothesis tests are designed to control Type I error
The significance level, \(\alpha\) is the probability of a Type I error: \[ \alpha = P\left( \text{Reject } H_0 \text{ when } H_0 \text{ is true} \right) = P_{H_0}(\text{Reject } H_0) \]
Common choices: \(\alpha = 0.05\), \(\alpha = 0.01\)
Power
The power, of a test is probability of correctly rejecting the null hypothesis (\(H_0\)), but is a function of some actual parameter value (\(\theta_1\)).
\[ \begin{aligned} \text{Power}(\theta_1) &= P\left( \text{Reject } H_0 \text{ when } \theta_1 \text{ is true} \right) \\ &= P_{\theta_1}(\text{Reject } H_0) \\ &= 1 - \beta(\theta_1) \end{aligned} \] for \(\theta_1 \in H_A\) where \(\beta(\theta_1)\) = Probability of Type II error
Statistic
Test statistic: A statistic \(T(Y_1, \ldots, Y_n)\) (i.e. a function of the sample values) that is used to make the decision on whether or not to reject the null hypothesis.
We want the test statistic to capture the evidence the data provides about the hypotheses.
Null Distribution/Rejection Region
Reference Distribution (Null distribution) The distribution the test statistic will be compared to.
Usually, the distribution of the test statistic when the null hypothesis is true.
Rejection region: values of the test statistic that will result in rejecting the null hypothesis.
How do we find it? Consider values of the test statistic which are most unusual and would be more typical if the alternative were true.
Example: rejection region calculation
Ex: Would you reject \(H_0: \mu = 30\) in favor of \(H_A: \mu < 30\) if your sample of 100 OSU freshman had a sample mean hours spent preparing for class:
\(\overline{Y}\)=31?
\(\overline{Y}\)=25?
\(\overline{Y}\)=29.2?
To decide we need to use the sampling distribution of the sample mean.
(We will assume, for now, that the population variance \(\sigma^2 = 25\) is known)
Critical value
The rejection region is of one of these forms:
- Reject \(H_0\) if \(T > c_U\)
- Reject \(H_0\) if \(T < c_L\)
- Reject \(H_0\) if \(T > c_U\) or \(T < c_L\)
\(c_U\) \(c_L\) are called critical values for the test and are chosen to obtain the desired significance level (i.e. to control the Type I error rate).
Example: continued
Consider values of the test statistic which are most unusual and would be more typical if the alternative were true.
If null hypothesis were true: \(\overline{Y} \dot \sim N(\quad , \qquad )\)
More typical values under the alternative would be on the low side of the distribution.
Example: continued
Rejection region will be of the form: Reject \(H_0\) if \(T < c_L\)
\(c_U\) \(c_L\) are are chosen to obtain the desired significance level.
What value of \(c_L\), gives \(P_{H_0}(\text{Reject } H_0) = \alpha = 0.05\)?
qnorm(0.05, mean = 30, sd = sqrt(25/100))
= 29.18
Example: continued
Reject \(H_0\) if \(\overline{Y} < 29.18\)
Ex: Would you reject \(H_0: \mu = 30\) in favor of \(H_A: \mu < 30\) if your sample of 100 OSU freshman had a sample mean hours spent preparing for class:
\(\overline{Y}\)=31? No
\(\overline{Y}\)=25? Yes
\(\overline{Y}\)=29.2? No
Z-test
If we wanted to use a Standard Normal as the reference distribution, we would standardize the sample mean by its mean and standard deviation under the null.
I.e. subtract the hypothesized mean \(\mu_0\) and divide by \(\sqrt{\sigma^2/n}\)
\[ Z(\mu_0) = \frac{\overline{Y} - \mu_0}{\sqrt{\sigma^2/n}} \]
Z-test
Leads to the Z-test.
To test a hypothesis about population mean when population variance is known.
To test \(H_0: \mu = \mu_0\)
Find test statistic: \[ Z(\mu_0) = \frac{\overline{Y} - \mu_0}{\sqrt{\sigma^2/n}} \]
Compare to Standard Normal
Z-test: Critical values
- \(H_A: \mu > \mu_0\), reject \(H_0\) when \(Z(\mu_0) > z_{1-\alpha}\)
- \(H_A: \mu < \mu_0\), reject \(H_0\) when \(Z(\mu_0) < z_{\alpha}\)
- \(H_A: \mu \ne \mu_0\) reject \(H_0\) when \(Z(\mu_0) < z_{\alpha/2}\) or \(Z(\mu_0) > z_{1-\alpha/2}\), equivalently \(|Z(\mu_0)|> z_{1-\alpha/2}\)
\(z_\alpha\), is the value \(z\) such that \(P(Z < z) = \alpha\) where \(Z \sim N(0, 1)\), can find with in R qnorm(alpha)
.
Z-test: Recap
Data Setting One sample, no explanatory variable \(Y_1, \ldots, Y_n\) i.i.d from population with known variance \(\sigma^2\)
Null hypothesis \(H_0: \mu = \mu_0\)
Test statistic \[ Z(\mu_0) = \frac{\overline{Y} - \mu_0}{\sqrt{\sigma^2/n}} \]
Reference distribution \(Z(\mu_0) \dot \sim N(0,1)\)
Rejection Region
One sided \(H_A: \mu < \mu_0\) | Two sided \(H_A: \mu \ne \mu_0\) | One sided \(H_A: \mu > \mu_0\) |
---|---|---|
\(Z(\mu_0) < z_{\alpha}\) | \(|Z(\mu_0)| > z_{1-\alpha/2}\) | \(Z(\mu_0) > z_{1 - \alpha}\) |
How do we know if a hypothesis test is good?
Exactness Is the actual rejection probability equal to the significance level \(\alpha\)?
- Finite sample exactness: For finite samples of size \(n\) is \(P(\text{Reject } H_0) = \alpha\) when null is true?
- Asymptotic exactness: As \(n\) goes to infinity does \(P(\text{Reject } H_0) \rightarrow \alpha\) when null is true?
A test is finite-sample exact if reference distribution is exactly the sampling distribution for test statistic when null is true.
A test is asymptotically exact if reference distribution is the asymptotic sampling distribution for test statistic when null is true.
Your turn
Is the Z-test:
finite sample exact?
When is the sampling distribution of \(\overline{Y}\) exactly \(N(\mu_0, \sigma^2/n)\)?
Only when population distribution is Normal, i.e \(Y \sim N(\mu_0, \sigma^2)\)
asymptotic exact?
When does the sampling distribution of \(\overline{Y}\) approach \(N(\mu_0, \sigma^2/n)\) as \(n \rightarrow \infty\)?
ALWAYS! Thanks to the CLT.
How do we know if a hypothesis test is good?
Consistency
For any fixed setting where alternative is true, does the rejection probability tend to one as sample approaches infinity?
\[ \text{Power}(\theta_1) \rightarrow_p 1, \quad \text{for any } \theta_1 \in H_A \]
(If we can take an infinite sample are we guaranteed to reject null when alternative is in fact true.)
Is the Z-test consistent?
What is the power of the test? \(\mu = \mu_A \ne \mu_0\)
Depends on rejection region (let’s do one sided upper):
\[ \begin{aligned} P\left(Z(\mu_0) > z_{1-\alpha}\right) &= P\left(\frac{\overline{Y} - \mu_0}{\sqrt{\sigma^2/n}} > z_{1-\alpha}\right) \\ & = P\left(\overline{Y} > z_{1-\alpha} \sqrt{\sigma^2/n} + \mu_0 \right) \\ & = P\left(\frac{\overline{Y} - \mu_A}{ \sqrt{\sigma^2/n}} > z_{1-\alpha} + \frac{\mu_0 - \mu_A}{\sqrt{\sigma^2/n} } \right) \\ & = P\left(\frac{\overline{Y} - \mu_A}{ \sqrt{\sigma^2/n}} > z_{1-\alpha} - \frac{\sqrt{n}(\mu_A - \mu_0)}{\sqrt{\sigma^2} } \right) \end{aligned} \]
Is the Z-test consistent?
Upper alternative \(\mu_A - \mu_0 > 0\), as \(n \rightarrow \infty\) subtracting a bigger and bigger term from critical value. So, term on right gets smaller as \(n\) gets larger.
\[ 1 - \Phi\left( z_{1-\alpha} - \frac{\sqrt{n}(\mu_A - \mu_0)}{\sqrt{\sigma^2} }\right) \]
\[ \rightarrow 1-\Phi(-\infty) = 1 - 0 = 1 \]
Yes, Z-test is consistent