Midterm
Next Friday Oct 27th in class.
No outside materials except one double-sided page of your own notes and a calculator.
I’ll be putting up a study guide and practice midterm by Friday.
Two options to vote on:
- No homework due next Thursday
- No homework due the week after the midterm - won by popular vote
Finish the CI for last week’s worksheet
Exact Binomial Test: Takeaways
“Exact” because it uses the exact sampling distribution of the sum of Yi.
The actual Type I error rate will never be more than α, but may be substantially less (i.e. conservative).
You can invert the test to get a confidence interval, but there isn’t an easy closed form for the interval.
Exact Binomial Test: In R
binom.test(x = 7, n = 12, p = 0.4)
##
## Exact binomial test
##
## data: 7 and 12
## number of successes = 7, number of trials = 12, p-value = 0.2417
## alternative hypothesis: true probability of success is not equal to 0.4
## 95 percent confidence interval:
## 0.2766697 0.8483478
## sample estimates:
## probability of success
## 0.5833333
Exact Binomial Test: In R
binom.test(x = 7, n = 12, p = 0.4)
x
- count of 1’s, i.e. ∑ni=1Yi
n
- sample size
p
- p0, the hypothesized population proportion
The reported CI is a Clopper-Pearson confidence interval, based on the exact distribution but with equal tails (i.e. try to get α/2 in each tail).
Approximate Binomial Test
Approximate Binomial Test
Use fact that: ¯Y˙∼N(E(Y),Var(Y)n)=N(p,p(1−p)n)
Leads to the Z-test where:
Z(p0)=ˆp−p0√p0(1−p0)/n ˆp=¯Y = sample proportion
Exact distribution of sample proportion
Approximate distribution of sample proportion
Your turn
library(openintro)
census %>%
group_by(sex) %>%
summarise(n = n())
## # A tibble: 2 x 2
## sex n
## <fctr> <int>
## 1 Female 232
## 2 Male 268
Find:
ˆp
The Z-statistic, for the test of H0:p=0.5
Your turn
A confidence interval?
Need to invert test, i.e. find all p0 such that: |Z(p0)|=|ˆp−p0√p0(1−p0)/n|>z1−α/2
It’s hard…
Instead use:
ˆp±z1−α2√ˆp(1−ˆp)n Based on inverting a (Wald) test with statistic:
Zw(p0)=ˆp−p0√ˆp(1−ˆp)/n
Asymptotically equivalent to Z(p0) (happens to be the Score test)
Your turn
library(openintro)
census %>%
group_by(sex) %>%
summarise(n = n())
## # A tibble: 2 x 2
## sex n
## <fctr> <int>
## 1 Female 232
## 2 Male 268
Find:
- 95% CI for p.
Can lead to contradictions
A score test, Z(p0), might not agree with a Wald interval.
Learn to live with it…or don’t calculate things by hand.
In R
prop.test(x = 232, n = 232 + 268, p = 0.5, correct = FALSE)
##
## 1-sample proportions test without continuity correction
##
## data: 232 out of 232 + 268, null probability 0.5
## X-squared = 2.592, df = 1, p-value = 0.1074
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
## 0.4207282 0.5078208
## sample estimates:
## p
## 0.464
In R
prop.test(x = 232, n = 232 + 268, p = 0.5, correct = FALSE)
Equivalent to Z(p0) and inverts to get confidence interval (i.e. p-value and CI will agree).
Reports X-squared
, χ2 statistic, take square root to get Z
When to use the Approximate Binomial test?
Compare to:
binom.test(x = 232, n = 232 + 268, p = 0.5)
##
## Exact binomial test
##
## data: 232 and 232 + 268
## number of successes = 232, number of trials = 500, p-value =
## 0.1174
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.4196128 0.5088153
## sample estimates:
## probability of success
## 0.464
When to use the Approximate Binomial test?
The approximation isn’t great for small expected counts.
OK to use the approximation if: np0>5 and n(1−p0)>5
(Or something similar)
Next time…
Use Binomial test as a way to look at population median.