p-values and Confidence Intervals ST551 Lecture 9

Finish Last Times slides

p-values

p-values

A p-value associated with a hypothesis test of some null hypothesis H0 vs. some alternative HA is the probability, under the null hypothesis, of observing a result at least as extreme as the statistic you observed.

Extreme here means in the direction of the rejection region.

What does extreme mean?

  • One sided lesser: pnorm(z)
  • One sided greater: 1 - pnorm(z)
  • Two sided: 2*(1- pnorm(abs(z)))

Example: Let’s say Y¯=25.7, H0:μ=30, H1:μ<30, n=100, σ2=25,known:

z <- (25.7 - 30)/(sqrt(25/100))
pnorm(z)
## [1] 3.985805e-18

The p-value is not

A p-value is the probability, under the null hypothesis, of observing a result at least as extreme as the result that was actually observed.

A p-value is NOT the probability of the null hypothesis being true!

There is a 3.57% chance that the mean is truly 12.

There is a 3.57% chance of observing a z-statistic at least this far from zero when the mean truly is 12.

American Statistical Association Statement on p-values

p-values as measures of evidence against the null hypothesis

From Statistical Sleuth

From Statistical Sleuth

p-values as measures of evidence against the null hypothesis

E.g. if p<0.001, for H0:μ=30, HA:μ<30.

  • “There is convincing evidence against the hypothesis that the mean time spent preparing for class by freshman at OSU is equal to 30”.

  • Also OK: “There is convincing evidence the mean time spent preparing for class by freshman at OSU is less than 30”.

  • Not OK: “There is convincing evidence the mean time spent preparing for class by freshman at OSU is equal to 30” - wrong direction

  • Not OK: “There is no evidence the mean time spent preparing for class by freshman at OSU is equal to 30” - p-values don’t give evidence for null

Your turn:

Now imagine the p-value is, p=0.5. Which of the following are correct conclusions?

  • “There is no evidence the mean time spent preparing for class by freshman at OSU is less than 30”.

  • “There is no evidence the mean time spent preparing for class by freshman at OSU is equal to 30”.

  • “There is convincing evidence the mean time spent preparing for class by freshman at OSU is equal to 30”.

  • “There is no evidence against the hypothesis that the mean time spent preparing for class by freshman at OSU is equal to 30”.

p-values and rejection regions

pα Reject H0:μ=μ0 at significance level α

p>α Fail to reject H0:μ=μ0 at significance level α

XKCD picture

Inference: Confidence Intervals

Inference: Confidence Intervals

A confidence interval gives a range of plausible values for the parameter.

A hypothesis test asks if a value is plausible.

Leads to:

  • A (1α)100% confidence interval is the set of all null hypotheses that would not be rejected at level α.

  • That is, μ0 is in a two-sided (1α)100% confidence interval for μ if H0:μ=μ0 would not be rejected at level α vs. a two-sided alternative.

CI for Z-test

Rejection region for two-sided alternative: |Z(μ0)|>z1α/2

We want all μ0 that satisfy

|Y¯μ0σ/n|<z1α/2 Or equivalently,

zα/2<Y¯μ0σ/n<z1α/2

CI for Z-test

Leads to (1α)100% confidence intervals of the form

(Y¯z1α/2σn,Y¯+z1α/2σn)

Sometimes called a Z-confidence interval.

z1α2=1.962

Interpretation of CIs

  • (1α)100% of the time that you perform this experiment, the interval you construct will contain the true value of μ.
  • E.g. in α100% of possible random samples from the population, this intervalcontains the true μ.
  • It is incorrect to say probability the true mean is inside a specific interval is, e.g. 95%.
  • The correct statement is “95% of the time, intervals constructed in this manner will include μ

A statistical summary

When summarizing an analysis, state:

  • Point estimate
  • Confidence interval estimate with confidence level
  • p-value and conclusion against the null, worded in context without notation.

(Any other information neccessary to understand what analysis was undertaken)

A statistical summary: example

Let’s say our sample mean time spent preparing for class from our sample of 100 OSU freshman is Y¯=25.7. (Still assuming a known variance of σ2=25)

  • There is convincing evidence OSU freshman (Fall 2017) spend less than 30 hours per week preparing for classes (one-sided p-value, p < 0.001 , from Z-test).
  • We estimate that the mean time OSU freshman (Fall 2017) spent preparing for class was 25.7 hours per week.
  • With 95% confidence, the mean time OSU freshman spend preparing for class is between 24.72 and 26.68 hours per week.

use p-value rounded to 2 significant figures if > 0.001

Your turn:

A random sample of n=25 Corvallis residents had an average IQ score of 104. Assume a population variance of σ2=225. What’s the mean IQ for Corvallis residents? Is it plausible the mean for Corvallis residents is greater than 100?

Write a statistical summary.

Next time

What if we don’t know the population variance?