Submit your answers on canvas.
1. p-values
Download the ASA’s Statement on p-values, (see Wasserstein and Lazar 2016 reference below).
Skip the “Context, Process and Purpose” section and read the section “ASA Statement on Statistical Significance and P-values” starting on page 3.
Answer the following questions:
We usually think about a small p-value providing evidence against the null hypothesis. What else does the article imply a small p-value may cast doubt on?
What is the primary argument for not basing scientific conclusions or policy decision solely on whether the p-value is below some threshold?
What is p-hacking?
Can a p-value measure the size of an effect? What can measure the size of an effect?
Skim through the references in the “A brief p-Values and Statistical Significance Reference List”, and shortlist three article titles that interest you. (You may be required to read one of these in a future homework)
2. Data analysis
The Behavioral Risk Factor Surveillance System (BRFSS) is a nationwide health-related survey of U.S. residents. For this question you can get a sample of responses from the 2003 survey by downloading an R data file from the class website:
library(tidyverse)
download.file("http://st551.cwick.co.nz/data/brfss.rds",
"brfss.rds", mode = "wb")
Then load it into the variable brfss
with:
brfss <- read_rds("brfss.rds")
brfss
The variables weight_kg
and wtdesire_kg
correspond to the responses to the questions:
- About how much do you weigh without shoes?
- How much would you like to weigh?
respectively converted to kilograms.
You can create a variable to represent the amount of weight a respondent would like to lose with:
brfss <- mutate(brfss, desired_loss = weight_kg - wtdesire_kg)
Find summary statistics (mean, standard deviation and number of observations) for
desired_loss
for both males and females in the sample.Produce histograms of
desired_loss
for both males and females.Do US resident females, on average, want to lose weight (i.e. is the mean desired loss greater than zero)? Conduct the appropriate analyses and write a statistical summary of your findings
Do US resident males, on average, want to lose weight (i.e. is the mean desired loss greater than zero)? Conduct the appropriate analyses and write a statistical summary of your findings
3. Performance of t-test
Explore the Type I error rate of the t-test for a two-sided level \(\alpha = 0.05\) test, for samples of size \(n = 5, 10, 25, 50\), for one of the following population distributions:
- Uniform(0, 1)
- Chi-squared(1)
- Beta(.5, .5)
- Exponential(1)
Use at least 10,000 simulations for each scenario.
Provide a table of the estimated Type I error rate by sample size.
Write a short (3-5 sentence) summary of how the t-test performs: is it close enough to exact that you would be comfortable using it even when the underlying distribution is as far from normal as these distributions?
References
Wasserstein, Ronald L, and Nicole A Lazar. 2016. “The ASA’s Statement on P-Values: Context, Process, and Purpose” 70 (2): 129–33. doi:10.1080/00031305.2016.1154108.