Homework 2

Due 2017/10/12

Submit your answers on canvas.

1. Central Limit Theorem Exploration

Using the code from lab as a guide, for each of the following distributions, explore how well the CLT Normal approximation approximates the sampling distribution. You should provide:

  • a figure that illustrates the population distribution,
  • a series of figures of the simulated sampling distributions for different sample sizes, along with the approximation based on the CLT, and
  • a brief summary of your observations
  1. A continuous Uniform distribution on the interval \([0, 1]\).
  2. A Gamma with shape parameter 2, and scale parameter 2.
  3. A Beta(0.5, 0.5) distribution.
  4. Some other distribution of your choice.

2. Applying the CLT

  1. Consider a continuous Uniform(-1, 1) population, and an i.i.d sample of size, n = 25.
    1. Simulate to estimate \(P(0.25 < \overline{Y} < 0.75)\) where \(\overline{Y}\) is the sample mean.

    2. Use the Central Limit Theorem to approximate the same probability.

  2. Consider a Bernoulli(0.4) population, and an i.i.d sample of size, n = 5.
    1. Simulate to estimate \(P(\overline{Y} < 0.3)\) where \(\overline{Y}\) is the sample mean.

    2. Find the same probability exactly using the fact that the sum of \(n\) Bernoulli(p) random variables has a Binomial(n, p) distribution and the pbinom() function in R.

    3. Use the Central Limit Theorem to approximate the same probability.

3. Z-test

Suppose a random sample of n = 25 households in Corvallis resulted in a sample mean household size (i.e. number of people living in the household) of 2.8. Assume the population variance of household size is known to be \(\sigma^2 = 1.96\).

  1. Compute the Z-statistic for testing the null hypothesis \(H_0: \mu = 2.6\)

  2. Perform a level \(\alpha = 0.1\) test of \(H_0: \mu = 2.6\) versus the one-sided upper alternative \(H_A: \mu > 2.6\).

  3. Now suppose that we perform a Z-test, but incorrectly assume that the population variance is 0.49 instead of the true value 1.96. If the null hypothesis is true, what proportion of the time will we reject the null if we are using a level \(\alpha = 0.1\) critical value?

  4. If the true population mean household size in Corvallis is 2.7, what is the power of this test?

  5. If instead of n = 25 people in your sample, you had n = 100 people in your sample, what would the power of the test be?

  6. If instead of performing a level \(\alpha = 0.1\) Z-test with your original \(n = 25\) people, you performed a level \(\alpha = 0.05\) Z-test with that same sample size, what would the power of the test be?

  7. Can you generalize these results? How does power change as you increase sample size? How does power change as you increase the significance level?

4. Good estimates

In class we saw three criteria which we could use to judge an estimate: bias, mean squared error and consistency. For the sample mean (as an estimate of the population mean) we could evaluate these criteria analytically, but that isn’t always possible.

Consider this situation:

  • A Normal\((\mu, \sigma^2)\) population, and
  • the sample median as estimate of the population median.

Describe how you would use simulation to evaluate the three criteria for this estimate

You do not need to do the simulation. But, be specific about what you would simulate and what you would calculate or look for in the simulated values. You should also comment on any limitations you see in attempting to evaluate these properties by simulation. (It may help to know if \(MSE(\widehat{\theta}) \rightarrow 0\) as \(n\rightarrow \infty\) then \(\widehat{\theta}\) is consistent).