Announcements
Lectures this week:
- Today lecture: Delta method and Bootstrap
- Weds lecture: Randomization & Permutation
- Friday lecture: Cancelled - Office hours instead
Formula Sheet The final is closed book, no note sheet. I am willing to provide some of the harder (less common) formulae.
Lab: No set material, I’ll encourage Chuan to lead a formula strategy session.
Delta Method
Delta Method
If the sampling distribution of a statistic converges to a Normal distribution, the Delta method, provides a way to approximate the sampling distribution of a function of a statistic.
Univariate Delta Method
If
then
(As long as exists and is non-zero valued.)
Another way of saying it
If we know,
then,
The approximation can be pretty rough. I.e. just because the sample is large enough that the original statistic is reasonably Normal, doesn’t meant the transformed statistic will be.
Example: Log Odds
Let Bernoulli, and .
We know .
We might estimate the log odds with:
What is the assymptotic distribution of the estimated log odds?
Example: Log Odds cont.
Other comments on delta method
Derived using a Taylor expansion of around
There is also a multivariate version (useful if you need some function of two statistics, e.g. ratio of sample means)
Bootstrap
Bootstrap
A method to approximate the sampling distribution of a statistic
Idea:
- Recall, one way to approximate the sampling distribution of a statistic was by simulation, but you have to assume a population distribution.
- The bootstrap uses the empirical distribution function as an estimate for the population distribution, i.e relies on
Example - Sampling distribution of Median by simulation
Assume a population distribution, i.e.
Repeat for
- Sample observations from
- Find sample median,
Then the simulated sample medians, approximate the sampling distribution of the sample median.
Example - Sampling distribution of Median by bootstrap
Estimate the population distribution from the sample, i.e.
Repeat for
- Sample observations from a population with c.d.f
- Find sample median,
Then the bootstrapped sample medians, approximate the sampling distribution of the sample median.
Sampling from a c.d.f
You can sample from any c.d.f by sampling from a Uniform(0, 1), then transforming with the inverse c.d.f.
I.e. sample i.i.d from Uniform(0,1), then
are distributed with c.d.f .
In the empirical case
Sampling from the ECDF is equivalent to sampling with replacement from the original sample.
Example - Sampling distribution of Median by bootstrap
Repeat for
- Sample observations with replacement from
- Find sample median,
Then the bootstrapped sample medians, approximate the sampling distribution of the sample median.
A little more subtly:
Example
Sample values: 1.8, 2.2, 2.7, 5.7, 6.9, 7.4, 8.1, 8.7, 9 and 9.5
Sample median: 7.1562828
A bootstrap resample: 1.8, 2.7, 2.7, 5.7, 6.9, 7.4, 8.1, 8.1, 8.7 and 9.5
Sample median: 7.1562828
Many resamples
Bootstrap confidence intervals
Many methods..
A common one:
- Quantile: largest resampled statistic value, and largest resampled statistic value
Comments on the bootstrap
Relies on ^F(y) being a good estimate of the F(y), doesn’t necessarily solve small sample problems.
Resampling should generally mimic original study design. E.g. If pairs of observations are sampled from a population, pairs should be resampled