Announcements

Lectures this week:

Today lecture: Delta method and Bootstrap
Weds lecture: Randomization & Permutation
Friday lecture: Cancelled - Office hours instead

Formula Sheet The final is closed book, no note sheet. I am willing to provide some of the harder (less common) formulae.

Lab: No set material, I’ll encourage Chuan to lead a formula strategy session.

Delta Method

If the sampling distribution of a statistic converges to a Normal distribution, the Delta method, provides a way to approximate the sampling distribution of a function of a statistic.

Univariate Delta Method

If $\sqrt{n} (\hat{θ} - θ) \to_{D} N (0, σ^{2})$

then $\sqrt{n} (g (\hat{θ}) - g (θ)) \to_{D} N (0, σ^{2} [g^{'} (θ]^{2})$

(As long as $g^{'} (θ)$ exists and is non-zero valued.)

Another way of saying it

If we know, $\hat{θ} \dot{\sim} N (θ, σ^{2})$

then,

$g (\hat{θ}) \dot{\sim} N (g (θ), σ^{2} [g^{'} (θ)]^{2})$

The approximation can be pretty rough. I.e. just because the sample is large enough that the original statistic is reasonably Normal, doesn’t meant the transformed statistic will be.

Example: Log Odds

Let $Y_{1}, \dots, Y_{n} \sim$ Bernoulli $(p)$ , and $X = \sum_{i = 1}^{n} Y_{i}$ .

We know $\hat{p} = \frac{X}{n} \sim N (p, \frac{p (1 - p)}{n})$ .

We might estimate the log odds with: $\log (\frac{\hat{p}}{1 - \hat{p}})$

What is the assymptotic distribution of the estimated log odds?

Example: Log Odds cont.

$g (p) = \log (\frac{p}{1 - p}) = \log (p) - \log (1 - p)$

Other comments on delta method

Derived using a Taylor expansion of $g (\hat{θ})$ around $g (θ)$

There is also a multivariate version (useful if you need some function of two statistics, e.g. ratio of sample means)

Bootstrap

A method to approximate the sampling distribution of a statistic

Idea:

Recall, one way to approximate the sampling distribution of a statistic was by simulation, but you have to assume a population distribution.
The bootstrap uses the empirical distribution function as an estimate for the population distribution, i.e relies on $\hat{F} (y) \approx F (y)$

Example - Sampling distribution of Median by simulation

Assume a population distribution, i.e. $Y \sim N (μ, σ^{2})$

Repeat for $k = 1, \dots, B$

Sample $n$ observations from $N (μ, σ^{2})$
Find sample median, $m^{(k)}$

Then the simulated sample medians, $m^{(k)}, k = 1, \dots, B$ approximate the sampling distribution of the sample median.

Example - Sampling distribution of Median by bootstrap

Estimate the population distribution from the sample, i.e. $\hat{F} (y)$

Repeat for $k = 1, \dots, B$

Sample $n$ observations from a population with c.d.f $\hat{F} (y)$
Find sample median, $m^{(k)}$

Then the bootstrapped sample medians, $m^{(k)}, k = 1, \dots, B$ approximate the sampling distribution of the sample median.

Sampling from a c.d.f

You can sample from any c.d.f by sampling from a Uniform(0, 1), then transforming with the inverse c.d.f.

I.e. sample $u_{1}, \dots, u_{n}$ i.i.d from Uniform(0,1), then

$y_{i} = F^{- 1} (u_{i}) i = 1, \dots, n$ are distributed with c.d.f $F (y)$ .

In the empirical case

Sampling from the ECDF is equivalent to sampling with replacement from the original sample.

Example - Sampling distribution of Median by bootstrap

Repeat for $k = 1, \dots, B$

Sample $n$ observations with replacement from $Y_{1}, \dots, Y_{n}$
Find sample median, $m^{(k)}$

Then the bootstrapped sample medians, $m^{(k)}, k = 1, \dots, B$ approximate the sampling distribution of the sample median.

A little more subtly: $\hat{m} - m \dot{\sim} \tilde{m} - \hat{m}$

Example

Sample values: 1.8, 2.2, 2.7, 5.7, 6.9, 7.4, 8.1, 8.7, 9 and 9.5

Sample median: 7.1562828

A bootstrap resample: 1.8, 2.7, 2.7, 5.7, 6.9, 7.4, 8.1, 8.1, 8.7 and 9.5

Sample median: 7.1562828

Many resamples

Bootstrap confidence intervals

Many methods..

A common one:

Quantile: $100 (α / 2)$ largest resampled statistic value, and $100 (1 - α / 2)$ largest resampled statistic value

Comments on the bootstrap

Relies on $\hat{F} (y)$ being a good estimate of the $F (y)$ , doesn’t necessarily solve small sample problems.

Resampling should generally mimic original study design. E.g. If pairs of observations are sampled from a population, pairs should be resampled

Delta Method and the Bootstrap ST551 Lecture 27

Announcements

Delta Method

Delta Method

Another way of saying it

Example: Log Odds

Example: Log Odds cont.

Other comments on delta method

Bootstrap

Bootstrap

Example - Sampling distribution of Median by simulation

Example - Sampling distribution of Median by bootstrap

Sampling from a c.d.f

In the empirical case

Example - Sampling distribution of Median by bootstrap

Example

Many resamples

Bootstrap confidence intervals

Comments on the bootstrap