Sampling Distributions ST551 Lecture 4

Finish up Lecture 3 slides

Sampling distributions

Options for finding the sampling distribution:

  • Derive it mathematically
  • Can’t derive the distribution?
    • Derive properties of the distribution
    • Simulate
    • Approximate

Deriving the sampling distribution

Normal population: set up

Population distribution: YN(μ,σ2)

Sample: Y1,,Yn i.i.d from population

Sample statistic: Sample mean = Y¯=1ni=1nYi

What is the sampling distribution of the sample mean?

Normal population: derivation

Y1+Y2

Y1+Y2+Y3

Y1+Y2++Yn

Y¯=Y1+Y2++Ynn

Bernoulli population

Population distribution: YBernoulli(p)

E.g US voters where Y={1,Supports single payer health care0,Does not support single payer health care

Sample: Y1,,Yn, i.i.d from population

Sample Statistic: Sample mean = Y¯=1ni=1nYi= gives the sample proportion

What is the sampling distribution of the sample proportion? i=1nYiBinomial(n,p)

Bernoulli population

E.g. p=0.56, n=10

Bernoulli population

E.g. p=0.56, n=10

Can’t derive in these situations

  • Population: YUniform(a,b)
  • Sample: size n i.i.d
  • Statistic: sample mean or sample variance
  • No closed form solution
  • Population Y unknown
  • Sample: size n i.i.d
  • Statistic: anything
  • Can’t derive because we don’t know population distribution

What to do?

  1. Derive parameters of sampling distribution
  2. Simulate the sampling distribution
  3. Approximate the sampling distribution

Some more probability review

Cumulative Density Function

The cumulative density function of a random variable X is F(x)=P(Xx)

Probabilty Density/Mass Function

For continuous distributions we can define the probability density function:

f(x)=ddxF(x)P(X(xΔ,x+Δ))2Δ For discrete distributions we have probability mass function:

p(x)=P(X=x)

Probabilty Density/Mass Function

Expectation (Mean)

The expectation (or mean) of a random variable, X, is

E(X)=xf(x)dxfor continuous distributionsE(X)=x:p(x)>0xp(x)for discrete distributions

Expectation Properties

For any random variables X and Y (don’t need independence)

E(X+Y)=E(X)+E(Y)

E(a1X1++anXn)=a1E(X1)++anE(Xn)

Known as the linearity property.

Variance and Covariance

The variance of r.v. X is Var(X)=E[(XE(X))2]=E[X2](E[X])2

The covariance between r.v.’s X and Y is Cov(X,Y)=E[(XE(X))(YE(Y))] If X and Y are independent Cov(X,Y)=0 (converse isn’t true)

Cov(X,X)=Var(X)

Variance Properties

For any random variables X and Y (don’t need independence)

Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)

Var(XY)=Var(X)+Var(Y)2Cov(X,Y)

For random variables X1,,Xn Var(a1X1++anXn)=a12Var(X1)++an2Var(Xn)+a1a2Cov(X1,X2)+a1a3Cov(X1,X3)++a1anCov(X1,Xn)+ana1Cov(Xn,X1)+ana2Cov(Xn,X2)++anan1Cov(Xn,Xn1)

Next time…

Use these properties to derive mean and variance for sampling distributions.