**Submit your homework as a compiled Rmarkdown document**. Submit both .pdf (either generated directly from the Rmarkdown, or saved from the Word version generated for the Rmarkdown), **and** the Rmarkdown file itself (the .Rmd). If you do your simulations in an separate R file, please also submit that, but the summary of the simulations must be in the .pdf, or you will receive no credit for the problem.

Submit your answers on canvas.

## 1. Simulation

Investigate a property of one of the methods we’ve seen this quarter, of your choice (that you haven’t already investigated in a homework).

*You should clearly state what you are investigating, and how you structured your simulations. You should also include a brief summary of your conclusions, inclusing relevant tables or figures to support your summary.*

Some ideas:

- The performance of the Rank Sign test with asymmetric distributions
- Compare the performance of the Chi-square test for variance with the t-test for variance when the population isn’t Normal
- The exactness (or power) of the K-S test for small samples
- The performance of F-test for variances (maybe compare with Levene’s test) with non-Normal populations
- The performance of the Wilcoxon Rank Sum test for different hypotheses under the location shift (or not) assumption
- The performance of the two-step procedure that uses Levene’s test to decide between the equal variance and Welch’s two sample t-tests for a test of equal means.

## 2. Data Analysis

For each for the following, perform an appropriate procedure to answer the question of interest. In each case your answer should:

- Provide a plot that summarizes the data
- Justify the choice of procedure
- Include a statistical summary

Using the

`acs_couples`

data (see below): Is there an association between health insurance and gender? Include estimates and confidence intervals for the probability of coverage for each gender in your summary.Using the

`acs_respondents`

data (see below): Is the median income of Oregon residents the same as the median income of Washington residents? (You don’t need to include a confidence interval for the difference in medians, but you should include estimates and confidence intervals for the individual medians).Using the

`acs_respondents`

data (see below): Do incomes of Oregon residents tend to be about the same as the incomes of Washington residents? (Hint:*“tend to be about the same”*might be a way to express \(P(Y > X) = 0.5\)). No need for a point estimate or confidence interval.Using the

`acs_respondents`

data (see below): Are the modes of transport to work used in the same proportions in Oregon and Washington? No need for a point estimate or confidence interval.

### ACS data

The American Community Survey (ACS), is a large survey undertaken by the US Census Bureau in the years between decennial censuses.

For this homework, you are given a two different subsets of the Public Use Micro Data sample for Oregon and Washington from 2016:

`acs_couples`

corresponds to both the husband and wife inn Oregon households that contain opposite gender married couples`acs_respondents`

corresponds to individuals that answered the survey (all from different households). (You may assume these are like a random sample of adult residents in Oregon and Washington).

```
library(tidyverse)
download.file("http://st551.cwick.co.nz/data/acs_couples.csv",
"acs_couples.csv")
acs_couples <- read_csv("acs_couples.csv")
acs_couples
download.file("http://st551.cwick.co.nz/data/acs_respondents.csv",
"acs_respondents.csv")
acs_respondents <- read_csv("acs_respondents.csv")
acs_respondents
```

Variables in `acs_couples`

column name | Variable |
---|---|

household_id | A unique ID number for each household |

state | State the household is in |

husband_age | Age in years of the husband |

husband_income | Total annual income of the husband, can include wages, retirement, interest, social security, self employment income. |

husband_health_insurance | Either with or without health insurance coverage |

wife_age | Age in years of the wife |

wife_income | Total annual income of the wife |

wife_health_insurance | Either with or without health insurance coverage |

Variables in `acs_respondents`

column name | Variable |
---|---|

household_id | A unique ID number for each household |

state | State the household is in |

age | Age of respondent |

sex | Sex of respondent |

total_income | Total income of respondent |

health_insurance | Is respondent with or without health insurance coverage? |

transport | Means of transportation to work (missing if not a worker) |