---
title: "Lab 5 Rmarkdown"
author: "Charlotte Wickham"
date: "Oct 23 2017"
output:
pdf_document: default
---
```{r setup, include=FALSE}
# This sets chunck options globally, in this case always show code in chunks
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see .
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
To illustrate the features I'll roughly answer parts of HW#3 Question 3. The headings in this document correspond to those in the [lab](http://st551.cwick.co.nz/lab-5/).
## Code Chunks
Below you'll see a code chunk, everything between `` ```{r} `` and `` ``` ``, is interpreted as R code, in this case doing the important job of loading a package we'll need:
```{r, message = FALSE}
library(tidyverse)
```
You can add **chunk options** after the opening `r` in `` ```{r} ``. Above I used `message = FALSE` so that all the messages that print when the `tidyverse` loads aren't included in the output. As another example, we might not need our readers to see the code we used to import the data, so while this code imports the data, it doesn't include the code in our output since `echo = FALSE`.
```{r, echo = FALSE}
brfss <- read_rds("brfss.rds")
brfss <- mutate(brfss, desired_loss = weight_kg - wtdesire_kg)
```
If you generate figures in a code chunk there are by default put in the document, for example, here are histograms of the desired weight loss for participants in the BRFSS study:
```{r, fig.height = 3, fig.width = 8}
# ^ Notice the chunk options above to control the figure size in inches
ggplot(brfss, aes(x = desired_loss)) +
geom_histogram(binwidth = 5) +
facet_wrap(~ sex)
```
## Markdown
Notice that in the Rmarkdown file this section began with `## Markdown`. Hashes `#` denoted sections, `##` subsections and `###` sub-subsections. You can also use Markdown syntax to:
* make lists
* **bold** words
* *italicise* words, and
* make `words look like code`
## Math
Rmarkdown also has the ability to typeset equations using LaTex syntax. Put an inline equation between single dollar signs, `$`, e.g. $\overline{Y} = \sum_{i = 1}^{n} Y_i$, and displayed equations (i.e. centered on a new line) with `$$`, e.g. the Central Limit Theorem says for large $n$:
$$
\overline{Y} = \sum_{i = 1}^{n} Y_i \quad \dot \sim \quad N(0, 1)
$$
[This tutorial](http://www.stat.cmu.edu/~cshalizi/rmarkdown/#math-in-r-markdown) has a reasonable list of the latex commands you might need in this class.
### Inline code
Let's say we do our t-test for the female participants:
```{r}
females <- filter(brfss, sex == "female")
(t_test <- t.test(females$desired_loss, mu = 0))
```
To write out summary, rather than copying the numbers out by hand we could pull them directly from the output. For example, for the point estimate and confidence interval:
```{r, echo = FALSE}
dp <- 1
```
> We estimate the mean desired weight loss for US resident females is `r round(t_test$estimate, dp)` kg. With 95% confidence we estimate the mean desired weight loss for US resident females is between `r round(t_test$conf.int[1], dp)` and `r round(t_test$conf.int[2], dp)` kgs.
You could try to do something pretty complicated with the p-value:
```{r}
nice_p <- function(x, cutoff = 0.001){
ifelse(x < cutoff, paste0("p < ", format(cutoff, scientific = FALSE)),
paste0("p = ", format(x, digits = 2, scientific = FALSE)))
}
nice_p(t_test$p.value)
```
It's a little harder to also convert the p-value to an amount of evidence and handle one sided and two sided test automatically, so I tend to the write most of the sentence by hand.
> There is convincing evidence that the mean desired weight loss for US resident females is not equal to zero (one-sample t-test, `r nice_p(t_test$p.value)`).
A one-sided test is appropriate here, in which case, you could halve the p-value or rerun the test, with `alternative = "greater"`.
## Tables
We might have calculated our sample statistics:
```{r}
sum_stat <- brfss %>%
group_by(sex) %>%
summarise(mean = mean(desired_loss, na.rm = TRUE),
sd = sd(desired_loss, na.rm = TRUE),
n_obs = sum(!is.na(desired_loss)))
```
But would like to present them in a table. The `pander` package is one solution:
```{r, results='markup'}
library(pander)
sum_stat %>%
pander(
col.names = c("Sex", "Mean", "Std. Dev", "Sample size, n"),
caption = "Summary statistics for the BRFSS sample",
digits = 3)
```
It doesn't do too bad a job of putting a test into a table either:
```{r}
pander(t_test)
```