STAT 220
What Does Reproducibility Mean in Data Science?
Short-term goals
What Does Reproducibility Mean in Data Science?
Long-term goals
Simple rules for:
#
,##
,etc)**
, italics *
)[linked text](url)
For further help, look at R Markdown Cheatsheet
Add chunks with button or:
⌘
+ Option (or Alt) ⌥
+ i
(Mac)i
(Windows/Linux)Run chunks by:
How many babies were born with name ‘Aimee’?
`r filtered_names %>% summarise(total = sum(n))`
There are a total of 53476 babies.
In what year were there highest proportion of babies born with the name Aimee
?
`r filtered_names %>% filter(prop == max(prop)) %>% pull(year)`
Aimee
name was the most popular in 1973.
```{r peek, echo = FALSE, results = "hide"}
glimpse(filtered_names)
```
{r label}
echo = FALSE
```{r peek, echo = TRUE, results = "show"}
glimpse(filtered_names)
```
Rows: 150
Columns: 5
$ year <dbl> 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890,…
$ sex <chr> "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", …
$ name <chr> "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "A…
$ n <int> 13, 11, 13, 11, 15, 17, 17, 18, 12, 16, 18, 14, 15, 17, 13, 13, 2…
$ prop <dbl> 0.00013319, 0.00011127, 0.00011236, 0.00009162, 0.00010902, 0.000…
```{r echo = TRUE, eval = TRUE}
glimpse(filtered_names)
```
Rows: 150
Columns: 5
$ year <dbl> 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890,…
$ sex <chr> "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", …
$ name <chr> "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "Aimee", "A…
$ n <int> 13, 11, 13, 11, 15, 17, 17, 18, 12, 16, 18, 14, 15, 17, 13, 13, 2…
$ prop <dbl> 0.00013319, 0.00011127, 0.00011236, 0.00009162, 0.00010902, 0.000…
```{r plot1, fig.path="img/"}
library(babynames)
your_name <- "Dee"
your_name_data <- babynames %>% filter(name == your_name)
ggplot(data=your_name_data, aes(x=year, y=prop)) +
geom_point(size = 3, alpha = 0.6) +
geom_line(aes(colour = sex), size = 1) +
scale_color_brewer(palette = "Set1") +
labs( x = 'Year',
y = stringr::str_c('Prop. of Babies Named ', your_name),
title = stringr::str_c('Trends in Names: ', your_name))
```
Chunk Option | Outcome |
---|---|
echo = FALSE |
The code is not included in the final document. |
include = FALSE |
Neither the code nor its results appear in the document. However, the code executes, and results can be used later. |
message = FALSE |
Any messages produced by the code are not shown in the document. |
warning = FALSE |
Any warnings generated by the code are omitted from the document. |
More on R Markdown Cheatsheet
ca2-yourusername
repository from Github10:00
Variables are used to store data, figures, model output, etc.
$
or %
. Common symbols that are used in variable names include .
or _
.R
is case sensitive.<-
. Recommend to use <-
to assign values to objects and =
within functions.#
symbol is used for commenting and demarcation. Any code following #
will not be executed.10:00