Writing with Quarto

Learning outcomes

After finishing this chapter you will be able to:

  • Create a quarto project with RStudio

  • Locally render and view a quarto website

  • Add markdown content to a quarto document

  • Add and execute code chunks to a quarto document

  • Add and modify code-chunk options

  • Execute code chunks that create figures and create cross references to them

Initiate a website project

We’ll start with creating a website project. In Rstudio:

  • Choose File > New Project..

  • Choose New Directory

  • Choose Quarto website

  • Choose a location and name for the website. Leave Create a git repository checked

  • Finish by pushing the button Create Project

Now we have created a template for a quarto website 👏 . It doesn’t contain much, but we can already render it and use it as a functional website. In order to view it locally in your browser you can either:

  • Open index.qmd in Rstudio and press the button Render

  • Or you can use the Terminal and type:

quarto preview

In both cases your computer will open a browser and locally host the page at a random port.

Adding content

Adding text

In this part, we will use quarto to display our project goals. In index.qmd, create an introduction where you will state:

In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:

  • Understanding the different columns of the data set
  • Find the top 10 most frequently used baby names in the data for:
    • girls
    • boys
  • Plot the yearly trend of the top 10 baby names
Exercise

Take the above introduction of your babynames project, convert it to markdown text, and add it to index.qmd.

This is what your index.qmd could look like:

---
title: "quarto-tutorial"
---

In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:

- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
   - girls
   - boys
- Plot the yearly trend of the top 10 baby names 

Adding an image

You can add images with the following syntax:

![image_alt_text](path/or/url/to/image.png)
Exercise

Check out the quarto figure guide. Find an image of a baby in the public domain on creative commons. Get the full URL of the image and add it to your index.qmd. Make it 400 pixels in width and align it to the right side of the page.

Your index.qmd could look like this:

---
title: "quarto-tutorial"
---

In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:

- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
  - girls
  - boys
- Plot the yearly trend of the top 10 baby names 

![](https://cdn.pixabay.com/photo/2016/10/02/06/27/baby-1709013_1280.jpg){fig-align="right" width=400}

Adding a page

We will now add a page for the analysis. We do this by clicking in Rstudio File > New File.. > Quarto Document... Choose a title, e.g. analysis and press Create. Save the newly created file as analysis.qmd.

In order to make our new page part of the website, we have to edit _quarto.yml. By adding it to the navbar item in the website markup:

website:
  title: "quarto-tutorial"
  navbar:
    left:
      - href: index.qmd
        text: Home
      - about.qmd
      - analysis.qmd

Adding code chunks

We can add some code to analysis.qmd. First, we add a chunk that loads the libraries:

```{r}
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```

Plotting and cross-referencing

To display the first 10 lines of the babynames table you can add:

```{r}
head(babynames) |> kable()
```
Exercise

Add the two above code chunks as an R code chunk to analysis.qmd, and do the following:

  • add some text to let the reader know what it does
  • suppress the package startup messages with #| output: false at the top of the chunk that loads the packages. More info about the hash-pipe here
  • re-render the site.
---
title: "Analysis"
---

First we load the packages:

```{r}
#| output: false
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```

The first ten lines of the babynames dataset looks like:


```{r}
head(babynames) |> kable()
```

Here, I’ve created two functions that do the following:

  • get_most_frequent: Gets the most frequent babynames over a time-period.
  • plot_top: from the output of get_most_frequent. Plot the top n most popular names.
Note

This is not a coding tutorial, so you can ignore the code itself. Just know what is does.

get_most_frequent <- function(babynames, select_sex, from = 1950) {
  most_freq <- babynames |>
    filter(sex == select_sex, year > from) |>
    group_by(name) |>
    summarise(average = mean(prop)) |>
    arrange(desc(average))
    
  return(list(
    babynames = babynames,
    most_frequent = most_freq,
    sex = select_sex,
    from = from))
}

plot_top <- function(x, top = 10) {
  topx <- x$most_frequent$name[1:top]
  
  p <- x$babynames |>
    filter(name %in% topx, sex == x$sex, year > x$from) |>
    ggplot(aes(x = year, y = prop, color = name)) +
    geom_line() +
    scale_color_brewer(palette = "Paired") +
    theme_classic()
  
  return(p)
}

Plotting them for girls like this:

get_most_frequent(babynames, select_sex = "F") |>
  plot_top()

Plotting them for boys like this:

get_most_frequent(babynames, select_sex = "M") |>
  plot_top()
Exercise

Add the above three code chunks to analysis.qmd, and do the following:

  • Since the functions are a bit bulky we want to make the chunk foldable. Do this by #| code-fold: true
  • Describe in a text what you see in the figures. Refer to the individual figures in text by cross-referencing.
To create a visualization of the most popular baby names, we have created two functions. Click the 'Code' link to view:

```{r}
#| code-fold: true
get_most_frequent <- function(babynames, select_sex, from = 1950) {
  most_freq <- babynames |>
    filter(sex == select_sex, year > from) |>
    group_by(name) |>
    summarise(average = mean(prop)) |>
    arrange(desc(average))
    
  return(list(
    babynames = babynames,
    most_frequent = most_freq,
    sex = select_sex,
    from = from))
}

plot_top <- function(x, top = 10) {
  topx <- x$most_frequent$name[1:top]
  
  p <- x$babynames |>
    filter(name %in% topx, sex == x$sex, year > x$from) |>
    ggplot(aes(x = year, y = prop, color = name)) +
    geom_line() +
    scale_color_brewer(palette = "Paired") +
    theme_classic()
  
  return(p)
}
```

Here we call the code to visualize the top 10 most frequent girl names from 1950 onwards:


```{r}
#| label: fig-line-girls
#| fig-cap: "Line plot displaying proportion of top 10 girl names by year"
get_most_frequent(babynames, select_sex = "F") |>
  plot_top()
```
    
In @fig-line-girls you can see that there has been a peak in popularity for 'Lisa', 'Jennifer' and 'Jessica'. Interesting! Let's have a look at the boys names:

```{r}
#| label: fig-line-boys
#| fig-cap: "Line plot displaying proportion of top 10 boy names by year"
get_most_frequent(babynames, select_sex = "M") |>
  plot_top()
```

@fig-line-boys shows that names that were popular before 1990 are relatively infrequent after 2000.