<- function(babynames, select_sex, from = 1950) {
get_most_frequent <- babynames |>
most_freq filter(sex == select_sex, year > from) |>
group_by(name) |>
summarise(average = mean(prop)) |>
arrange(desc(average))
return(list(
babynames = babynames,
most_frequent = most_freq,
sex = select_sex,
from = from))
}
<- function(x, top = 10) {
plot_top <- x$most_frequent$name[1:top]
topx
<- x$babynames |>
p filter(name %in% topx, sex == x$sex, year > x$from) |>
ggplot(aes(x = year, y = prop, color = name)) +
geom_line() +
scale_color_brewer(palette = "Paired") +
theme_classic()
return(p)
}
Writing with Quarto
After finishing this chapter you will be able to:
Create a quarto project with RStudio
Locally render and view a quarto website
Add markdown content to a quarto document
Add and execute code chunks to a quarto document
Add and modify code-chunk options
Execute code chunks that create figures and create cross references to them
Initiate a website project
We’ll start with creating a website project. In Rstudio:
Choose File > New Project..
Choose New Directory
Choose Quarto website
Choose a location and name for the website. Leave Create a git repository checked
Finish by pushing the button Create Project
Now we have created a template for a quarto website 👏 . It doesn’t contain much, but we can already render it and use it as a functional website. In order to view it locally in your browser you can either:
Open
index.qmd
in Rstudio and press the button RenderOr you can use the Terminal and type:
quarto preview
In both cases your computer will open a browser and locally host the page at a random port.
Adding content
Adding text
In this part, we will use quarto to display our project goals. In index.qmd
, create an introduction where you will state:
In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:
- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
- girls
- boys
- Plot the yearly trend of the top 10 baby names
Throughout this course, do not hesitate to refer to this quarto markdown basic sheet.
Take the above introduction of your babynames
project, convert it to markdown text, and add it to index.qmd
.
This is what your index.qmd
could look like:
---
title: "quarto-tutorial"
---
In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:
- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
- girls
- boys
- Plot the yearly trend of the top 10 baby names
Adding an image
You can add images with the following syntax:

Check out the quarto figure guide. Find an image of a baby in the public domain on creative commons. Get the full URL of the image and add it to your index.qmd
. Make it 400 pixels in width and align it to the right side of the page.
Your index.qmd
could look like this:
---
title: "quarto-tutorial"
---
In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:
- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
- girls
- boys
- Plot the yearly trend of the top 10 baby names
{fig-align="right" width=400}
Adding a page
We will now add a page for the analysis. We do this by clicking in Rstudio File > New File.. > Quarto Document... Choose a title, e.g. analysis
and press Create. Save the newly created file as analysis.qmd
.
In order to make our new page part of the website, we have to edit _quarto.yml
. By adding it to the navbar
item in the website
markup:
website:
title: "quarto-tutorial"
navbar:
left:
- href: index.qmd
text: Home
- about.qmd
- analysis.qmd
Adding code chunks
We can add some code to analysis.qmd
. First, we add a chunk that loads the libraries:
```{r}
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```
Plotting and cross-referencing
To display the first 10 lines of the babynames table you can add:
```{r}
head(babynames) |> kable()
```
Add the two above code chunks as an R code chunk to analysis.qmd
, and do the following:
- add some text to let the reader know what it does
- suppress the package startup messages with
#| output: false
at the top of the chunk that loads the packages. More info about the hash-pipe here - re-render the site.
---
title: "Analysis"
---
First we load the packages:
```{r}
#| output: false
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```
The first ten lines of the babynames dataset looks like:
```{r}
head(babynames) |> kable()
```
Here, I’ve created two functions that do the following:
get_most_frequent
: Gets the most frequent babynames over a time-period.plot_top
: from the output ofget_most_frequent
. Plot the top n most popular names.
This is not a coding tutorial, so you can ignore the code itself. Just know what is does.
Plotting them for girls like this:
get_most_frequent(babynames, select_sex = "F") |>
plot_top()
Plotting them for boys like this:
get_most_frequent(babynames, select_sex = "M") |>
plot_top()
Add the above three code chunks to analysis.qmd
, and do the following:
- Since the functions are a bit bulky we want to make the chunk foldable. Do this by
#| code-fold: true
- Describe in a text what you see in the figures. Refer to the individual figures in text by cross-referencing.
To create a visualization of the most popular baby names, we have created two functions. Click the 'Code' link to view:
```{r}
#| code-fold: true
<- function(babynames, select_sex, from = 1950) {
get_most_frequent <- babynames |>
most_freq filter(sex == select_sex, year > from) |>
group_by(name) |>
summarise(average = mean(prop)) |>
arrange(desc(average))
return(list(
babynames = babynames,
most_frequent = most_freq,
sex = select_sex,
from = from))
}
<- function(x, top = 10) {
plot_top <- x$most_frequent$name[1:top]
topx
<- x$babynames |>
p filter(name %in% topx, sex == x$sex, year > x$from) |>
ggplot(aes(x = year, y = prop, color = name)) +
geom_line() +
scale_color_brewer(palette = "Paired") +
theme_classic()
return(p)
}```
Here we call the code to visualize the top 10 most frequent girl names from 1950 onwards:
```{r}
#| label: fig-line-girls
#| fig-cap: "Line plot displaying proportion of top 10 girl names by year"
get_most_frequent(babynames, select_sex = "F") |>
plot_top()
```
In @fig-line-girls you can see that there has been a peak in popularity for 'Lisa', 'Jennifer' and 'Jessica'. Interesting! Let's have a look at the boys names:
```{r}
#| label: fig-line-boys
#| fig-cap: "Line plot displaying proportion of top 10 boy names by year"
get_most_frequent(babynames, select_sex = "M") |>
plot_top()
```
@fig-line-boys shows that names that were popular before 1990 are relatively infrequent after 2000.
Extra: quarto extension and webassembly
Quarto offers the possiblity to personnalise and extend its base capability with extensions built by the quarto community.
These include extra fonts and emojis, additionnal document template (like scientific journal template), extra graphical components (like the accordion ).
Extensions can be installed with the terminal command:
quarto add EXTENSIONNAME
The extension will be installed in _extensions/
folder.
Note that extension are specific to a quarto project, meaning that they need to be installed separately for each project. This may sound tedious, but in practice it makes sharing and automatically building websites or documents with these extensions easier.
Let’s try it with an extension that extends quarto with webassembly components for python and R: quarto-live.
quarto-live : web assembly for “free” interactive computing
Using web-assembly, we can have python or R code executed on the fly in the browser. In quarto, this can be achieved with the quarto-live extension.
This is very nice as it lets you build fully interactive app without the need to host them with a server architecture (because all the computations happen on the user’s computer), which lets us host it as a static website (potentially for free, as for our github page!).
Of course there is a downside: upon loading the page, the user’s browser will need to install all the R webassembly components and associated libraries. This generally takes around 10 to 30 seconds if you don’t have many complex libraries. Also, the execution of the R or python code is often a bit slower than what you would experience outside the web browser.
We start by installing the extension.
In the terminal, navigate to your quarto project folder and type:
quarto add r-wasm/quarto-live
The command will ask you to type “Y” and press Enter in order to validate the installation.
Then, you can start adding webassembly components in your qmd files.
For example, try adding a new file interactive_compute.qmd
this example is heavily inspired by this page.
We start the page by the yaml preamble:
---
title: Interactive work without a server with web assembly
format: live-html
engine: knitr
---
{{< include ./_extensions/r-wasm/live/_knitr.qmd >}}
Note here the format: live-html
, and engine: knitr
. The first marks the document as a live
document (that uses quarto-live), and the second set the rendering engine (the alternative is engine: jupyter
, which requires a python installation).
The last line is required if you use the knitr rendering engine. According to the extension documentation in the future it will not be needed anymore, but for now we have to include it.
Then, you can (finally) start adding some code:
```{webr}
#| output: false
#| echo: FALSE
#| edit: false
#| autorun: true
library(babynames)
library(ggplot2)
library(magrittr)
library(dplyr)
```
[Observable input](https://github.com/observablehq/inputs?tab=readme-ov-file) component:
Histogram interacting with a javascript
```{ojs}
//| echo: false
= Inputs.checkbox(
viewof sex_box "M", "F"],
[
{value: ["M", "F"],
label: "Shown sexes:",
}
)```
```{webr}
#| echo: FALSE
#| edit: false
#| autorun: true
#| input:
#| - sex_box
nbins = 100
data = babynames %>% filter( sex %in% sex_box )
ggplot( data , aes(x=n , fill = sex) ) + geom_histogram(bins = nbins) + scale_x_log10()
```
You can note that the code cells types are different:
{webr}
: contains R that will be in web-assembly: rather than being executed once during the html rendering, they will be executed live in the browser.{ojs}
: Observable javascript - we use these cells to include interactive javascript input/outputs
Also, in addition to the base quarto chunk options, we see:
//|
in{ojs}
cells, because javascript comments start with//
rather than#
#| edit:
: make a cell editable or not#| autorun: true
: when true the cell is automatically executed at startup and whenever its content is changed.
And importantly:
#| input:
#| - sex_box
This tells R to go grab a javascript variable we defined with viewof
in another {ojs}
block.
From there, even more components can be added to create the interactive plot of dashboard of your choice:
```{webr}
#| echo: false
#| edit: false
#| autorun: true
get_most_frequent <- function(babynames, select_sex, from = 1950) {
most_freq <- babynames |>
filter(sex == select_sex, year > from) |>
group_by(name) |>
summarise(average = mean(prop)) |>
arrange(desc(average))
return(list(
babynames = babynames,
most_frequent = most_freq,
sex = select_sex,
from = from))
}
plot_top <- function(x, top = 10) {
topx <- x$most_frequent$name[1:top]
p <- x$babynames |>
filter(name %in% topx, sex == x$sex, year > x$from) |>
ggplot(aes(x = year, y = prop, color = name)) +
geom_line() +
scale_color_brewer(palette = "Paired") +
theme_classic()
return(p)
}
```
A more complex examples with more interactive inputs (NB: it seems that when moving the slider the image get re-computed for each intermediate value, leading to some slowness sometimes) :```{webr}
#| echo: FALSE
#| edit: false
#| autorun: true
#| input:
#| - sex_show
#| - year
#| - n_names
get_most_frequent(babynames, select_sex = sex_show , from = year ) |>
plot_top(top=n_names)
```
```{ojs}
//| echo: false
= Inputs.radio(["M", "F"], {label: "Sex:",value:"F"})
viewof sex_show
= Inputs.range([1880, 2010], {step: 10, label: "starting year:"})
viewof year
= Inputs.range([1, 12], {step: 1, label: "number of top names:"})
viewof n_names ```
You can also just have interactive R/python cells where you can write and execute code :
```{webr}
for (x in 1:5) {
print(x ** 2)
}
```