Git and GitHub pages

Committing changes

In order to create commits, we need to make sure our directory is a git repository. We can check this in any directory by using the terminal. Make your project directory your current directory with cd (if you click the Terminal tab in Rstudio you will directly in the project directory) and type:

git status

If your directory is not set up for git it will return something like:

fatal: not a git repository (or any of the parent directories): .git

Otherwise, it will give you an overview of files that are or can be staged and committed.

If you followed the instructions, a git repository has been initiated upon creating the project, so you should be able to stage and commit your files.

Note

If your project directory isn’t a git repository yet, initialize it with

git init 
Exercise

Check out the untracked files with git status. Do all these files need to be version-controlled? If not, add the files/directories you don’t want to version control to the .gitignore

The directory _site/ contains the rendered html files. These aren’t source files, so don’t have to be version controlled. We can add it to the .gitignore, which would look like:

.Rproj.user
.Rhistory
.RData
.Ruserdata

/.quarto/
_site/

Now that we’re all set for our first commit we can start by staging the files. We could do that one-by-one by typing e.g. git add index.qmd, git add _quarto.yml and so on. However, to stage all changed and untracked files we can use:

git add --all
Staging convenience

If you want to check beforehand what you will stage, you can do a ‘dry run’ with:

git add --all -n

If you made a mistake and want to unstage a file that is not tracked yet, you can run:

git rm --cached file.txt

If the file is tracked (i.e. you have committed changes before), you can use:

git restore --staged file.txt

The command git status will now return all files in green, which means that they are staged. The next step will be the commit itself:

git commit -m "first commit"
Note

The more descriptive you make your commits, the easier it is to find back a commit. It is generally advised to write in a tense that describes what the commit does, so e.g.

  • Fixes bug
  • Adds image to main page
  • Adds gene expression heatmap

Using a remote repository

Warning

This sub chapter assumes that you can connect with GitHub through https. If you haven’t setup a PAT yet, follow the instructions here as stated in the course preparations.

We will now link our local repository to a remote repository on GitHub. First, we need to create an empty repository on GitHub. Do this by:

  • Navigating to the GitHub website and login

  • Click the + sign at the top right and click New repository

  • Fill out the form:

    • Choose a repository name. Typically this is the same name as your project directory

    • We will host a website, so make sure it’s public

    • Leave the rest unchecked, so no README file and no .gitignore (we’ll leave the README.md for now, and .gitignore is already in our local repository)

  • Click the green button with Create repository

After you have created the empty repository, GitHub is already giving you suggestions what to do. We have already created the commit, and our current branch is the main branch (check it with git branch).

Caution

If running git branch does not return main (if so, it’s likely to return master), change the branch name with:

git branch -m main

To make a new connection called origin between the local and remote repositories we run:

git remote add origin https://github.com/USER/REPOSITORY.git

After that, we’ll synchronize with the remote repository with:

git push -u origin main

Which pushes the main branch from the local repository to the remote (named origin). The option -u makes sure that the main branch is used as default branch on the remote repository. Now you can check whether your local repository is synced with the remote by refreshing the repository page in your browser. The files you have committed should appear.

Now we can publish our site. To do this, you only need to run one command:

quarto publish gh-pages

This command will:

  • Create a branch named gh-pages (if it doesn’t exist)

  • Push the website files to gh-pages

  • Change the settings in your GitHub repository to tell that the website files will be in gh-pages

🤝 congrats! You’ve just published your first website with an analysis report! Find it at USER.github.io/REPO.

Creating a branch and pull request to add a figure panel

Let’s add a feature to our website: we will add a sub-chapter where we create a panel of multiple figures and cross reference them in the text. It is important that we don’t interfere with the main branch while we are developing it.

From issue to branch

When adding a new feature, a good practice would be to first write an issue. Here you describe clearly what you want to do. With an issue, it is straightforward to discuss about it, maintain a backlog, and give it a priority. At the point you decide you will work on the issue, you can make a branch out of the issue. In this way the branch has a clear aim - described in the issue.

Exercise

Go to your repository on GitHub. Write an issue (Click the Issues tab) with the following content:

Title: Create a panel of figures

Description:

The figure panel should contain:

  • Line plot of top 5 girl names from 2010 onwards
  • Line plot of top 10 girl names from 2010 onwards
  • Heatmap of top 30 girl names from 2010 onwards

Describe and cross-reference the individual figures in the text

After that, save the issue, and at the issue page, on the right panel click on Create a branch under Development. This will automatically suggest a good branch name. Use the default settings and click Create branch. GitHub will automatically suggest what code to run to work on the branch locally. This will be:

git fetch origin
git checkout 1-create-a-panel-of-figures

Run these commands in your local terminal at the root directory of your quarto project.

You can validate whether you are in the right branch by typing:

git branch

The branch 1-create-a-panel-of-figures should appear in green.

Creating a figure panel

Nice! Now we have a new branch so we can develop in our repository without interfering with our stable main branch. Let’s work on the issue. To create the three figures that you just described in the issue, the required code will be provided by us. Your job would be to add hash-pipe (i.e. #|) options to the code chunk to render the plots as we want.

Exercise

Check out the quarto documentation on computed figures and cross-referencing and custom layout.

The code below will create the three figures that are displayed in two rows:

  • The two line graphs: should be in the first row, and next to each other
  • The heatmap: should be in the second row and span the entire column.

Use this code and add the required chunk options:

```{r}
# get most frequent girl names from 2010 onwards
from_year <- 2010
most_freq_girls <- get_most_frequent(babynames, select_sex = "F",
                                     from = from_year)

# plot top 5 girl names
most_freq_girls |>
  plot_top(top = 5)

# plot top 10 girl names
most_freq_girls |>
  plot_top(top = 10)

# get top 30 girl names in a matrix
# with names in rows and years in columns
prop_df <- babynames |> 
  filter(name %in% most_freq_girls$most_frequent$name[1:30] & sex == "F") |>
  filter(year >= from_year) |> 
  select(year, name, prop) |>
  pivot_wider(names_from = year,
              values_from = prop)

prop_mat <- as.matrix(prop_df[, 2:ncol(prop_df)])
rownames(prop_mat) <- prop_df$name

# create heatmap
pheatmap(prop_mat, cluster_cols = FALSE, scale = "row")
```

Add the following to the top of the code chunk:

#| label: fig-line-2010
#| layout: [[50,50], [100]]
#| fig-cap: "Most popular girl names from 2010 onwards"
#| fig-subcap: 
#|   - "Top 5"
#|   - "Top 10"
#|   - "Top 30"

You can refer to the figures in the markdown text like this:

In @fig-line-2010-1 and @fig-line-2010-1 the line plots are shown. To view trends of many names at once, @fig-line-2010-3 displays a heatmap.

Creating a pull request

If the panel is appearing as it should, it might be time to merge it to the main branch. For this we’d first need to commit our changes:

git add analysis.qmd
git commit -m "adds the figure panel"
Note

Now, we could merge by (do not run this now!):

# go to the main branch
git checkout main
# merge the development branch to the main branch
git merge 1-create-a-panel-of-figures

As shown above, we could directly merge our development branch locally. However, if we are working with multiple people on the same project, we want to be explicit that we want to perform this merge. For example, you might want your colleague to test/review your change before merging. The way to do this would be by a pull request.

To create a pull request we first need to push our changes to the remote repository:

# make sure you are on the development branch
git checkout 1-create-a-panel-of-figures
# push the changes to the remote repository
git push

Now browse to the repository on GitHub. You will already see a yellow banner with the question whether you want to create a pull request. However, we will do it in a more explicit way. Go to the tab Pull requests. Click the button New pull request. As base specify the main branch. At compare specify 1-create-a-panel-of-figures. Now review the displayed changes and click the button Create pull request. Add comments if necessary. In the same page, you can specify reviewers and assignees if you want others to become involved. We skip this for now, and click Merge pull request to merge 1-create-a-panel-of-figures to main.

Exercise
  • Check out the issue. Has anything happened to it?
  • Switch to the main branch locally and pull the changes your local repository
  • Validate whether the development branch is merged to main
  • Check out the commits with git log --oneline. Anything changed in the commit history?
  • Re-publish the website

The issue should be marked as ‘Closed’. This has happened automatically after merging the pull request, because GitHub created a reference between the issue and the branch.

To switch to main and pull the changes:

git checkout main
git pull

Now the code with the figure panel should be added to analysis.qmd. If you run git log --oneline you will see that the merge is stored as a commit, looking something like this:

4fdf90e Merge pull request #1 from yourusername/1-create-a-panel-of-figures

To publish

quarto publish gh-pages

[Extra] render with GitHub actions

Wouldn’t it be nice if the website would be updated automatically when we push changes to the main branch? This can be done with GitHub actions. GitHub actions are a set of jobs that run after certain changes (triggers) to your repository. For example, you can run a job after you push to a specific branch. In this case, we want to render the website after a push to the main branch.

Set up renv

In order to render the website as a github action, we need to define the R package dependencies that are required to run the code. For us, that would be among others ggplot, pheatmap, babynames and others. To conveniently handle this, we use the package renv, which basically looks for packages you are using in your project and creates an environment definition file (called renv.lock) based on which packages it finds.

To set up renv for you project, run the following:

install.packages("renv")
renv::init()
renv::snapshot()

Here, init will initiate renv in your project by creating the required files, among which renv.lock. The function snapshot will create a snapshot of your current project dependencies and add it to renv.lock.

Exercise

After running renv::init() and renv::snapshot() see what is in renv.lock and commit the changes of your repository to the main branch.

The file renv.lock is a json file containing all package metadata. To add it to your repository:

git checkout main
git add --all
git commit -m "set up renv"

Define the GitHub action

Now that we have defined the required dependencies, we can define the action. We will provide you with a template for this:

name: Quarto Publish

# the on statement tells github actions whnen to run the action
on:
  # pushing to main
  push:
    branches: main
    
# set permissions so the actions in the workflow can write files to your repository
permissions: write-all

# define the job
jobs:
  # we name it quarto-deploy
  quarto-deploy:
    # define the virtual environment
    runs-on: ubuntu-latest
    # the steps are the individual actions that are needed to complete the job

    steps:
        
        # first step is check out the repository so we have access to the files
        # inside the repository in our virtual environment
      - name: Check out repository
        uses: actions/checkout@v4
        
        # now we install quarto in the environment
      - name: Set up Quarto
        uses: quarto-dev/quarto-actions/setup@v2
        
        # we also install R
      - name: Install R
        uses: r-lib/actions/setup-r@v2
        with:
          # ! important: change this version to your local R version 
          # i.e. the version that is specified in renv.lock
          r-version: '4.3.1'

        # and the required R dependencies
      - name: Install R Dependencies
        uses: r-lib/actions/setup-renv@v2
        with:
          cache-version: 1

        # now we use a pre-created action to basically run 
        # quarto publish gh-pages
      - name: Render and Publish
        uses: quarto-dev/quarto-actions/publish@v2
        with:
          target: gh-pages
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Compare the version of R that is installed on your local machine with the one specified in the yaml file. Update it to the version that you have locally installed (and is specified in renv.lock). Store the file in your repository as .github/workflows/quarto-publish.yml. Also add and commit this file to you repository with:

git add .github/workflows/quarto-publish.yml
git commit -m "adds action yaml"

Now, we’re ready to push and with that trigger the run of our first action. Go ahead and push the commits to remote repository with:

git push

After pushing, go to the Actions tab in your repository on GitHub. You should see a new action running named ‘Quarto Publish’. After it finishes, you can check out the rendered website, which will be hosted on USER.github.io/REPOSITORY.

Note

This will take a while, because all packages need to be compiled.

From now on, every time you push to main, the action will run and the website will be updated.