Git and GitHub pages
Committing changes
In order to create commits, we need to make sure our directory is a git repository. We can check this in any directory by using the terminal. Make your project directory your current directory with cd
(if you click the Terminal tab in Rstudio you will directly in the project directory) and type:
git status
If your directory is not set up for git it will return something like:
fatal: not a git repository (or any of the parent directories): .git
Otherwise, it will give you an overview of files that are or can be staged and committed.
If you followed the instructions, a git repository has been initiated upon creating the project, so you should be able to stage and commit your files.
If your project directory isn’t a git repository yet, initialize it with
git init
Check out the untracked files with git status
. Do all these files need to be version-controlled? If not, add the files/directories you don’t want to version control to the .gitignore
The directory _site/
contains the rendered html files. These aren’t source files, so don’t have to be version controlled. We can add it to the .gitignore
, which would look like:
.Rproj.user
.Rhistory
.RData
.Ruserdata
/.quarto/
_site/
Now that we’re all set for our first commit we can start by staging the files. We could do that one-by-one by typing e.g. git add index.qmd
, git add _quarto.yml
and so on. However, to stage all changed and untracked files we can use:
git add --all
If you want to check beforehand what you will stage, you can do a ‘dry run’ with:
git add --all -n
If you made a mistake and want to unstage a file that is not tracked yet, you can run:
git rm --cached file.txt
If the file is tracked (i.e. you have committed changes before), you can use:
git restore --staged file.txt
The command git status
will now return all files in green, which means that they are staged. The next step will be the commit itself:
git commit -m "first commit"
The more descriptive you make your commits, the easier it is to find back a commit. It is generally advised to write in a tense that describes what the commit does, so e.g.
- Fixes bug
- Adds image to main page
- Adds gene expression heatmap
Using a remote repository
This sub chapter assumes that you can connect with GitHub through https. If you haven’t setup a PAT yet, follow the instructions here as stated in the course preparations.
We will now link our local repository to a remote repository on GitHub. First, we need to create an empty repository on GitHub. Do this by:
Navigating to the GitHub website and login
Click the
+
sign at the top right and click New repositoryFill out the form:
Choose a repository name. Typically this is the same name as your project directory
We will host a website, so make sure it’s public
Leave the rest unchecked, so no README file and no
.gitignore
(we’ll leave theREADME.md
for now, and.gitignore
is already in our local repository)
Click the green button with
Create repository
After you have created the empty repository, GitHub is already giving you suggestions what to do. We have already created the commit, and our current branch is the main branch (check it with git branch
).
If running git branch
does not return main
(if so, it’s likely to return master
), change the branch name with:
git branch -m main
To make a new connection called origin
between the local and remote repositories we run:
git remote add origin https://github.com/USER/REPOSITORY.git
After that, we’ll synchronize with the remote repository with:
git push -u origin main
Which pushes the main
branch from the local repository to the remote (named origin
). The option -u
makes sure that the main
branch is used as default branch on the remote repository. Now you can check whether your local repository is synced with the remote by refreshing the repository page in your browser. The files you have committed should appear.
Now we can publish our site. To do this, you only need to run one command:
quarto publish gh-pages
This command will:
Create a branch named
gh-pages
(if it doesn’t exist)Push the website files to
gh-pages
Change the settings in your GitHub repository to tell that the website files will be in
gh-pages
🤝 congrats! You’ve just published your first website with an analysis report! Find it at USER.github.io/REPO.
Creating a branch and pull request to add a figure panel
Let’s add a feature to our website: we will add a sub-chapter where we create a panel of multiple figures and cross reference them in the text. It is important that we don’t interfere with the main branch while we are developing it.
From issue to branch
When adding a new feature, a good practice would be to first write an issue. Here you describe clearly what you want to do. With an issue, it is straightforward to discuss about it, maintain a backlog, and give it a priority. At the point you decide you will work on the issue, you can make a branch out of the issue. In this way the branch has a clear aim - described in the issue.
Go to your repository on GitHub. Write an issue (Click the Issues tab) with the following content:
Title: Create a panel of figures
Description:
The figure panel should contain:
- Line plot of top 5 girl names from 2010 onwards
- Line plot of top 10 girl names from 2010 onwards
- Heatmap of top 30 girl names from 2010 onwards
Describe and cross-reference the individual figures in the text
After that, save the issue, and at the issue page, on the right panel click on Create a branch under Development. This will automatically suggest a good branch name. Use the default settings and click Create branch. GitHub will automatically suggest what code to run to work on the branch locally. This will be:
git fetch origin
git checkout 1-create-a-panel-of-figures
Run these commands in your local terminal at the root directory of your quarto project.
You can validate whether you are in the right branch by typing:
git branch
The branch 1-create-a-panel-of-figures
should appear in green.
Creating a figure panel
Nice! Now we have a new branch so we can develop in our repository without interfering with our stable main branch. Let’s work on the issue. To create the three figures that you just described in the issue, the required code will be provided by us. Your job would be to add hash-pipe (i.e. #|
) options to the code chunk to render the plots as we want.
Check out the quarto documentation on computed figures and cross-referencing and custom layout.
The code below will create the three figures that are displayed in two rows:
- The two line graphs: should be in the first row, and next to each other
- The heatmap: should be in the second row and span the entire column.
Use this code and add the required chunk options:
```{r}
# get most frequent girl names from 2010 onwards
from_year <- 2010
most_freq_girls <- get_most_frequent(babynames, select_sex = "F",
from = from_year)
# plot top 5 girl names
most_freq_girls |>
plot_top(top = 5)
# plot top 10 girl names
most_freq_girls |>
plot_top(top = 10)
# get top 30 girl names in a matrix
# with names in rows and years in columns
prop_df <- babynames |>
filter(name %in% most_freq_girls$most_frequent$name[1:30] & sex == "F") |>
filter(year >= from_year) |>
select(year, name, prop) |>
pivot_wider(names_from = year,
values_from = prop)
prop_mat <- as.matrix(prop_df[, 2:ncol(prop_df)])
rownames(prop_mat) <- prop_df$name
# create heatmap
pheatmap(prop_mat, cluster_cols = FALSE, scale = "row")
```
Add the following to the top of the code chunk:
#| label: fig-line-2010
#| layout: [[50,50], [100]]
#| fig-cap: "Most popular girl names from 2010 onwards"
#| fig-subcap:
#| - "Top 5"
#| - "Top 10"
#| - "Top 30"
You can refer to the figures in the markdown text like this:
In @fig-line-2010-1 and @fig-line-2010-1 the line plots are shown. To view trends of many names at once, @fig-line-2010-3 displays a heatmap.
Creating a pull request
If the panel is appearing as it should, it might be time to merge it to the main branch. For this we’d first need to commit our changes:
git add analysis.qmd
git commit -m "adds the figure panel"
Now, we could merge by (do not run this now!):
# go to the main branch
git checkout main
# merge the development branch to the main branch
git merge 1-create-a-panel-of-figures
As shown above, we could directly merge our development branch locally. However, if we are working with multiple people on the same project, we want to be explicit that we want to perform this merge. For example, you might want your colleague to test/review your change before merging. The way to do this would be by a pull request.
To create a pull request we first need to push our changes to the remote repository:
# make sure you are on the development branch
git checkout 1-create-a-panel-of-figures
# push the changes to the remote repository
git push
Now browse to the repository on GitHub. You will already see a yellow banner with the question whether you want to create a pull request. However, we will do it in a more explicit way. Go to the tab Pull requests. Click the button New pull request. As base specify the main
branch. At compare specify 1-create-a-panel-of-figures
. Now review the displayed changes and click the button Create pull request. Add comments if necessary. In the same page, you can specify reviewers and assignees if you want others to become involved. We skip this for now, and click Merge pull request to merge 1-create-a-panel-of-figures
to main
.
- Check out the issue. Has anything happened to it?
- Switch to the main branch locally and pull the changes your local repository
- Validate whether the development branch is merged to main
- Check out the commits with
git log --oneline
. Anything changed in the commit history? - Re-publish the website
The issue should be marked as ‘Closed’. This has happened automatically after merging the pull request, because GitHub created a reference between the issue and the branch.
To switch to main
and pull the changes:
git checkout main
git pull
Now the code with the figure panel should be added to analysis.qmd
. If you run git log --oneline
you will see that the merge is stored as a commit, looking something like this:
4fdf90e Merge pull request #1 from yourusername/1-create-a-panel-of-figures
To publish
quarto publish gh-pages
[Extra] render with GitHub actions
Wouldn’t it be nice if the website would be updated automatically when we push changes to the main branch? This can be done with GitHub actions. GitHub actions are a set of jobs that run after certain changes (triggers) to your repository. For example, you can run a job after you push to a specific branch. In this case, we want to render the website after a push to the main branch.
Set up renv
In order to render the website as a github action, we need to define the R package dependencies that are required to run the code. For us, that would be among others ggplot
, pheatmap
, babynames
and others. To conveniently handle this, we use the package renv
, which basically looks for packages you are using in your project and creates an environment definition file (called renv.lock
) based on which packages it finds.
To set up renv
for you project, run the following:
install.packages("renv")
::init()
renv::snapshot() renv
Here, init
will initiate renv in your project by creating the required files, among which renv.lock
. The function snapshot
will create a snapshot of your current project dependencies and add it to renv.lock
.
After running renv::init()
and renv::snapshot()
see what is in renv.lock
and commit the changes of your repository to the main branch.
The file renv.lock
is a json file containing all package metadata. To add it to your repository:
git checkout main
git add --all
git commit -m "set up renv"
Define the GitHub action
Now that we have defined the required dependencies, we can define the action. We will provide you with a template for this:
name: Quarto Publish
# the on statement tells github actions whnen to run the action
on:
# pushing to main
push:
branches: main
# set permissions so the actions in the workflow can write files to your repository
permissions: write-all
# define the job
jobs:
# we name it quarto-deploy
quarto-deploy:
# define the virtual environment
runs-on: ubuntu-latest
# the steps are the individual actions that are needed to complete the job
steps:
# first step is check out the repository so we have access to the files
# inside the repository in our virtual environment
- name: Check out repository
uses: actions/checkout@v4
# now we install quarto in the environment
- name: Set up Quarto
uses: quarto-dev/quarto-actions/setup@v2
# we also install R
- name: Install R
uses: r-lib/actions/setup-r@v2
with:
# ! important: change this version to your local R version
# i.e. the version that is specified in renv.lock
r-version: '4.3.1'
# and the required R dependencies
- name: Install R Dependencies
uses: r-lib/actions/setup-renv@v2
with:
cache-version: 1
# now we use a pre-created action to basically run
# quarto publish gh-pages
- name: Render and Publish
uses: quarto-dev/quarto-actions/publish@v2
with:
target: gh-pages
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Compare the version of R that is installed on your local machine with the one specified in the yaml file. Update it to the version that you have locally installed (and is specified in renv.lock
). Store the file in your repository as .github/workflows/quarto-publish.yml
. Also add and commit this file to you repository with:
git add .github/workflows/quarto-publish.yml
git commit -m "adds action yaml"
Now, we’re ready to push and with that trigger the run of our first action. Go ahead and push the commits to remote repository with:
git push
After pushing, go to the Actions tab in your repository on GitHub. You should see a new action running named ‘Quarto Publish’. After it finishes, you can check out the rendered website, which will be hosted on USER.github.io/REPOSITORY.
This will take a while, because all packages need to be compiled.
From now on, every time you push to main, the action will run and the website will be updated.