Server login

Learning outcomes

Note

You might already be able to do some or all of these learning outcomes. If so, you can go through the corresponding exercises quickly. The general aim of this chapter is to work comfortably on a remote server by using the command line.

After having completed this chapter you will be able to:

Use the command line to:
- Make a directory
- Change file permissions to ‘executable’
- Run a bash script
- Pipe data from and to a file or other executable
Program a loop in bash

Choose your platform

In this part we will show you how to access the cloud server, or setup your computer to do the exercises with conda or with Docker.

If you are doing the course with a teacher, you will have to login to the remote server. Therefore choose:

Cloud notebook

If you are doing this course independently (i.e. without a teacher) choose either:

conda
Docker

Cloud notebook

If you are participating in this course with a teacher, you have received a link and a password. Copy-paste the link (including the port, e.g.: http://12.345.678.91:10002) in your browser. This should result in the following page:

Type your password, and proceed to the notebook home page. This page contains all the files in your working directory (if there are any). Most of the exercises will be executed through the command line. Here’s a video that explains how to use JupyterLab to use a terminal and work with scripts:

If you rather read, here’s written explanation how to work with JupyterLab. First, let’s open the terminal. Find it at New > Terminal:

For a.o. efficiency and reproducibility it makes sense to execute your commands from a script. You can generate and edit scripts with New > Text File:

Once you have opened a script you can change the code highlighting. This is convenient for writing the code. The text editor will automatically change the highlighting based on the file extension (e.g. .py extension will result in python syntax highlighting). You can change or set the syntax highlighting by clicking the button on the bottom of the page. We will be using mainly shell scripting in this course, so here’s an example for adjusting it to shell syntax highlighting:

Docker

Material

Instructions to install docker
Instructions to set up to container

Exercises

Docker can be used to run an entire isolated environment in a container. This means that we can run the software with all its dependencies required for this course locally in your computer. Independent of your operating system.

In the video below there’s a tutorial on how to set up a docker container for this course. Note that you will need administrator rights, and that if you are using Windows, you need the latest version of Windows 10.

The command to run the environment required for this course looks like this (in a terminal):

Modify the script

Modify the path after -v to the working directory on your computer before running it.

docker run \
--rm \
-e JUPYTER_ENABLE_LAB=yes \
-v /path/to/workingdir/:/home/jovyan \
-p 8888:8888 \
geertvangeest/ngs-variants-jupyter:latest \
start-notebook.sh

If this command has run successfully, you will find a link and token in the console, e.g.:

http://127.0.0.1:8888/?token=4be8d916e89afad166923de5ce5th1s1san3xamp13

Copy this URL into your browser, and you will be able to use the jupyter notebook.

The option -v mounts a local directory in your computer to the directory /home/jovyan in the docker container (‘jovyan’ is the default user for jupyter containers). In that way, you have files available both in the container and on your computer. Use this directory on your computer to e.g. visualise data with IGV. Change the first path to a path on your computer that you want to use as a working directory.

Don’t mount directly in the home dir

Don’t directly mount your local directory to the home directory (/root). This will lead to unexpected behaviour.

The part geertvangeest/ngs-variants-jupyter:latest is the image we are going to load into the container. The image contains all the information about software and dependencies needed for this course. When you run this command for the first time it will download the image. Once it’s on your computer, it will start immediately.

conda

If you have a conda installation on your local computer, you can install the required software using conda.

You can build the environment from ngs-variants.yml

Generate the conda environment like this:

conda env create --name ngs-variants -f ngs-variants.yml

The yaml file probably only works for Linux systems

If you want to use the conda environment on a different OS, use:

conda create -n ngs-variants python=3.8

conda activate ngs-variants

conda install -y -c bioconda \
samtools \
bwa \
snpeff \
gatk4

This will create the conda environment ngs-variants

Activate it like so:

conda activate ngs-variants

After successful installation and activating the environment all the software required to do the exercises should be available.

A UNIX command line interface (CLI) refresher

Most bioinformatics software are UNIX based and are executed through the CLI. When working with NGS data, it is therefore convenient to improve your knowledge on UNIX. For this course, we need basic understanding of UNIX CLI, so here are some exercises to refresh your memory.

Make a new directory

Login to the server and use the command line to make a directory called workdir.

If working with Docker

If your are working with docker you are a root user. This means that your “home” directory is the root directory, i.e. /root, and not /home/username. If you have mounted your local directory to /root/workdir, this directory should already exist.

Answer

cd
mkdir workdir

Make a directory scripts within ~/workdir and make it your current directory.

Answer

cd workdir
mkdir scripts
cd scripts

File permissions

Generate an empty script in your newly made directory ~/workdir/scripts like this:

touch new_script.sh

Add a command to this script that writes “SIB courses are great!” (or something you can better relate to.. ) to stdout, and try to run it.

Answer

The script should look like this:

#!/usr/bin/env bash

echo "SIB courses are great!"

Usually, you can run it like this:

./new_script.sh

But there’s an error:

bash: ./new_script.sh: Permission denied

Why is there an error?

Hint

Use ls -lh new_script.sh to check the permissions.

Answer

ls -lh new_script.sh

gives:

-rw-r--r--  1 user  group    51B Nov 11 16:21 new_script.sh

There’s no x in the permissions string. You should change at least the permissions of the user.

Make the script executable for yourself, and run it.

Answer

Change permissions:

chmod u+x new_script.sh

ls -lh new_script.sh now gives:

-rwxr--r--  1 user  group    51B Nov 11 16:21 new_script.sh

So it should be executable:

./new_script.sh

More on chmod and file permissions here.

Redirection: `>` and `|`

In the root directory (go there like this: cd /) there are a range of system directories and files. Write the names of all directories and files to a file called system_dirs.txt in your home directory (use ls and >).

Answer

ls / > ~/system_dirs.txt

The command wc -l counts the number of lines, and can read from stdin. Make a one-liner with a pipe | symbol to find out how many system directories and files there are.

Answer

ls / | wc -l

Variables

Store system_dirs.txt as variable (like this: VAR=variable), and use wc -l on that variable to count the number of lines in the file.

Answer

FILE=system_dirs.txt
wc -l $FILE

shell scripts

Make a shell script that automatically counts the number of system directories and files.

Answer

Make a script called e.g. current_system_dirs.sh:

#!/usr/bin/env bash
cd /
ls | wc -l

Loops

If you want to run the same command on a range of arguments, it’s not very convenient to type the command for each individual argument. For example, you could write dog, fox, bird to stdout in a script like this:

#!/usr/bin/env bash

echo dog
echo fox
echo bird

However, if you want to change the command (add an option for example), you would have to change it for all the three command calls. Amongst others for that reason, you want to write the command only once. You can do this with a for-loop, like this:

#!/usr/bin/env bash

ANIMALS="dog fox bird"

for animal in $ANIMALS
do
  echo $animal
done

Which results in:

dog
fox
bird

Write a shell script that removes all the letters “e” from a list of words.

Hint

Removing the letter “e” from a string can be done with tr like this:

word="test"
echo $word | tr -d "e"

Which would result in:

tst

Answer

Your script should e.g. look like this (I’ve added some awesome functionality):

#!/usr/bin/env bash

WORDLIST="here is a list of words resulting in a sentence"

for word in $WORDLIST
do
  echo "'$word' with e's removed looks like:"
  echo $word | tr -d "e"
done

resulting in:

'here' with e's removed looks like:
hr
'is' with e's removed looks like:
is
'a' with e's removed looks like:
a
'list' with e's removed looks like:
list
'of' with e's removed looks like:
of
'words' with e's removed looks like:
words
'resulting' with e's removed looks like:
rsulting
'in' with e's removed looks like:
in
'a' with e's removed looks like:
a
'sentence' with e's removed looks like:
sntnc

Like you might be used to in R or python you can also loop over lines in files. This can be convenient if you have for example a set of parameters in each line of a file.

Create a tab-delimited file animals.txt with the following contents:

dog retrieves   4
fox jumps   4
bird    flies   2

Hint

If you’re having trouble typing the actual ‘tabs’ you can also download the file here

With unix shell you can loop over the lines of that file and store each column as a variable. Below, the three columns in the tab delimited file are stored in the variables $animal, $behaviour and $leg_number:

cat animals.txt | while read animal behaviour leg_number
do
    #something here
done

Exercise: Modify the script in such a way that it writes the strings that are stored in the variables at each line to stdout.

Done

cat animals.txt | while read animal behaviour leg_number
do
    echo "The $animal $behaviour, and has $leg_number legs" 
done

Server login

Learning outcomes

Material

Exercises

First login

A UNIX command line interface (CLI) refresher

Make a new directory

File permissions

Redirection: > and |

Variables

shell scripts

Loops

Redirection: `>` and `|`