Skip to content

Useful links

Tools for RNAseq DE analysis

limma for differential gene expression analysis of microarray or RNA seq data, also includes functions for enrichment analysis

https://www.bioconductor.org/packages/devel/bioc/vignettes/limma/inst/doc/usersguide.pdf

edgeR for differential gene expression analysis of RNA seq data

https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

DESeq2 for differential gene expression analysis of RNA seq data

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

From reads to genes to pathways Pipeline for DGE analysis and pathway analysis

https://f1000research.com/articles/5-1438

bioconductor: Introduction and structure

https://ivanek.github.io/analysisOfGenomicsDataWithR/02_IntroToBioc_html.html

Additional tools for enrichment analysis

clusterProfiler e-Book

https://yulab-smu.github.io/clusterProfiler-book/

fgsea package with the fgsea() function called by clusterProfiler

https://bioconductor.org/packages/release/bioc/html/fgsea.html

GOsummaries bioconductor package, to visualise Gene Ontology (GO) enrichment analysis results on gene lists arising from different analyses such clustering or PCA. The significant GO categories are visualised as word clouds that can be combined with different plots summarising the underlying data.

https://www.bioconductor.org/packages/release/bioc/html/GOsummaries.html

online tool for overrepresentation analysis

http://www.pantherdb.org/

online tool for gene set enrichment analysis

http://www.webgestalt.org/

Revigo and rrvgo

http://revigo.irb.hr/

http://bioconductor.org/packages/release/bioc/html/rrvgo.html

A recent publication of the Gene Ontology consortium. They try to improve taxon constraints, i.e. to better define which terms are relevant for some species but not for others (such as leukocyte-related terms that should be constrained to vertebrates.

ChEBI: A freely available dictionary of molecular entities focused on ‘small’ chemical compounds. Can be used as a database for GSEA in metabolomics studies.

Considerations and challenges of enrichment analysis

A recent publication (2020) on gene set analysis challenges, opportunities and future research.

Importance of the ranking metrics in GSEA: Paper on the “Ranking metrics in gene set enrichment analysis: do they matter?”

Mubeen et al, 2022: On the influence of several factors on pathway enrichment analysis

Wijesooriya et al, 2022: Urgent need for consistent standards in functional enrichment analysis

Tools for species other than human or mouse

** OrgDB** packages: Bioconductor contains “org.db” packages with genome/gene annotation for many organisms, for example for Drosophila or Arabidopsis.

For fungi, FungiDB provides identification of metabolic pathways based on a-user selected list of genes

PlantGSAD, a gene set database for plants: https://academic.oup.com/nar/article/50/D1/D1456/6371972?login=true

Eventually the AllEnricher tool by Zhang et al: AllEnricher: a comprehensive gene set function enrichment tool for both model and non-model species

A pipeline for non-model species using Scots pine as an example: The authors describe how to assemble the transcriptome from RNA sequencing data, how to annotate genes, and suggest GO enrichment analysis.

eggNOGG: A hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Describes a public database of orthology relationships, gene evolutionary histories and functional annotations.

Proteomics

Bessarabova et al, 2012: Knowledge-based analysis of proteomics data. Not the most recent paper but a nice overview.

Combes et al, 2021: GO Enrichment Analysis for Differential Proteomics Using ProteoRE.

Proteomics data analysis in R: https://pnnl-comp-mass-spec.github.io/proteomics-data-analysis-tutorial/. Chapter 8 contains code for ORA and GSEA with clusterProfiler, but also with fgsea.