Useful links
Tools for RNAseq DE analysis
limma for differential gene expression analysis of microarray or RNA seq data, also includes functions for enrichment analysis
https://www.bioconductor.org/packages/devel/bioc/vignettes/limma/inst/doc/usersguide.pdf
edgeR for differential gene expression analysis of RNA seq data
https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf
DESeq2 for differential gene expression analysis of RNA seq data
http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html
From reads to genes to pathways Pipeline for DGE analysis and pathway analysis
https://f1000research.com/articles/5-1438
bioconductor: Introduction and structure
https://ivanek.github.io/analysisOfGenomicsDataWithR/02_IntroToBioc_html.html
Additional tools for enrichment analysis
clusterProfiler e-Book
https://yulab-smu.github.io/clusterProfiler-book/
fgsea package with the fgsea()
function called by clusterProfiler
https://bioconductor.org/packages/release/bioc/html/fgsea.html
GOsummaries bioconductor package, to visualise Gene Ontology (GO) enrichment analysis results on gene lists arising from different analyses such clustering or PCA. The significant GO categories are visualised as word clouds that can be combined with different plots summarising the underlying data.
https://www.bioconductor.org/packages/release/bioc/html/GOsummaries.html
online tool for overrepresentation analysis
online tool for gene set enrichment analysis
Revigo and rrvgo
http://bioconductor.org/packages/release/bioc/html/rrvgo.html
A recent publication of the Gene Ontology consortium. They try to improve taxon constraints, i.e. to better define which terms are relevant for some species but not for others (such as leukocyte-related terms that should be constrained to vertebrates.
ChEBI: A freely available dictionary of molecular entities focused on ‘small’ chemical compounds. Can be used as a database for GSEA in metabolomics studies.
Considerations and challenges of enrichment analysis
A recent publication (2020) on gene set analysis challenges, opportunities and future research.
Importance of the ranking metrics in GSEA: Paper on the “Ranking metrics in gene set enrichment analysis: do they matter?”
Mubeen et al, 2022: On the influence of several factors on pathway enrichment analysis
Wijesooriya et al, 2022: Urgent need for consistent standards in functional enrichment analysis
Tools for species other than human or mouse
** OrgDB** packages: Bioconductor contains “org.db” packages with genome/gene annotation for many organisms, for example for Drosophila or Arabidopsis.
For fungi, FungiDB provides identification of metabolic pathways based on a-user selected list of genes
PlantGSAD, a gene set database for plants: https://academic.oup.com/nar/article/50/D1/D1456/6371972?login=true
Eventually the AllEnricher tool by Zhang et al: AllEnricher: a comprehensive gene set function enrichment tool for both model and non-model species
A pipeline for non-model species using Scots pine as an example: The authors describe how to assemble the transcriptome from RNA sequencing data, how to annotate genes, and suggest GO enrichment analysis.
eggNOGG: A hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Describes a public database of orthology relationships, gene evolutionary histories and functional annotations.
Proteomics
Bessarabova et al, 2012: Knowledge-based analysis of proteomics data. Not the most recent paper but a nice overview.
Combes et al, 2021: GO Enrichment Analysis for Differential Proteomics Using ProteoRE.
Proteomics data analysis in R: https://pnnl-comp-mass-spec.github.io/proteomics-data-analysis-tutorial/. Chapter 8 contains code for ORA and GSEA with clusterProfiler, but also with fgsea.