Guidelines and protocols for Pathway Analysis of omics data
The Statistics and Bioinformatics Unit ([UEB](http://ueb.vhir.org)) at Vall d’Hebron Research Institute (VHIR) is a support platform whose mission is providing expert advice, services and training in bioinformatics and statistics for clinical and biomedical research.
At the bioinformatics platform of UEB we work with distinct omics data and apply our knowledge to provide researchers with accurate. reliable, robust and, if possible, quick results.
Many of our analyses deal with selecting some type of “feature” such as genes, proteins, microRNAs etc. which may act as a biomarkers for distinguishing, predicting or classifying distinct biological states.
These analyses can be roughly separated in two parts:
This document provides guidelines and examples of the analyses that can be performed on one or more gene lists to help gain biological insight on the results of some type of omics data analysis (e.g. selection of differentially expressed genes).
Overall these analyses are known as Pathway Analysis. Although one could discuss the exact meaning of the term we will keep it and from now on it will be used to describe any analysis based on a list of genes or other molecular features obtained from an omics data analysis.
There is a huge variety of methods and tools for doing pathway analysis. Khatri (2012) is still a good reference to provide a good comparative overview. We refer the reader to this and similar papers and go straight to the list of recommended tools.
Because we do not aim so much at being comprehensive as at being useful we begin considering three possible approaches that can be undertaken by a researcher when she is faced with the mission of “doing a pathway analysis of my gene list”
As the page progresses we expect to add more examples, more tools, and more explanations.
These three approaches described below differ in sophistication and in difficulty.
The simplest way to do pathway analysis consists of using one of the tools embedded in popular pathway databases such as The Gene Ontology and Reactome. Each website contains some tool to do pathway analysis.
There are many other web pages or standalone programs that can be used to do pathway analysis. See some of them in the “Pathway Analysis Tools” section.
If one wishes to do more complete or sophisticated analysis one can turn to a recently published nature protocol: Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap. This protocol shows how to use some publicly available tools for going from basic enrichment analysis to visualization of results and, what is more important, to a guided interpretation. The protocol is pay-only but a preprint is available at biorXiv: Pathway enrichment analysis of -omics data (preprint).
Last but not least one can always use state of the art tools available from the Bioconductor project. Bioconductor contains many options for doing similar or distinct analyses.
In order to avoid getting lost in the myriad of options a good starting point is the Clusterprofiler package which not only allows to do a variety of analysis with a homogeneous interface, but which also has an excelent and extensive documentation.
People tend to ask more questions like where can I find a tool to do pathway analysis? than How can I learn how to do pathway analysis the right way? We are convinced that the right way to proceed is first answering the second question and next, the first one.
With this aim in mind, we have compiled some materials that we have found on the web -including ours- that can provide a gentle introduction to this topic, easier to do than to define.
There are many references on pathway analysis of omics data. Indeed one might even think that there are too many papers on this topic because many of the methods or tools available seem to be minor variations of pre-existing methods or tools. Instead of trying to do an exhaustive list of papers we simply present a few papers which seem relevant to us. For a longer or more complete list see the references therein.
There are literally dozens, probably hundreds of Pathway Analysis Tools. The table below contains an opinionated list of a few popular tools that cover the spectra of tools types, such as commercial/non-commercial, web based/standalone/R based. Only one link escape from this classification: the link https://tools4mirs.org/software/target_functional_analysis/ points to a page which at its time points to functional analysis tools for microRNAs target genes.
Tool Name | Web interface | Standalone | R version | Commercial |
---|---|---|---|---|
DAVID | yes | no | yes | no |
PANTHER | yes | no | no | no |
WebGestalt | yes | no | no | no |
GSEA | no | yes | yes | no |
g:Profiler | yes | no | yes | no |
Enrichr | yes | no | yes | no |
Tools4miRs | yes | NA | NA | NA |
clusterprofiler | no | no | yes | no |
Pathvisio | yes | no | no | no |
GeneGo/MetaCore | no | yes | no | yes |
IPA | no | yes | no | yes |