Pathway based data integration and visualization
  • Help
    • Overview
    • User Options
    • Custom Analysis
    • Example Analysis
    • API Documentation
    • References
    • Question?
  • Pathview
    • Bioconductor
    • Tutorial
    • R-Forge
    • Paper
  • About
  • Related
  • Login as Guest Account
    • Login
    • Register
    • Guest
  • User Home(current)
  • New Analysis
  • Example 1
  • Example 2
  • Example 3
  • Example 4
  • Help
  • API Help
  • API Query Generator
  • Questions?

Help Information

Custom Analysis

Input & output

Input & Output

Upload your gene and/or compound data, specify species, pathways, ID type etc.

Graphics

Graphics

Specify the layout, style, and node/edge or legend attributes of the output graphs.

Colouration

Coloration

Customize the color coding of your gene and compound data.

  • Options
    • Gene Data
    • Compound Data
    • Gene ID Type
    • Compound ID Type
    • Species
    • Pathway Selection
    • Pathway ID
    • Output Suffix
  • Options
    • Kegg Native
    • Same Layer
    • Discrete
    • Split Group
    • Expand Node
    • Multi State
    • Match Data
    • Compound Label Offset
    • Key Alignment
    • Signature Position
    • Key Position
  • Options
    • Node Sum
    • NA Color
    • Limit
    • Bins
    • Low
    • Mid
    • High






Example Analysis

Multiple Sample KEGG View

This example shows the multiple sample/state integration with Pathview KEGG view.

View details »

Multiple Sample Graphviz View

This example shows the multiple sample/state integration with Pathview Graphviz view.

View details »

ID Mapping

This example shows the ID mapping capability of Pathview.

View details »

Integrated Pathway Analysis

This example covers an integration pathway analysis workflow based on Pathview.

View details »

Results

Detailed result description here

User Options: Input and Output

Gene Data

Gene Data accepts data matrices in tab- or comma-delimited format (txt or csv). Data matrix has genes as rows and samples as columns. First column should be gene IDs, first row sample IDs. The data may also be a single-column of gene IDs (example). Here gene ID is a generic concept, including multiple types of transcript or protein IDs, for example ENTREZ Gene, Symbol, RefSeq, GenBank Accession Number, UNIPROT, Enzyme Accession Number, etc. Users can specify this information through the Gene ID Type option below. uniquely mappable to KEGG gene IDs. KEGG ortholog IDs are also treated as gene IDs as to handle metagenomic data.

Both the absolute or original expression levels and the relative expression levels (log2 fold changes, t-statistics) can be visualized on pathways. However, the latter are more frequently used. If you supply data as original expression levels, but you want to visualize the relative expression levels (or differences) between two states. You need to specify a few extra options(NOT needed if you just want to visualize the input data as it is):

  • Control/reference:
    the column numbers for controls;
  • Case/sample:
    the column numbers for cases;
  • Compare:
    whether the experiment samples are paired or not.

For examples of gene data, check: Example Gene Data 1 and Example Gene Data 2.

If you intend to do a full pathway analysis plus data visualization (or integration), you need to set Pathway Selection below to Auto. Gene Data and/or Compound Data will also be taken as the input data for pathway analysis. Frequently, you also need to the extra options: Control/reference, Case/sample, and Compare in the dialogue box. However, these options are NOT needed if your data is already relative expression levels or differential scores (log ratios or fold changes). Example 4 covers the full pathway analysis.

Compound Data
Compound Data accepts data matrices in tab- or comma-delimited format (txt or csv). The format is the same as Gene Data in format (and you may also need to specify sample columns and experiment design), except rows are compounds including metabolites, drugs, small molecules etc. For example, check: Example Compound Data 1 and Example Compound Data 2.
Like the gene concept, here the compound concept is also generic, including multiple types of metabolites, drugs, small molecules. Users can specify this information through the Compound ID Type option below.
Gene ID Type
ID type used for the Gene Data. This can be selected from the autosuggest drop down list.
Compound ID Type
ID type used for the Compound Data. This can be selected from the autosuggest drop down list.
Species
Either the KEGG code, scientific name or the common name of the target species. Species may also be "ko" for KEGG Orthology pathways. This can be selected from the autosuggest drop down list.
Pathway Selection
Whether the target pathways for visualization be selected automatically or specified by the user. Auto-selection is recommended if the user is not sure what pathway(s) to view. Pathways are selected using GAGE for continuous data or over-representation test for discrete data (i.e. list of gene or compound IDs). If no pathways are called significant, the few top pathways will be selected.When both gene data and compound data are present, pathway analysis is done on the two datasets separately first, then the results are combined into more robust global statistics/p-values through meta-analysis. You may either check or try Example 4 to see the effect of setting this option.
Pathway ID
KEGG pathway ID(s), usually 5 digit. Can be entered in 2 ways from select box and autosuggest text box.This option is not needed when Pathway Selection option is set to Auto.
Output Suffix
The suffix to be added after the pathway name as part of the output graph file name. Sample names or column names of the Gene Data or Compound Data are also added when there are multiple samples.

User Options: Graphics

Kegg Native
Whether to render the pathway as native KEGG graph (.png) or using Graphviz layout engine (.pdf). Note Graphviz view may drop nodes due to missing data in KEGG xml data files.
Same Layer
Controls plotting layers: 1) if node colors be plotted in the same layer as the pathway graph when Kegg Native is checked, 2) if edge/node type legend be plotted in the same page when Kegg Native is unchecked.
Discrete (Gene and Compound)
Whether Gene Data or Compound Data should be treated as discrete. Default values are both FALSE, i.e. both data should be treated as continuous.
Keys Alignment
How the color keys are aligned when both Gene Data and Compound Data are not NULL. Potential values are "x", aligned by x coordinates, and "y", aligned by y coordinates.
Split Group
Whether split node groups are split to individual nodes. Each split member nodes inherits all edges from the node group. This option only affects Graphviz graph view, i.e. when Kegg Native is FALSE. This option also effects most metabolic pathways even without group nodes defined originally. For these pathways, genes involved in the same reaction are grouped automatically when converting reactions to edges unless split group is TRUE. Default value is FALSE.
Expand Node
Whether the multiple-gene nodes are expanded into single-gene nodes. Each expanded single-gene nodes inherits all edges from the original multiple gene node. This option only affects Graphviz graph view, i.e. when Kegg Native is FALSE. This option is not effective for most metabolic pathways where it conflicts with converting reactions to edges. Default value is FALSE.
Multi State
Whether multiple states (samples or columns) Gene Data or Compound Data should be integrated and plotted in the same graph. Default is TRUE, In other words, gene or compound nodes will be sliced into multiple pieces corresponding to the number of states in the data.
Match Data
Whether the samples of Gene Data and Compound Data are paired. Default match data is TRUE. When let sample sizes of Gene Data and Compound Data be m and n, when m>n, extra columns of NA’s (mapped to no color) will be added to Compound Data as to make the sample size the same. This will result in the same number of slice in gene nodes and compound when multi state is TRUE.
Signature Position
Controls the position of pathview signature. Default value is "bottom right". No pathview signature will be put when "None" is selected. Potential values can be found in the drop down list.
Key Position
Controls the position of color key(s). Default value is "top left". No color key will be plot when "None" is selected. Potential values can be found in the drop down list.
Compound Label Offset
How much compound labels should be put above the default position or node center. This is useful when compounds are labeled by full name, which affects the look of compound nodes and color. Only effective when Kegg Native is FALSE.

User Options: Coloration

Node Sum
The method name to calculate node summary given that multiple genes or compounds are mapped to it. Potential values can be found in the drop down list. Default Value is "Sum".
NA Color
Color used for NA's or missing values in Gene Data and Compound Data. Potential value can be "transparent" or "grey".
Limit (Gene and Compound)
The limit values for Gene Data and Compound Data when converting them to pseudo colors. This field is a numeric field you can enter two values separated by a comma for example "1,2" (without quote). First value stands for lower limit and second value for higher limit. If a single value n is given then limit is taken as (-n, n). Input fields are enabled after checking respective checkpoints for Gene and Compound Data.
Bins (Gene and Compound)
This argument specifies the number of levels or bins for Gene Data and Compound Data when converting them to pseudo colors. Default value is 10.
Low, Mid, High (Gene and Compound)
These arguments specify the color spectra to code Gene Data and Compound Data. Default spectra (low-mid-high) "green-gray-red" and "blue-gray-yellow" are used for Gene Data and Compound Data respectively. Users may specify colors using common names (green, red etc), hex color codes (00FF00, D3D3D3 etc), or the color picker.

Reference

Please cite our paper if you use this website. This will help the Pathview project in return.

  • Luo W, Pant G, Bhavnasi YK, Blanchard SG, Brouwer C. Pathview Web: user friendly pathway visualization and data integration. Nucleic Acids Res, 2017, Web Server issue, doi: 10.1093/ nar/gkx372
  • Luo W, Brouwer C. Pathview: an R/Biocondutor package for pathway-based data integration and visualization. Bioinformatics, 2013, 29(14):1830-1831, doi: 10.1093/bioinformatics/btt285

Please also cite GAGE paper if you are doing pathway analysis besides visualization, i.e. Pathway Selection set to Auto on the New Analysis page.

  • Luo W, Friedman M, etc. GAGE: generally applicable gene set enrichment for pathway analysis. BMC Bioinformatics, 2009, 10, pp. 161, doi: 10.1186/1471-2105-10-161

Contact

Email us: pathomics@gmail.com

© 2013 -       Pathview Project

Bioinformatics Services Division - Department of Bioinformatics and Genomics - UNC Charlotte