GenMAPP Help Topics    
  GenMAPP Introduction   MAPP Sets
  Drafting Board   MAPPFinder
  Drafting Board Toolbar   MAPPBuilder
  The Gene Database   Downloader
  The Gene Database Manager   Advanced Concepts
  Expression Datasets   GenMAPP Knowledge Base
  Expression Dataset Manager   Converter

The "Striped Gene" Feature, or Coloring MAPPs with Multiple Color Sets

GenMAPP 2.1 allows for simultaneous display of multiple coloring criteria on MAPPs. This "striped gene" feature was designed to help in interpretation of complex data and to allow for more innovative and complex visual display options. With this feature, it is possible to display any number and combination of criteria at one time, for example displaying multiple time points in a time course or displaying different types of data together, such as gene expression and proteomic data.

The striped gene view can be applied to any dataset which contains multiple Color Sets. The user defines which of the Color Sets to display, and each Color Set selected creates a horizontal stripe in the gene box. The below figure attempts to visualize the relationship between stripes and Color Sets.

Note: Each stripe is equivalent to one Color Set and therefore a set of criteria, not a single criteria.

 

Preparing the data

GenMAPP analysis requires that the raw data is pre-processed into a form that can be used by GenMAPP. Pre-processing typically includes things like background adjustment, normalization and probe-level summarization, but since each experiment is unique it is not possible to make recommendations as to exactly what type of pre-processing should be done. Because of this, the below instructions list set of typical pre-processing steps which does not represent a solution for all datasets.

Example Data

For the purpose of these instructions, an example dataset is used. The data is from the Andrade lab at Ottawa Genome Center and was collected from GEO. 11 different time points of embryonic stem cell differentiation along with pluripotent ESCs were assayed on Affymetrix Mouse Genome 430 v2.0 GeneChip® arrays, with three replicates per group. In terms of the example dataset, the following pre-processing steps were performed:

Background adjustment, normalization and probe-level summarization

Background adjustment and normalization algorithms are typically included in array image processing applications, so these algorithms may be applied to your data by the core lab or facility that processes the arrays. Similarly, summarizing the data at the level of probes (for Affymetrix arrays) is commonly done at this stage as well, before the data reaches the end user. Since there are many algorithms available that will all have different effects on the data, consulting a statistician in regards to your specific dataset is advised.

Example data: Background adjustment, normalization and probe-level summarization was done using the gcrma algorithm in Bioconductor. To eliminate low expressing probe sets where more than one probe set existed for the same gene, a script was designed to eliminate probe sets with maximal expression <80% of the maximal expression of the most highly expressed probe set for a particular gene.

Combining all data in one spreadsheet

To use GenMAPP, the complete dataset (all arrays) must be contained in one file. If the data is not immediately available in this summary format, all relevant files must be combined. Most often this means combining separate data files containing data for individual arrays into one file, but it can also mean combining different types of data.

Example data: The output from Bioconductor is a text file with one row per probe set and one column for each array, which can be opened and manipulated in Excel.

Calculate metrics

Any type of metric or parameter can be used to color genes in GenMAPP, including text-based parameters. Calculating the metrics can be done in several ways, for example programmatically, in a database program or most commonly in Excel.

Example data: Using Excel, fold changes and p-values (student t-test) were generated for each time point relative to ESCs. For each comparison to ESCs, a separate Color Set was created, with criteria for up-regulated or down-regulated (1.5 and 4 fold) with a t-test p-value < 0.05.

Format data

Before import to GenMAPP, the data needs to be formatted according to GenMAPP specifications. Briefly, this includes adding a System Code column containing a system code for each entry and organizing the columns to have a GenMAPP supported ID in the first column and the System Code as the second column. For details on how to do this, see the Expression Dataset Manager.

Example data: A System Code column was inserted as the second column, and filled with the System Code for Affymetrix (X).

Creating a GenMAPP dataset

Importing the data

Once the data is properly formatted, it can be imported to GenMAPP via the Expression Dataset Manager:

  1. Download and load the appropriate database in GenMAPP.
  2. In the Expression Dataset Manager, select File>New to begin the data import process. For details on data import, please refer to the Expression Dataset Manager.

Creating Color Sets

To create Color Sets for your dataset, use the Criteria Builder in the Expression Dataset Manager. While the process of creating Color Sets is the same for all GenMAPP datasets, creating Color Sets for the specific purpose of simultaneous visualizing of multiple Color Sets requires some additional considerations:

 

Example data: Each Color Set deals with a different time point as compared to pluripotent ESCs.

Viewing multiple Color Sets

Selecting Color Sets for display

When the dataset contains the Color Sets you want to use for simultaneous display, selecting them for display is straightforward.

  1. First, load the appropriate database in GenMAPP, and open a pathway of interest.
  2. In the Color Sets drop-down list, select the Multiple Color Sets option.
  3. Select relevant Color Sets for display in the Multiple Color Sets window by Ctrl+click or click the All button to select all Color Sets. For example, if you have a time course experiment with one Color Set for each time point, then you should select all tie points for display. For detailed instructions on how to select multiple Color Sets, see the Drafting Board Toolbar.

Example data: All Color Sets were selected for display, resulting in one stripe for each time point studied.

Note: Depending on how many Color Sets you choose for display and how complex these are, coloring the pathway may take a few seconds.

Legend

For the striped gene view, the Legend displays which Color Sets are currently displayed for which stripe of the gene, and what each color represents:

The coloring legend can be displayed for the first Color Set only or for each Color Set. This behavior is controlled in the Options menu. For readability, it may be convenient to display the Legend for only the first Color Set, if many Color Sets are selected for display and if they all share a similar color scheme. For details on this please refer to the Legend.