GenMAPP Help Topics    
  GenMAPP Introduction   MAPP Sets
  Drafting Board   MAPPFinder
  Drafting Board Toolbar   MAPPBuilder
  The Gene Database   Downloader
  The Gene Database Manager   Advanced Concepts
  Expression Datasets   GenMAPP Knowledge Base
  Expression Dataset Manager   Converter

The Gene Database Manager

Using the Gene Database Manager, you can update Gene Databases, add remarks to an existing GenMAPP Gene Table, add new Gene Tables and Relationship Tables, or edit Gene or Relationship Tables you have added. The Gene Database Manager also allows you to create your own Gene Database for a species not supported by GenMAPP. To reach the Gene Database Manager from the main Drafting Board window choose Data > Gene Database Manager.

Adding Gene Databases

You may want to add a new Gene Database to your GenMAPP installation if you did not download a database when you installed GenMAPP, or if there is a new species database you are interested in from the GenMAPP.org website. You can do this at any time through the Download Data from GenMAPP.org feature from the GenMAPP program Data menu. To download a new database from within the GenMAPP program, use the following steps:

From the Drafting Board Window, left-click on Data and choose Download Data from GenMAPP.org.

If your internet connection is available, you will be presented with a list of databases (and other files) available at the GenMAPP website. Select the database you are interested in. If your computer does not have a functioning internet connection, the download process will not work. In that case, please contact genmapp@gladstone.ucsf.edu to receive a CD version of the Gene Databases.

The database will be downloaded to your computer. The default location is C:/GenMAPP 2 Data/Gene Databases. The database will be automatically unzipped and ready for use.

You can now use the new Gene Database by returning to the Drafting Board Window, left clicking on the Data menu, and selecting Choose Gene Database.

www.GenMAPP.org appends a date to the Gene Database file name, for example Mm-Std_20050630.exe, indicating that a Mus musculus Gene Database was created from the June 30th, 2005, edition of GenMAPP.org's master database. In other words, data is current as of that date. If you are replacing an existing Gene Database on your computer, follow the procedure in Gene Database Updates.

Gene Database Updates

GenMAPP.org will periodically release updated databases. If you would like to replace your current database with a more recent version follow these steps:

  1. Make sure the database you are interested in updating is open in GenMAPP.
  2. From the Drafting Board Window, left-click on Data and choose Download Data from GenMAPP.org.
  3. If your internet connection is available, you will be presented with a list of databases (and other files) available at the GenMAPP website. Select the database you are interested in. If your computer does not have a functioning internet connection, the download process will not work. In that case, please contact genmapp@gladstone.ucsf.edu to receive a CD version of the Gene Databases.
  4. The database will be downloaded to your computer. The default location is C:\GenMAPP 2 Data\Gene Databases although you may Change this to another location. The database will be automatically unzipped and ready for use.
  5. When the download is complete, click OK to exit the Data Acquisition Tool.
  6. Return to the Data menu in the Drafting Board Window and left click on Gene Database Manager.
  7. In the Gene Database Manager, choose Data > Update Gene Database. Select your newly downloaded database.
  8. GenMAPP will now:

GenMAPP retains the old Gene Database, replacing the date at the end of the file name with the suffix "old". For example, if the previous database was Hs-Std_20030823.gdb, it will now be called Hs-Std_old.gdb. The newly installed database will have a date in it's name, such as Hs-Std_20050713.gdb.

Adding a Gene Table

GenMAPP has a number of Gene Tables in various Gene Databases available through the Downloader but you may wish to add your Gene Table own to a Gene Database on your computer. You may add any gene ID system you wish with as many annotation columns as you wish. You will assign your Gene Table a name (such as MyGenes) and a Gene Table System Code (such as &m) with these restrictions:

Raw Gene File

Your raw gene file must be rows of consistent columns in either tab-delimited (.txt or .tab) or comma-separated-value (.csv) form. Either of these forms can be exported from popular spreadsheet programs such as Excel. The first row must consist of the column titles.

The first column of your raw gene file must be the gene ID system's unique identifier; its title is irrelevant (Note to Excel users). Subsequent columns may be anything you choose and as many as you need.

This is an example of how the Raw gene file might look in a spreadsheet program:

Oligo ID Spot Index Grid Row Column Gene Description
M200000577 5330 15 1 10 Scn1b_Sodium channel, voltage-gated, type I
M200000578 8370 23 1 10 Btn1a1_Butyrophilin, subfamily 1, member A1
M200000579 11410 31 1 10 Adamts1_A disintegrin-like and metalloprotease
M200000580 14450 39 1 10 Lyst_Lysosomal trafficking regulator
M200000581 17490 47 1 10 Ext1_Exostoses (multiple) 1
M200000582 1529 5 1 9 Adcy8_Adenylate cyclase 8
M200000583 4569 13 1 9 Rgs14_Regulator of G-protein signaling 14
M200000584 7609 21 1 9 Anxa11_Annexin A11
M200000585 10649 29 1 9 Gata4_GATA binding protein 4
M200000586 13689 37 1 9 Eya3_Eyes absent 3 homolog (Drosophila)
M200000587 16729 45 1 9 Shbg_Sex hormone binding globulin

Naming Rules

The raw gene file name must follow the rules above, only alpha, numeric, or underscore characters. If the Gene Database Manager detects an invalid name it will prompt you to correct it including a suggestion for a valid name.

Column titles may not include embedded quotes (leading and trailing quotes will be stripped off), "`" (an acute accent), "!", "[", "]", ".", "," (comma), "$", or "|" (pipe symbol). The apostrophe is allowed but for storage in the gene database, will be converted to another character that looks similar. A column title may not exceed 50 characters.

There are certain reserved column titles. You may not use the titles ID, System Code or Date, however your first two columns may be entitled anything (Note to Excel users) since the Gene Database Manager will interpret the first two columns as Gene ID and System Code.

You will ahev the opportunity to assign a Species to the Gene Table upon import, so it is not necessary to add a Species column to your raw data

Unique gene identifiers may not contain more than 50 characters nor may it contain quotes, "$", "," (comma), or "'" (apostrophe).

The Remarks Column

Your Gene Table may have a Remarks column, which may contain any number of characters. The information in this column is displayed in the Gene Finder as well as on Backpages. Since both of these are browsers, you may use HTML coding in your Remarks column.

For example, to embed a hyperlink in your Remarks, you can add something like the following:

See <a href="http://www.GenMAPP.org">GenMAPP site</a> for info

Importing a Raw Gene File as a Gene Table

  1. In the Gene Database Manager, click the Data > Add New Gene Table. Choose the raw gene file you wish to convert.

Three windows will appear:

The Gene Database Manager suggests a Gene Table name based on the name of your raw gene file. You may modify it and click OK.

Enter the System Code for your Gene Table. Remember, the code must begin with & followed by a single alphanumeric character.

GenMAPP also asks you if any columns, in addition to the Gene ID column, should be made searchable. Checking the box for any column means that the column will be made searchable through the Gene Finder. You can have a maximum of 100 searchable columns. However, it is recommended that you choose no more than two columns (in addition to the Gene ID) to be made searchable, since it will significantly slow down the speed of GenMAPP.

GenMAPP will now start importing your raw data. A progress bar informs you of the status of the import and of any errors encountered in the process. Once the import is finished, your new Gene Table will be displayed on the screen. There will be a field at the bottom left-hand corner for supplying a web link.

If any errors were encountered, the Gene Table is not created and an ~Errors~ column is added as the last column of your raw gene file. Correct the errors in each indicated row and import the data again. Because of the way some spreadsheets, notably Excel, treat a final column that may or may not have data in all rows, GenMAPP does not report a single missing column as an error. More than one missing column is reported, however.

A practical way of dealing with import errors is to load your raw gene file into a spreadsheet program, such as Excel, and filter the rows to only those in which the ~Errors~ column is not blank. You can then easily make changes to erroneous entries. It is not necessary to remove the ~Errors~ column before importing the file again. Be sure to save the result as a tab-delimited (.txt or .tab) or comma-separated-value (.csv) file.

Adding a Relationship Table

A Relationship Table is the link between two Gene Tables. For example, if your Gene Database has both Ensembl and Entrez Gene tables, an "All related genes" search for Ensembl ENSMUSG00000003031 will also find Entrez Gene 12576. The relations always work both ways. If you search for Entrez Gene 12576, you will also find Ensembl ENSMUSG00000003031.

A Relationship Table always consists of two columns: Primary, containing the genes for one gene ID system (say Ensembl) and Related, containing the genes for the other gene ID system (say, Entrez Gene). Which gene ID system is in what column is irrelevant.

Raw Relationship File

To add a Relationship Table, you must first make sure that both systems in the relationship table exists in the Gene Database.

Your raw relationship file must have two (or more) columns. The first is the gene IDs for one Gene Table (and the Primary or first column of your Relationship Table), the second is the Gene IDs for the other Gene Table (and the Related or second column of your Relationship Table). Any columns beyond are ignored. For example, if you are relating the MyGenes Gene Table to Entrez Gene, your first column should contain the MyGenes identifiers and the second column the Entrez Gene identifiers. The resulting Relationship Table, by convention, will be named MyGenes-Entrez Gene. The raw relationship file must be in tab-delimited (.txt or .tab) or comma-separated-value (.csv) form such as that exported from many spreadsheets and database systems. Gene IDs are limited to 50 characters. The columns should have no headings.

This is an example of how the Raw Relationship file might look in a spreadsheet program:

M200000577 100000
M200000578 100001
M200000579 100002
M200000580 100003
M200000581 100004
M200000582 100005
M200000583 100006
M200000584 100007
M200000585 100008
M200000586 100009
M200000587 100010

To add a Relationship Table, in the Database Manager, left-click on Data -> Add New Relationship Table. After entering the raw data file to convert, the Gene Database Manager shows you a screen similar to the following:

Here you enter the name of the primary and related gene ID systems using the drop-down menu. The system codes will be automatically filled in for you.

Your new Relationship Table's name will be your primary and related names combined with a dash in between. For example, if you imported MyGenes, code &M, as primary and chose Entrez Gene as related, the Gene Database Manager names your Relations Table MyGenes-Entrez Gene.

Click the Process button to start import.

If any errors are encountered, the Relationship Table is not created and the an ~Errors~ column is added to your raw relationship file. Correct the errors in each indicated row and import the data again.

A practical way of dealing with import errors is to load your raw gene file into a spreadsheet program, such as Excel, and filter the rows to only those in which the ~Errors~ column is not blank. You can then easily make changes to erroneous entries. It is not necessary to remove the ~Errors~ column before importing the file again. Be sure to save the result as a tab-delimited (.txt or .tab) or comma-separated-value (.csv) file. Correct the errors before importing the raw relationship file again.

Copying Tables from Other Databases

GenMAPP allows you to copy any table from another Gene Database present in your computer. To copy a table or tables from another Gene Database, choose Data > Copy Table(s) From.... You will be prompted to choose a Gene Database from your computer. A window will appear displaying the tables in the current database:

Click on the table you wish to copy to select it. To select multiple tables, hold down the Ctrl key before clicking. Click the Copy button to begin copying. The progress will be displayed at the bottom of the window.

Editing Gene Tables

In the Gene Database Manager, click on the Data menu and Edit Gene Table. From the Gene Table dropdown list, select the Gene Table you wish to edit. A data grid for that Gene Table will appear.

Manipulating the Data Grid

The data grid can show only a small portion of the data in a gene ID system. You will have to scroll vertically and/or horizontally, and probably have to change the size of rows or individual columns. Note that the data in many cells take up far more room than the initial cell provides. It may wrap for many lines, only the first of which is visible in the initial cell.

To size either rows or columns, move your mouse pointer to the joints in the left or top (heading) bars on the grid. Your mouse pointer will change to the sizing icon. Hold down the left mouse button and drag the joint to the size you want. Columns may be sized individually but changing the size of any row changes all of them.

Large Gene Tables must be manipulated in segments of 250,000 records each. You may move between segments by clicking the Previous Segment or Next Segment buttons.

Editing Restrictions

For a Gene Table that exists in a Gene Database downloaded through the Downloader, you may only add notations in the Remarks column. To maintain consistency among all GenMAPP users, you may not edit any other columns or add records to such a table

If the Gene Table is one that you added to the Gene Database (see Adding a Gene Table), you may edit any of the columns as well as the Web Link for that Gene Table.

Making Changes

For most columns, you may type changes directly in the data cell.

The Species column must contain data in a specific format and only the species that are allowed for a particular Gene Table in a particular Gene Database. The allowed species are only the species specified for the Gene Database and species assigned to the gene table during import of the raw gene data file.

If you click on a Species cell and the Gene Table allows more than one species, the Species dropdown list appears and you may chose your species from the list. Each time you click on a species, the Gene Database Manager adds that species to the ones in the cell. To delete species, click on Clear species, deleting all of them, and add back any that you wish.

To add records, move to the blank row at the end of the data grid and add the record there.

Deleting a Table

The Gene Database Manager also allows you to delete any table that you have added to the database. To delete a table, go to Data > Delete Table. A window will appear listing all tables that you can delete.

Specifying Web Links

Each Gene ID system in GenMAPP can be associated with a web link that allows gene ID to be automatically linked to the corresponding public resource. The web links appear on Backpages and in the Gene Finder. For GenMAPP supported Gene ID systems, like UniProt, the web links are already defined in the database. For custom gene ID systems, you can add the web link, either immediately following import of a new Gene Table, or at a later time. The field for entering web links appears on the bottom left of the Edit Gene Table window, that appears after importing a Gene Table.

The link must follow a generic search pattern with the unique gene identifier in a single place in the link. You indicate that place with a "~" (tilde). This is the most common search pattern; others are not supported by GenMAPP. For example, the html for creating an automated link to UniProt is defined as:

http://www.expasy.ch/cgi-bin/niceprot.pl?~

where the "~" represents the place where the unique identifier is substituted. When you click on the UniProt identifier 143E_HUMAN on a Backpage, GenMAPP links to the URL

http://www.expasy.ch/cgi-bin/niceprot.pl?1433E_HUMAN

The "~" need not necessarily be at the end of the link. The WormBase link, for example, is

http://www.wormbase.org/db/seq/protein?name=WP%3A~;class=Protein

If the web link you supply doesn't contain a "~", the Gene Database Manager will not recognize the link and will not associate it with the Gene Table.

Creating a New Gene Database

GenMAPP allows you to create a new Gene Database for any species in which you are interested. In order to successfully create a new Gene Database, you must have sufficient gene information for your species of interest, for example from a public database. This data must be in the correct format for GenMAPP to be able to use it for creating a new Gene Database.

Gene Tables

When creating a new Gene Database, you must have at least one Gene Table. This table will contain gene identifiers and annotation from a gene cataloging system for your species. For example, this information could come from a public database. You could also have your own cataloging system for your species, with associated annotation. Any information you add to the Gene Table will appear on the Backpage for each gene. If you supply a web link for your Gene Table, the gene identifiers on the Backpage will link to that webpage. The format requirements of the raw data for a Gene Table in a new Gene Database is the same as for Adding a Gene Table to an existing Gene Database.

Relationship Tables

You may have information relating your gene identifiers to other ID types. For example, if your Gene Table was downloaded from a public database, it is likely that there is information linking each gene to for example UniGene or UniProt. This kind of information can also be added to your new Gene Database. It provides information for the Backpage, and also makes it possible to use these identifiers as gene identifiers on MAPPs, provided that you choose All Related Gene IDs as your Coloring option in the Options menu. The format requirements of the raw data for a Relationship Table in a new Gene Database is the same as for Adding a Relationship Table to an existing Gene Database.

Gene Ontology Tables

In order for your new Gene Database to work with MAPPFinder, a series of Gene Ontology tables are required. Since the structure of some of the Gene Ontology tables is non-trivial, GenMAPP.org currently produces these files for each custom database on request. The only GO table that you have to supply is a Relationship Table between your designated MOD gene ID system and the GO term IDs. This table should be formatted like any other Relationship Table, with your ID system as the Primary ID and the GO term IDs as the Related. Naturally, you will also have to designate the corresponding Gene Table as the MOD. Once you have this table available, submit it to genMAPP@gladstone.ucsf.edu. You will receive a GO skeleton database from GenMAPP.org that will be used as a template for copying creating your custom database.

 

For detailed instructions on how to create a new Gene Database, please refer to our manual, Creating a GenMAPP database for a non-supported species.  

Creating GOCount Table

Under the File menu, there is an option to "Create GOCount Table". This feature refers to an older process of creating a new database, and is not used in the current protocol described in our detailed manual. If all necessary GO tables are present in your database, custom or distributed, the option will be unavailable.

The "Create GOCount Table" feature may be useful for some specialized purposes. Please refer to the old protocol for details on how to use this feature.

Gene Database Information

To display information regarding the current database, you have two options. You can either left-click on Data > Gene Database Information in the main Drafting Board Window, or left-click on Data > Gene Database Information in the Gene Database Manager. A window will display all tables in the current database, with system codes and any related systems. It will also display which system is the MOD, and for which species the database was created, in addition to information on when the database was created and updated.