|DaGO-Fun - Database for GO-based Functional Annotation Analysis|
|IT-GOM Help and Description|
Welcome to the user guide for the IT-GOM tool for computing IC-based (Information Content-based) semantic similarity scores using GO annotations. The tool provides a stepwise query selection menu, enabling the user to construct a query and adapting the selection choices in the process, leaving only relevant options open that correspond to his/her selections. This renders the user interface in IT-GOM easy to use and customized for effective exploration of GO term and protein semantic similarity scores.
|1. Search step|
This step allows the user to select the type of scores and the ontology to which the GO terms under consideration belong. Note that in IT-GOM, the three GO ontologies: Molecular Function (MF), Biological Process (BP) and Cellular Component (CC) are considered to be independent. This step contains two drop-down lists: measuring (for type of score) and ontology. Currently, the IT-GOM tool provides three types of scores, namely Term Information Content (IC) values, Term Semantic Similarity and Protein Semantic Similarity or Protein Functional Similarity scores. The ontology drop-down list is not available when measuring term IC.
|2. Tool Category step|
At this step, the user should provide the family of GO semantic similarity scores and how GO term IC values should be combined to produce GO term or protein semantic similarity scores. IT-GOM initially displays two drop-down lists: Select Family and Select Approach. It implements two main families, namely annotation- and topology-based families. If the user selects annotation-based family, there is a checkbox which appears, allowing him/her to include or exclude GO terms with the IEA (Inferred from Electronic Annotation) evidence code. Click here for more information about GO Evidence Codes. In the context of the topology-based family of methods, this checkbox appears only if the user has chosen to measure the protein semantic similarity scores as a protein may have GO terms with the IEA GO evidence code. This means if the checkbox is not checked, GO terms with IEA evidence code are ignored when computing GO term IC or protein semantic similarity scores.
|3. Options step|
This step is available only when measuring protein semantic (functional) similarity scores in the context of the annotation-based family. In this case, more information is requested on how to combine GO term IC values to produce these scores. This step contains two drop-down lists: Model and Combination. In the 'Model' drop-down list, the user indicates if he intends to use GO term IC directly (Direct Term-based) or through GO term semantic similarity values (Term Similarity-based). Depending on this choice, the 'Combination' drop-down list is updated accordingly.
|4. User Input step and Result outputs|
The user input is either GO IDs, GO ID pairs or UniProt protein Accession pairs, depending on the score measurements to be processed. In response to the user query, the IT-GOM tool produces a comprehensive summary in a table format on the next page of the user interface.
Note that in the IT-GOM tool, the level of a term is the length of the longest path (or the maximum number of links) from the root of the ontology down to that term. The root itself is located at the level 0 considered to be the reference level. The status indicates whether a term is still active (A) or obsolete (O) or the term does not exist (N) in the ontology under consideration for the current settings. Note that by clicking on a given active GO ID, the associated sub-GO graph is displayed using the AmiGO tool.
Each concept is linked to its original database for viewing its features: protein is linked to the UniProt database and GO term to the GO database via the AmiGO tool. The tool also provides the possibility of displaying more details for a given pair by selecting a concept pair of interest and clicking on the view button at the bottom of the table. The new page then allows you to access the information about a specific concept by clicking on it. For this, proteins are linked to the QuickGO tool and GO terms to the AmiGO tool for viewing the sub-GO DAG of the term.
|5. Important note on input limits:|
We aim to let the IT-GOM tool calculate results for as many user inputs as possible, however, because of limitations in computational resources, we have to balance the maximum number of GO terms, and GO term and protein pairs for each user query. The maximum number of GO terms is 5000 when computing GO term IC scores, in which case the tool will display only 10 of them stepwise and all 5000 GO term features can be viewed by downloading them in a text file. For GO term semantic similarity scores, the user has to enter at most 3000 pairs and for protein, can enter a maximum of 3000 pairs per query. Entries beyond the maximum limitations will be ignored. Unfortunately if you have cases where your data exceed these limitations, it is necessary to divide the input data, run the IT-GOM tool separately, and merge the results at the end of the process. Alternatively you can contact the administrators who are willing to collaborate and run large data sets for analysis.
For more information, please refer to the associated publication: "Gaston K. Mazandu and Nicola J. Mulder. DaGO-Fun: Tool for Gene Ontology-based functional analysis using term information content measures, 2013", DaGO-Fun preliminary paper currently under review.