|DaGO-Fun - Database for GO-based Functional Annotation Analysis|
|The DaGO-Fun Input Requirements|
The DaGO-Fun input data depends on the tool used. Note that four tools have currently been included in the DaGO-Fun (see tools shortcut tab or Browsing tools on the left menu), and are described below.
1. Integrated Tool for IC-based GO Semantic Similarity Measures
This tool integrates three sub-tools, namely Term Information Content (IC) score retrieval, Term semantic similarity and protein semantic or functional similarity score computation.
2. GO Term Similarity based Protein-Fuzzy Identification Tool
From the list of GO IDs provided, proteins performing functions similar to each of the terms are retrieved depending on the agreement level chosen. Note that customized or weak agreement level means that any protein with similarity score (SS) different from 0 to the term is selected. For realistic or moderate agreement, SS is greater than 0.3, strong or high SS is greater than 0.7, very strong SS is greater than 0.85 and perfect SS is equal to 1.
3. GO-based Semantic Similarity-Fuzzy Enrichment Analysis Tool
From the list of target genes or proteins, statistically and biologically relevant GO annotations explaining the biological phenomena being analyzed are identified and filtered according to the agreement level, as described previously. A moderate filter is applied to analyses using weak (or customized) and moderate (or realistic) agreement level. A strong filter is applied to analyses using strong and very strong agreement level. No filter is applied when using a perfect agreement level, corresponding to the traditional term enrichment approach. Note that if a background or reference list of proteins is not provided then all the organism's annotated proteins will be used by default as the reference.
4. GO-based Similarity Protein Fuzzy-Classification Tool
GO annotations of proteins provided are used to construct a weighted network using functional or semantic similarity between proteins depending on the semantic similarity approach and agreement level used. This network is then used to cluster or classify these proteins. The limited number of proteins is to allow the user to view the clustering figure displayed properly.