|DaGO-Fun - Database for GO-based Functional Annotation Analysis|
|About the GOSP-FCT Tool|
Genome sequencing projects and high-throughput technologies have yielded complete genome sequences and functional genomics data for human and several other organisms, including crucial microbial pathogens of humans, animals and plants. Currently, several genes and proteins are annotated with Gene Ontology (GO) terms and can be used to detect groups of functionally related proteins. Identifying gene or protein clusters based on the knowledge about their functional annotations is likely to provide an effective approach for analyzing complex biological phenomena by elucidating meaningful patterns in gene or protein dataset and improving their biological interpretation. Typically, the GOSP-FCT tool models a given list of proteins or genes as graphs or functional maps with vertices weighted using functional similarity scores between these proteins, making it easy to apply automated clustering methods to detect protein complexes or other biologically significant functional groupings.
1. Background or Reference Data and Clustering Approaches
The DaGO-Fun clustering tool uses all annotated proteins contained in the GO Annotation Uniprot Knowledgebase (GOA-UniProtKB) file when computing protein functional similarity scores to weight the functional map and clustering proteins based on their GO annotations. Remember that the selection of a given semantic similarity approach will significantly affect the output results.
2. Functional Similarity Measures and Functional Maps
GOSP-FCT sets relationships (edges) between genes or proteins in the functional map (or graph) using their functional similarity scores. Specifically, we are using fuzzy logic to model the occurrence of a given annotation in the corpus under consideration through semantic similarity between terms. The GOSP-FCT tool supports all the current IC-based GO semantic similarity measures that we are aware of, namely topology-based approaches:
Note that the results or outputs of a given query are a function of a certain agreement score or level. For GOSP-FCT, the edge or relationship between two proteins is set only in the case where the functional similarity score between these proteins is greater than or equal to the agreement level selected by the user.
For more information, please refer to the associated publication: "Gaston K. Mazandu and Nicola J. Mulder. DaGO-Fun: Tool for Gene Ontology-based functional analysis using term information content measures, 2013", DaGO-Fun preliminary paper currently under review.