DaGO-Fun - Database for GO-based Functional Annotation Analysis

Browsing Tools

Browsing Resources

Protein Resources

Protein Interactions

Annotation Analysis

The DaGO-Fun Data Sources

The DaGO-Fun data are mainly GO term and protein data retrieved from the UniProt, GO and GOA-UniProtKB project databases.

  • The Universal Protein resource (UniProt database) is the source for protein accessions.
  • The Gene Ontology (GO) database is the source for GO term data.
  • The GO Annotation Uniprot Knowledgebase (GOA-UniProtKB) project is the source for protein GO annotations.

The current version of the DaGO-Fun uses UniProt and GOA-UniProtKB releases 15 April, 2014 and GO format version 1.2 and data-version releases/2014-04-16 as downloaded on 19-April-2014. The DaGO-Fun tool includes a total of 40848 GO terms with 39044 active terms (25849 Biological Process terms, 9693 Molecular Function terms and 3502 Cellular Component terms) and 1804 obsolete terms (749 Biological Process terms, 904 Molecular Function terms and 151 Cellular Component terms). It contains 33027437 proteins annotated with Biological Process terms, 31758182 annotated with annotated with Molecular Function terms and 16108883 annotated with Cellular Component terms extracted GOA-UniProtKB file. Note that these protein numbers exclude proteins which annotated with only root terms.

For organism specific tools, namely GO Term Similarity based Protein-Fuzzy Identification (GOTSP-FIT) and GO-based Semantic Similarity-Fuzzy Enrichment Analysis (GOSS-FEAT) tools, protein annotations were downloaded from the Integr8 project which has since been merged into Ensembl Genomes. Currently, the DaGO-Fun tool includes following three organisms:

  • Mycobaterium tuberculosis (MTB)
  • Mycobacterium leprae
  • Human (Homo sapiens)