Sequence Demarcation Tool

Sequence Demarcation Tool Version 1.3 (SDTv1.3) is a free and easy to use program that allows classification of virus sequences based on sequence pairwise identity. It takes as input a FASTA file of aligned or unaligned DNA or protein sequences and aligns every unique pair of sequences, calculates pairwise similarity scores, and displays a colour coded matrix of these scores. It also produces both a plot of these pairwise identity scores and text files containing analysis results. The identity scores are calculated as 1-(M/N) where M is the number of mismatching nucleotides and N the total number of positions along the alignment at which neither sequence has a gap character.

can't display SDT mat can't display SDT plot

               Colour coded matrix of pairwise similarity scores                                          Distribution plot of pairwise similarity scores

Up-to-date Mastrevirus reference datasets

1. Mastrevirus reference sequences alignment mastrevirus_references.fas (last updated 20/11/2012)
2. Mastrevirus reference sdt file mastrevirus_references.sdt (last updated 20/11/2012)

SDT_Linux & SDT_Mac (Sequence Demarcation Tool for 32 and 64 bit Linux and Mac OSX operating systems)

This a command-line version of SDTv1.0 written in Python, which runs on Linux and Mac OSX, 32 and 64 bit operating systems ).

SDTMPI_Linux (Sequence Demarcation Tool MPI for 32 and 64 bit Linux operating systems)

SDTMPI_Linux (32 and 64 bit) is a free Linux-based Python program that uses a parallel programming library for Python (mpi4py) to allow quick calculations of DNA sequence pairwise similarities for large numbers of sequences that cannot be practically analysed using SDT Version1.0 (for example above 1000 or more sequences). Given a FASTA file containing DNA sequences, the program aligns all possible pairs of sequences using Muscle, ClustalW2, or Mafft, calculates the sequence similarity score for each pair and uses a rooted neighbour joining phylogenetic tree to cluster closely related sequences based on similarity scores. It outputs a text file containing the scores and a ".sdt" file that can be open with SDTv.1.0 Windows version to easily visualise the pairwise identity distribution plot and colour coded similarity matrix. The indentity scores are calculated as 1-(M/N) where M is the number of mismatching nucleotides and N is the total number of positions along the alignment where neither sequence has a gap character.

Please cite:Muhire BM, Varsani A, Martin DP (2014) SDT: A Virus Classification Tool Based on Pairwise Sequence Alignment and Identity Calculation. PLoS ONE 9(9): e108277. doi:10.1371/journal.pone.0108277


Written by Brejnev Muhire
Computational Biology Group - University of Cape Town
Institute of Infenctious Diseases and Molecular Medicine (IIDMM)
Anzio Road Observatory 7925
Cape Town South Africa
Tel: +27 21 406 6058/6176 Fax: +27 21 406 6068
Email: mhrbre001@myuct.ac.za or brejnev.muhire@gmail.com