SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides



Fig. 1. Screenshot of SeqVis, with data set from Rokas et al. (2005) uploaded

Summary

SeqVis is a stand-alone, platform-independent Java application developed with the aim to facilitate analysis and 3D visualization of compositional heterogeneity in species-rich alignments of nucleotide sequences. Each sequence is represented by a dot in a tetrahedron plot (i.e., an extension of the de Finetti), where the position of the dot depends uniquely on the nucleotide content of that sequence. Sequences that are compositionally different will appear as dots found in different areas within the tetrahedron (as in Fig 1). SeqVis also allows users to analyse their data set using, for example, the useful matched-pairs test of symmetry (Ababneh et al. 2006).

SeqVis can be downloaded from here — some of its features are listed here. A manual, with an in-depth description of why the program is needed and how its features may be used, was published by Jermiin et al. (2009). The benefit of using SeqVis is illustrated here using two phylogenetic data sets.

Background

Compositional heterogeneity

Most phylogenetic methods assume that the sequences have evolved under the same time-reversible conditions (i.e., the evolutionary process is assumed to have been globally stationary, reversible, and homogeneous). Compositional heterogeneity in sequence data occurs when the nucleotide composition of homologous sequences varies across the taxa, and it implies that the sequences cannot have evolved under the same stationary, reversible, and homogeneous conditions. Therefore, if a phylogenetic analysis is to be conducted with a time-reversible Markov model (such as the HKY or GTR models), then it would be wise to survey the phylogenetic data first for evidence of violation of these assumptions.

Existing methods for assessing compositional heterogeneity

There are many ways to survey alignments of nucleotides for evidence of compositional heterogeneity but most of the available methods are either likely to give an unreliable result (due to the methods' poor statistical design) or impractical because they were not designed to analyse the increasingly large, species-rich alignments that are now available (for details, see Jermiin et al. 2004, 2009).

Citing SeqVis

Please cite this article if you include results generated by SeqVis in your research:

Ho JWK, Adams CE, Lew JB, Matthews TJ, Ng CC, Shahabi-Sirjani A, Tan LH, Zhao Y, Easteal S, Wilson SR, Jermiin LS (2006). SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides. Bioinformatics 22, 2162-2163.


References

Ababneh F, Jermiin LS, Ma C, Robinson J (2006). Matched-pairs tests of homogeneity with applications to homologous nucleotide sequences. Bioinformatics 22, 1225-1231.

Jermiin LS et al. (2004). The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Syst. Biol. 53, 638-644.

Jermiin LS, Ho JWK, Lau KW, Jayaswal V (2009). SeqVis: A tool for detecting compositional heterogeneity among aligned nucleotide sequences. In Bioinformatics for DNA sequence analysis (Ed. Posada D), Humana Press, Totowa (NJ). Pp. 65-91

Rokas A, Krüger D, Carroll SB (2005). Animal evolution and the molecular signature of radiations compressed in time. Science 310, 1933-1938.

© The University of Sydney, 2004-2011. All rights reserved.