Background: The enormity of the information contained in large data sets makes it difficult to develop intuitive understanding. It would be useful to have software that allows visualization of possible correlations between properties that can be associated with a core data set. In the case of bacterial genomes, existing visualization tools focus on either global properties such as variations in composition or detailed local displays of the features that comprise the annotation. It is not easy to visualize other information in the context of this core information.
Results: A Java based software known as the Genome Display Tool (GDT), allows the user to simultaneously view the distribution of multiple attributes pertaining to genes and intragenic regions in a single bacterial genome using different colours and shapes on a single screen. The display represents each gene by small boxes that correlate with physical position in the genome. The size of the boxes is dynamically allocated based on the number of genes and a zoom feature allows close-up inspection of regions of interest. The display is interfaced with a MS-Access relational database and can display any feature in the database that can be represented by discrete values. Data is readily added to the database from an MS-Excel spread sheet. The functionality of GDT is demonstrated by comparing the results of two predictions of recent horizontal transfer events in the genome of Synechocystis PCC-6803. The resulting display allows the user to immediately see how much agreement exists between the two methods and also visualize how genes in various categories (e.g. predicted in both methods, one method etc) are distributed in the genome.
Conclusion: The GDT software provides the user with a powerful tool that allows development of an intuitive understanding of the relative distribution of features in a large data set. As additional features are added to the data set, the number of possible correlations that can be visualized grows rapidly. Although described here for use in bacterial genomics, the principle is general and similar software might be useful in other contexts such as patient studies.