Greedy Clique Partition pAckage Tool (GCPAT)


Greedy Clique Partition pAckage Tool(GCPAT) is a program for clustering binary fingerprints with unknown values (i.e. strings of 0/1/N's). The OFRG method involves clustering fingerprints from the hybridization experiments. Unlike in other clustering problems, the data vectors in our application involve some unknown values (N's), and thus standard clustering algorithms do not apply.

The program uses a greedy strategy for minimum clique partition to cluster binary strings of 0's, 1's, and N's. It attempts to partition the input fingerprints into the smallest number of clusters, each of which consisting only of compatible fingerprints. The unknown values (i.e. the N's) in the fingerprints of a cluster are then imputed in a straightforward way. The program can also be used to produce a UPGMA style hierarchical clustering tree so the user can investigate clone clusters at various levels of granularity.

There are three executable files in the GCPAT package, one is a Java aplication, called "GCPAT.exe". The file "gcpclustering.1.0.exe" clusters fingerprints and resolves the missing values in them. And, "pairAlignmentScore.exe" computes the identity score among DNA sequences within a cluster by aligning them.


To run GCPAT, make sure that the above 3 executables are in the same folder, and then run GCPAT.exe. The supported platform are Windows 95/98/2000/XP.


In the bar menu of GCPAT, the "Clustering" option is the main part of the package, which takes a fingerprint file and outputs a *.clustering file and a tree file after the user chooses GCP under Clustering. The *.clustering file is a tab delimited file, and the *.tree file can be open by Tools -> Display tree.

The option, UPGMA under the Clustering menu, is for computing the tree, which needs a *.clustering file as input.

The Taxonomic Tabulation option under the Tools menu will output a tab delimited file like *.tabulation. It requires a probe set file, *.txt, a training DNA sequence file, *.cgi, and an optional lineage file, *.txt. In this case no output will be generated until you choose "Save tabulation into file ..." from the File menu.

In the File menu, the user might also download different information by choosing "Save ..." from the File menu. In the tree window, i.e. after displaying a tree, the user might also highlight the clones in the tree as well treatments. This option can be found under the Tools menu on the tree window. The (color) tree can also be saved or exported into a JPEG file format.


GCPAT.exe       gcpclustering.1.0.exe       pairAlignmentScore.exe       User's Guide
Copyright @ by Oligonucleotide Fingerprinting of Ribosomal RNA Genes(OFRG) Group (zliu@cs.ucr.edu, qfu@cs.ucr.edu), 2002

Last Modified on 04/25/2013