Home   |   Sequence Retrieval   |   ProbeTools  |   OFRG Central   |   Macros   |   GCPAT  |   CloneTools  |   PRISE2 
OFRG
ProbeTools Instruction

ProbeTools is a probe set design package. This is a software package for designing oligonucleotide probes for OFRG hybridization experiments. The general goal is to design a small collection of probes that will collectively discriminate all, or almost all, clones within a specified population.
The input to the program is a collection of rRNA gene sequences, called the training set, selected by a user. These sequences are assumed to be representative of the targeted population of clones. The program first extracts a number of candidate probes that meet certain criteria specified by the user (length, GC content, etc.) These candidate probes are further processed by a module that removes probes that are not likely to bind in a predictable manner. Then, from among the remaining candidate probes, the program selects the final small collection of probes (typically, between 30 and 50) that optimizes the resolution. More specifically, the goal of selecting the probe set is modeled as a combinatorial optimization problem where we want to compute a set P of k probes, for a given k, that distinguishes the maximum number of clones in the training set. (We say that two clones are distinguished by P, if there is a probe in P that hybridizes with one clone but not the other.) This program was implemented following a heuristic based on the simulated annealing algorithm. (J. Borneman, M. Chrobak, G.Della Vedova, A. Figueroa, and T. Jiang. 2001. Probe selection algorithms with applications in the analysis of microbial communities. Bioinformatics 17(Suppl. 1):S39-S48).

Input
Training file

A training file contains a subset of rDNA clones from the given population in FASTA format. We offer 3 training files. You may also upload your own training file in FASTA format. Your own training file can not contain more than 12000 sequences and the length of one sequence can not exceed 3000 nucleotide bases. Otherwise it will be trimmed automatically.

Required Parameters:
Probe length

The number of nucleotides a probe contains. It can not exceed 15.

Number of probes

The number of probes in one probe set. It can not exceed 60. If the number of probes is more than 27, the computation time will be long. In this case it is recommended  to choose email as the output type.

GC content
The ratio of G+C nucleotides in a probe. If you have no particular requirement for G+C ratio,  you may set it as "between 0% and 100%".
Advanced Parameters:
Candidate probe filter

The candidate probe filter will eliminate some candidate probes whose frequency are  higher than the user defined  threshold, or lower than (1 - user defined threshold). In this way the probe's distinguishing ability may get improved. The default value is set to 80%.

Number of probe sets

The program computes the given  number of probe sets, and only outputs the set with the best accuracy. The default value is set to 5. Generally the more sets to compute, the higher the accuracy is and the longer the running time is.

Iterations before decreasing temperature

We use the simulated annealing algorithm to compute the probe set. The temperature decides how likely we switch to the worse probe choice. This parameter tells the program how many iterations to compute using the same temperature. The default value is set to 2. Generally more iterations lead to higher accuracy and longer running time.

Steepness of temperature decreasing

The program will terminate computing when the temperature is below some predefined threshold. The temperature decreasing formula is: t = t / (steepness + t) * t. Generally the smoother  the temperature decreases, the higher the accuracy is. However the temperature also decides the likelihood of the program finding the way out of a local optimum. If the temperature decreases too smooth, the program may stick in the local optimum too long and is less likely to reach the global optimum. The default value is set to 1000, which is an empirically good choice.

For more detail about the probe design algorithm

Please read the paper:
Probe Selection Algorithms with Applications in the Analysis of Microbial Communities.
James Borneman, Marek Chrobak, Gianluca Della Vedova, Andres Figueroa and Tao Jiang
BIOINFORMATICS Vol. 17 Suppl. 1 2001, Pages S39-S48

Extended requirement:
Required probes

The program can take some user input probes as the required probes, and design more probes for the probe set to make the total number of probes in one set equal to the "Number of probes" parameter. Note that the length of the required probes and the user specified parameter "Probe length" have to be same.

Forbidden probes

The program can eliminate some user input probes from the candidate probes. Note that the length of the input probes have to be same as the "Probe length" parameter.

Output
See the resultant probe set on screen

This choice allows the program to compute quickly with sacrificing some accuracy.

Obtain the resultant probe set via email

This choice allows the program to obtain a better accuracy with longer running time. During the computation you can close the browser, which will not influence the computation process and you will get the result by email when it is done.

Home   |   Sequence Retrieval   |   ProbeTools  |   OFRG Central   |   Macros   |   GCPAT  |   CloneTools  |   PRISE2 

Copyright @ by Oligonucleotide Fingerprinting of Ribosomal RNA Genes(OFRG) Group (zliu@cs.ucr.edu, qfu@cs.ucr.edu), 2002

Last Modified on 04/25/2013