More information is available at http://faculty.cse.tamu.edu/shsze/gcfinder.

INSTALLATION

1. Type ./install to install GCFinder.

2. Either move the executable files gcfinder, osynm and synm to a directory
   on the search path or add the current directory to the search path.

INPUT

The following files are needed:

1. A file specifying file names that define the homologous groups.  Each
   organism starts with a line containing ">" and its name, followed by
   one or more lines that specify a file name for each chromosome.

   Example:

      >bsu
      bsu_cog

      >spy
      spy_cog

      >spn
      spn_cog

      >cac
      cac_cog

2. A file specifying file names that give the gene names. Each organism
   starts with a line containing ">" and its name, followed by one or
   more lines that specify a file name for each chromosome.  The order
   of the file names should be the same as in 1.

   Example:

      >bsu
      bsu_gname

      >spy
      spy_gname

      >spn
      spn_gname

      >cac
      cac_gname

3. Files that define the homologous groups.  In each row, the first number
   shows the position of the gene on the chromosome and the other numbers
   give the corresponding homologous group IDs.

   Example 1 (each gene belongs to one homologous group):

       1   593
       2   592
      19   2812
      20   718
      21   353
      25   4915
      26   3853
      27   1982

   Example 2 (each gene may belong to many homologous groups):

      38   73 143
      39   84
      40   3583 3584
      41   1658
      42   30
      46   1947
      47   503
      48   251
      49   2088
      50   1207
      51   462

4. Files that give the gene names. The first column shows the position
   of the gene on the chromosome and the second column gives the name of
   the gene.

   Example:

       1   BSU00010
       2   BSU00020
       3   BSU00030
       4   BSU00040
       6   BSU00060
       7   BSU00070
       9   BSU00090
      10   BSU00100
      11   BSU00110
      12   BSU00120
      13   BSU00130
      14   BSU00140
      15   BSU00150
      16   BSU00160
      17   BSU00170
      18   BSU00180
      19   BSU00190
      20   BSU00200

USAGE

   gcfinder -h=filelist -g=gfilelist -t=1 -u=1 -a=2 -d=50 -e=1e-5 -o=result

Command line parameters:

   -h= "file name specifying files that define the homologous groups"
   -g= "file name specifying files that give the gene names"
   -t= "type of gene clusters"
       0 -- ordered clusters
       1 -- unordered clusters
   -u= "type of genomes"
       0 -- linear
       1 -- circular
   -a= "minimum number of genomes that a gene cluster must appear"
   -d= "maximum size of gene clusters"
   -e= "e-value cutoff, only gene clusters with lower e-value are returned"
   -o= "output file name"

OUTPUT

Each gene cluster is shown beginning with "Cluster:" and a list of
homologous group IDs in the cluster.

"Expect" gives the e-value.  "Size" gives the average size of the cluster
and the average number of genes in the chromosomes.  "Appear" gives the
number of chromosomes in which the cluster appears and the total number of
chromosomes.

The rest of the lines show specific genes on each chromosome. The leading
string is the name of the organism. "Chr:" shows the chromosome number,
"S" is the starting gene position, and "E" is the ending gene position.
For each gene, the homologous group ID is given, and its position and name are
shown in parentheses.

Example:

Cluster:  642 745 1136
        Expect = 8.6456e-06, Size = 3/2894, Appear = 4/4
        bsu_cog (Chr:1 S3301 E3326):  642 (3301, BSU33020) 642 (3320, BSU33210) 745 (3321, BSU33220) 1136 (3326, BSU33270)
        spy_cog (Chr:1 S1554 E1557):  642 (1554, SPy_2026) 745 (1555, SPy_2027) 1136 (1557, SPy_2031)
        spn_cog (Chr:1 S1525 E1546):  642 (1525, SP_1632) 745 (1526, SP_1633) 1136 (1546, SP_1653)
        cac_cog (Chr:1 S364 E366):  745 (364, CAC0371) 642 (365, CAC0372) 1136 (366, CAC0373)

A list of maximal unordered gene clusters found on four bacterial genomes
B. subtilis, S. pyogenes, S. pneumoniae and C. acetobutylicum is in
bacteria/result, which is obtained by the command in bacteria/run.

A list of maximal unordered gene clusters that appear in four yeast genomes
S. cerevisiae, S. paradoxus, S. mikatae and S. bayanus is in yeast/result,
which is obtained by the command in yeast/run.

The maximum size of a gene cluster is constrained to be 50.