Go back to top



CODFISH calculates a set of codon usage statistics for a sequence using a specified codon usage table.


CodFish calculates a set of codon usage figures, based on a DNA sequence and (for some parameters) also on a codon usage table.

The program name is derived from its original intended use, to study codon usage in "fission yeast" (Schizosaccharomyces pombe).


This program was written by Peter Rice (E-mail: Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (


Here is a session with CodFish

  % codfish
   CODFISH uses nucleotide sequence data
   CODFISH of what sequence ?  GenEMBL:Spada5gen
                Start (* 1 *) ?  875
                End (* 2565 *) ?  2533
              Reverse (* No *) ?
   What should I call the output file (* spada5gen.cus *) ?
   What codon usage file (* pombecai.cod *) ?


The output from a session with CodFish is a text output file, containing a list of codon usage statistics (file extension ".cus").

Part of the output from the example is shown below:

  CODFISH of emfun:spada5gen from: 875 to: 2533 check: 8743
  Nc calculation
Standard: 46.765
    EGCG: 46.972
  Codon Bias Index (CBI): 0.152
  Codon Adaptation Index (CAI): 0.339


The input files for CodFish are a nucleotide sequence file and a codon usage table in GCG format.


The Wisconsin Package must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages the Wisconsin Package supports. See Chapter 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.


If you need to stop this program, use C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.


All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  Minimum syntax: % codfish [-INfile=]spgene.seq -Default
  Prompted Parameters:
  -BEGin=1 -END=2565        Sequence range
  -REVerse                  Reverses the first set of DNA sequences
  -WORDSize=15              Comparison word size (minimum match)
  -OUTfile=seqname.cus      Output file name
  -OUTfile2=seqname.cuf     Codon usage table output name
  Local Data Files: None
  Optional Parameters:


The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.


The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.


sets the name of the main output file.


sets the name of the ".cuf" output file.

Printed: April 22, 1996 15:52 (1162)