Go back to top



CpGReport looks for potential CpG islands in a nucleotide sequence.


CpGReport scans a nucleotide sequence for regions with higher than expected frequencies of the dinucleotide CG.


This program was originally written by Gos Micklem (E-mail: gos@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).

This version was modified for inclusion in EGCG by Rodrigo Lopez S. (E-mail: rodrigol@biotek.uio.no; Post: Biotechnology Centre of Oslo, PO Box 1125 Blindern, N-0317 Oslo 3, Norway).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).


Here is a sample session with CpGReport

  % cpgreport
   CPGREPORT uses nucleotide sequences
   CPGREPORT of what sequence(s)  ? GenEMBL:hsmed
   What score for CpG (* 17 *) ?  28
   What should I call the output file (* hsmed.cpg *) ?
                Start (* 1 *) ?
                End (* 5292 *) ?


The output from CpGReport is a simple report of hits in the sequence.

  CPGREPORT of EM_PR:HSMED check: 3986 from 1 to 5292
  Sequence              Begin    End Score        CpG   %CG  CG/GC ..
  EM_PR:HSMED              35     79    43          3  62.2   1.50
  EM_PR:HSMED             131   5070  2485        256  58.4   0.66
  EM_PR:HSMED            5100   5101    28          1 100.0    -
  EM_PR:HSMED            5224   5229    53          2 100.0   2.00
  EM_PR:HSMED            5277   5278    28          1 100.0    -



The input file for CpGReport is a GCG nucleotide sequence file.


All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  Minimum Syntax: % cpgreport [-INfile=]GenEmbl:hsldlr02 -Default
  Prompted Parameters:
  -BEGin=1 -END=100           Range of interest
  -CPGSCORE=17                Score for a CG sequence.
  [-OUTfile=]hsldlr02.cpg     Output file
  Local Data Files: None
  Optional Parameters: None


The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.


sets the score for each CG sequence found. A value of 17 is more sensitive, but 28 has also been used with some success.

Printed: April 22, 1996 15:52 (1162)