Go back to top



PepCount reports the number of occurrences of residues at a given position in protein sequences.


PepCount is a specialized program to analyze the occurrences of amino acids in the first few residues of selected protein sequences. It could be extended into a more general program if there is sufficient interest.


This program was written by Peter Rice (E-mail: Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (


Here is a sample session with PepCount

  % pepcount
   PEPCOUNT uses protein sequences
   PEPCOUNT of what sequence(s) ?  Sw:*_Human
   What should I call the output file (* 143b_human.anti *) ?


The output from PepCount is a simple report of hits in the sequence.

  PEPCOUNT of sw:*_human at position 1
Initial Met ignored
All entries considered
  A        595
  B          1
  C         37
  D        202
  E        258
  F         44
  G        278
  H         37
  I         59
  K        137
  L        153
  M         43
  N         87
  P        156
  Q        116
  R        196
  S        292
  T        130
  V        131
  W         45
  X         18
  Y         51
  Z          1


The input file for PepCount is a set of GCG protein sequences.


All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  Minimum Syntax: % pepcount [-INfile=]Sw:*_human -Default
  Prompted Parameters:
  [-OUTfile=]sw.count         Output file
  Local Data Files: None
  Optional Parameters:
  -POSition=1                 Count residues at this position
  -NOSKIPmet                  Include initial Met residues
  -NOALLentries               Ignore entries that don't start with Met
  -STARTres=Q                 Save sequences starting with Q (Gln)
  -FOSNfile=sw.fil            Save sequences in FOSN file


The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.


counts the residues at position 1.


includes the first residue in the calculation. Normally an initial Methionine (if present) is removed.


counts all entries. Normally only those having an initial Methionine are considered.


saves the names of all entries starting with "Q" (glutamine) at the first position (after removal of an initial methionine). The FOSNfile qualifier is also required to define the output file name.


names an output file for entries selected by STARTres.

Printed: April 22, 1996 15:54 (1162)