Seals


SEALS (A System for Easy Analysis of Lots of Sequences) is a software package expressly designed for large-scale research projects in bioinformatics. Using a command-line user interface, SEALS provides dozens of commands to help the user quickly implement standard sequence analysis protocols, design new investigations, and generally Get Things Done with dispatch.

Seals Home Page

Seals Online Documentation


Installing Seals

To set your account up to run Seals on PMGM, you need to change your .login and .tcshrc files in your CMGM/PMGM account.

Running Seals

Step 1. login to PMGM and type

 activate_seals

Step 2. Make sure your sequences are in fasta format. You can use the GCG program tofasta to convert them. If you use Seals to extract a sequence from the databases, it will already be in the correct format

Step 3. Select the Seals analysis you want to run. You can use the UNIX Piping command " | " to pass the results from one program to another one. For example.

if you wanted to

  1. extract sequence with gi number 10000 from GenBank,
  2. run Gapped Blast on the GenPept protein database,
  3. only show results with a p value <= .001, and
  4. save the results as a library of fasta sequences

you would use this command

gi2fasta 10000 | splishpgp genpept | blast2blast -pcut=.001 | blast2bounded


Seals Databases Available

Several databases have been downloaded from the NCBI and have been reformatted for use with the Blast and fasta software that Seals uses.

Database name

What it contains

genembl

All Non-redundant GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS, or HTGS sequences)

genpept

All non-redundant GenBank CDS translations

swissprot

The last major release of the SWISS-PROT protein sequence database (no updates)

gb_new

All new GenBank sequences since the last release

swplus

SwissProt + EMBL translations

pir

PIR protein database

E. coli genomic CDS translations

E. coli genomic nucleotide sequences

Database of mitochondrial sequences (Rel. 1.0, July 1995)

Yeast (Saccharomyces cerevisiae) protein sequences

Yeast (Saccharomyces cerevisiae) genomic nucleotide sequences

Patent Database

Protein sequences derived from the Patent division of GenBank

Nucleotide sequences derived from the Patent division of GenBank

Genome Survey Sequence, includes single-pass genomic data, exon-trapped sequences, and Alu PCR sequences.

High Throughput Genomic Sequences

Non-redundant Database of GenBank+EMBL+DDBJ EST Divisions

Non-redundant Database of Human GenBank+EMBL+DDBJ EST sequences

Non-redundant Database of Mouse GenBank+EMBL+DDBJ EST sequences

Non-redundant Database of all other organisms GenBank+EMBL+DDBJ EST sequences