Are there any other Web sites?
The Transcription Factor Database is a database of the DNA recognition sequences for eukaryotic and prokaryotic sequence-specific transcription factors
Here is an example of some entries in the TFD database.
Name - Sequence - Comments..
UAS(G)-pMH100 CGGAGTACTGTCCTCCG ! J Mol Biol 209: 423-32 (1989)
TFIIIC-Xls-50 TGGATGGGAG ! EMBO J 6: 3057-63 (1987)
HSE_CS_inver0 CTNGAANNTTCNAG ! Cell 30: 517-28 (1982)
ZDNA_CS 0 GCGTGTGCA ! Nature 303: 674-9 (1983)
GCN4-his3-180 ATGACTCAT ! Science 234: 451-7 (1986)
This format is used by GCG for any pattern matching program. For example, the restriction enzyme and Prosite files are in this format. This means that you can use programs like Map and Motifs and Findpatterns to search this database.
There isn't much annotation in this database. If you need to find out more about these sites, you have to look up the original literature reference. You can also go to the main TFD gopher site and search for the name of the factor. Then you can get a bit more info.
You can use GCG programs like Map, MapSort, Motifs and Findpatterns to search this database.
The TFD database is stored in the genrundata subdirectory on PMGM. When the program asks what database to use. Type in.
|
Program |
What to type in |
|
Map |
map -data=genrundata:tfd.dat |
|
Motifs |
motifs -data=genrundata:tfd.dat |
|
FindPatterns |
findpatterns -data=genrundata:tfd.dat |
You can copy this file to your own directory using Fetch, and modify the database. You could also use Findpatterns to search for a specific sequence pattern.
Here is an example of the output you will get using the Map program
(Linear) MAP of: test.seq check: 2851 from: 1 to: 117
REFORMAT of: test.seq check: 2851 from: 1 to: 117 April 3, 1997 11:06
(No documentation)
Using Enzyme data from: genrundata:tfd.dat FileCheck: 2301
This file is a composite from the following datasets:
TFD (release 7.6) SITES dataset file, 2/97
Transfac (release 3.1) SITES dataset selected entries, 3/97
References: Nucleic Acids Res 21, 3117-8 (1993).
Nucleic Acids Res 24, 238-41 (1996).
In Transcription Factors: Essential Data (Chichester UK: J Wil
ey and Sons),
With 5092 enzymes: *
December 23, 1997 22:04 ..
LyF/Ikaros_site
C/EBP_CS1 GATA-1_CS2 REB1-consensus |
| | | |
ATTACCCCAGAGATTCACCAGAGATTCCAGATACCAGAGACTACCCATTTACCCGAGGGG
1 ---------+---------+---------+---------+---------+---------+ 60
TAATGGGGTCTCTAAGTGGTCTCTAAGGTCTATGGTCTCTGATGGGTAAATGGGCTCCCC
STE6.2
UBP1_RS |
GAGA-en | |
GAGA_box/CT_element | |
NIT2-niaD-niiA_(1) | | |
Knirps_site| | | |
AP-2_CS4 GATA-1_CS2 || | | |
| | || | | |
GAAAAAAAAATTAGACCCCAGGATTTAGATACCCAGAGAGAGATTTACACCATATTA
61 ---------+---------+---------+---------+---------+------- 117
CTTTTTTTTTAATCTGGGGTCCTAAATCTATGGGTCTCTCTCTAAATGTGGTATAAT
Enzymes that do cut:
C/EBP_CS1 UBP1_RS AP-2_CS4 STE6.2 GAGA-en GATA-1_CS2 NIT2-niaD-niiA_(1)
REB1-consensus GAGA_box/CT_element Knirps_site LyF/Ikaros_site
To find out more about the different Transcription factors, you can search for information about them. For example, to get information about the "C/EBP_CS1" site, you need to type in the commands in bold text
pmgm:~ 58% to genrundata /gcgv9/gcgcore/data/rundata pmgm:/gcgv9/gcgcore/data/rundata 59 % grep C/EBP_CS1 tfd.dat C/EBP_CS1 0 TKNNGYAAK 0 ! C/EBP Genes Dev 1: 133-46 (1987)
This will move you over to the directory that has the TFD database, and then you use the "grep" command to search through the database for the name "C/EBP_CS1". The result you get back will be the literature reference that talks about that site. You now need to visit the library.
You can also go to the main TFD gopher site and search for the name of the factor. Then you can get a bit more info.
Searching for Regulatory Elements with GCG