Computational Molecular Biology

Molecular Biology Databases

October 1, 2009

Doug Brutlag

Homework Assignment Number 2

Describe your searches and intermediate steps for each of the following tasks. Do NOT send verbatim transcripts. Summarize your work and the results.

1) In the documentation for each protein database (NCBI Protein, UniProt (AKA Swiss-Prot) database and PDB database) find a document (URL) that contains the list of attributes for each database. What protein attributes or protein features are common to all the databases? Which attributes are unique to each database?

2) A) What fraction of the complete yeast genome (Saccharomyces cerevisiae), encodes internal-membrane protein sequences according to the NCBI protein database? (By internal-membrane proteins I mean proteins internal to a membrane, not membranes internal to the cell.)

B) What fraction of the complete yeast genome (Saccharomyces cerevisiae), encodes internal-membrane protein sequences according to the UniProt (SwissProt) protein database?

C) Please explain any difference between these two results.

3) Using the Entrez Structure browser, find the nearest structural neighbors to the Maltodextrin-Binding Protein (1OMP). How many total neighbors do you find in the database? How many of these represent unique proteins?

4) Use NCBI Protein database, or UniProt (at EXPASY) SRS to find a group of protein sequences (between 10 and 100) with which you would be interested in analyzing in future homework assignments. Please send us a list of just the DEFINITION or ID lines for the proteins you choose.

Please send summaries of your results to homework218@cmgm.Stanford.EDU.

Due October 8, 2009.

Back to Molecular Biology Databases

Back to Syllabus