Wu, T. D. and Brutlag, D. L. (1996). Discovering Empirically Conserved Amino Acid Substitution Groups in Databases of Protein Families. ISMB-96, 3, 230-240.
University School of Medicine, Stanford
This paper introduces a method for identifying amino acid substitution groups that are conserved empirically in aligned positions from databases of protein families. Existing approaches view amino acid substitution as a pairwise phenomenon and characterizes it using substitution matrices. In contrast, the method presented here identifies subsets of amino acids that are conserved empirically using a conditional distribution matrix, which contains entries for every combination of individual amino acids and subsets of amino acids. Each row in the conditional distribution matrix contains the distribution of amino acids in those aligned positions that contain a given subset of amino acids. The algorithm converts a database of protein families into a conditional distribution matrix and then examines each possible substitution group for evidence of conservation. A substitution group is empirically conserved when it has characteristics of compactness and isolation, meaning that amino acids within the group substitute for one another at a higher frequency than amino acids outside the group. The algorithm is applied to the blocks and hssp databases. Twenty amino acid substitution groups are found to be conserved empirically in both databases. These groups provide insight into biochemical properties that are conserved in protein evolution.
[Back to Doug] [Address] [Academics] [Honors] [Publications] [Presentations] [Public Service]