CMU-CS-02-195
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-02-195

Rapid Protein Structure Detection and Assignment
using Residual Dipolar Couplings

Michael A. Erdmann, Gordon S. Rule

December 2002

CMU-CS-02-195.ps
CMU-CS-02-195.ps.gz
CMU-CS-02-195.pdf


Keywords: Protein structure, NMR, residual diploar coupling, resonance assignment, structural homology


Motivation: High-throughput structural proteomics requires fast robust algorithms for extracting protein structure from sparse experimental data. Current approaches are too slow. Determining the 3D structure of an unknown protein may require 6-12 months, mainly for data interpretation. Determining ligand induced changes in structure of a previously known protein may still require weeks of effort. This second problem is of great interest to drug designers, and is our main focus in this paper. A key step is the resonance assignment problem, in which observed NMR peaks must be matched to a protein's atoms.

Contributions: This paper describes two novel procedures, together called PEPMORPH, for inferring structure and assigning resonances: (1) A method for extracting combinatorial protein substructures directly from sparse NMR experiments; (2) A method for matching experimental to known substructures by exploiting the orientational constraint of residual dipolar coupling (RDC). PEPMORPH reverses the traditional approach, in which NMR resonances are assigned prior to structure determination. As a result, PEPMORPH increases the information available during assignment, speeding up the overall process.

Results: We have tested PEPMORPH on a variety of real proteins deposited in the Protein Data Base (PDB), using standard synthetic NMR data with a variety of noise levels, and on one protein (Rho130) using real N15 NOESY data and synthetic RDC data. PEPMORPH assigns a very high fraction of the resonances correctly and flags those resonances that cannot be assigned uniquely because of significant structural change. PEPMORPH runs in O(n^3) time, where n is the number of amino acids in the protein, requiring minutes for moderately sized (20-35kDa) proteins on a 1GHz PC.

16 pages


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu