Page 140 - Academic Press Encyclopedia of Physical Science and Technology 3rd BioChemistry
P. 140
P1: LDK/GLT P2: GRB Final pages
Encyclopedia of Physical Science and Technology EN013H-614 July 27, 2001 10:29
180 Protein Folding
spontaneously. Thus, all of the information necessary for The empirical approach to understanding protein fold-
biological activity is contained in the simple sequence of ing has relied heavily on mutational analysis. As men-
amino acids as encoded by the DNA. Practically speaking, tioned earlier, proteins from different species with iden-
predicting protein structure, stability, and function from tical functions may have slightly different amino acid
the primary sequence will open myriad opportunities in sequences, or mutations. Often the mutations are con-
the areas of medicine (e.g., drug discovery and under- servative, particularly in amino acids that are critical to
standing molecular basis of disease), industry and man- the structure or function of the protein. Scientists study
ufacturing (e.g., biocatalysis and bioprocessing), and the the different physical properties of these related proteins
environment (e.g., bioremediation). to gain insight into the role of amino acids in local or
Proteins are linear polymers of amino acids that are global structure and function of the protein. Often mu-
linked through amide linkages, commonly called the pep- tations are purposely engineered into protein sequences
tide bond. The “backbone” atoms include the amide link- using molecular biological techniques to test hypotheses
ages separated by a carbon that is derivatized by any one of about roles of certain amino acids in structure or function.
20 common side chains. The side chains may be grouped Selective substitution of tryptophan into a sequence al-
at neutral pH as acidic, basic, hydrophobic, and uncharged lows placement of a convenient spectroscopic probe (see
hydrophilic according to their chemical nature. Thus, al- below).
though the backbone of the peptide polymer is a repeating Although proteins are very diverse, the one thing that al-
identical unit, the side chains and their distinct properties most all have in common is that they adopt spontaneously
dictate the nature of the protein. Because a subset of the a unique and stable tertiary structure. This is an utter mira-
amino acid side chains is charged at neutral pH (acidics cle of nature given the complexity of these heterogeneous
are negative and basics are positive), the protein polymer polymers. The study of protein folding is focused on un-
is a polyelectrolyte. The linear sequence of amino acids derstanding the rules that govern the transition into and
is called the primary structure of the protein (Fig. 1). The the stability of this unique fold. The transition into the
primary structure dictates the way in which the polypep- tertiary structure is studied by kinetic methods. Thus, ki-
tide folds into a functional protein, in most cases without netic studies ask the question, “By what pathway is the
instructions from other sources. final tertiary structure folded?” Alternatively, equilibrium
Protein families are proteins related by structure or thermodynamic methods ask “How stable is the final fold
function. A protein family may be structurally diverse but and why?” Each of these approaches will be discussed
have a particular cluster of amino acids at the active site individually.
that defines the class according to some catalytic function
(e.g., dehydrogenases and kinases). Alternatively, proteins
may have a structural motif that defines the class (e.g., II. STABILITY OF THE TERTIARY FOLD
helix–loop–helix motif of the EF-hand calcium-binding
proteins). Proteins with identical function in different or- Stability of a protein is usually studied by observing the
ganisms often have slightly different primary structures energetics of unfolding transitions given by the equations
(see below). The presence of certain amino acids relative below:
to others in primary sequences allows putative protein se-
N ↔ U (1)
quences from the Human Genome Project, for example,
to be classified into general protein families. Whether this K un = [U]/[N] (2)
initial classification is valid remains to be seen. o
G = –RT ln K un (3)
To discover the rules of protein folding, two major ap- un
proaches have emerged: computational and empirical ap- These equations apply to a simple two-state transition be-
proaches. The computational approach, often termed pro- tween the native (N) and the unfolded (U) state given
teonomics, attempts to predict the structure of a protein by the equilibrium constant K un . This is, by definition,
based on its sequence by defining a set of rules and crite- a cooperative process without a detectable intermediate
rion for their application. This topic is covered elsewhere species. The denatured or unfolded state of a protein is
in this series. The empirical approach to discovering the generally considered to be an ensemble of conformations
rules of protein folding defines global rules for folding in which all parts of the protein are exposed to the sol-
based on lessons learned from particular proteins. These vent with a minimum of intramolecular interactions. The
3
two methods are distinctly interwoven. Hypotheses de- denatured state has high conformational entropy and is
rived from one are testable through the other. In this paper, biologically inactive. The unfolding transition (Eq. (1)
we will discuss the empirical approach to studying protein and Fig. 2) can be induced by pressure, temperature, ex-
folding. treme pH, and denaturants such as urea and guanidine