A new protein folding screen: Application to the
ligand binding domains of a glutamate and kainate
receptor and to lysozyme and carbonic anhydrase
NEALI ARMSTRONG, ALEXANDRE DE LENCASTRE, and ERIC GOUAUX
Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street,
New York, New York 10032
~Received September 15, 1998; Accepted March 19, 1999!
Abstract
Production of folded and biologically active protein from Escherichia coli derived inclusion bodies can only be
accomplished if a scheme exists for in vitro naturation. Motivated by the need for a rapid and statistically meaningful
method of determining and evaluating protein folding conditions, we have designed a new fractional factorial protein
folding screen. The screen includes 12 factors shown by previous experiments to enhance protein folding and it
incorporates the 12 factors into 16 different folding conditions. By examining a 10256th fraction of the full factorial,
multiple folding conditions were determined for the ligand binding domains from glutamate and kainate receptors, and
for lysozyme and carbonic anhydrase B. The impact of each factor on the formation of biologically active material was
estimated by calculating factor main effects. Factors and corresponding levels such as pH ~8.5! and l-arginine ~0.5 M!
consistently had a positive effect on protein folding, whereas detergent ~0.3 mM lauryl maltoside! and nonpolar additive
~0.4 M sucrose! were detrimental to the folding of these four proteins. One of the 16 conditions yielded the most folded
material for three out of the four proteins. Our results suggest that this protein folding screen will be generally useful
in determining whether other proteins will fold in vitro and, if so, what factors are important. Furthermore, fractional
factorial folding screens are well suited to the evaluation of previously untested factors on protein folding.
Keywords: bacterial expression; fractional factorial screen; inclusion bodies; protein folding
Harvesting the fruits of whole, partial, and selective genome se-
quencing projects frequently requires the heterologous expression
of genes and subsequent studies of the gene products. Expression
in a bacterial host, such as Escherichia coli, is an economical and
time efficient method of protein production. However, in many
cases expression of foreign proteins in E. coli leads to the produc-
tion of insoluble inclusion bodies. Since a large number of proteins
can be folded from inclusion body material and because protein
production in the form of inclusion bodies has a number of merits,
a method that would answer the question of whether proteins de-
rived from inclusion body material can be folded into a biologi-
cally relevant conformation would be useful ~Chen & Gouaux,
1997; De Bernardez Clark, 1998!. Ideally, the method would an-
swer the folding question with a reasonable degree of confidence
using a small number of experiments. If expression in E. coli is
subsequently deemed untenable, then production of the protein in
more expensive and time consuming systems can be justified. In
this paper, we present an approach to search for folding conditions
and apply it to the folding of four proteins.
The folding buffer should favor the formation of the native state
while minimizing the aggregation of folding intermediates. A wealth
of experimentation has shown that polar additives ~such as argi-
nine!, osmolytes, detergents, and chaotropes can minimize aggre-
gation and increase the yield of biologically active material ~see
Rudolph & Lilie, 1996 for a recent review!. Other factors affecting
the formation and stability of the folded state are pH, redox envi-
ronment, ionic strength, protein concentration, presence of ligand,
and the mode by which the denaturant concentration is reduced,
i.e., by dilution or dialysis, as examples. It is also known that
proteinaceous chaperones can promote folding in vitro ~Cole, 1996!.
However, we have devoted our attention to small molecules and
polymers in the experiments described here.
Reprint requests to: Eric Gouaux, Department of Biochemistry and Mo-
lecular Biophysics, Columbia University, 650 West 168th Street, New York,
New York 10032; e-mail: jeg52@columbia.edu.
Abbreviations: GluR2, ionotropic glutamate receptor, AMPA specific,
subtype 2 or B; GluR2-S1S2, ligand binding domain of GluR2; GFB-S1S2,
ligand binding domain of the goldfish kainate binding protein, subtype b;
CAB, carbonic anhydrase B; GuHCl, guanidine hydrochloride; GSSG, oxi-
dized glutathione; GSH, reduced glutathione; FF16, 16 condition fractional
factorial folding screen; AMPA, a-amino-3-hydroxy-5-methylisoxazole-4-
propionic acid; KA, kainate; HPLC, high pressure liquid chromatography;
PCR, polymerase chain reaction; pNAc, p-nitrophenylacetate; PEG, polyeth-
ylene glycol; SEC, size-exclusion chromatography; RT, room temperature.
Protein Science ~1999!, 8:1475–1483. Cambridge University Press. Printed in the USA.
Copyright © 1999 The Protein Society
1475
Different proteins often require distinct conditions for folding.
For example, Mycoplasma arginine deiminase folds upon dilution
into only 10 mM potassium phosphate at pH 7.0 ~Misawa et al.,
1994!, whereas a good condition for urokinase folding requires
50 mM Tris, pH 9.0, 1 M GuHCl, 0.2 M l-arginine, 5 mM EDTA,
0.005% Tween 80, 1.25 mM GSSG, and 0.25 mM GSH ~Winkler
& Blaber, 1986!. In general, determination of folding conditions
has involved a trial and error, one factor at a time approach and
there is scant information available in the literature on general
methods to search for folding conditions. Nevertheless, one ap-
proach involved applying a crystallization screen to the search for
protein folding conditions ~Hofmann et al., 1995!. However, one
would predict that effective precipitation and crystallization con-
ditions will almost certainly not be useful for protein folding; not
surprisingly, this approach was largely unsuccessful. By contrast,
optimizations of previously determined folding conditions are more
widespread ~see Ahn et al., 1997!. In an effort to provide a logic
to the search for protein folding conditions, we have developed
fractional factorial folding screens ~Chen & Gouaux, 1997!. Here
we report an improved version of the original screen and describe
its application to a number of structurally diverse proteins.
Since the in vitro folding of a protein may be influenced by a
number of factors each bounded by two chemically reasonable
levels ~i.e., the factor “protein concentration” might have the levels
0.1 and 1.0 mg0mL!, an efficient and statistically meaningful method
of searching for folding conditions is desired. However, if one
chooses 12 factors and each factor is assigned two levels, evalu-
ation of the full factorial would require 4,096 experiments. In
many cases, when one is searching for factors that impact the
outcome of a particular process, evaluation of a fraction of the full
factorial is sufficient to determine which factors have the greatest
effects. Indeed, the fractional factorial experiment allows one to
estimate main effects and multifactor interactions, depending on
the resolution of the particular design ~Box et al., 1978!. Once the
most significant factors have been identified, the folding condi-
tions can be optimized using a subset of the original factors. Al-
though some of the assumptions inherent in fractional factorial
experiments, such as the assumptions that the response ~i.e., the
yield of folded protein! is linearly dependent on the level of the
factor and that the factors do not interact, may not be strictly valid,
a screen based on a fractional factorial design nevertheless pro-
vides a powerful tool by which folding conditions can be screened
and factors can be evaluated.
Here we describe a fractional factorial protein folding screen
that includes 12 factors and a total of 16 experiments ~FF16;
Table 1!. The effectiveness of the FF16 screen was tested on the
ligand binding domains from the rat GluR2 receptor ~GluR2-S1S2;
Chen & Gouaux, 1997; Arvola & Keinänen, 1996!, and from the
goldfish kainate binding protein b ~GFB-S1S2; Wo & Oswald,
1994!, on hen egg white lysozyme, and on bovine carbonic anhy-
drase B ~CAB!. GluR2-S1S2 and GFB-S1S2 have low level amino
acid sequence identity, related biological activities, molecular
weights of ;32 kDa, one disulfide bond, two compact domains,
mixed a0b secondary structures and basic isoelectric points; prior
to the work from this laboratory, folding conditions for neither
GluR2-S1S2 nor GFB-S1S2 had been reported. In contrast, lyso-
zyme ~Dobson et al., 1994; Hevehan & De Bernardez Clark, 1997!
and CAB ~Cleland et al., 1992; Wetlaufer & Xie, 1995; Xie &
Wetlaufer, 1996! are both well-studied proteins and have served as
model systems for testing factors and methods for protein folding.
On the one hand, lysozyme ~MW of 14.5 kDa! has four disulfide
bonds, a pI of 11.35 and is primarily a-helical. On the other hand,
CAB ~30 kDa! has no disulfide bonds, a catalytic metal ion center,
a central b-sheet that forms the core of the protein and an isoelec-
tric point of 5.9. All four proteins were successfully folded under
multiple conditions from the screen.
Results
Statistical analysis of the data
The main effects of each factor were calculated by summing the
response ~i.e., @3H#-AMPA counts in the case of GluR2-S1S2!
obtained when using the “1” level and then when using the “2”
level of the particular factor under consideration ~Fig. 2!. The sum
of the “2” experiments were then subtracted from the sum of the
“1” experiments and the resulting difference was divided by 8,
i.e., Main Effect 5 ( “1 level” 2 ( “2 level”08 ~Box et al.,
1978!. As shown in Figure 2, polar additive ~arginine! at the “1”
level, i.e., at 0.5 M, pH at the “1” level ~pH 8.5! and protein
concentration at the “1” level have some of the strongest favor-
able effects on the yield of active protein.
To help determine if a given factor had a positive or negative
effect for each of the four proteins, the main effects for each
protein were scaled in the following manner. For each of the pro-
teins, the values of all factors were scaled such that the main effect
of the factor with the greatest effect was set to 1.0. In the case of
GluR2-S1S2 the factor is pH, for example. Then, the mean main
effects for each factor were calculated. This value is indicated by
the unfilled bars in Figure 3. The mean main effects were then
ranked in order of decreasing magnitude. As shown in Figure 3,
pH, polar additive, chaotrope, and protein concentration have the
largest positive mean main effects. Detergent has the largest neg-
ative mean main effect. Reduction0oxidation potential has an over-
all average effect of zero, although the spread in the main effects
is large because lysozyme folds well in the presence of GSH0
GSSG, i.e., the “1” level of Red.0Ox. has a large positive effect
~Fig. 2! while GSH0GSSG has a smaller negative effect on the
folding of the other proteins.
GluR2-S1S2
As shown in Figure 1, most of the FF16 screen conditions gave
significant levels of folded GluR2-S1S2 except #2, #5, #9, and
#14. Condition #7 resulted in the highest @3H#-AMPA counts fol-
lowed by #16 and #6. The main effects plot for GluR2-S1S2 shown
in Figure 2 illustrates that the following factors had a positive
effect on the folding: pH of 8.5, 0.5 M GuHCl, 0.5 mg0mL protein
concentration, and 0.5 M l-arginine. The inclusion of sucrose,
divalent cations, and PEG had negligible effects while dialysis,
high ionic strength, GSH0GSSG, and detergent had weakly neg-
ative effects. The presence of glutamate in the folding buffer had
a modestly positive consequence. Based on large scale folding
experiments of GluR2-S1S2, which employed folding conditions
similar to #7, the yield was ;10% using OD280 measurements.
GFB-S1S2
Conditions #7, #3, #4, and #11 gave the largest amount of properly
folded GFB-S1S2 as judged by the @3H#-kainate binding results
listed in Table 2. The yield of soluble protein was 10–15% as
1476 N. Armstrong et al.
Table 1. 16 Condition fractional factorial folding screen (FF16)
Buffer Patterna Modeb
@Protein#
~mg0mL!c
Polar
add.d Detergente pHf Red.0Ox.g Chao.h
Ionic
str.i
Dival.
cat.j
PEG
~%!k Ligandl
NP
add.m
1 222212212112 Dil. 0.1 0 0 8.5 1 mM DTT 0 250 mM EDTA 0.05 10 mM 0
2 222121121222 Dil. 0.1 0 0.3 mM 6.0 GSH0GSSG 0.5 M 10 mM Mg, Ca 0 0 0
3 221221122111 Dil. 0.1 0.5 M 0 6.0 GSH0GSSG 0.5 M 10 mM EDTA 0.05 10 mM 0.4 M
4 221112211221 Dil. 0.1 0.5 M 0.3 mM 8.5 1 mM DTT 0 250 mM Mg, Ca 0 0 0.4 M
5 212221211211 Dil. 0.5 0 0 6.0 GSH0GSSG 0 250 mM Mg, Ca 0 10 mM 0.4 M
6 212112122121 Dil. 0.5 0 0.3 mM 8.5 1 mM DTT 0.5 M 10 mM EDTA 0.05 0 0.4 M
7 211212121212 Dil. 0.5 0.5 M 0 8.5 1 mM DTT 0.5 M 10 mM Mg, Ca 0 10 mM 0
8 211121212122 Dil. 0.5 0.5 M 0.3 mM 6.0 GSH0GSSG 0 250 mM EDTA 0.05 0 0
9 122222111121 Dial. 0.1 0 0 6.0 1 mM DTT 0.5 M 250 mM Mg, Ca 0.05 0 0.4 M
10 122111222211 Dial. 0.1 0 0.3 mM 8.5 GSH0GSSG 0 10 mM EDTA 0 10 mM 0.4 M
11 121211221122 Dial. 0.1 0.5 M 0 8.5 GSH0GSSG 0 10 mM Mg, Ca 0.05 0 0
12 121122112212 Dial. 0.1 0.5 M 0.3 mM 6.0 1 mM DTT 0.5 M 250 mM EDTA 0 10 mM 0
13 112211112222 Dial. 0.5 0 0 8.5 GSH0GSSG 0.5 M 250 mM EDTA 0 0 0
14 112122221112 Dial. 0.5 0 0.3 mM 6.0 1 mM DTT 0 10 mM Mg, Ca 0.05 10 mM 0
15 111222222221 Dial. 0.5 0.5 M 0 6.0 1 mM DTT 0 10 mM EDTA 0 0 0.4 M
16 111111111111 Dial. 0.5 0.5 M 0.3 mM 8.5 GSH0GSSG 0.5 M 250 mM Mg, Ca 0.05 10 mM 0.4 M
a102 factor levels are: Mode, dil 5 2, dial 5 1; @Protein#, 0.1 mg0mL 5 2, 0.5 mg0mL 5 1; Polar Additive, 0 5 2, 0.5 M 5 1; detergent, 0 5 2, 0.3 mM 5 1; pH, 6.5 5 2, 8.5 5 1; Red.0Ox.,
1 mM DTT 5 2, GSH0GSSG 5 1; chaotrope, 0 5 2, 0.5 M 5 1; ionic strength, 10 mM 5 2, 250 mM 5 1; divalent cations, 1 mM EDTA 5 2, Mg, Ca 5 1; PEG, 0 5 2, 0.05% 5 1; ligand, 0 5
2, 10 mM 5 1; nonpolar additive, 0 5 2, 0.4 M 5 1.
bThis factor was only included for GluR2-S1S2 and GFB-S1S2 folding experiments.
cLysozyme experiments were performed using protein concentrations of 0.1 and 1.0 mg0mL.
dl-arginine.
eDetergent: lauryl maltoside
fpH 6.0, 50 mM MES; pH 8.5, 50 mM Tris-HCl; pH was measured at 4 8C.
g1 mM reduced ~GSH! and 0.1 mM oxidized ~GSSG! glutathione.
hGuanidine hydrochloride.
i Molar ratio of NaCl to KCl was 25:1.
j 1 mM EDTA or 2 mM MgCl2, 2 mM CaCl2; except for CAB buffers that contained 1 mM ZnCl2 instead of MgCl2 and CaCl2.
kPEG MWaverage 5 3,550 Da; the concentration was weight0volume.
l The ligand was l-glutamate for GFB-S1S2 and GluR2-S1S2 buffers; this factor was excluded from the folding buffers for lysozyme and CAB.
mSucrose.
Fractionalfactorialproteinfolding
screen
1477
estimated by SDS-PAGE. The fewest counts were recorded for
conditions #2, #6, #8, #13, and #14. The strongest positive factors
were 0.5 M l-arginine followed by dialysis, pH 8.5, and 1 mM
l-glutamate ~Fig. 2!. The presence of detergent had a strong neg-
ative effect while glutathione, high ionic strength, PEG, and su-
crose had moderately negative consequences. Interestingly, protein
concentration appeared to have no effect on GFB-S1S2 folding
showing that a higher protein concentration did not yield more
folded material. This in turn suggests that nonproductive folding
and aggregation occurs at protein concentrations above 0.1 mg0
mL. Although most conditions in the FF16 screen were sufficient
to fold GFB-S1S2, only one of the best four folding conditions
were shared with its close relative GluR2-S1S2. Likewise, the
chaotrope and protein concentration factors, which were important
for GluR2-S1S2 folding, were relatively neutral factors for GFB-
S1S2 folding. While these two proteins have many common phys-
ical characteristics and functional properties, these similarities do
not confer equivalence in terms of factor main effects.
Lysozyme
Lysozyme did not yield active material under as many conditions
of the FF16 screen as compared to GluR2-S1S2 and GFB-S1S2.
This result is probably due to the requirement of lysozyme for
conditions that promote disulfide bond formation and rearrange-
ment. In fact, none of the eight conditions that contained 1 mM
DTT resulted in active lysozyme, indicating that lysozyme folding
is greatly disfavored under reducing conditions. The two best con-
ditions for lysozyme folding were #11 and #16, respectively. Like
GFB-S1S2, protein concentration was a neutral factor in lysozyme
folding. Since the data presented here measured the total amount of
active protein and conditions with different protein concentrations,
such as #11 ~0.1 mg0mL! and #16 ~1.0 mg0mL!, resulted in similar
Fig. 1. Relative activity chart. The relative activity was calculated with respect to the condition that resulted in the greatest amount
of active material for each protein; all other conditions are given as a percentage of the best condition. Condition #7 resulted in the
largest amount of properly folded GluR2-S1S2, GFB-S1S2, and CAB, whereas condition #11 yielded the greatest amount of active
lysozyme.
Fig. 2. Plot of the factor main effects. The main effects for each factor
were calculated using the equation: ~( “1 level” 2 ( “2 level”!08 ~Box
et al., 1978!. The factors are plotted in order of decreasing main effects,
starting from the left. A positive value on the chart indicates that the “1”
level ~i.e., pH 8.5 or 0.5 M l-arginine! is estimated to enhance protein
folding, whereas a negative value reflects a decrease in folding due to the
“1” level of the factor. The calculations of main effects do not take into
account multifactor interactions, which, if present, may alter the estimation
of factor main effects; to estimate multifactor interactions, higher resolu-
tion screens must be performed. The 1s uncertainty for estimation of the
main effects are 502 for GluR2-S1S2, 1980 for GFB-S1S2, 0.008 for
lysozyme, and 0.005 for carbonic anhydrase B.
1478 N. Armstrong et al.
amounts of folded product, condition #11 gave a higher yield of
active lysozyme, based on the amount of starting material. By
comparing the FF16 results to an assay using native lysozyme,
condition #11 had a yield of ;37% and condition #16 yielded
;4% of the activity based on the amount of original material.
Carbonic anhydrase B
CAB folded to yield catalytically active protein under many of the
conditions in FF16. However, as was determined from the folding
of GluR2-S1S2 and GFB-S1S2, condition #7 clearly gave the high-
est yield per volume of protein folding solution. Both condition #7
and condition #11 resulted in a folding yield of ;60% based on the
amount of starting material. It was striking that the six conditions
for which the highest CAB activity was measured all had protein
concentrations of 0.5 mg0mL, which made high protein concen-
tration the strongest positive factor. The presence of 0.5 M l-arginine,
pH 8.5, or 0.5 M GuHCl also had a positive effect on CAB folding.
PEG has previously been shown to increase the yields of CAB
folding ~Cleland et al., 1992!. However, in this screen the presence
of PEG had a moderately negative effect. This disparity could be
due to multifactor interactions, since the folding buffers used by
Cleland et al. included only 1 M GuHCl, 0.5 mg0mL CAB, and
PEG ~Cleland et al., 1992!.
Discussion
Utilization of protein produced as inclusion bodies from E. coli is
a valuable strategy if the protein folds in vitro. Searching for
folding conditions by trial and error, particularly in light of the
substantial number of factors that may facilitate folding, is a daunt-
ing and potentially fruitless task. To answer the questions of ~1!
whether the protein of interest will fold and ~2! what factors are
most influential, we have designed and tested a fractional factorial
protein folding screen. The screen answers the question of whether
the protein will fold using a small number of highly varied con-
ditions and it facilitates the determination of
本文档为【Armstrong1999】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。