Protein (Macromolecular)...
-
Upload
phungkhanh -
Category
Documents
-
view
214 -
download
0
Transcript of Protein (Macromolecular)...
Chemistry 4000 Biocrystallography Slide 1
Protein (Macromolecular) Protein (Macromolecular) CrystallographyCrystallography
Comparison to Comparison to Chemical Chemical
CrystallographyCrystallography
Chemistry 4000 Biocrystallography Slide 2
Differences are Differences are allall a result of the a result of the unique properties of proteins. unique properties of proteins.
Crystallization Crystal Quality
Diffraction Properties X-ray Source
Structure Determination Accuracy of Structure
Exaggerated example of Mosaicity Moderately diffracting protein crystal
Chemistry 4000 Biocrystallography Slide 3
ProteinProtein Size Size
● Proteins are large in comparison to most compounds (heteropolymer of amino acids)
– average molar mass is 30,000 g/mol per polypeptide chain
– range between 5000 and 1,000,000 g/mol
● Proteins can be composed of one or more polypeptide chains
– bacterial ribosome contains over 50 polypeptide chains and 3 RNA molecules (molar mass ~2,500,000 g/mol)
Top Left: rifampcin 880 g/mol
Bottom: RNA polymerase 345,000 g/mol DNA binding site
Chemistry 4000 Biocrystallography Slide 4
Protein StructureProtein Structure● Proteins adopt a single or very few structures – folded states
– Largely determined by non-covalent interactions
● involves regions both close and widely separated in the covalent structure
– Dependent upon aqueous phase
● Non-covalent forces are solution dependent
● Protein structures are typically compact and globular
– Charged and polar residues are located on the surface of proteins
– Interior of proteins is (almost) exclusively hydrophobic
Structure determinesfunction at themolecular level
Chemistry 4000 Biocrystallography Slide 5
Protein Stability Protein Stability
● Folded state is marginally stable (by design?)
– Relatively mild conditions disrupt the folded state
● Temperature above 45ºC (mammals)● pH < 4 or > 9 ● Low ionic strength (< 50 mM) or low dielectric solvent
.......
● Proteins are susceptible to spontaneous chemical modification
– oxidation of sulfur, deamination, hydrolysis
● Proteins have limited solubility
– few proteins can be concentrated to 1 mM
Unfolded State
Folded State
Chemistry 4000 Biocrystallography Slide 6
Protein CrystallizationProtein Crystallization
Difficult !!! The “bottleneck” in protein structure determination
● Must utilize conditions that do not disrupt the folded state
– aqueous solutions
– narrow temperature range (0 – 37°C)
– narrow pH range (4-9)
● Large surface area (protein) and long, slow crystallization process
– days to months
– require novel crystallization methods
● Slow crystallization process increases likelyhood of chemical modification Vapour diffusion
crystallization trial
Chemistry 4000 Biocrystallography Slide 7
Protein CrystalsProtein Crystals
● Crystals are small and have a large unit cell
– 0.1 x 0.1 x 0.1 mm is a typical crystal size
– 100 Å per edge is a typical unit cell
● Contain 30-70% solvent (present as channels)
– Solvent is critical for protein structure and therefore lattice structure
– Solvent is largely disordered
● Restricted number of Space Groups
– amino acids are all L-stereoisomer; no inversion or mirror symmetry
– lower symmetry (¾ of crystals are orthorhombic or lower)
Chemistry 4000 Biocrystallography Slide 8
More Protein CrystalsMore Protein Crystals
● Limited contacts between symmetry related molecules
– crystals are mechanically fragile ('crush' as opposed to 'fracture')
– temperature, pressure, X-ray, etc. sensitive
● crystals typically nucleate at interface of solvent:support or solvent:air
● High Mosaicity (0.2 - 1.5º)
– partly due to mechanical fragility?
– elongated spot shape, anisotropic diffraction
Chemistry 4000 Biocrystallography Slide 9
Crystal PackingCrystal Packing
YZ plane XY plane
Unit Cell: 45.95 x 140.02 x 76.30 Å Space Group: P212
12
1
Chemistry 4000 Biocrystallography Slide 10
Diffraction PropertiesDiffraction Properties
Protein crystals diffract poorly (low signal / noise and resolution)
● large unit cell and small crystal size means there are fewer unit cells/crystal and less constructive interference
● proteins are composed of light atoms (H, C, N, O, S) with weak scattering factors
● disordered solvent generates diffuse scattering background (especially between 3.2-4.0 Å)
● solvent and protein atoms have significant thermal motion
– rapid falloff in diffraction as function of resolution
Chemistry 4000 Biocrystallography Slide 11
More on DiffractionMore on Diffraction
● Long data collection
– weak diffraction necessitates longer exposures
– large d spacing requires larger crystal to detector distance
– low symmetry (more unique reflections/resolution shell)
● Radiation damage
– crystals are damaged (indirectly) by X-rays and their diffraction changes as a function of time
– longer Cu K more damaging than shorter wavelengths
● X-ray generate free radicals within solvent
– Freezing minimizes damage BUT requires suitable cryoprotectant
● Glycerol, glycols, sugars, oils, etc.
Chemistry 4000 Biocrystallography Slide 12
X-ray SourceX-ray Source
● Cu radiation (not Mo)
– longer wavelength increases spot separation in reciprocal space (required due to large d spacing)
– longer wavelength X-rays are diffracted more efficiently
– protein crystals rarely diffract beyond Cu limit (0.77 Å)
● Rotating (liquid cooled) Anode
– dissipates heat of incident electron beam allowing greater electron flux
– produces more X-ray photons
● Longer crystal to detector spacing
– again due to large crystal d spacing
Chemistry 4000 Biocrystallography Slide 13
Synchrotron X-ray SourceSynchrotron X-ray Source
● Elliptical or circular particle accelerators
● Particle deflection produces intense electromagnetic radiation (100-1000 fold more intense)
● Emission wavelength can be easily changed (0.1 fm steps)
● Huge advantage for structure determination
– Multiple Wavelength Anomalous Dispersion
(1) Electron Gun(2) Linear Accelerator(3) Booster Ring(4) Storage Ring(5) Beamline(6) X-ray Station
Chemistry 4000 Biocrystallography Slide 14
Structure Determination (MR)Structure Determination (MR)
● Molecular Replacement
– requires knowledge of “closely similar” structure
– based upon overlap of Patterson maps from the experimental data and known structure
● intensity data is used to calculate experimental Patterson ● atomic coordinates are used to calculate known Patterson
– determination of superposition matrix allows placement of known structure in unknown unit cell and provides phases
Molecular Replacement(Real space example)
Chemistry 4000 Biocrystallography Slide 15
Structure Determination (MIR)Structure Determination (MIR)
Multiple Isomorphous Replacement● exploits stoichiometric binding of heavy atoms
– Heavy atom positions are determined from differences in experimental intensities
– Phases are derived from heavy atom positions
● require at least two unique heavy atom derivatives to solve a novel structure
– H,hkl
from heavy atom position
– FH,hkl
= FHP,hkl
- FP,hkl
● Weak diffraction, thermal motion and non-isomorphism greatly complicate calculation
– low information content of intensities is also a problem
Chemistry 4000 Biocrystallography Slide 16
Harker Construct -Harker Construct -From Patterson to PhaseFrom Patterson to Phase
d/ = H
(1,2)
Arbitary Origin
Radius = FP,hkl
FH,hkl
Radius = FPH,hkl
(1) Adding FH,hkl to FP,hkl yields tail of vector FPH,hkl
(2) Circle represents all possible values of PH,hkl
(1) Origin represents tail of vector FP,hkl (structure factor)
(2) Circle represents all possible values of P,hkl
Knowns (measured):FPH,hkl FP,hkl
Knowns (calculated):FH,hkl = FPH,hkl – FP,hkl (algebra)
H,hkl (Patterson soln)
STEP 1) STEP 2)
Chemistry 4000 Biocrystallography Slide 17
Harker Construct -Harker Construct -From Patterson to PhaseFrom Patterson to Phase
(1) Intersection of circles represents possible solutions for PH,hkl and P,hkl that are consistant with the known
value of H,hkl
(2) Two solution are possible when using a single derivative
FPH,hkl
FP,hkl P,hkl
PH,hkl
STEP 3)
FPH,hklFP,hkl
P,hkl
PH,hkl
STEP 4)
FPH2,hkl
(1) Introducing a second derivative resolves the ambiguity and a unique solution for the phase problem is obtained
Note: With the errors in real data this is not nearly as straightforward as portrayed
Chemistry 4000 Biocrystallography Slide 18
Structure Determination (MAD)Structure Determination (MAD)
Multiple Wavelength Anomalous Dispersion● conceptually identical to isomorphous
replacement
● requires tunable X-ray source (synchrotron)
– data collected at different wavelengths
● remote, inflection and peak of atomic absorption edge
– completely isomorphous**
● molecular biology techniques allow reliable introduction of Se as heavy atom (~2 / 100 residues)
Chemistry 4000 Biocrystallography Slide 19
Structural ModelStructural Model
Limited resolution cannot identify individual atoms
– cannot determine structure without external information● require knowledge of bond lengths and angles● require knowledge of covalent structure of protein
– can determine fit of known fragments of structure to electron density
● orientation of side groups may be ambiguous (carboxyamides, imidazoles)
– cannot locate H atoms
Structural model is built into electron density
Chemistry 4000 Biocrystallography Slide 20
Structural ModelStructural Model
3.5 Å 1.8 Å
Final refined modelfit to electron
density at 3.5 & 1.8 Å resolution
- Note the difference in the definition of the oxygen atoms (red)
Chemistry 4000 Biocrystallography Slide 21
AccuracyAccuracy
Can we know the structure is accurate?● In several cases protein structures have been determined
at greater than 1.0 Å resolution
– validate structure of same protein determined at lower resolution
● Successfully explains wide array of biological data
– existing experimental data can be rationalized using the structure
● Successfully at predicting results of biological and physical experiments
– repeatedly proven to be model of choice for designing experiments
● Same structure as determined by independent techniques (NMR, cryoEM) at lower resolution
Chemistry 4000 Biocrystallography Slide 22
Common Indicators of High Common Indicators of High Quality StructureQuality Structure
● R-factors
– Rmerge
< 5% for data with I/(I) > 2
– Rrefine
< 20% for data to 2.0 Å resolution (and Rfree
< 25%)
● Stereochemistry
– Bonds (rmsd) ~ 0.010 Å, Angles (rmsd) ~ 1.2°
– Ramachandran Plot ~90% favored (main chain torsion angles)
● Model
– > 95% of protein atoms fit and ~1 H2O per residue (2.0 Å resolution)
Must explain functional and experimental data !
Chemistry 4000 Biocrystallography Slide 23
Structural VisualizationStructural Visualization
● Structures are complex and require simplified representation
– “Cartoons” of protein provide overall view of structure
– Electrostatic surface provide visualization of local regions
Substrate binding requires both shape
and charge complementarity
Chemistry 4000 Biocrystallography Slide 24
SummarySummary
● Proteins have a number of unique properties that affect the ease of production and quality of protein crystals
● Relatively low quality of protein crystals compromise the quality of intensity measurements
● Relatively weak and low resolution intensity measurements increase the difficulty of structure determination and decrease the accuracy of the final structure
● Protein crystallography is not (yet?) a routine technique that can be performed by a qualified technician