AST734 Andrew Boal 25 November 2004 Image credits: NASA, NPS, and Protein Data Bank Coming Soon to a...

31
AST734 Andrew Boal 25 November 2004 QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Image credits: NASA, NPS, and Protein Data Bank Coming Soon to a Lab Near you…

Transcript of AST734 Andrew Boal 25 November 2004 Image credits: NASA, NPS, and Protein Data Bank Coming Soon to a...

AST734Andrew Boal

25 November 2004

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Image credits: NASA, NPS, and Protein Data Bank

Coming Soon to a Lab Near you…

Extreme environments and astrobiologyNumerous extreme terrestrial habitats are seen as potential analogs to life-bearing niches in the solar systemExtreme environments are those which exist outside of the conditions of a “mesophilic environment” (T~30-40oC, salt concentration <3%, etc)

These “extreme” environments might model conditions found on Mars, Europa, Titan, elsewhere

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Terrestrial examples include hot springs (high temp.), salt lakes (high salt), deep sea vents (high pressure), deserts (low water)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Image credits: NASA, NPS

Extreme environments: microbes in residenceExtremeophiles are defined by the type of environment required for growth

There is no overall consensus on the definition of an extreme environment

Organisms that can survive in an extreme environment but do not require those conditions for growth are extremeotolerent

Mesophile: Lives in an ambient environmentQuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Image credit: CDC

Thermophile: Temp. > ~45oC

Psychrophile: Temp. < ~20oC

Barophile: High pressure

Xerophile: Low water content

Halophile: Salt content > 3-10%

Acidophile: pH < 5

Alkaliphilie: pH > 9

Radiophile: high amounts of radiation

Biogeography

Biogeography is the study of the environmental distribution of species

One can explore several, isolated, analagous extreme environments which may not allow transport of microbes between them to develop a better understanding of microbial evolution

But, what about a deeper look?Map credit: CIA World Factbook, Image credits: NPS

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Molecular components of cellsThe predominant components of the molecular makeup of cells include lipids, nucleotides, and proteins

The ability of these molecules to function is directly related to molecular shape, which is influenced by the environments, so…

Nucleotides: protein blueprints and fabrication

Qu

ickTim

e™

an

d a

TIFF

(LZW

) de

com

pre

ssor

are

need

ed

to s

ee th

is pictu

re.

Proteins: do the work of the cell

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.Q

uic

kTim

e™

an

d a

TIF

F (U

nco

mp

resse

d) d

ecom

pre

sso

rare

nee

de

d to

see th

is p

ictu

re.

Lipids: provide cell membranes

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Biomolecular structural endemism

Are there molecular structures which are endemic in an environment?

If so, how and why are those structures arrived at?

The Big Questions:

Photo Credits: National Park Service Web pages

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

What are biomolecules?

Biomolecular structureBiomolecular structure is determined by a combination of covalent and noncovalent bonds

Covalent bonds are static entities which are little effected by environment

Noncovalent bonds (hydrophobic interactions, hydrogen bonding, and electrostatic attraction) exist in a dynamic equilibrium, and thus can be attenuated by factors such as temperature, ion content, and pH

Biomolecules must both be somewhat flexible and somewhat rigid to attain proper functioning, therefore the forces that hold the molecular shape must attain a balance with the environment

Too static- function is compromised

Balance- function and function preserved

Too dynamic- structure is compromised

Lipid structureLipids are made up of a hydrophilic (water-loving) head group and a hydrophobic (water fearing) tail

In cell membranes, lipids pack to form a bilayer so that the heads are in water and the tails are mixed together

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

H2O

O O

O

OO

P-O O

O

N+

H2C CHO

CH2

O

O

CCH2

CCH2

OO

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH3

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH2

H2CCH3

P-O O

O

CH2

H2CN+

H3CCH3

CH3

Hydrophilic head group

Hydrophobic tail

Lipids in thermal environments

Lipids from thermophilic archaea have a dramatically different chemical structure

Mesophile lipid bilayer

O O

PO

OO

OO

OP

O OOO

OP

O O

POThermophile Archaea bilayer

Increased hydrocarbon branching- increased hydrophobicity

Head-tail linkages are ethers, not esters, and are chemically more robust

OO

OP

O O

PO

Hyperthermophile Archaea lipid

Backbone of both layers is chemically connected, again increased stability

DNA and RNA: chemistryDNA and RNA are polymers of nucleotides (oligo- or polynucleotide)Nucleotides are comprised of nucleobases attached to a sugar

Sugars:

Ribose is in RiboNucleic Acid (RNA)

Deoxyribose is in DeoxyriboNucleic Acid (DNA)

O

OHO

O

Backbone

Backbone

Nucleobase

O

O

O

Backbone

Backbone

Nucleobase

The extra -OH (alcohol) in ribose makes it much less chemically stable

N

N

O

O

H

Sugar

N

N

NHH

O

Sugar

N

N

N HH

N

N

Sugar

N

N

O

O

H

Sugar

N

NHO

NH

HN

N

Sugar

Nucleobases:

Thymine (DNA only)

Uracil (RNA only)

Adenine Guanine

Cytosine

Nucleobases are cyclic structures which are basic (like ammonia)

DNA and RNA: polynucleotide structureDNA and RNA structure is based on hydrophobic interactions and hydrogen bonding

Hydrogen bonding is a weak interaction where two electronegative elements “share” a hydrogen atom (note that carbon-hydrogen bonds do not partake in hydrogen bonding

NN H

O

NH

H

N

N NN

NH

H

O

O

O

O

POO-

O

O

O NN

O

O

H NN

NH

H

N

N

Dashed lines indicate hydrogen bonds

Polynucleotide backbone has charged phosphate groups which are hydrophilic

Thymine:Adenine (T:A) base pair

Guanine:Cytosine (G:C) base pair

Center of duplex is hydrophobic

DNA: secondary structureBase pairing determines the nature of the secondary structure

The basic elements of DNA secondary structure are the duplex (which is by far the most prevalent), the junction, and the hairpin

Quic

kTim

e™

an

d a

TIF

F (

LZW

) d

ecom

pre

sso

rare

ne

ed

ed

to

see

th

is p

ictu

re.

Qu

ickTim

e™

an

d a

TIFF

(LZW

) de

com

pre

ssor

are

nee

de

d to

see

this p

icture

.

Qu

ickTim

e™

and

aTIF

F (L

ZW

) de

com

pre

ssor

are

nee

de

d to

see th

is pictu

re.

C

C

G

T

A

T

G

C

A

G

G

T

A

C

TA

T C C G C T A A G

CG C

A

GTT

A G G C G A T T

junction duplexhairpin

DNA melting

One of the easiest ways to measure DNA stability is to obtain a “melting curve” which is a spectroscopic measurement of duplex unzipping

Representation of DNA melting by duplex unzipping or unwinding

Figure taken from: Drukker, K., et. al. J. Phys. Chem. B. 2000, 104, 6108-6111

Example of a DNA melting curve obtained spectroscopicly

Stability of DNA in extreme environmentsMain determinant of DNA stability is the fraction of C:G base pairs in a given oligonucleotide sequenceThe primary difference between an A:T and G:C base pair is that G:C has three hydrogen bonds, and is thus more stable

40

50

60

70

80

90

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9DNA content (f

GC)

Tm

, 69mM NaCl

Tm

, 220mM NaCl

Tm

, 1020mM NaCl

Data taken from: Owczarzy, R., et. al.

Biochemistry, 2004, 43, 3537-3554. Other factors include hydrophobicity and interaction between salt and the DNA backbone

NN

O

O

H NN

NH

H

N

N

A:T

NN H

O

NH

H

N

N NN

NH

H

O

G:C

Transfer RNA (tRNA) transports amino acids into the ribosome

The many faces of RNARNA is primarily involved in protein synthesis and comes in three major types:

Message RNA (mRNA) is made by transcription of DNA and lists the amino acid sequence of a protein

Ribosome RNA (rRNA) forms the skeleton of the ribosome, the machine which makes proteins

Growing protein

Amino acid

Structure of tRNA

tRNA is a good molecule to explore for environmental studies

Like DNA, RNA secondary structure has elements such as duplexes, loops, bulges, and hairpins

tRNA molecules are usually fairly small (less than 100 nucleic acid monomers)

tRNA has a relatively simple secondary structure

tRNA usually exists as free molecule in the cell

G

T A

CG

A

C

G

T

T

C

T

G

C

A

A

GG

CGAG

GCTCAG

AGGG

TC A

T A

T

T

A

T C

A GC

AG

TG

C G G

G C CTC A

GC

TT

C

G

G

G

G

G

C

T

C

CA

CC

A

Stability of tRNAThe stability of tRNA can be both measured spectroscopicly like DNA but can also be calculated

Calculated free energy is obtained by factoring in the strength of noncovalent interactions in a folded and unfolded tRNA and is expressed as the free energy of complex formation, ∆Gf (NOTE: lower ∆Gf value indicates increased stability, formation is more favorable)

Initial ∆Gf values and predicted secondary structure can be calculated from raw sequence data:

E. coli.: GGGGCTATAGCTCAGCTGGGAGAGCGCTTGCATGGCATGCAAGAGGtCAGCGGTTCGATCCCGCTTAGCTCCACCA

E. coli.: ∆Gf = -28.9 kcal/mol

Thermoplasma acidophilum: GGGCCGGTAGATCAGAGGTAGATCGCTTCCTTGGCATGGAAGAGGcCAGGGGTTCAAATCCCCTCCGGTCCA

T. acidophilum.: ∆Gf = -30 kcal/mol

Calculated ∆Gf values of the GGC codon tRNA from E. coli. and T. acidophilum

Proteins: amino acids and primary structurePrimary structure is determined by covalent amide bonds between individual amino acids

Amino acids

O

HON

H

HR H

Amino functionality

Acid functionality

Side chain “R-group”- defines the chemical and physical nature of the amino acid

Examples of amino acids

O

HON

H

HH

Alanine (A, Ala)slightly hydrophobic

Leucine (L, Leu)strongly hydrophobic

O

HON

H

HH

O

HON

H

HH

Serine (S, Ser)hydrophilic

OH

O

HON

H

HH

Glutamic acid (G, Glu)hydrophilic, negative

OOH

O

HON

H

HH

Lysine (K, Lys)hydrophilic, positive

H2N

HN

NH

HN

NH

HN

NH

HN

NH

HN

NH

HN

NH

NH2

O

HO

O

O

O

O

O

O

O

O

O

O

O

OHO N

N

OH

O

NH

NH2

HN

HO

H2NO

A peptide or protein is a chain of 10-1000 amino acids

Proteins: secondary structureSecondary structures (folds) are defined by hydrogen bonding and steric interactions of the side chain

The -sheet is a linear arrangement of amino acids

Structure is defined by inter-strand hydrogen bonds, less by sterics of side chains

Other, but far less common, peptide folds include the coiled-coil, random coil, bulge, -turn, 310 helix, 27 helix, -helix, -barrel, and so on…

Sheets can be parallel or anti-parallel, defined by orientation of the backbone

The -helix is a coil of a peptide chain and has 3.6 residues per helical turn

Primary interactions are hydrogen bonding between residues along the helical axis and steric interactions between side chains

Proteins: tertiary and quaternary structure

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Tertiary Structure: the overall shape of a folded protein

Tertiary and quaternary structure is defined almost entirely by noncovalent interactions Quaternary Structure: the

assembly of multiple protein units into a larger structure

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Top View

Side View

Protein model systems: helicesThe helix is a common protein structural element which can be readily studiedhelices are the secondary structural element which is most susceptible to sequence and environment factors and the stability of helices is related to the stability of the overall protein

Structural stability is measured by spectroscopicly observing helix unfolding

Like DNA melting, helix (and protein) stability is related to a structural denaturation

Graph taken from: Whitington, S. J., et. al. Biochemistry, 2003, 42,

14690-14695.

As for tRNA, ∆Gf can be calculated for helices or can be measured using Circular Dichrosim spectroscopy by employing the relationship ∆Gf = -RTlnK, where K can be measured from the spectrum

Example of environment related structural differences

One example is the study of the helices of RecARecA is a protein involved with DNA repair, cell division and other processes and is found in all environments

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

This work was published as: Petukov, M. et. al. Proteins: Structure, Function,

and Genetics 1997, 29, 309-320

Crystal structure of RecA from E. coli

Crystal structure of RecA from E. coli was used as a template

RecA sequences from 29 proteins were aligned with that of E. coli, allowing for the determination of helical fragments

∆Gf values for these sequences were calculated and analyzed

There are 10 helical regions in RecA

Thermophile helices are more stable

Calculated ∆Gf values indicated that helices of thermophlie origin were more stable than mesophile helices

Eight of the thermophile helices were found to be more stable- these helices are likely related to STRUCTURAL stability

No change was found for two helices, both of which are directly involved in interactions with DNA and other proteins, these helices likely need to retain flexibility for FUNCTIONAL stability

Interestingly, total helix stability was found to be the same value if the optimal temperature for protein activity is taken into account- this is again related to the need for molecular flexibility

To

tal h

elix

∆G

f

20oC 37oC 80oC

T. thermophilus (80oC)

E. coli (37oC)

P. areuglinosa (20oC)

Biomolecular structural endemism

Are there molecular structures which are endemic in an environment?

If so, how and why are those structures arrived at?

The Big Questions:

Photo Credits: National Park Service Web pages

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Study roadmap

Bioinformatics Sample Collection

Model StudiesSynthesis of short RNA and peptide sequences

Study structure of these molecules in lab-generated extreme (thermal/salt/pressure) environments

Computer models of these systems

Develop comprehensive listing of known protein/RNA sequences from public database

Identify environments for study (Hawaii lakes, Chile: Andes and Patagonia?)

Travel/Sample Collection/ Data Analysis

Search for environment-specific structural elements

Environments to be explored

Initial work will be carried out in Hawaiian lakesThese include Lake Kauhako (Moloka’I), Lake Wai’ele’ele (Maui), Green Lake and lake Waiau (Hawai’I)

These lakes are relatively accessible and will provide a ready data set that we will use to develop our sampling and analysis methodologiesThis data set will also establish part of the mesophile baseline

South American Lake Environments

South American lakes are less well studied from the biogeographical view point- will be able to describe new environmentsThese environments are also geographically isolated from other extreme environments will allow for greater geographic variability

South America, specifically the Andes and Patagonia, have numerous extremeophilic environments

Other possible environments include deep sea trenches and subglacial lakes- UH collaborations

What we will look at: “adaptive” proteins

Antifreeze protein: inhibits ice crystal formation

Qu

ickTim

e™

and

aTIF

F (L

ZW

) decom

pre

ssor

are

nee

de

d to

see

this p

ictu

re.

Potassium Channel: transports K+ into the cell

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

Mechanosensitive channel: responds to osmotic stress

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture. ATPsulfurylase: critical in sulfate reducing bacteria

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Proteins which serve a function adapted to the environment

What we will look at: “conserved” proteinsConserved proteins are those which would be expected to be more similar given a function which is ubiquitous

ATPase: synthesis of ATP, a cell energy source

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

DNA gyrase: involved in DNA packaging

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Rhodopsin: light sensing and transduction

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Pyruvate kinase: involved in glycolysis

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Planned Methodologies: Bioinformatics and Sample Collection

Data that will be collected in the environmentEnvironmental DNA- will be used to establish the biodiversity of a site as well as provide information regarding molecular sequences

Physical factors will also be taken into account, including the temperature, salinity, nutrient composition, etc…

Bioinformatics is the term used to describe the mining of biological sequence and structural data bases

The initial work here will be to develop a database of molecular sequences correlated with the organism of origin (which will tell us the nature of the environments they came from)

These sequences will then be examined for environment-specific structural motifsThis database will help to establish environmental targets and can be modified by biogeographical studies

Methodologies: Model systems and computationsSynthesis and physical or computational characterization of model and natural peptides or nucleotide sequences

These studies will provide us with a numerical quantity (∆Gf) for stability as well as molecular level insights of the mechanism of stabilityOther variants of this work includes the study of the folding of proteins isolated from the environment and the study of peptide-oligonuicleotides interactions

helices

tRNA sequences

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

More complicated peptides such as the helix bundle (common in membrane proteins)

Bringing it all togetherWe will attempt to establish a relationship between the physical environment, biodiversity, and molecular structure

One way this can be accomplished is to generate plots of stability vs. structural similarity for individual environments

Incr

ea

sing

st

ruct

ura

l sim

ilari

ty

Increasing stabilityor protein activity

This range will indicate stability window

This range will indicate the variance of structures which are capable of surviving

A small stability range would indicate that there are rigorous energetic requirements

A small structural similarity range would indicate environment specific structures

If both values are small, it may indicate that structures evolved to meet the specific requirements of that environment