GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

62
GN3502: Bacterial Genetics Ken Forbes Medical Microbiology

Transcript of GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Page 1: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

GN3502: Bacterial Genetics

Ken Forbes

Medical Microbiology

Page 2: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

1. “Classical” bacterial genetics

2. New approachesPhysical mapping of genomes

Whole genome sequencing

Functional analysis

3. New perspectives on bacterial geneticsOrigin of species

Bacterial lifestyles

Lecture synopsis

Page 3: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

“Classical” view of bacteria

• Single chromosome• May have plasmids and phage• Simple gene structure• Genes have recognisable phenotype• Can do genetics in lab

– gene transfer• transformation• transduction• Conjugation

– molecular biology

Page 4: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Classical methods are not adequate

• Bacteria live in many diverse habitats• Much diversity within a species• Most genes in most species have not yet been

identified

Page 5: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Have most of the genes in any species been identified?

• Traditional genetic and molecular methods have identified a function for only half of the genes in E. coli

• Constraints from– methodologies– many genes will not be expressed in the lab

• New approaches needed– genome oriented– sequence oriented

Page 6: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

1. “Classical” bacterial genetics

2. New approachesPhysical mapping of genomes

Whole genome sequencing

Functional analysis

3. New perspectives on bacterial genetics

Origin of species

Bacterial lifestyles

Lecture synopsis

Page 7: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Lecture synopsis: 2. New approachesPhysical mapping of genomes

Methods: PFGE clone libraries

Discoveries: bacterial genomes size shape replicons

Whole genome sequencingMethods: sequencing strategies

Discoveries: gene organisation assigning function

Functional analysisDiscoveries: new genes

Methods: for individual genes for whole genomes DNA arrays proteome

Page 8: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping of genomes

• Low resolution restriction enzyme maps of whole genome

• Locate genes on the map using DNA-based techniques

• PHYSICAL map of chromosome not a GENETIC map

Page 9: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

• Restriction map whole chromosome with rare cutting REs– complete digests– partial– double digests

Pulsed-Field Gel Electrophoresis

EE

E

E

EEE

E

HH

H

1 Mb

S

S

HSE HS(H)

Page 10: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Molten agaroseCultured cells

Incubate with Proteinase K

Trapped HMW DNA

Embedded Cells

Inactivate Proteinase K & wash to remove cell

debris

Pulsed-Field Gel Electrophoresis

Page 11: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Digest with Rare-cutting restriction enzyme

+

Periodic Switching (pulsing) between

electrode pairs

Net migration

-

+

-

Pulsed-Field Gel Electrophoresis

Page 12: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Mapping genes on whole genome RE maps

E E

E

E

EEE

E

HH

H

S

S

| geneA

gene

B |

geneC |

• Hybridize cloned-gene DNA fragment to PFGE fragments– locate gene on map

Page 13: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Ordered clone librariesmethod

• Make clones of entire genome– Ø clones of whole genome

• Small (10’s kb) size of inserts means 1000’s clones required to cover whole chromosome

– Bacterial Artificial Chromosomes (BAC)• clone in E.coli F plasmid• large (100’s kb) size of inserts means fewer clones needed

• Order the clones into contigs– overlapping clones will cross hybridise

Page 14: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Ordered clone libraries

• Disadvantages– not all regions clonable – labour intensive and expensive

• Advantages– immortalised source of genomic DNA– minimally redundant– easy to find and sub-clone a gene of interest– identify adjacent genes– use in genome sequencing projects

Page 15: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Ordered clone librariesapplications

• E. coli K12– widely used lab strain

• Mycobacterium leprae– obligate human pathogen– not cultivable in vitro– genetic analysis impossible– ordered clone library allowed molecular genetic analysis

Page 16: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping

• Pros – only need DNA of organism– standard molecular biology methods used

• Cons– low resolution– no phenotypic information about genes

Page 17: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping of genomesMethods: PFGE clone libraries

Discoveries: bacterial genomes size shape replicons

Whole genome sequencingMethods: sequencing strategies

Discoveries: gene organisation assigning function

Functional analysisDiscoveries: new genes

Methods: for individual genes for whole genomes DNA arrays proteome

Lecture synopsis: 2. New approaches

Page 18: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacterial genomes come in many different sizes

• Range 0.6Mb – 9Mb• Bigger genomes encode more genes• < 2Mb specialist species

– restricted ecological niche (Mycoplasma)

– fastidious growth (Haemophilus influenzae)

– obligate intracellular parasites (Chlamidia)

• 3 – 5Mb generalist species – broad metabolic potential, few organic growth requirements

(E. coli)

• > 5Mb species with developmental cycles – (Streptomyces: mycelial growth, spores, complex bioactive

compounds)

Page 19: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacterial genomes come in different conformations

• Circular chromosomes– the traditional view: E. coli

• Linear chromosomes– Borrelia

• Plasmids– circular and linear forms

Page 20: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacterial genomes can have several chromosomes

• “Chromosomes must harbour some essential genes”– ribosomal RNA (rrn)

• “Plasmids should not be required for viability”– only encode supplementary functions– can be very large (1-2 Mb)

Page 21: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacterial genomes

• Most species have one chromosome– eg E. coli

• 1x circular chromosome with rrn, housekeeping genes

• Some species have 2 chromosomes (a few 3)– eg Agrobacterium tumefaciens

• 2x chromosomes each with rrn and housekeeping genes– 1x circular 3Mb– 1x linear 2Mb

• 2x plasmids, circular 200kb, 450kb

Page 22: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping: conclusions

• Bacterial genomes are very variable– chromosome size, conformation, number– plasmids often very important, but not essential

• Genomes have a large coding capacity– this reflects bacterial biodiversity– there are many genes of unknown function– laboratory analysis imposes constraints on understanding

of many genes

• How can you identify all of the genes in a species?

Page 23: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping of genomesMethods: PFGE clone libraries

Discoveries: bacterial genomes size shape replicons

Whole genome sequencingMethods: sequencing strategies

Discoveries: genome organisation identify genes

Functional analysisDiscoveries: new genes

Methods: for individual genes for whole genomes DNA arrays proteome

Lecture synopsis: 2. New approaches

Page 24: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Whole genome sequencing

• Whole genome sequences now available for– 300 bacterial species/ strains– most pathogens – representatives of most bacterial lineages

Haemophilus influenzae genomepublished 1995

Page 25: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Whole genome sequencing

• Advantages– inexpensive– all of genome seq’ available– all genes identified

• Requirements– automated DNA sequencing machines– massive computing power

• “Factory sequencing”

Page 26: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Fluorescent sequencing• DNA sequencing reaction

– Sanger terminator chemistry • nt chain extension until blocked by terminator nt

– terminator nt has fluorescent dye attached• each nt has different colour

Page 27: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Phases of sequencing project

• Primary sequencing phase– random accumulation of seq’ into contigs

• Linking phase– contigs linked together using directed sequencing

methods

• Polishing phase– removal of sequence ambiguities from the single

contig

• Finished sequence– analyse, annotate

Page 28: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Genome sequencing strategies

• Total-genome shotgun sequencing• Primer walking• Mixed strategy

Page 29: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Total-genome shotgun sequencing

• Shotgun cloning– shear DNA into random fragments of 1-5kb– clone into vector

• Sequencing primers in vector

vector

cloned insert

sequencingprimers

Page 30: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Total-genome shotgun sequencingadvantages

• Don’t require map of genome• Sequencing machines at continuous full capacity• Sequence polishing only done once• >’er accuracy through multiple coverage

– 6-10 fold genome equivalents

Page 31: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Total-genome shotgun sequencingdisadvantages

• Repeat coverage is wasteful• Can’t clone some genomic regions• Repetitive regions in genome

– can’t map each to its correct genomic position– prevents contigs from being joined together

• other methods required to span across each repeat

• Sequence assembly and analysis can only be done at end of sequencing phase

Page 32: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Primer walking

• Require ordered clone library• Primer walk along each cloned fragment

– first primer in vector• sequence into cloned DNA

– next primer in new seq’ • sequence further into cloned DNA

– start at each end of cloned fragment– cycles of:

sequencing

polishingprimer design

primersynthesis

Page 33: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Primer walking

• Advantages– high quality, useable sequence obtained from start– sequence produced in large contigs– no repeat coverage – both strands sequenced

• Disadvantages– many expensive primers needed– time lag between walks– little automation, sequencing machines often idle

Page 34: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Mixed strategy

• Most popular strategy• Combine advantages of both methods

– initial random- sequencing phase• on either whole genome or on set of ordered clones• typically 3-6 fold coverage

– final primer-walking over gaps

Page 35: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Ultrahigh throughput sequencing

• Sequencing by Synthesis – SBS– eg SOLEXA– generates short (18-35 base) reads

video of chemistry

Page 36: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Ultrahigh throughput sequencing

• Sequencing by Synthesis – SBS– template of tens of millions of individual, clonally

amplified DNA fragments– yields up to 1 gigabase sequence in total– avoids cloning steps– inexpensive: £500/ bacterial genome

Page 37: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping of genomesMethods: PFGE clone libraries

Discoveries: bacterial genomes size shape replicons

Whole genome sequencingMethods: sequencing strategies

Discoveries: genome organisation identify genes

Functional analysisDiscoveries: new genes

Methods: for individual genes for whole genomes DNA arrays proteome

Lecture synopsis: 2. New approaches

Page 38: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Genome organisation

• Can identify– all protein and RNA coding genes– organisation of genes

• in genome • wrt each other

Page 39: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

E. coli genome

• Traditional genetic and molecular methods have identified 2220 genes in E. coli

Page 40: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

E. coli genome

• Whole genome sequencing has identified 4288 protein coding genes in E. coli genome

Page 41: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

E. coli genome

genetic map = 100 min

physical map = 4.6Mb

1min = 46Kb

Page 42: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Genome organisation

• >90% of genome codes for genes• Genes

– identified in genome sequence by• Open Reading Frame (ORF)• homology to known genes in other spp

• Regulation of gene expression– promoter and ribosome binding site sequences– operons and linked genes

Page 43: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Identifying genes: by phenotype

• Genes traditionally identified by genetic analysis– Robust identification of gene by its function

Page 44: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Identifying genes: by DNA homology

• Identify gene by sequence homology• Need previously characterised gene in another

species– high homology between them– robust identification of the previously characterised gene– But new gene may have different biological role

Page 45: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Identifying genes: by Open Reading Frame

• ORF: “a DNA seq with no stop codons”• Only genes coding for proteins• Ends of the gene not easily defined

Page 46: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacterial genomes have many genes with no known function

• 60% of genes have a recognisable function– but the specific role of many are unknown

• 40% of genes have no known function– 10% found in other species

• conserved protein families• important housekeeping genes?

– 30% unique to each sp• determine pathogenicity, lifestyle

Page 47: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Physical mapping of genomesMethods: PFGE clone libraries

Discoveries: bacterial genomes size shape replicons

Whole genome sequencingMethods: sequencing strategies

Discoveries: genome organisation identify genes

Functional analysisDiscoveries: new genes

Methods: for individual genes for whole genomes DNA arrays proteome

Lecture synopsis: 2. New approaches

Page 48: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Assigning function to novel genes

• How do you determine the function of genes identified by seq’ rather than by phenotype?

• For individual genes use an appropriate molecular genetic technique– gene knockouts– conditional lethal mutations– control region probes

Page 49: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Assigning function to new genes

• Individual genes gene knockouts

conditional lethal mutations

control region probes

• Whole genome DNA arrays

proteome analysis

Page 50: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

DNA arrays

• Macroarrays– DNA fragment probes (eg PCR product)– one per gene– array on membrane (103 s)

• Microarrays– oligonucleotide probes– several oligonucleotides per gene– array on glass (105 s)

Page 51: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

DNA arrays

Colour = relative ORF expressionIntensity = extent ORF expression

Sample A Sample B

Expression in both samples

Page 52: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

DNA arrays: applications

• Gene expression (mRNA)– transcriptome

• Presence/ absence genes (DNA)– genome polymorphisms

Page 53: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Proteomics

• 2D electrophoresis of cellular proteins– separate by charge then by size

– AA sequence spot of interest– refer back to genome sequence

• Characterisation of all expressed proteins

Page 54: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

1. “Classical” bacterial genetics

2. New approachesPhysical mapping of genomes

Whole genome sequencing

Functional analysis

3. New perspectives on bacterial geneticsOrigin of speciesLifestyles

Lecture synopsis

Page 55: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Why have bacteria so many genes?

• 60% have recognisable function– specific role of many genes unknown

• eg only to enzyme class

• 40% have no known function– 10% common, conserved gene families– 30% unique to each species

Page 56: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Some genes are common to many species

• Conserved gene families • Presumably housekeeping genes• Potential targets for novel antibacterials

Page 57: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Some genes are unique to one species

• These genes give a sp its unique characteristics• Allow adaptation to a particular lifestyles• Virulence genes

Page 58: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

How many genes does a pathogen need?

• Mycobacterium tuberculosis– mechanism of pathogenesis unknown– 4.4 Mb genome– 3994 genes

• 1/3 known function

• 1/3 similar proteins

• 1/3 unknown

in vivo300 genes not required

in vitro3000 genes not required

Page 59: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Some species are apparently “missing genes”

• Many pathogens have complex growth requirements

• Some functions or pathways absent– genes for some pathways eliminated

• nutrients supplied by host

– adaptation to niche• H.pylori lives in acidic environment of stomach

does not ferment sugars (acidic products) does ferment amino acids (alkaline products)

Page 60: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

“Community Genomics Among Stratified Microbial Assemblages in the Ocean's Interior”

(2006) DeLong, et al Science 311, pp. 496-503

• Planktonic microbial communities in Pacific Ocean– sampled from ocean surface to sea floor– sequenced 64 million base pairs– thousands of new genes

• Variations in sequencs at different depths– near the ocean surface

• photosynthetic and mobile microorganisms• more genes for iron uptake

– deeps• a predominance of "adhesive" microbes• antibiotic synthesis genes

Page 61: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

• Organisms do not live in isolation• Organisms interact with host/ environment• Organisms often dependent on each other

– nutrient flow through biological systems

• Use genomics to understand the interaction between spp at gene level

Bacteria are diverse

Page 62: GN3502: Bacterial Genetics Ken Forbes Medical Microbiology.

Bacteria are diverse

Stereo micrograph of dental plaque.Nutrient flow from cocci to filamentous bacteria.