A super quick introduction to molecular biology and...

36
A super quick introduction to molecular biology and genomics Héctor Corrada Bravo Dept. of Computer Science Center for Bioinformatics and Computational Biology University of Maryland University of Maryland, Fall 2015

Transcript of A super quick introduction to molecular biology and...

Page 1: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

A super quick introduction to molecular biology and

genomics

Héctor Corrada Bravo Dept. of Computer Science

Center for Bioinformatics and Computational BiologyUniversity of Maryland

University of Maryland, Fall 2015

Page 2: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Key  terms• Genotype/Phenotype  • Cell  • Proteins    • Evolu6on:  inheritance,  selec6on,  varia6on  • DNA/RNA  • Chromosome  • Gene  • Genome  • Replica6on  • Transcrip6on  • Exon/Intron  • Transla6on  • Codon  • Central  Dogma  • Gene  Expression  • Regula6on  • Epigene6cs

Page 3: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Why  are  my  children        such  pigs?

Page 4: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Why  am  I  such  a  pig?

Phenotype,  cells,  metabolism,  protein

Page 5: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Proteins• phenotype:  characteris6cs  (traits)  of  an  organism  • characteris6cs  due  to  cellular  structures  and  ac6vi6es  –mostly  carried  out  by  proteins  

• Examples:

5

alpha-­‐kera7n component  of  hair

insulin regulates  blood  glucose  level

ac7n  &  myosin muscle  contrac7on

hemoglobin oxygen  transport

DNA  polymerase synthesis  of  DNA

DNA  glycosylases DNA  repair

matrix  metalloproteinase extra-­‐cellular  matrix  degrada7on

Page 6: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Gene6cs

• gene:  in  classical  gene6cs  it  was  an  abstract  concept  – a  unit  of  inheritance  passed  from  parent  to  offspring  – specify  proteins  

• genome  refers  to  the  complete  set  of  genes  • genotype:  gene6c  characteris6cs  of  an  individual

6

Page 7: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Hector Corrada Bravo

What is Genomics?

• Study the molecular basis of variation in development and disease

• Using high-throughput experimental methods

• algorithms

• ML

• data management

• modeling

7

cancer

healthy

Page 8: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor
Page 9: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

What  is  Genomics?• Each  cell  contains  a  complete  copy  of  an  organism’s  genome,  or  blueprint  for  all  cellular  structures  and  ac6vi6es.  

• The  genome  is  distributed  along  chromosomes,  which  are  made  of  compressed  and  entwined  DNA.  

• Cells  are  of  many  different  types  (e.g.  blood,  skin,  nerve  cells),  but  all  can  be  traced  back  to  a  single  cell,  the  fer6lized  egg.

Page 10: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Chromosomes

These  are  actually  human.  And  for  a  down  syndrome  pa6ent

Page 11: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

DNA

Watson  and  Crick  1953

DNAs  (Deoxyribonucleic  acids)  are  molecules  to  store  gene6c  informa6on  of  a  living  organism.  

DNA  consists  of  two  polymers  made  from  four  types  of  nucleo6des:  adenine  (A)  guanine  (G),  cytosine  (C)  and  thymine  (T).  

Purines:  A,  G;  Pyrimidines:  C,  T  

Two  polymers  are  complementary  to  each  other  and  from  a  double-­‐helix  structure  

5’-ACCGTTCGACGGTAA-3’   |||||||||||||||   3’-TGGCAAGCTGCCATT-5’

Page 12: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

chromatin

Page 13: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Measurement

• For  a  small  enough  piece,  we  can  measure  the  sequence  of  bases,  referred  to  as  sequencing  

• Human  Genome  Project

Page 14: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

GenomeTCAGTTGGAGCTGCTCCCCCACGGCCTCTCCTCACATTCCACGTCCTGTAGCTCTATGACCTCCACCTTTGAGTCCCTCCTCTCACACCTGACATGAAAAGGCACATGAGGATCCTCAAATACCCCGTGATCAGTCTCAGGGTAGCTCTCATAGCCTGGACAGGGCCCCCCTCGGGGGTTGCGCCCAGGTCCAGGCGGGGGATGCACAGCAACAGTCACCGAAGCAGAAGCCGTCACAGTGGTGATGGGCTGGCAGTAGCTGGGCACAGAGCTGCCCATGGCGGTGGACGTTGGGTTCCGAGGGTTGTGAGAACGGGCCCCACGGGGCCCTGAGCGGTCCCTATTGCTAGGGCCAGAATGCCCTTCAGTAGAAATTTCAAAAGCGTCTCTGCGCGGTCTGTAGGGGGGTGGCCGCAAGCCTTCTCTAGGGGGATCCCTTCGAGGCTGCTGGCCTTGCCGTCCAGGGGACAAGGAGCCAGAGTCCAGGTGGGGCTGTTGCCGAGGGGTCAAGGGAGGCTGATGTCTGGAGTCCGGATGGACCACCTGCAGAGGAGAGACATAGGTCAACACAGGGAGGTAGGATGGTGGTGATGTTCCACCCACAAAAGAAAACCTATTCCTTTAGAAACCTCCAGGATGTGAATCCTGCCTGCACCTGCACAGCTGGCTGGAGGCATATAGCCACTGCCCATAGATCTCAACTTACCCTCACAACCAACTGCCCCCAGGCCTAAGTTCTCTGCCTCAAAACTGCCAAGGCCTGGATAGCCAAGAGCCTGGGTGTCTTGGAAATATGCAACCATAAATAGTAGCTTTTAGAAGTATAAGGCTCCTGTTTCTGGGTCATATTAGTGTTGTTTTCACCTGTCCCCAGCCCTAAGCCAGGTGTGGCCAGAAGCAAATGTACTGTAAGAGCAGAGCAAAAACTTCCACACAGATAGTTCTGTTAGGCAATACATCTCTGCCTGACTATTAGGAATCTGGTTTCTGGGTCCTCTGTACAAAGCTCGGAGCAACACAGTGGCCACATCAATCAAAAGGACCGTGACCAACTTCAAAGTCGGTGAGCTTGTACCTATTTTTAGGCTCCTGCTGAACAGAACCAGATTCACACTACAGCTCAGCAGGGCATCGTCACGGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTGGGGGGGGGGGGTGGACAGAGGACGGGGACACAATTCACTGGCCAGCCCTTCTCTCCTTCAAGGAAGGCTGCTCTAGCCTGGGACTGGAATACACATTTCCTGTAAACATGGTGGGGGCCTCAGGCAAGCCAGAGTTTTGGAGCCTTCCTTAACTCTTCAAGGTGAGCATCTTGACTTGGAGGGTGGGGGTGCGGGTAAGGAAGGAACCTGTGGACTCCTCCCTACAAGACAGAAAAGGAATAAGCCACGAAGACAATAACGATTTTTGTATCAAGCGTCCTCTCCCATTTCAGCTTACCTGACAATGAAATCAAATTCGGACCCTGCAAGCATCAGTACACCCAGCAGAGTGGACACAGCACCGTCCAGAACGGGAGCAAACATGTGCTCCAGAGCGAGCATAGCCCTGTGGTTCTTGTCCCCAATGGCTGTCAGAAAGGCCTGAACAAAGGAGAAAATTGACACGGTCACATTCTGGGTGTGGTAAAGTGCTCAGCTGTGTCTATACTTGGGTTTTGTAT…

Total  amount  of  DNA  in  human  genome:  3  *  109  base  pairs  (bp)

Page 15: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor
Page 16: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Replica6on

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

Page 17: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

Page 18: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

     C            C      C    G    T    A  A    G    T      A    T  T    T      G          

         T    T    G  G    G  T  A  A  T  G  C      

     A  T  G  G      G  T  C  A  A        T  T  A    

   T  T  T      A      G    T      A    G    

         A  A  T      G  T      Cnucleo6des  available  in  cells  

Page 19: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

Page 20: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

T  T  C  G  A  T  T  A  C  G  A

A  A  G  C  T  A  A  T  G  C  T

Page 21: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Genes

Gene Gene Gene Gene Gene

Page 22: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor
Page 23: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Central  Dogma

DNA RNA Proteins

Genes  encode  proteins  which  are  transcribed  into  mRNA  and  translated  into  proteins.

Page 24: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Transcrip6on

C T A G C G C T C  

| | | | | | | | |  G A T C G C G A G

DNA

C U A G C G

RNA  polymerase

mRNA

Page 25: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

http://www.uniprot.org/uniprot/P09238

Page 26: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

Transla6on

h\p://gel.ym.edu.tw/~ycl6/sc2005/images/transla6on.gif

Page 27: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

The genetic code

Page 28: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

gene regulation

Page 29: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

gene regulation

http://string-db.org/version_9_0/newstring_cgi/show_network_section.pl?identifier=9606.ENSP00000279441&all_channels_on=1&network_flavor=evidence&targetmode=proteins

Page 30: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

What  makes  them  different?

Much  human  varia6on  is  due  to  difference  in  ~  6  million  base  pairs  (0.1  %  of  genome)  referred  to  as  SNPs

Page 31: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

TACATAGCCATCGGTANGTACTCAATGATGATAGenomic  DNA: A SNP

G

Single  Nucleo6de  Polymorphism  (SNP)  

Three  genotypes

Page 32: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

TACATAGCCATCGGTAAGTACTCAATGATGATA

AA

ATGTATCGGTAGCCATTCATGAGTTACTACTAT

TACATAGCCATCGGTAAGTACTCAATGATGATAATGTATCGGTAGCCATTCATGAGTTACTACTAT

Mother

Father

Page 33: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

TACATAGCCATCGGTAAGTACTCAATGATGATA

AG

ATGTATCGGTAGCCATTCATGAGTTACTACTAT

TACATAGCCATCGGTAGGTACTCAATGATGATAATGTATCGGTAGCCATCCATGAGTTACTACTAT

Mother

Father

Page 34: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor

TACATAGCCATCGGTAGGTACTCAATGATGATA

GG

ATGTATCGGTAGCCATCCATGAGTTACTACTAT

TACATAGCCATCGGTAGGTACTCAATGATGATAATGTATCGGTAGCCATCCATGAGTTACTACTAT

Mother

Father

Page 35: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor
Page 36: A super quick introduction to molecular biology and genomicsusers.umiacs.umd.edu/~hcorrada/CMSC423/lectures/... · A super quick introduction to molecular biology and genomics Héctor