Lecture #4 : Comparing genes
description
Transcript of Lecture #4 : Comparing genes
![Page 1: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/1.jpg)
Lecture #4 : Comparing Lecture #4 : Comparing genesgenes
9/14/09
![Page 2: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/2.jpg)
This weekThis week Homework #2 due on Wed
Email with questionsEmail me answers or hand in in class
Wed - I will be at Dept of Biology retreatLecture will be given by Kelly O’Quin -
expert in phylogeneticsHe will go over homework so it must
be done before class
![Page 3: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/3.jpg)
Questions for todayQuestions for today
0. More BLAST1. Where do we get high quality
gene sequences?2. How do genes evolve?3. How do we compare genes?
![Page 4: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/4.jpg)
How to find genesHow to find genes
Start with genes which are known from model organisms
Use these to pull out genes from genomes
Compare genes to learn about sensory evolution
![Page 5: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/5.jpg)
Blast - GenbankBlast - Genbank
What database do you want to search?
What do you want to compare?
What program do you want to do the searching?
![Page 6: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/6.jpg)
Query Database Type
Nucleotide Nucleotide Blastn, Megablast, Discont megablast
Protein Protein Blastp, Psi-blast, Phi-blast
Translated nucleotide
Protein Blastx
Protein Translated nucleotide
Tblastn
Translated nucleotide
Translated nucleotide
Tblastx
Types of blast queriesTypes of blast queries
![Page 7: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/7.jpg)
![Page 8: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/8.jpg)
Defaults
Database
Program
Confirm
![Page 9: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/9.jpg)
Nucleotide BLAST = DNA nucleotide query vs nucleotide database
![Page 10: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/10.jpg)
Choices for programsChoices for programs
Megablast Highly similar sequences >95%
Word length 28 Discontiguous megablast
Pretty similar seqs Word length 11
Blastn Dissimilar seqsWord length 11
![Page 11: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/11.jpg)
Translated blast = protein query vs translated database
![Page 12: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/12.jpg)
BLAST a genomeBLAST a genome
Request IDAWJ4D4B7012
![Page 13: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/13.jpg)
![Page 14: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/14.jpg)
![Page 15: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/15.jpg)
![Page 16: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/16.jpg)
![Page 17: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/17.jpg)
BLASTing is funBLASTing is fun
This is meant to be enjoyable Be a genome explorer
Find out what kind of data is out thereFind out what kind of data isn’t there
QUESTIONS?????
![Page 18: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/18.jpg)
Q1.Q1.
There is so much data in Genbank. How do you find GOOD data?
ExampleBovine rhodopsin - 1st G protein
coupled receptor to be sequencedSearch Genbank with text
49 entries
![Page 19: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/19.jpg)
Bovine opsinBovine opsin
![Page 20: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/20.jpg)
Bovine rhodopsinBovine rhodopsin
![Page 21: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/21.jpg)
Searching for genesSearching for genes
Searching by text is fraught with perilGenbank has too many linksPull up many things that are not what
you want BLAST is better approach NCBI has also made records which
combine all similar sequences into one
![Page 22: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/22.jpg)
![Page 23: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/23.jpg)
![Page 24: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/24.jpg)
NCBI has done some of NCBI has done some of the workthe work
They have hand-curated data for some species to make a set of reference sequencesNucleotide sequences - NMxxxxxxxProtein sequences - NPxxxxxx
For human rhodopsinNM000539NP000530
These are the gold standard for sequences
![Page 25: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/25.jpg)
HomologeneHomologene
![Page 26: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/26.jpg)
HomologsHomologs
Two genes which arise in the common ancestor of two organisms and are passed down
Implies genes perform same function in two organisms
Therefore they can be compared to learn about evolution
![Page 27: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/27.jpg)
Human
Chimp
Macaque
Bushbaby
These 4 primates have many genes which are homologsand have been passed down from primate ancestor
![Page 28: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/28.jpg)
Homologene search for Homologene search for rhodopsinrhodopsin
![Page 29: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/29.jpg)
HomologeneHomologene
![Page 30: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/30.jpg)
Three primary sequence Three primary sequence portals: 1. NCBIportals: 1. NCBI
![Page 31: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/31.jpg)
3. DNA database of Japan3. DNA database of Japan
![Page 32: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/32.jpg)
2. Ensembl - European 2. Ensembl - European Bioinformatics Institute Bioinformatics Institute
(EBI)(EBI)
![Page 33: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/33.jpg)
Select just genesSelect just genes
![Page 34: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/34.jpg)
Scroll down to find the Scroll down to find the gene you wantgene you want
![Page 35: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/35.jpg)
Location Orthologues are predicted and linkedLinks to transcript and protein
![Page 36: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/36.jpg)
![Page 37: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/37.jpg)
OMIM - Online mendelian OMIM - Online mendelian inheritance in maninheritance in man
![Page 38: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/38.jpg)
Good places to find genesGood places to find genes
Model organisms: NCBI homologene Genes from models and other organisms:
Sanger Ensembl gene familiesNOTE: These are often predicted from genome
sequencesIf there is a sequence in NCBI homologene, it
may be different (and more accurate) than Sanger predictions
OMIM is a good reference
![Page 39: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/39.jpg)
Q2. How do genes change Q2. How do genes change through time?through time?
Change in actual sequenceMutationRecombination
Change in frequency of a sequenceSelection - “survive” betterDrift - get passed on by chanceMigration - move between populations
![Page 40: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/40.jpg)
Mutation vs selectionMutation vs selection Mutation = sequence changeATGCCGTGACGT ATGCCTTGACGT
Selection/drift/migration = sequence frequency changes across a number of individuals
ATGTG ATGTG ATGTG ATGTG ATGTG ATGTGATGTG ATGTG ATGTG ATGTG ATGTG ATGTT
ATGTG ATGTG ATGTG ATGTT ATGTT ATGTT ATGTG ATGTG ATGTG ATGTT ATGTT ATGTTATGTT ATGTG ATGTG ATGTT ATGTT ATGTT
![Page 41: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/41.jpg)
Evolution as tinkererEvolution as tinkerer
Changes are typically small Mutation is source of new
sequenceNot all mutations are created equalSome occur more often than others
Other forces shift frequency of particular sequence
![Page 42: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/42.jpg)
Triplet amino acid codeTriplet amino acid codeF, phe TTT S, ser TCT Y, tyr TAT C, cys TGTF, phe TTC S, ser TCC Y, tyr TAC C, cys TGCL, leu TTA S, ser TCA O, stopTAA J, stop TGAL, leu TTG S, ser TCG B, stopTAG W, trp TGG
L, leu CTT P, pro CCT H, his CAT R, arg CGTL, leu CTC P, pro CCC H, his CAC R, arg CGCL, leu CTA P, pro CCA Q, gln CAA R, arg CGAL, leu CTG P, pro CCG Q, gln CAG R, arg CGG
I, ile ATT T, thr ACT N, asn AAT S, ser AGTI, ile ATC T, thr ACC N, asn AAC S, ser AGCI, ile ATA T, thr ACA K, lys AAA R, arg AGAM, metATG T, thr ACG K, lys AAG R, arg AGG
V, val GTT A, ala GCT D, asp GAT G, gly GGTV, val GTC A, ala GCC D, asp GAC G, gly GGCV, val GTA A, ala GCA E, glu GAA G, gly GGAV, val GTG A, ala GCG E, glu GAG G, gly GGG
![Page 43: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/43.jpg)
Mutation causes Mutation causes nucleotide changenucleotide change
What about AA sequence? Synonymous change
Syn = sameAA stays same
Nonsynonymous changeNot sameAA changes
![Page 44: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/44.jpg)
Amino acid codeAmino acid codeF, phe TTT S, ser TCT Y, tyr TAT C, cys TGTF, phe TTC S, ser TCC Y, tyr TAC C, cys TGCL, leu TTA S, ser TCA O, stopTAA J, stop TGAL, leu TTG S, ser TCG B, stopTAG W, trp TGG
L, leu CTT P, pro CCT H, his CAT R, arg CGTL, leu CTC P, pro CCC H, his CAC R, arg CGCL, leu CTA P, pro CCA Q, gln CAA R, arg CGAL, leu CTG P, pro CCG Q, gln CAG R, arg CGG
I, ile ATT T, thr ACT N, asn AAT S, ser AGTI, ile ATC T, thr ACC N, asn AAC S, ser AGCI, ile ATA T, thr ACA K, lys AAA R, arg AGAM, metATG T, thr ACG K, lys AAG R, arg AGG
V, val GTT A, ala GCT D, asp GAT G, gly GGTV, val GTC A, ala GCC D, asp GAC G, gly GGCV, val GTA A, ala GCA E, glu GAA G, gly GGAV, val GTG A, ala GCG E, glu GAG G, gly GGG
![Page 45: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/45.jpg)
Amino acid (AA) typesAmino acid (AA) types
Non-polar A, F, G, I, L, M, P, V, W
Polar N, Q, S, T, Y Charged, + H, K, R Charged, - D, E Other C
Often changing AA within a group does not affect protein function
![Page 46: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/46.jpg)
SelectionSelection Stabilizing selection - Acts to
keep protein function the sameSynonymous change more frequent
than nonsynonymous Amino acid changes occur within
group much more common than betweenNon polar nonpolarPolar polar
![Page 47: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/47.jpg)
Similarity matrixSimilarity matrix
A = alanineC = cysteineD = aspartic acidE = glutamic acidF = phenylalanineG = glycineH = histidine
![Page 48: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/48.jpg)
Comparing sequencesComparing sequences
Can do at either nucleotide or AA level
Gather sequences from a bunch of different organisms
Need to align them so that sites which perform the same function can be compared
![Page 49: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/49.jpg)
Aligning sequencesAligning sequences
Sequences may differ in lengthOften have differences at amino- or
carboxy- terminus of the proteinNeed a way to align parts of protein
that are performing the same function
![Page 50: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/50.jpg)
Example - RH2 opsin in Example - RH2 opsin in fishesfishes
Goldfish MNGTEGNNFYVPLSNRMedaka MENGTEGKNFYIPMNNRZebrafish MNGTEGSNFYIPMSNRKillifish MGYGPNGTEGNNFYIPMSNKTrout MQNGTEGSNFYIPMSNRHalibut MVWDGGIEPNGTEGKNFYIPMSNRCod MRMEANGTEGKNFYIPMSNRTetraodon MVWDGGIEPNGTEGKNFYIPMSNR
![Page 51: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/51.jpg)
Align sequencesAlign sequences
Zebrafish M--------NGTEGSNFYIPMSNR Trout M------Q-NGTEGSNFYIPMSNR Medaka M------E-NGTEGKNFYIPMNNR Cod M----RMEANGTEGKNFYIPMSNR Halibut MVWDGGIEPNGTEGKNFYIPMSNR Tetraodon MVWDGGIEPNGTEGKNFYIPMSNR Goldfish M--------NGTEGNNFYVPLSNR Killifish M---GYG-PNGTEGNNFYIPMSNK * *****.***:*:.*:
* identical: conserved. semi-conserved
![Page 52: Lecture #4 : Comparing genes](https://reader035.fdocuments.net/reader035/viewer/2022081515/56813e60550346895da86776/html5/thumbnails/52.jpg)
Amino acid (AA) typesAmino acid (AA) types
Non-polar A, F, G, I, L, M, P, V, W
Polar N, Q, S, T, Y Charged, + H, K, R Charged, - D, E Other C
Often changing AA within a group does not affect protein function