Introduction to Bioinformatics 236523/234525
description
Transcript of Introduction to Bioinformatics 236523/234525
![Page 1: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/1.jpg)
Introduction to Bioinformatics236523/234525
Lecturer: Prof. Yael Mandel-Gutfreund
Teaching Assistance: Shula Shazman
Idit kostiCourse web site :http://webcourse.cs.technion.ac.il/236523
![Page 2: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/2.jpg)
2
What is Bioinformatics?
![Page 3: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/3.jpg)
3
Course Objectives
• To introduce the bioinfomatics discipline • To make the students familiar with the major
biological questions which can be addressed by bioinformatics tools
• To introduce the major tools used for sequence and structure analysis and explain in general how they work (limitation etc..)
![Page 4: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/4.jpg)
4
Course Structure and Requirements
1.Class Structure1. 2 hours Lecture 2. 1 hour tutorial
2. Home work• Homework assignments will be given every second week• The homework will be done in pairs.• 5/5 homework assignments will be submitted
2. A final project will be conducted and submitted in pairs
![Page 5: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/5.jpg)
5
Grading
• 20 % Homework assignments• 80 % final project
![Page 6: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/6.jpg)
6
Literature list• Gibas, C., Jambeck, P. Developing Bioinformatics
Computer Skills. O'Reilly, 2001. • Lesk, A. M. Introduction to Bioinformatics. Oxford
University Press, 2002.• Mount, D.W. Bioinformatics: Sequence and Genome
Analysis. 2nd ed.,Cold Spring Harbor Laboratory Press, 2004.
Advanced Reading
Jones N.C & Pevzner P.A. An introduction to Bioinformatics algorithms MIT Press, 2004
![Page 7: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/7.jpg)
7
What is Bioinformatics?
![Page 8: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/8.jpg)
8
“The field of science in which biology, computer science, and information technology merge to form a single discipline”
Ultimate goal: to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned.
What is Bioinformatics?
![Page 9: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/9.jpg)
9
Central Paradigm in Molecular Biology
mRNAGene (DNA) Protein
21ST centaury
Genome Transcriptome Proteome
![Page 10: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/10.jpg)
10
From DNA to Genome
Watson and Crick DNA model
First protein sequence1955
1960
1965
1970
1975
1980
1985
First protein structure
![Page 11: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/11.jpg)
11
1995
1990
2000 First human genome draft
First genomeHemophilus Influenzae
Yeast genome
![Page 12: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/12.jpg)
12
Total 1379 294
Eukaryotes 133 39
Bacteria 1152 235
Archaea 94 23
Complete Genomes
2010 2005
![Page 13: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/13.jpg)
1,000 Genomes Project: Expanding the Map of Human Genetics
Researchers hope the effort will speed up the discovery of many diseases's genetic roots
13
![Page 14: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/14.jpg)
14
Main Goal: To understand the living
cell
Annotation Comparativegenomics
Structuralgenomics
Functionalgenomics
25000 genomes… What’s Next ?The “post-genomics” The “post-genomics” eraera
![Page 15: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/15.jpg)
From ….25000 genomes
To…Understanding living cells
![Page 16: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/16.jpg)
16
CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATGCGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAACTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTCAGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGAAGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAATAT GGA CAA TTG GTT TCT TCT CTG AAT .................... TGAAAAACGTA
Annotation
![Page 17: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/17.jpg)
17
Annotation
Identify the genes within a given sequence of DNA
Identify the sitesWhich regulate the gene
Predict the function
![Page 18: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/18.jpg)
18
How do we identify a genein a genome?
A gene is characterized by several features (promoter, ORF…)some are easier and some harder to detect…
![Page 19: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/19.jpg)
19
CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATGCGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAACTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTCAGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGAAGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAATAT GGA CAA TTG GTT TCT TCT CTG AAT .................................
.............. TGAAAAACGTA
TF binding sitepromoter
Ribosome binding SiteORF=Open Reading FrameCDS=Coding Sequence
Tran
script
ion
Star
t Site
![Page 20: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/20.jpg)
20
Using Bioinformatics approaches for Gene hunting
Relative easy in simple organisms (e.g. bacteria)
VERY HARD for higher organism (e.g. humans)
![Page 21: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/21.jpg)
21
Comparativegenomics
![Page 22: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/22.jpg)
22
Comparison between the full drafts of the human and chimp genomesrevealed that they differ only by 1.23%
How humans are chimps?
Perhaps not surprising!!!
![Page 23: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/23.jpg)
So where are we different ??
23
Human ATAGCGGGGGGATGCGGGCCCTATACCCChimp ATAGGGG - - GGATGCGGGCCCTATACCCMouse ATAGCG - - - GGATGCGGCGC -TATACCA
![Page 24: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/24.jpg)
24
And where are we similar ???
VERY SIMAILARConserved between many organisms
VERYDIFFERENT
![Page 25: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/25.jpg)
25
Functionalgenomics
![Page 26: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/26.jpg)
26
TO BE IS NOT ENOUGH In any time point a gene can be functional or not
![Page 27: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/27.jpg)
27
From the gene expression pattern we can lean:
What does the gene do ?When is it needed?What other genes or proteins interact with it?…..
What's wrong??
![Page 28: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/28.jpg)
28
StructuralGenomics
![Page 29: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/29.jpg)
29
The protein three dimensional structure can tell
much more than the sequence alone
Protein-ligand complexes
Functional sites
fold Evolutionaryrelationship
Shape and electrostatics
Active sites
protein complexes
Biologic processes
![Page 30: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/30.jpg)
30
Resources and Databases
The different types of data are collected in database
– Sequence databases – Structural databases– Databases of Experimental Results
All databases are connected
![Page 31: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/31.jpg)
31
Sequence databases
• Gene database• Genome database• Disease related mutation database• ………….
![Page 32: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/32.jpg)
32
Genome Browsers
Easy “walk” through the genome
![Page 33: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/33.jpg)
33
Genome Browsers
• UCSC Genome Browser http://genome.ucsc.edu/
• Ensembl Genome Browser (http://www.ensembl.org)
• WormBase: http://www.wormbase.org/
• AceDB: http://www.acedb.org/
• Comprehensive Microbial Resource: http://www.tigr.org/tigr-scripts/CMR2/CMRHomePage.spl
• FlyBase: http://flybase.bio.indiana.edu/
![Page 34: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/34.jpg)
34
Mutation database
• Single base difference in a single position
among two different individuals of the same species
• Play an important role in differentiation and disease
![Page 35: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/35.jpg)
35
Sickle Cell Anemia
• Due to 1 swapping an A for a T, causing inserted amino acid to be valine instead of glutamine in hemoglobin
Image source: http://www.cc.nih.gov/ccc/ccnews/nov99/
![Page 36: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/36.jpg)
36
Healthy Individual>gi|28302128|ref|NM_000518.4| Homo sapiens hemoglobin, beta (HBB), mRNAACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA
GGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC
>gi|4504349|ref|NP_000509.1| beta globin [Homo sapiens]
MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG
AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH
![Page 37: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/37.jpg)
37
Diseased Individual>gi|28302128|ref|NM_000518.4| Homo sapiens hemoglobin, beta (HBB), mRNAACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA
GGTGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC
>gi|4504349|ref|NP_000509.1| beta globin [Homo sapiens]
MVHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG
AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH
![Page 38: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/38.jpg)
38
Structure Databases
• 3-dimensional structures of proteins, nucleic acids, molecular complexes etc
• 3-d data is available due to techniques such as NMR and X-Ray crystallography
![Page 39: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/39.jpg)
39
![Page 40: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/40.jpg)
40
Databases of Experimental Results
• Data such as experimental microarray images- gene expression data
• Proteomic data- protein expression data• Metabolic pathways, protein-protein
interaction data, regulatory networks
• ETC………….
![Page 41: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/41.jpg)
41
PubMed
Service of the National Library of Medicine
http://www.ncbi.nlm.nih.gov/pubmed/
Literature Databases
![Page 42: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/42.jpg)
42
Putting it all Together
• Each Database contains specific information
• Like other biological systems also these databases are interrelated
![Page 43: Introduction to Bioinformatics 236523/234525](https://reader036.fdocuments.net/reader036/viewer/2022070502/56814b50550346895db84a8c/html5/thumbnails/43.jpg)
43
GENOMIC DATAGenBank
DDBJ
EMBL
ASSEMBLED GENOMES
GoldenPath
WormBase
TIGR
PROTEINPIR
SWISS-PROT
STRUCTUREPDB
MMDB
SCOP
LITERATUREPubMed
PATHWAYKEGG
COG
DISEASELocusLink
OMIM
OMIA
GENESRefSeq
AllGenes
GDBSNPsdbSNP
ESTsdbEST
unigene
MOTIFSBLOCKS
Pfam
Prosite
GENE EXPRESSION
Stanford MGDB
NetAffx
ArrayExpress