Biology/Computer Science 251: Introduction to Bioinformatics
description
Transcript of Biology/Computer Science 251: Introduction to Bioinformatics
![Page 1: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/1.jpg)
Biology/Computer Science 251: Introduction to Bioinformatics
What is bioinformatics?
Definition: “Bioinformatics is nothing but good, sound, regularbiology appropriately dressed so that it can fit into a computer” -- Claverie & Notredame, Bioinformatics for Dummies
![Page 2: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/2.jpg)
Class Web Sitehttp://cs.gettysburg.edu/~leinbach/Bio_CS251/
• This site will contain all important documents related to the class.
Power points of all lectures
Labs
Exam Answer Keys (Posted after the exam)
Homework Assignments and Answer Keys
• Note the updated syllabus contained at this site. It supercedes the one distributed on paper.
![Page 3: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/3.jpg)
![Page 4: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/4.jpg)
Where to start: A brief history of Bioinformatics
Gregor Mendel: 1866 - first described a set of mathematical rules by Which the appearance of an organism (its PHENOTYPE) could berelated to its inherited genetic makeup (GENOTYPE)
7 pea traits, or characters, studied by Mendel
![Page 5: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/5.jpg)
Two Scientists Who In ~900 Words Reshaped the Way In Which We View Life on Earth
![Page 6: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/6.jpg)
![Page 7: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/7.jpg)
Red blood cells undergo sickling due toa single base change in the DNA of thebeta-globin gene. This base change in DNA changes one residue in the protein`
![Page 8: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/8.jpg)
Bioinformatics allows the study of gene origins and the evolution of new genes
SCIENCE,12-22-06
![Page 9: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/9.jpg)
We are now moving into the post-genomic era, as entire genomesare sequenced and made available with individual genome databases
![Page 10: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/10.jpg)
Bioinformatics was born out of databases, constructed to collectprotein and DNA sequences
1. Proteins were first: 1960’s thru 1970’s Margaret Dayhoff, National BiomedicalResearch Foundation (NBRF) established PIR, Protein Information Resource,which grew into the PIR-International Protein Sequence Database:
http://www-nbrf.georgetown.edu/pir
2. DNA came second: 1977 - reliable DNA sequencing developed1974 - GenBank was established, followed by1980 - European Molecular Biology Laboratory
(EMBL) Data Library, and…1984 - DNA Databank of Japan (DDBJ)
Today, GenBank is under National Center for Biotechnology Information (NCBI). GenBank, EMBL, and DDBJ formed International Sequence Database Collaboration, Data are exchanged on a daily basis. GenBank: http://www.ncbi.nlm.nih.gov
How fast is the GenBank database growing?
http://www.ncbi.nlm.nih.gov/GenBank/genebankstats.html
![Page 11: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/11.jpg)
The Net (no pun intended) Result is an Astounding Growth of Biological Information
Source: Scrabanek op. cit.
![Page 12: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/12.jpg)
Whole- genome structure/organization can be compared between species
This is a human karyogram,or idiogram
This is a human-mouse synteny map
![Page 13: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/13.jpg)
Bioinformatic and genomic approaches have exploded thestudy of evolution (molecular evolution)
![Page 14: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/14.jpg)
SCIENCE, 12-22-06
![Page 15: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/15.jpg)
Bioinformatic and genomic approaches allow discovery of newand unsuspected species of organisms that cannot be detectedusing conventional approaches
Archaeal Richmond Mine AcidophilicNanoorganism (ARMAN)
![Page 16: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/16.jpg)
Bioinformatics can be applied to study interactions among proteinswithin the cell, i.e, proteomics
For example, take a look at the Saccharomyces genome database (SGD):
http://www.yeastgenome.org/
![Page 17: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/17.jpg)
Protein Structure Data is Growing at a Slower Pace
![Page 18: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/18.jpg)
Protein-proteininteraction map forbudding yeast
Jeong et al. 2001. Nature 411: 41-2
The color of a node indicatesThe effect of deleting the Corresponding protein.
Red: lethalGreen: non-lethalOrange: slow growthYellow: unknown
![Page 19: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/19.jpg)
Novel insights from the layering of 3-D protein structure withProtein Interaction Networks
![Page 20: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/20.jpg)
Cancer cells Normal cells
DNA chips(DNA microarrays)can be used to unravel the geneticbasis of cancer
![Page 21: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/21.jpg)
![Page 22: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/22.jpg)
One form of cancer, Non-Hodgkins Lymphoma, is actually two types: two varieties of Diffuse Large B-Cell Lymphoma (DLBCL) exhibit very different microarrayRNA-expression profiles and very different survival outcomes.
![Page 23: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/23.jpg)
Definition II: “A field that involves the building and manipulation ofbiological databases. In the context of genomics, this means managingmassive amounts of sequencing data and providing useful access to and interpretation of the data” -- Weaver, Molecular Biology, 3rd ed.
Definition III: “A field that extracts biological information from largedatasets such as sequences, protein interactions, microarrays, etc. Thisfield also includes the area of data visualization”
-- Campbell & Heyer, Genomics, Proteomics, and Bioinformatics
What is bioinformatics?
Definition I: “Bioinformatics is nothing but good, sound, regularbiology appropriately dressed so that it can fit into a computer” -- Claverie & Notredame, Bioinformatics for Dummies
![Page 24: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/24.jpg)
Carl’s Definition of BioinformaticsA study of the algorithms and programs that are used by Molecular Biologists and others in the Biological and Medical Sciences in their quest for understanding protein structure and function in living organisms.
This is just one of many definitions that may be found in text books, scientific papers, and on the web. The simplest definition is that it is an interdisciplinary subject drawing on material from Biology, Mathematics, and Computer Science. To me this is like saying that e = mc2 has something to do with relativity theory.
![Page 25: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/25.jpg)
Some Implications of this Definition
• An individual studying Bioinformatics needs to have some understanding of the basic ideas of Molecular Biology research.
• They also need to have a familiarity with DNA sequences and how they contribute to 3D Protein Structure as well as gene identification and phylogenetics.
• They need to be familiar with the many “in silico” tools that are used and the parameters that control the output of the programs or algorithmically controlled devices.
• It is important for them to understand the objectives and limitations of both Computer Science and Molecular Biology.
• They need to have some experience with collecting biological data for analysis
![Page 26: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/26.jpg)
![Page 27: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/27.jpg)
Computational Biology
Micro Biology & Medical Science
Computational Biology
Computer Science Biology
Bioinformatics
Micro Biology & Medical Science
Computational Biology
(Note the two way arrow)
![Page 28: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/28.jpg)
Early Pre History
Computer Science
Micro Biology
Bioinformatics
![Page 29: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/29.jpg)
Late Pre History
Computer Science
Micro Biology
Bioinformatics
![Page 30: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/30.jpg)
Recent History
Computer Science
Micro Biology
Bioinformatics
![Page 31: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/31.jpg)
As a result, DNA sequencing and Proteomics have had an increasing number of important applications in the life, medical and social sciences.
Pickup any scientific journal that deals with the life or medical sciences, any popular scientific magazine, or, for that matter, any daily newspaper and you will find an article where DNA or related issues play an important role
Why, it even makes the comic section:
![Page 32: Biology/Computer Science 251: Introduction to Bioinformatics](https://reader035.fdocuments.net/reader035/viewer/2022062423/568145ec550346895db2ef39/html5/thumbnails/32.jpg)
For example: A fungus, Aspergillus nidulans
http://www.broad.mit.edu/annotation/genome/aspergillus_nidulans/Home.html
What can we do with these databases?What is the purpose of bioinformatics?
Answer: to make sense out of this: AN 1.29 nt 240000 - 270000
Preview a few tools -
A.n. genome database: “Browse Regions” - “Feature Map” - “Get DNA sequence”
GenBank: “ORF Finder” - “Blastp search” - “Conserved domain” - “Format Results”