Bioinformatics Essentials Stephanie Tatem Murphy [email protected].
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of Bioinformatics Essentials Stephanie Tatem Murphy [email protected].
ATGCATTTCGGTTTACGCCATATAGCTCGGGAATCATGCATCGATCGAGTAGCTAGCTAG
PNSADADNDFEDRLRAGLCDHDKEVQGLQVRCAVUEEHMHKKQQEFENIRLDAQRLEFFAYIFQKEHMKR
DNA ProteinModel organisms
TGT AAT AGT TAT ATT TTCATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAAACA TGG CTT TTT AAC CTGATA AAT CCT ACG AAT ATTTGT AAT AGT TAT GTT ATTGCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG
TGT AAT AGT TAT ATT TTCATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAAACA TGG CTT TTT AAC CTGATA AAT CCT ACG AAT ATTTGT AAT AGT TAT GTT ATTGCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG
What is Bioinformatics?
Which genes are turned off then on ?Courtesy of Dr. Young Moo Lee UC Davis
Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Genome Transcriptome Proteome
Fundamental DogmaDNA
RNA
Proteins
Pathways
Phenotypes
PopulationsGenBank
EMBLDDBJ
MapDatabases
SwissPROTPIR
PDB
Gene Expression?
Clinical Data ?
Regulatory Pathways? Metabolism?
Biodiversity?
Neuroanatomy?
Development ?
Molecular Epidemiology?
Comparative Genomics?
the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.
the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.
Although a few databases already exist to distribute molecular information,
Although a few databases already exist to distribute molecular information,
Bob Robbins http://www.esp.org/rjr/canberra.pdf
Gene a b c d e
Art by Yelena Ponirovskaya
…ATGGCCCTGTGGATGCGCCTCCTGCCCCTG…..
DNA base sequence recipe for amino acids
Met: Ala: Leu: Trp: Met: Arg: Leu: Leu: Pro: Leu: Amino acid sequence = protein = trait
The Biology Project University of Arizona
http://www.biology.arizona.eduDNA acitivity – RFLP, Inheritance http://www.biology.arizona.edu/human_bio/activities/blackett/introduction.html
DNA replication forkhttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/03t.html
DNA base pairinghttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/08t.html
DNA translationhttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/10t.html
The Genetic Codehttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/12t.htmlhttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/13t.html
DNA transcriptionhttp://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/15t.html
Bioinformatics – a Definition
bio – informatics: bioinformatics is conceptualizing biology in terms of molecules and applying “informatics techniques” to understand and organise the information associated with these molecules, on a large scale. In short, bioinformatics is a management information system for molecular biology and has many practical applications.
As submitted to the Oxford English Dictionary.
What is Bioinformatics? N. M. Luscombe, et al. Yale UniversityMethod Inform Med 4/2001
The field of science in which biology, computer science, and information technology merge into a single discipline. NCBI, Aug 2001
BIOINFORMATICS
BIOLOGY
COMPUTERSCIENCE
INFORMATIONTECHNOLOGY
Bioinformatics – a Definition
What’s in a name?
SequenceAnalysis
DatabaseHomologySearching
MultipleSequence
Alignment
HomologyModelingDocking
ProteinAnalysis
Proteomics
3DModeling
SampleRegistration &
TrackingIntegrated
DataRepositories
CommonVisual
Interfaces
IntellectualPropertyAuditing
Life Science Informatics
GenomeMapping
Bioinformatics Needs
Multidisciplinary teamsbiologists, mathematicians, computer scientists, laboratory technicians
Users and Developers to use / create scalable database infrastructurestandards to control vocabulary and annotationnew ways of visualizing, analyzing and searching datanew ways of delivering information, tools and results
Faster and larger computer systems
Demo Bioinformatics Company
Onconomics Corporationhttp://www.bscs.org/onco/default.htm
From nonprofit BSCS Biological Sciences Curriculum Study
Computer Programming 50 yrs ago DNA & Protein Structure
Personal Computers/ Internet 20 yrs ago PCR
w.w.w. Last 10 yrs Human Genome Project
All fields use computers Now Biological (art, law, communication) Research
Bioinformatics Computer Skills
Growth of Bioinformatics
www.oreilly.com
Why informatics?
Large size of data setsAllow students to ask questions of dataIntegrate current research into classroom
http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
>100,000 species are represented in GenBank
all species 128,941
viruses 6,137
bacteria 31,262
archaea 2,100
eukaryota 87,147
The most sequenced organisms in GenBank
Homo sapiens 10.7 billion basesMus musculus 6.5bRattus norvegicus 5.6bDanio rerio 1.7bZea mays 1.4bOryza sativa 0.8bDrosophila melanogaster 0.7bGallus gallus 0.5bArabidopsis thaliana 0.5b
Updated 8-12-04GenBank release 142.0
Table 2-2Page 18
Online datasets for all the Life Sciences
Environment and EcologyPopulation http://www.prb.orgWater http://www.waterontheweb.org/ http://www.neptune.washington.edu/
Geography http://nhd.usgs.gov/http://data.geocomm.com/
Chemistry
Physics
Biology
Anatomy & Physiology
Earth http://www.dlese.org/educators/usingdata.html
Agriculture
Nutrition
Plant http://allometra.com/ath_fasta_mpss.shtml
Data mining requires a testable hypothesis generated with regard to the function or structure of a gene or protein by identifying similar sequences in better characterized organisms.
To help in uncovering phylogenetic relationships and evolutionary patterns.
www.tigr.org
Why use Bioinformatics?
What is Bioinformatics? N. M. Luscombe, et al. Yale University Method Inform Med 4/2001
BiotechnologyDid You or Will You Ever?
Ride in a car? Genetically engineered micro-organisms will someday be used to extract oil from rocks. Micro-organisms that break down oil spills are already in use.
Drink tap water? Genetically engineered micro-organisms will someday be used to attract and filter out harmful substances from drinking water.
Have a dog or cat? Vaccines for a number of pet diseases such as rabies will be improved by genetic engineering.
Wear brightly colored clothes? Many clothing dyes can be made less expensively with biotechnology, and will last longer.
Take vitamins? Vitamins can be made more potent and less expensively with biotechnology.
Go to the bathroom? Micro-organisms are already an important part of sewage treatment; genetic engineering will produce bacteria that are more efficient at breaking down wastes.
What Good is Recombinant DNA?People with diabetes need to take a drug called insulin. In the past, this drug was extracted and purified from ground-up animal glands. It takes several pounds of cow or pig glands to produce a fraction of an ounce of insulin.
Today, the DNA with the instructions for making insulin can be spliced into a plasmid,And produced by bacteria? It’s faster, easier, and cheaper this way.
There are still many technical problems to be solved. Not all gene splices work, and some that do may fail over time.
There are also social and environmental concerns about biotechnology. Some people fear we will upset the balance of nature if “genetically engineered” organisms escape. Others fear that recombinant DNA will be used to influence human size, race, or intelligence.
The best way for people to enjoy the benefits and avoid the problems is to stay informed and up to date about what’s happening in biotechnology.
http://www.chourave.ch/init/kid/cartoon-00.html
How Do You Make Recombinant DNA?
First, you need to isolate a specific bit of DNA with the instructions you want. To do this, you use restriction enzymes that break up DNA strands in specific places.
After you have DNA fragments, you sort them by size, using a gel. DNA is loaded onto the top of the gel, and then electricity is passed through it. This causes the DNA pieces to migrate down, and the small pieces travel further than the large pieces.
Next, you need to add the DNA fragment into a host. In most research, the host is a plasmid, a ring of DNA found in some bacteria.
The host DNA has to be exposed to restriction enzymes to make split ends that will attach to the fragment. After you mix the new and host DNA fragments, you need to add enzymes that will glue them together.
If you used a plasmid as a host, you need to put it back into a bacterium. When the bacterium replicates itself, it will copy the new DNA too. A small population of “gene-spliced” bacteria can develop into a large population in just a few days.
How Do You Make Recombinant DNA?
http://www.gene.com/gene/research/ biotechnology
What is an Enzyme?
Enzymes are molecules that speed up biological reactions.
Some characteristics of enzymes:
For example, the enzyme carbonic anhydrase enables red blood cells to pick up and dump carbon dioxide 1 million times faster than they could without it.
Enzymes increase the rate of a chemical reaction.
Enzymes don’t enter into the reaction themselves. They’re not physically changed as a result of the reaction. A single enzyme can act thousands of times.
Enzymes are highly specific. Like a wrench that will only fit a 5/16-inch bolt, each enzyme generally works with only a particular kind of molecule.
An enzyme increases the odds that two molecules will meet, so an enzyme is a “matchmaker”.
Why try to Design Better Enzymes?
Enzymes are fragile….they lose their shape (de-nature) if the temperature or acidity go up even a little. They also de-nature in alcohol or oils.
This is a drag! If you’re adding an enzyme to a laundry detergent you’d like it to function in hot water, with bleach!
As we understand more and more about DNA and how it is de-coded, we can re-write the instructions for making some enzymes.By altering their shapes, we may be able to make enzymes that are sturdier and able to function under harsher conditions. We may even be able to invent some completely new enzymes!
Examples of Enzymes
Subtilisin–This enzyme is added to laundry detergent. It breaks down proteins (like yucky egg yolk stains or gross dried blood) into tiny fragments that can be rinsed away from the fibers of the cloth.
Papain-This enzyme breaks up proteins, and is extracted from the papaya fruit. It’s now added to contact lens cleaner solution to help dissolve away gross crusty things from soft contact lenses.
Ceredase-Several thousand people in the United States have Gaucher disease (low levels of a crucial enzyme that dissolves fatty deposits in the liver, spleen and bone marrow). They suffer from bone pain, fractures, swelling and bleeding. Ceredase is a variation of the enzyme, produced in the laboratory, which can be used to treat disease.
Vianain-Originally derived from pineapples, this enzyme offers hope to burn victims. It helps prepare burned areas of skin grafts by safely dissolving damaged skin layers that would otherwise have to be removed surgically.
Journals & BooksPublic Library of Science - Open Access Journals
http://www.plosbiology.orgInternational Society for Computational Biology – Book Reviews
http://www.iscb.org/bioinformaticsBooks.shtmlFree Journals: Biotechniques http://www.BioTechniques.com
Genomeweb http://www.genomeweb.comBooks:The Cartoon Guide to Genetics, Larry Gonick & Mark Wheelis
ISBN 0062730991 Harper 1983Introduction to Bioinformatics, Arthur Lesk http://www.oup.com/uk/lesk/bioinf
ISBN 0199251967 Oxford 2002Fundamental Concepts of Bioinformatics, Dan Krane & Michael Raymer
ISBN 0805346333 Benjamin Cummings 2003Discovering Genomics, Proteomics, & Bioinformatics, A. Campbell & L. Heyer
ISBN 0805347224 Benjamin Cummings 2002Understanding Biotechnology, George Acquaah
ISBN 0130945005 Pearson Prentice Hall 2004Understanding Biotechnology, A. Borem, F. Santos, D. Bowen
ISBN 0131010115 Pearson Prentice Hall 2003
Human Genome Project
http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer2001/index.shtml
Genomics and Its Impact on Science and Society: The Human Genome Project and Beyond
U.S. Department of Energy Genome Programshttp://doegenomes.org
www.ncbi.nlm.nih.gov
National Center for
BiotechnologyInformation
A user’s guide to human genome
Nature Genetics www.nature.com/ng/vol 32, pg 1-79, 01 Sep 2002
Introduction: putting it together
Question 8: How can one find all the members of a human gene family?
Question 12: How does a user find characterized mouse mutants corresponding to human genes?
Web resources: Internet resources featured in this guide
Get Schooled for Bioinformatics• Biology
– Know basics & Have sense of biological experimentation
• Computer Science– Programming C, C++, Perl, JAVA, SAS, CGI– Database construction UNIX, LINUX– Algorithm design
• Math/Statistics– Probability, Experimental design
• Ethics • “Core Bioinformatics”
– LIMS– EST clustering– Sequence analysis & annotation
Fundamental Dogma
DNA
RNA
Proteins
Circuits
Phenotypes
Populations
GenBankEMBLDDBJ
MapDatabases
SwissPROTPIR
PDB
Gene Expression?
Clinical Data ?
Regulatory Pathways? Metabolism?
Biodiversity?
Neuroanatomy?
Development ?
Molecular Epidemiology?
Comparative Genomics?
the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.
the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.
Although a few databases already exist to distribute molecular information,
Although a few databases already exist to distribute molecular information,
Biological Research = To enable the discovery of new biological insights as well as create a global perspective from which unifying principles in biology can be discerned. NCBI, Aug 2001
Biological Research = To enable the discovery of new biological insights as well as create a global perspective from which unifying principles in biology can be discerned. NCBI, Aug 2001
Ultra – Conserved element -Only 6 SNP’s- mouse, rat, human
TGATCCCGGACTCTATGAATTATTGATGAGATATGAGCGTTGATTTCCCCTTTCAGGATGCAAACTCCATTATATTGTTAAAATGGCGATTTAATCGTTGAGAATAGCTTTGGTGTGGGTTTTTTCCCCCAACTCATTTGCGCCTCCTTCCTTTTCATTTAACTCTCTTAATTAAATCCTTTAACAGATTTTAATCACTTTTTGGAG