Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from...
Transcript of Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from...
![Page 1: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/1.jpg)
Jason WilliamsCold Spring Harbor Laboratory, DNA Learning Center
[email protected]@JasonWilliamsNY
DNALC Live
Barcoding BioinformaticsPart II
![Page 2: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/2.jpg)
DNALC LiveThis is an experiment; give us feedback
on what you would like to see!
![Page 3: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/3.jpg)
DNALC Live• Provide genetics, molecular biology, and bioinformatics learning
resources
• Laboratory and computer demos, short online courses for middle school, high school, and the general public
• Interviews with scientists, help for teachers
• At-home activities, social media contests, and more
![Page 4: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/4.jpg)
dnalc.cshl.edu
DNALC Website and Social Media
dnalc.cshl.edu/dnalc-live
![Page 5: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/5.jpg)
youtube.com/DNALearningCenter
DNALC Website and Social Media
facebook.com/cshldnalc
@dnalc
@dna_learning_center
![Page 6: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/6.jpg)
Barcoding BioinformaticsPart II
![Page 7: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/7.jpg)
Who is this course for?
• Audience(s): US AP Biology (high school grades 10-12) AND Intro undergraduate biology
• Format: 3 sessions (1 per week); ~ 45 minutes each
• Exercises: Follow along with our online bioinformatics tool DNA Subway
• Learning resources: Slides and packet available (teachers can also request the teacher edition)
![Page 8: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/8.jpg)
Course Learning Goals
• Learn how DNA can be used to identify unknown organisms
• Understand how we obtain DNA Sequence and access its quality
• Use BLAST* to compare an unknown DNA Sequence to known sequences
• Compare DNA Sequences using phylogenetics
*AP Bio (Lab 3 – Comparing DNA Sequences)
![Page 9: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/9.jpg)
Lab Setup• We will be using DNA Subway – You can get a free account at cyverse.org
(optional)
![Page 10: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/10.jpg)
Barcoding BioinformaticsPart II
(Sequence cleaning and BLAST)
![Page 11: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/11.jpg)
Steps for today’s session
• Recap on our experimental dataset
• Review of sequence quality
• Sequence cleaning and pairing
• Introduction to BLAST
![Page 12: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/12.jpg)
Recap of the dataset
![Page 13: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/13.jpg)
Steps to DNA Barcoding
ACGAGTCGGTAGCTGCCCTCTGACTGCATCGAATTGCTCCCCTACTACGTGCTATATGCGCTTACGATCGTACGAAGATTTATAGAATGCTGCTACTGCTCCCTTATTCGATAACTAGCTCGATTATAGCTACGATG
Organism is sampled DNA is extracted “Barcode” amplified
Sequenced DNA is compared with DNA in a barcode database
![Page 14: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/14.jpg)
Example barcoding experiment
Mary Acheampong,Bobby Glover, and Marisa VanBrakle
Mentor: Allison GranberryHostos-Lincoln Academy of Science, The Bronx
2012 UBP Grand Prize Winners
![Page 15: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/15.jpg)
Example barcoding experiment
![Page 16: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/16.jpg)
Aedes adult Anopheles adult Culex adult
By Muhammad Mahdi Karim - Own work, GFDL 1.2, https://commons.wikimedia.org/w/index.php?curid=11185617
By Jim Gathany - (PHIL), ID #5814. https://commons.wikimedia.org/w/index.php?curid=799284 By Muhammad Mahdi Karim - Own work, GFDL 1.2, https://commons.wikimedia.org/w/index.php?curid=7673048
Aedes larva Anopheles larva Culex larva
Photograph by Michele M. Cutwa, University of Florida.Photograph by Michelle Cutwa-Francis, University of Florida.
![Page 17: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/17.jpg)
![Page 18: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/18.jpg)
Why does this matter?
Aedes:
● Chikungunya● Dengue fever● Lymphatic filariasis● Rift Valley fever● Yellow fever● Zika
Anopheles:
● Malaria● Lymphatic filariasis
Culex:
● Japanese encephalitis● Lymphatic filariasis● West Nile fever
![Page 19: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/19.jpg)
Experimental components/design
• We have DNA from unknown mosquito samples• We can obtain DNA from known samples
Materials
![Page 20: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/20.jpg)
Experimental components/design
• We have DNA from unknown mosquito samples• We can obtain DNA from known samples
Materials
Hypothesis• We can use computational methods (BLAST/phylogenetic analysis) to
infer the species
![Page 21: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/21.jpg)
Experimental components/design
• We have DNA from unknown mosquito samples• We can obtain DNA from known samples
Materials
Hypothesis• We can use computational methods (BLAST/phylogenetic analysis) to
infer the species
Controls• We have sensitivity controls (sequence quality, BLAST parameters)• We have outgroup sequences (non-mosquito, negative controls) and
known samples (positive controls)
![Page 22: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/22.jpg)
Review of sequencing and quality
![Page 23: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/23.jpg)
https://dnalc.cshl.edu/resources/animations/cycseq.html
DNA Sequencing
![Page 24: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/24.jpg)
Chromatogram/Electropherogram
![Page 25: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/25.jpg)
Some sequence examples…
![Page 26: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/26.jpg)
Phred scores…
Phred Score Error (bases miscalled) Accuracy10 1 in 10 90%20 1 in 100 99%30 1 in 1,000 99.9%40 1 in 10,000 99.99%50 1 in 100,000 99.999%
![Page 27: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/27.jpg)
If 99% was good enough
Photo credithttp://www.personal.psu.edu/sxt104/class/99percent.html
![Page 28: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/28.jpg)
A note on controls
At what temperature does ice (H2O) + Chemical “X”
melt?
![Page 29: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/29.jpg)
A note on controls
Positive control: What does the effect look like if present?
![Page 30: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/30.jpg)
A note on controls
Positive control: What does the effect look like if present?
Negative control: What does the effect look like if absent?
![Page 31: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/31.jpg)
A note on controls
Positive control: What does the effect look like if present?
Negative control: What does the effect look like if absent?
Sensitivity control: Across what range of values can I measurethe effect?
![Page 32: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/32.jpg)
A note on controls
Positive control Negative controlPhoto creditshttps://commons.wikimedia.org/wiki/File:Water_in_a_beaker.JPGhttp://www.chem.uiuc.edu/webfunchem/temperature/Temp10.htmhttps://www.dreamstime.com/photos-images/alcohol-thermometer.html
Sensitivity control
![Page 33: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/33.jpg)
A note on controls
Positive control Negative controlPhoto creditshttps://commons.wikimedia.org/wiki/File:Water_in_a_beaker.JPGhttp://www.chem.uiuc.edu/webfunchem/temperature/Temp10.htmhttps://en.clipdealer.com/vector/media/A:17494508?
Sensitivity control
![Page 34: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/34.jpg)
Phred are our measure of quality (signal/noise)
Lower score = more noise than signal
![Page 35: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/35.jpg)
Bi-directional sequencing
Photo credithttps://www.omicsonline.org/articles-images/CMBO-2-108-g003.html
![Page 36: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/36.jpg)
Reverse complementation
Photo credithttps://biology.stackexchange.com/questions/56304/manual-primer-design-for-a-gene-on-the-reverse-strand
![Page 37: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/37.jpg)
Reverse complementation
Photo credithttps://image.slidesharecdn.com/pcrprimerdesignenglishversion-160317161103/95/pcr-primer-design-english-version-10-638.jpg?cb=1458231192
![Page 38: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/38.jpg)
Clean up and consensus
![Page 39: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/39.jpg)
Introduction to BLAST
![Page 40: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/40.jpg)
Basic Local Alignment Search Tool
• An algorithm for searching a database of sequences
![Page 41: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/41.jpg)
Basic Local Alignment Search Tool
• An algorithm for searching a database of sequences
• “Google for DNA” (although works with any biologicalsequence, and started before Google ~1985)
![Page 42: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/42.jpg)
Basic Local Alignment Search Tool
• An algorithm for searching a database of sequences
• “Google for DNA” (although works with any biologicalsequence, and started before Google ~1990 vs 1998)
• NCBI is the most popular interface, but this is software thatcan be run anywhere (including Subway)
![Page 43: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/43.jpg)
Warning: Analogy(useful for discussion but not the whole picture)
![Page 44: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/44.jpg)
BLAST algorithm analogy
Query sequenceACTGACATCGGGGTGCTACG
Database
![Page 45: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/45.jpg)
BLAST algorithm analogy
Query sequenceACTGACATCGGGGTGCTACG
Database
TGCTAACTGACTGCTAACTGAGACTG
TGCTAAC
TGCTAACTAC
TGCTAACTGAG
TGCTAACTGAG
ACTGTGCTAAC
![Page 46: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/46.jpg)
BLAST algorithm analogy
Photo credithttps://www.nlm.nih.gov/about/2018CJ.html
![Page 47: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/47.jpg)
BLAST algorithm analogy – searching by “word”
Break the Query sequenceInto “words” (k-mers)
ACT GAC ATC GGG GTG CTA CG
Database
TGCTAACTGACTGCTAACTGAGACTG
TGCTAAC
TGCTAACTAC
TGCTAACTGAG
TGCTAACTGAG
ACTGTGCTAAC
![Page 48: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/48.jpg)
BLAST algorithm analogy – searching by “word”
Break the Query sequenceInto “words” (k-mers)
ACT GAC ATC GGG GTG CTA CG
Database
ACT… TCT … GCT …
![Page 49: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/49.jpg)
Let’s BLAST a sequence
>mosquito-1FCTTTAAGTATATTAATTCGTGCTGAATTAAGTCACCCAGGGATATTTATTGGAAATGATCAAATTTATAACGTAATTGTTACAGCTCATGCATTTATTATAATTTTTTTTATAGTAATACCAATTATAATTGGAGGATTTGGAAATTGATTAGTTCCTTTAATATTAGGAGCTCCTGATATAGCATTTCCTCGAATAAATAATATAAGTTTTTGAATATTACCTCCTTCTTTAACTCTACTACTTTCTAGTTCAATAGTAGAAAATGGAGCAGGGACAGGATGAACAGTTTATCCTCCTCTTTCATCAGGAACAGCACATGCTGGAGCTTCTGTTGATTTAGCAATTTTCTCTCTTCATTTAGCAGGGATTTCATCTATTTTAGGAGCAGTAAATTTTATTACTACTGTTATTAATATACGATCATCTGGAATTACTTTAGATCGATTACCTTTATTTGTTTGATCTGTAGTAATTACTGCTATTTTATTACTTTTATCTCTTCCTGTATTAGCTGGAGCTATTACTATATTATTAACTGATCGAAATTTAAATACTTCCTTCTTTGACCCAATTGGAGGAGGAGA
https://blast.ncbi.nlm.nih.gov/Blast.cgi
![Page 50: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/50.jpg)
BLAST and controls
Photo credithttps://fasta.bioch.virginia.edu/biol4230/lects/biol4230_4_blast2.pdf
![Page 51: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/51.jpg)
BLAST algorithm analogy – searching by “word”
The Query sequenceIs aligned to a Subject (a sequence in the database)
Q: ACTGAC–ATCGGGGTGCTACG||| ||| |||| | || |||| ||
S: ACTGACCATCGGAGTGCTACG
![Page 52: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/52.jpg)
BLAST algorithm analogy – alignment
Photo credithttps://www.ncbi.nlm.nih.gov/books/NBK62051/
![Page 53: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/53.jpg)
Let’s do a BLAST
![Page 54: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/54.jpg)
Some BLAST definitions
• Max Score: Highest alignment score (according to a formula)
![Page 55: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/55.jpg)
Some BLAST definitions
• Max Score: Highest alignment score (according to a formula)
• Query Cover: % of the query length included in aligned segment
![Page 56: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/56.jpg)
Some BLAST definitions
• Max Score: Highest alignment score (according to a formula)
• Query Cover: % of the query length included in aligned segment
• E value: The number of alignments expected by chance with the calculated score or better
![Page 57: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/57.jpg)
Some BLAST definitions
• Max Score: Highest alignment score (according to a formula)
• Query Cover: % of the query length included in aligned segment
• E value: The number of alignments expected by chance with the calculated score or better
• Per. Identity: Highest % identity for a set of aligned segments to the same subject sequence.
![Page 58: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/58.jpg)
Does BLAST tell me what species I have identified?
![Page 59: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/59.jpg)
No*
![Page 60: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/60.jpg)
(Some) Limitations to BLAST
• Homology: BLAST is trying to indicate which homologous (related by ancestry) sequences are found in the database
![Page 61: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/61.jpg)
(Some) Limitations to BLAST
• Homology: BLAST is trying to indicate which homologous (related by ancestry) sequences are found in the database
• Data base coverage: BLAST returns its best result; that is not guaranteed to be the true result
![Page 62: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/62.jpg)
(Some) Limitations to BLAST
• Homology: BLAST is trying to indicate which homologous (related by ancestry) sequences are found in the database
• Data base coverage: BLAST returns its best result; that is not guaranteed to be the true result
• Locus resolution: Barcodes are often good for genus-level resolution
![Page 63: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/63.jpg)
A note on resolution (and controls)
Photo credithttp://physwiki.apps01.yorku.ca/index.php?title=Main_Page/BPHS_4090/microscopy_I
![Page 64: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/64.jpg)
Next time:
Multiple sequence alignments and phylogenetics
![Page 65: Barcoding Bioinformatics Part II · Experimental components/design • We haveDNA from unknownmosquitosamples • We can obtain DNA from known samples Materials Hypothesis • We](https://reader033.fdocuments.net/reader033/viewer/2022060313/5f0b61d77e708231d4303e61/html5/thumbnails/65.jpg)
dnalc.cshl.edu
DNALC Website and Social Media
dnalc.cshl.edu/dnalc-live