Filling in: Ioannis Pandis, PhD [email protected] CO341: Introduction to Bioinformatics Prof. Yi-Ke...
-
Upload
lily-newton -
Category
Documents
-
view
218 -
download
0
Transcript of Filling in: Ioannis Pandis, PhD [email protected] CO341: Introduction to Bioinformatics Prof. Yi-Ke...
![Page 1: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/1.jpg)
Filling in:
Ioannis Pandis, PhD
CO341: Introduction to Bioinformatics
Prof. Yi-Ke Guo ([email protected])
![Page 2: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/2.jpg)
Sequencing and Genomics
DNA Sequencing
Sequencing Analysis
Gene Expression
Gene Expression Analysis
Functional Genomics
![Page 3: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/3.jpg)
DNA Structure
Double Helix (Crick & Watson)– 2 coiled matching strands– Backbone of sugar phosphate pairs
Nitrogenous Base Pairs – Roughly 20 atoms in a base– Adenine Thymine [A,T]– Cytosine Guanine [C,G]– Weak bonds (can be broken)– Form long chains called polymers
Read the sequence on 1 strand– GATTCATCATGGATCATACTAAC
![Page 4: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/4.jpg)
Differences in DNA
2% tiny
Roughly 4%
Share
Materia
l
DNA differentiates:– Species/race/gender– Individuals
We share DNA with– Primates,mammals– Fish, plants, bacteria
Genotype– DNA of an individual
Genetic constitution
Phenotype– Characteristics of the
resulting organism Nature and nurture
![Page 5: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/5.jpg)
Genes Chunks of DNA sequence
– Between 600 and 1200 bases long– 22,000 human genes, 100,000 genes in tulips
Large percentage of human genome – termed “junk”: does not code for proteins
“Simpler” organisms such as bacteria– Are much more “evolved” (have hardly any junk)– Viruses have overlapping genes (zipped/compressed)
Often the active part of a gene is split into exons– Separated by introns
![Page 6: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/6.jpg)
Transcription Take one strand of DNA Write out the counterparts to each base
– G becomes C (and vice versa)– A becomes T (and vice versa)
Change Thymine [T] to Uracil [U] You have transcribed DNA into messenger RNA Example:
Start: GGATGCCAATGIntermediate: CCTACGGTTACTranscribed: CCUACGGUUAC
![Page 7: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/7.jpg)
The Synthesis of Proteins
Instructions for generating Amino Acid sequences– (i) DNA double helix is unzipped– (ii) One strand is transcribed to messenger RNA – (iii) RNA acts as a template
ribosomes translate the RNA into the sequence of amino acids
Amino acid sequences fold into a 3d molecule Gene expression
– Every cell has every gene in it (has all chromosomes)– Which ones produce proteins (are expressed) & when?
![Page 8: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/8.jpg)
Genetic Code
How the translation occurs
Think of this as a function:– Input: triples of three base letters (Codons)– Output: amino acid– Example: ACC becomes threonine (T)
Gene sequences end with: – TAA, TAG or TGA
![Page 9: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/9.jpg)
Example Synthesis
TCGGTGAATCTGTTTGAT Transcribed to:
AGCCACUUAGACAAACUATranslated to:
SHLDKL
![Page 10: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/10.jpg)
Evolution of Genes: Inheritance
Evolution of species– Caused by reproduction and survival of the fittest
But actually, it is the genotype which evolves– Organism has to live with it (or die before reproduction)– Three mechanisms: inheritance, mutation and crossover
Inheritance: properties from parents– Embryo has cells with 23 pairs of chromosomes– Each pair: 1 chromosome from father, 1 from mother– Most important factor in offspring’s genetic makeup
![Page 11: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/11.jpg)
Evolution of Genes: Mutation Genes alter (slightly) during reproduction
– Caused by errors, from radiation, from toxicity– 3 possibilities: deletion, insertion, substitution
Substitution: ACGTTGACTC ACGATGACTT Deletion: ACGTTGACTC ACGTGACTC Insertion: ACGTTGACTC AGCGTTGACTC
– Frameshift: ACGTTGACTC AGCGTTGACTC
Mutations are categorised into:– Neutral or– Deleterious
A single change has a massive effect on translation Causes a different protein conformation
![Page 12: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/12.jpg)
Evolution of Genes: Crossover (Recombination)
DNA sections are swapped – From male and female genetic input to offspring DNA
![Page 13: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/13.jpg)
Sequencing for Medical Study
Phenotype
Genotype
Hypothesis
Test HypothesisBy Genetic Manipulation
![Page 14: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/14.jpg)
Typical Cycle of the Study
Phenotype
Genotype
Hypothesis:
Test HypothesisBy Genetic Manipulation
Two groups:1.Develop
Colorectal cancerAt Young Age
2. Do not
Mutation in APCGene
APC is a Tumor Supressor Gene
Delete APC in MouseControl: Isogenic APC+
![Page 15: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/15.jpg)
Technologies Required
Phenotype
Genotype
Hypothesis
Test HypothesisBy Genetic Manipulation
Observation
?Sequencing?
Reading/Thinking
Gene Deletion/Replacement
In 2005$9 million/genome
Not feasible
![Page 16: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/16.jpg)
The thing is changing rapidly: Bp/$$ increases exponentially with time
Adapted from Shendure et al 2004
In 1980, the sequencing cost per finished bp ≈ $1.00In 2003, the sequencing cost per finished bp ≈ $0.01
>>> a 100-fold reduction in 20-25 years
![Page 17: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/17.jpg)
History of DNA Sequencing History of DNA Sequencing
Avery: Proposes DNA as ‘Genetic Material’
Watson & Crick: Double Helix Structure of DNA
Holley: Sequences Yeast tRNAAla
1870
1953
1940
1965
1970
1977
1980
1990
2002
Miescher: Discovers DNA
Wu: Sequences Cohesive End DNA
Sanger: Dideoxy Chain TerminationGilbert: Chemical Degradation
Messing: M13 Cloning
Hood et al.: Partial Automation
• Cycle Sequencing • Improved Sequencing Enzymes
• Improved Fluorescent Detection Schemes
1986
• Next Generation Sequencing• Improved enzymes and chemistry
• Improved image processing
Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)
1
15
150
50,000
25,000
1,500
200,000
50,000,000
Efficiency(bp/person/year)
15,000
100,000,000,000 2008
1928???
![Page 18: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/18.jpg)
History of DNA Sequencing History of DNA SequencingAdapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)
Griffith's experiment, reported in 1928 by Frederick Griffith
![Page 19: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/19.jpg)
History of DNA Sequencing History of DNA Sequencing
Avery: Proposes DNA as ‘Genetic Material’
Watson & Crick: Double Helix Structure of DNA
Holley: Sequences Yeast tRNAAla
1870
1953
1940
1965
1970
1977
1980
1990
2002
Miescher: Discovers DNA
Wu: Sequences Cohesive End DNA
Sanger: Dideoxy Chain TerminationGilbert: Chemical Degradation
Messing: M13 Cloning
Hood et al.: Partial Automation
• Cycle Sequencing • Improved Sequencing Enzymes
• Improved Fluorescent Detection Schemes
1986
• Next Generation Sequencing• Improved enzymes and chemistry
• Improved image processing
Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998)
1
15
150
50,000
25,000
1,500
200,000
50,000,000
Efficiency(bp/person/year)
15,000
100,000,000,000 2008
![Page 20: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/20.jpg)
Sanger Sequencing(Chain-termination Methods)
DNA is fragmented Cloned to a plasmid
vector Cyclic sequencing
reaction Separation by
electrophoresis Readout with
fluorescent tags
![Page 21: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/21.jpg)
Basics of the “old” technology Clone the fragmented DNA. Generate a ladder of labeled (colored) molecules that
are different by 1 nucleotide. Separate mixture on some matrix. Detect fluorochrome by laser. Interpret peaks as string of DNA. Strings are 500 to 1,000 letters long Assemble all strings into a genome
The Process Is Sequential
![Page 22: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/22.jpg)
3 ∙ 109 bp
1x coverage
10x coverage
2 ∙ 106 bp/day= 40 years
× 3 ∙ 109 bp
10x coverage × 3 ∙ 109 bp × $0.001/bp = $30 million
That is what old technology take
![Page 23: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/23.jpg)
New Generation Sequencing
![Page 24: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/24.jpg)
Basics of the “new” technology Get DNA and fragment it Attach all fragments to glass slides. Perform amplification by some form of PCR Sequencing ALL these fragments in PARALLLE using chain
termination or other methods such as pyro-sequencing Extend and amplify signal with some color scheme. Detect fluorochrome by microscopy. Interpret series of spots as short strings of DNA. Strings are 30-300 letters long Multiple images are interpreted as 0.4 to 1.2 GB/run
(1,200,000,000 letters/day). Map or align strings to one or many genome.
Making Millions Short Sequence Reads in Parallel
![Page 25: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/25.jpg)
Technology Overview: Solexa/Illumina Sequencing
http://www.illumina.com/
![Page 26: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/26.jpg)
Immobilize DNA to Surface
Source: www.illumina.com
![Page 27: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/27.jpg)
Sequence Colonies
![Page 28: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/28.jpg)
Sequence Colonies
![Page 29: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/29.jpg)
Call Sequence
![Page 30: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/30.jpg)
From Debbie Nickerson, Department of Genome Sciences, University of Washington, http://tinyurl.com/6zbzh4
![Page 31: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/31.jpg)
Sequence Alighment
Meyerson et al, 2011
![Page 32: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/32.jpg)
2006: $10 million 2008: $100,000 2009: $10,000 2010: $5,000 2012: $1,000 ??? $100
So, how fast is cost going down?
![Page 33: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/33.jpg)
Informatics Informatics challenge : ample applications
– All the genomics research can be uniformly done through sequencing (with the help of proper assay design)
– Bioinformatics turns the sequencer into universal genomics interpreter
– Not a challenge, rather a big opportunity!!!
For Edison, phonograph was not primarily designed for playing music but …….
![Page 34: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/34.jpg)
One Stone, Many Birds:NGS May Enable a Uniform Bioinformatics
Mapped Position : Structure/functionality
(Mapping)
BP Variant: SNP & Mutation Pattern
(Detecting)
Read Numbers:Quantified Abundance
(Counting)
![Page 35: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/35.jpg)
Match These Sequences
How do we match this sequence:
gattcagacctagct
With this sequence:
gtcagatcct
![Page 36: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/36.jpg)
Possible Answers
1. gattcagacctagct (no indels) gtcagatcct
2. gattcaga-cctagct (with indels) g-t-cagatcct
3. gattcagacctagc-t (no overhang) gtcagatcct
4. gattcagacctagct (with overhang) gtcagatcct
![Page 37: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/37.jpg)
Sequence Matching Algorithms #1
Without indels Hamming distance Scoring schemes
– Certain changes in sequence more likely Due to chemical properties of the residues
BLAST algorithm– Idea: match local regions and expand– Seven part process
![Page 38: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/38.jpg)
Sequence Matching Algorithms #2
With indels Drawing of Dotplots Dynamic Programming
(getting from A to B)Quickest route to Z + Quickest route from Z
VPFLLMMVLGVPFMMLG
A
B
ZGD
C
E
F
![Page 39: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/39.jpg)
Searching Databases
We have ways to score how well 2 seqs match Now want to use this in databases
– Given a known gene sequence– Which genes in the database are closely related
Have to worry about:– Repeated subsequences biasing matches– Accuracy and significance of matches– Sensitivity and specificity (false + and false -)
![Page 40: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/40.jpg)
Functional Genomics—Transcriptomics
Transcriptome – the complete set of coding and non-coding RNA molecules in a cell at a particular time: Varies between cell types
Transcriptomics – the study of the transcripts in a cell, cell type, organism, etc.
![Page 41: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/41.jpg)
Methods for Transcriptomics Microarray-based:
– High-throughput gene expression profiling– Hybridization of labeled cDNAs to an array of complementary DNA
probes– Measurement of expression levels based on hybridization intensity
Sequence-based:– Full-length cDNA (FLcDNA) sequencing: complete sequencing of
cDNA clone– Expressed sequence tag (EST) sequencing: Single-pass
sequencing of cDNA clone– Serial Analysis of Gene Expression (SAGE):
Short sequence tags at 3’ end of transcript Tags concatenated and sequenced
NGS enables whole transcriptome sequencing : Sequence Census Method
![Page 42: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/42.jpg)
Machine Learning
Machine learning (inductive reasoning)– Automatic proposing of hypotheses based on data– Has many applications in bioinformatics, such as
microarray analysis Example: predictive toxicology
– Given: set of toxic drugs and a set of non-toxic drugs– Given: background information (chemistry, etc.)– Produces: hypothesis why drugs are toxic/toxis
mechanism Overview of machine learning
– Aims, techniques, methodologies, representations Artificial neural networks Support vector machine et.al
![Page 43: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/43.jpg)
Machine Learning
Larrañaga et al. 2005
![Page 44: Filling in: Ioannis Pandis, PhD i.pandis@ic.ac.uk CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo (yg@ic.ac.uk)](https://reader036.fdocuments.net/reader036/viewer/2022062517/56649ef65503460f94c09cbb/html5/thumbnails/44.jpg)
QUESTIONS?The End!