Adapter and quality trimming
description
Transcript of Adapter and quality trimming
Adapter and quality trimming
Mick WatsonDirector of ARK-Genomics
The Roslin Institute
ADAPTER TRIMMING
Illumina technology• Watch a video?
http://www.youtube.com/embed/45vNetkGspo
Illumina technology
Bridge Amplification
Key point:• Sequence from Illumina may contain adapters
QUALITY TRIMMING
Quality trimming• Take every read• Remove bases at 5’ end (usually) or 3’ end
(sometimes) that are below threshold• Either remove after first bad base• Or remove after average within sliding
window falls below threshold
Paired-end and mate-pair
700bp
3000bp
2 x 100bp reads approx. 500bp apart
2 x 50bp reads approx. 3000bp apart
A Paired-end
B Mate-pair
Paired reads• Paired reads represented by TWO fastq files• Often named the same with _1.fastq, _2.fastq• Or R1.fastq, R2.fastq
• Order of reads matters• Read 1 in file 1 paired with read 1 in file 2• Etc
• What happens if your quality trimmer removes read from one file but not the other?
Paired-end aware software?• We will use sickle to trim on quality– It is paired-end aware
• We will use cutadapt to remove adapters– It is not paired-end aware