Tutorial 6 High Throughput Sequencing. HTS tools and analysis Visualization - IGV Analysis platform...

38
Tutorial 6 High Throughput Sequencing

Transcript of Tutorial 6 High Throughput Sequencing. HTS tools and analysis Visualization - IGV Analysis platform...

Tutorial 6

High Throughput Sequencing

HTS tools and analysis

• Visualization - IGV

• Analysis platform – Galaxy

• Tuning up the pipelines

Working with IGV

:// . . / /http www broadinstitute org igv

Why and how to work with IGV

Base qualities, comparison between samples

False positive indels

Same mapping statistics – different meaning

What might cause this low percentage of mapping?

The sample contains a high percentage of contamination

The sample is very different from the reference genome

One image is worth a thousand words…

Structural Variations

Large deletion in the sample compared to the reference genome

Galaxy

Use your account name and password to login to Galaxy:

Uploading data to Galaxy

Mapping, filtering and conversion to BAM

Mapping

Filter SAM file

Convert SAM to BAM

Variant calling

Create pileup

Find variants

Tuning up the pipelines

1 mismatch per read

5 mismatches per read

How can mapping parameters affect the results

False positives vs. true negatives

3-bases insertion

One pipeline for all projects?

How can you tune your analysis?Try different programs.

Mapping:– Change mapping parameters– Use non-unique mappings– Don’t filter duplicates

Variants:– Change variant filtration – Change variant merging – penetrance, different heredity, low coverage in one

individual…– Look for bigger variants: big insertions/ deletions, inversions, copy number

variations etc.

Gene expression:– Change the test threshold