CLC bio presentation at 5th SFAF 6/3/2010

15
CLC bio A Comprehensive Platform for NGS Data Analysis Saul A. Kravitz, PhD Director of Consulting Services

description

My presentation at the 5th Sequencing FInishing and Analysis in the Future (SFAF -- http://www.lanl.gov/conferences/finishfuture/2010SFAF_Meeting_Guide.pdf) June 3, 2010

Transcript of CLC bio presentation at 5th SFAF 6/3/2010

Page 1: CLC bio presentation at 5th SFAF 6/3/2010

CLC bioA Comprehensive Platform

for NGS Data Analysis

Saul A. Kravitz, PhDDirector of Consulting Services

Page 2: CLC bio presentation at 5th SFAF 6/3/2010

Before the Flood

2005: $5M Human genome – 19 sequencer years

Sample Prep AnalysisSequencing Science

Page 3: CLC bio presentation at 5th SFAF 6/3/2010

Nextgen Sequencing Revolution

Sample Prep AnalysisSequencing Science

2010: $6k Human genome ~1 sequencer day

Help!!

Page 4: CLC bio presentation at 5th SFAF 6/3/2010

Bioinformatics Challenges

•Data Analysis Tools for Biomedical Researchers•GUI-driven•HPC integration

•Unprecedented data volumes•Rapid technology change, applications growth

•Multi-platform data integration•No one-size-fits-all solutions

•Rapid customization and adaptation

Page 5: CLC bio presentation at 5th SFAF 6/3/2010

CLC bio NGS Analysis Platform

CLC Genomics WorkbenchCLC Genomics Server

CLC Assembly CellDeveloper SDK

Easy to use, Wizard-driven Desktop SoftwareEnterprise solution

High performance NGS algorithms

Workbench and Server Customization

Page 6: CLC bio presentation at 5th SFAF 6/3/2010

Swiss Army Knife of NGS Analysis

Genomics Transcriptomics EpigenomicsRNA-SeqmiRNA

CHIP-SeqRead MappingDe Novo AssemblySNP/DIP Detection

Visualization

File Format Conversion

Desktop SolutionsEnterpriseSolutions

Traditional Bioinformatics

Intuitive GUISDK

Tools Integration

High Performance

Page 7: CLC bio presentation at 5th SFAF 6/3/2010

Why not use free tools?

•Are tools free or “free”?

•Tools vs solutions

•True cost of ownership

•Ease of Use

•Tools integration

•Support

Page 8: CLC bio presentation at 5th SFAF 6/3/2010

Small RNA Analysis(in Beta soon)

•Identify and filter/trim adapters

•annotate using mirBASE and other resources

- target species of interest

•Merge/group by mature, precursor/reference

•Fully integrated with expression analysis

Page 9: CLC bio presentation at 5th SFAF 6/3/2010

De Novo Assembler

• Human assembly of 38x Illumina paired-end

• CLC Quality equivalent to Abyss

• CLC: 7 hrs, 1 node, 42 Gb of RAM

• Abyss: 80 hrs, 21 nodes, 336 Gb of RAM

• Metagenomics Assembly

• METAHIT Dataset MH0041 40M 75bp paired end

• 3 hrs on desktop, 6 Gb RAM

• Higher N50 and Total Contig Size than Reported

Page 10: CLC bio presentation at 5th SFAF 6/3/2010

Viral Sequencing at JCVI(See Nadia Fedorova’s Poster!)

• Amplify and Barcode using SISPA, 454 + Illumina Sequencing

• Depth of coverage sometimes >1000x

• De novo Assembly of Consensus for all Segments

• For each segment:

• Map reads from each technology independently using best full length reference from NCBI, call variations

• Update reference with variations confirmed by multiple technologies

• Map reads using updated reference and all reads

• Convert to consed, analyze, order Sanger closure reactions

Source: Jessica Hostetler, Nadia Federova, Tim Stockwell, Danny Katzel

Page 11: CLC bio presentation at 5th SFAF 6/3/2010

Why CLC bioTools?

• CLC handled hybrid sequencing technologies directly

• Very biased coverage confounded other assemblers that expect random arrival stats.  CLC didn’t seem to suffer from biased coverage. 

• Very accurate SNP calls in areas of deep coverage.

Tim StockwellDirector of Viral InformaticsJ. Craig Venter Institute

Page 12: CLC bio presentation at 5th SFAF 6/3/2010

Targeted Resequencing QC

•Assessment of targeted sequencing technology

•Coverage Statistics for Targeted Regions

•Very short schedule, limited bioinformatics staff

•Plug-in development leveraging CLC tools to automate the process and meet short deadline

•QC Report now available as plug-in

Page 13: CLC bio presentation at 5th SFAF 6/3/2010

Professional Services

•Developing customized solutions

•Integration with LIMS, workflows, DB

•Bioinformatics Algorithm Development

•Cloud and Grid Integration

•Data Analysis

Page 14: CLC bio presentation at 5th SFAF 6/3/2010

Thank you for listening

Saul A. Kravitz, [email protected](301)355-0813

Questions?

Page 15: CLC bio presentation at 5th SFAF 6/3/2010

Thank you for listening

Saul A. Kravitz, PhDskravitz @ clcbio.com 301)355-0813

Questions