September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology,...

12
Second-Generation Applications to Third-Generation Progress Cambridge Healthtech Institute’s Fourth Annual Cambridge Healthtech Institute’s Third Annual Evolution of Next-Generation Sequencing Register by August 20 th and Save up to $200 September 27-29, 2010 Rhode Island Convention Center Providence, RI Deciphering the Sequencing Data Deluge Next-Generation Sequencing Data Management Corporate Sponsors Corporate Support Sponsors Premier Sponsors

Transcript of September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology,...

Page 1: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

Second-Generation Applications to Third-Generation Progress

Cambridge Healthtech Institute’s Fourth Annual

Cambridge Healthtech Institute’s Third Annual

Evolution of Next-Generation Sequencing

Register by August 20th

and Save up to $200

September 27-29, 2010Rhode Island Convention CenterProvidence, RI

Deciphering the Sequencing Data Deluge

Next-Generation Sequencing

Data ManagementCorporate Sponsors Corporate Support

SponsorsPremier Sponsors

Page 2: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

2 Next-GeNeRatIoN SequeNCING data maNaGemeNt

Pre-Conference Short CoursesSuNdaY, SePtemBeR 26 | 2:00 – 5:00 Pm

Pre-Conference Short Course 1*

1:30 – 2:00 pm Short Course Registration

2:00 Opening RemarksStan Gloss, Founding Partner and Managing Director, BioTeam, Inc.

2:10 Next-Generation Sequencing Analysis and BeyondMichele Clamp, Senior Consultant, BioTeam, Inc.DNA sequencing technology is moving at a lightning pace. The last few years have seen the cost of sequencing drop by an order of magnitude, and the next two years seem likely to deliver another huge change. So much change in such a short period of time results in a need for radical change to our data storage, mining and analysis needs. This talk will include:

- Designing effective sequencing strategies

- Dealing with increasing amounts of sequencing data

- Efficient first pass analysis methods for read mapping and assembly

- Getting the most from RNAseq data

- Downstream annotation

2:45 Galaxy: Making NGS Analyses Accessible for AllDaniel Blankenberg, Ph.D., Postdoctoral Research Associate, Biochemistry & Molecular Biology, Pennsylvania State UniversityRecent rapid proliferation of DNA sequencing technology has enabled any investigator, for a modest cost, to produce enormous amounts of sequence data; however, working with this large-scale sequencing data yields significant challenges for even the largest institutions, let alone individual investigators and small labs. Here, we present Galaxy, an open-source analysis framework that is available as a free public service and able to be effortlessly deployed on both private hardware or cloud resources. The Galaxy platform empowers transpar-ent and reproducible research by providing interactive access to popular tools, including those that allow manipulation of raw sequencing reads, mapping, peak calling, genomic interval operations, visualization at genome browsers and more, as well as a point-and-click workflow system. Using Galaxy, a user without computational expertise can, for example, perform a complete ChIP-Seq analysis beginning with raw sequencing reads and continuing through visu-alizing called peaks at reference or custom-built genome browsers, all without leaving the familiar interface of a web browser.

3:20 Refreshment Break

3:55 Practical Bioinformatics for an Academic Next-Gen Sequencing Core LabStuart Brown, Ph.D., Associate Professor, Center for Health Informatics and Bioinformatics, New York University School of MedicineIn an academic biomedical research institution, the next-gen sequencing core lab often requires extensive bioinformatics support. Many investigators who are interested in using next-gen sequencing technology in their research projects do not have the necessary bioinformatics skills in their laboratory to analyze and interpret the data. At NYU, our Genome Sequence Informatics group par-ticipates in experimental design, grantwriting, data QC, primary data process-ing (basecalling, alignment to reference genomes, parsing and reformatting primary sequence data output files), and extensive downstream data analysis projects (ChIP-seq peak calling, variant discovery, differential gene expression, epigenomics, metagenomics, etc.). We also work closely with IT to develop data storage and delivery systems.

4:30 Closing Panel Discussion

5:00 pm End of Short Course

SuNdaY, SePtemBeR 26 | 2:00 – 5:00 PmPre-Conference Short Course 2* Sponsored by

Target Enrichment for NGSNext-generation sequencing technology has increased the ability to sequence DNA in a massively parallel manner. Nevertheless, routine genetic screens in large cohorts of individuals remain cost- prohibitive through whole genome sequencing approaches. Agilent Technologies has developed the SureSelect platform, a portfolio of sample-preparation products to enable next-generation sequencing users to focus analysis to particular genomic loci with substantial cost savings.

2:00 Opening Remarks

2:10 Exome Capture and Sequencing in Identifying Mutations Responsible for Rare Recessive DisordersJacek Majewski, Ph.D., Assistant Professor, Canada Research Chair, Human Genetics, McGill University and Genome Quebec Innovation Centre

2:50 Laboratory Methods and Analysis Considerations for Efficient Analysis of the Human ExomeShawn Levy, Ph.D., Faculty Investigator, Hudson Alpha Institute for Biotechnology

3:30 Refreshment Break

3:55 Detection of Inherited Mutations for Breast and Ovarian Cancer Using Genomic Capture and Massively Parallel SequencingTom Walsh, Ph.D., Research Assistant Professor, Medical Genetics, University of Washington

4:35 Closing Panel Discussion

tueSdaY, SePtemBeR 28 | 6:00 – 9:00 PmConference Short Course 3* Sponsored by

Cloud ComputingThis presentation will cover real-world use cases across drug discovery & design, collaboration, next generation sequencing, proteomics, software as a service, and bioinformatics, to explore how life sciences are using cloud computing, its challenges and effectiveness, how money can be saved by an organization, and regulatory compliance. Dinner will be served.

5:45 pm Buffet Dinner Selection

6:10 Opening Remarks: Life Sciences and Cloud Jason Stowe, CEO, Cycle Computing

6:20 High-Throughput Genome Sequence Analysis on the Cloud Toby Bloom, Ph.D., Director of Informatics, Genome Sequencing Center, Broad InstituteNext-gen sequencing technologies generate huge amounts of data. Analyzing that data requires a level of IT support not available in many small research facili-ties, and presents numerous challenges for collaborators needing to share that data across multiple research centers. We will discuss our efforts to address these challenges through use of cloud computing and our results thus far.

*Separate registration required

Page 3: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

3 Next-GeNeRatIoN SequeNCING data maNaGemeNt

Pre-Conference Short Courses continuedtueSdaY, SePtemBeR 28 | 6:00 – 9:00 Pm CoNtINued

6:50 Molecular Modeling Research in the Cloud at Schrodinger Peter S. Shenkin, Ph.D., Vice President, SchrodingerSchrodinger is a leading researcher in computational molecular modeling. As such, we develop software that runs on the Cloud and we also use the Cloud for internal research, drug-discovery collaborations with commercial partners, and infrastructure needs. In this talk, we describe our experience with scien-tific computing on the Cloud and present the pros and cons of moving col-laboration applications like e-mail, documents, and calendaring to the Cloud.

7:20 Cloud Computing at U. Penn ITMAT Bioinformatics Facility Angel Pizarro, Director, University of Pennsylvania ITMAT Bioinformatics FacilityThe cloud provides a valuable tool for doing bioinformatics research, and this presentation will discuss various operational issues in using Cloud Computing for aiding and abetting high-throughput biomedical research. The focus will be “what is the simplest way to get the job done”, using recent projects involving next-generation sequencing and proteomics. We will cover a specific examples of bootstrapping a system to install, configure and start a simple Ruby-based Map-Reduce system, Cloud Crowd, as well how to get the data to and from the cloud.

7:50 NGS Workflows on the Cloud: From PoC to Production David Powers (formerly of Eli Lilly, Evangelist), Senior Analyst, Business Development, Cycle ComputingIn this talk we will compare and contrast real life workflows that have been moved to the cloud. We will dive into the technical implementation details to explore the workflows, dealing with data synchronization, and how we leverage cloud services in AWS to process large data. Use cases will include secondary/tertiary- analysis, data archival, and re-analysis. Comparisons between internal clusters for small/large sites, and cloud-based methods will also be reviewed.

8:20 Using the Cloud for Interactive Data Analysis of Biopharma Market Data Pieter Sheth-Voss, Ph.D., Senior Research Director, Quintiles Market Intelligence and AnalyticsIn this talk, we describe our experiences developing Provenance®, a cloud-based platform for interactive data visualization and modeling developed within Quintiles Market Intelligence and Analytics for analysis of complex healthcare market data. Specifically, discuss specific challenges we set out to address with the cloud. These include (a) handling a diversity of data models, (b) provi-sioning a high-speed database, (c) ensuring security, (d) minimizing latency, (e) planning for scalability, and (f) synchronizing data with external sources. We also discuss a few practical aspects within our subject domain that may be more broadly relevant. Our goal is to provide practical insights for researchers seeking to use the cloud for their own compute-intensive end user applica-tions. On the commercial side of biotech/pharmaceutical development, exten-sive analysis of market data is often required to identify and characterize the market opportunities to help guide clinical development decisions, such as trial design and research priorities. Market intelligence may include both “primary” data collected directly from physicians, patients, payers and other constituents, as well as “secondary” data including patient insurance claims, chart audits, electronic health records, and prescription data.

8:50 Closing Remarks Jason Stowe, CEO, Cycle Computing

9:00 pm End of Short Course

*Separate registration required

Page 4: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

Next-Generation Sequencing Data ManagementmoNdaY, SePtemBeR 27, 2010

7:30 am Conference Registration and Morning Coffee

PLeNaRY KeYNote SeSSIoN 8:30 Chairperson’s Remarks

8:40 African Genomes: Charting Human Diversity

Stephan Schuster, Ph.D., Professor, Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a

much larger genetic diversity than human popula-tions in other continents. Sequencing and analysis of African genomes will allow assessing the breadth of human genetic diversity and benefit research into common and rare variant diseases.

9:20 Family Genome Sequencing as an Approach to Disease Genetics

David J. Galas, Ph.D., Professor and Senior Vice President, Institute for Systems BiologyWhole genome sequencing of families

adds a new dimension to genetic analysis. The power of these data in increasing accuracy, iden-tification of rare variants and in evaluating genetic models for segregating traits is substantial. The complementarity of these new methods with pop-ulation-based and linkage approaches signals new opportunities and challenges for computational methods and for integrating systems biology with genetic analysis. The benefit of these approaches will be substantial in, but not limited to, human genetics of disease and health.

10:00 Coffee Break

10:30 Finding a Needle in a HaystackW. Richard McCombie, Ph.D., Professor, Cold Spring Harbor Laboratory

11:10 What Might Ubiquitous Sequencing Mean?Keith Batchelder, Ph.D., Founder and CEO, Genomic Healthcare Strategies

• Howwillitbeusedinwaysweexpect?Inwayswedon’texpect?

-- Beyond cancer – in your local Walmart

-- Multiple times for single individuals

• Willmoneyordatabemoreimportant?

• Wherewillsequencingtakeplace?

• Whatwillthechallengesbefornextgencom-panies?

• Whataresomestrategies thatmightbeusedbynextgencompaniestobesuccessful?

11:50 Close of Morning Session

12:00 Luncheon Presentation Sponsored by

Advances in Target Enrichment using Agilent’s SureSelect PlatformEmily Leproust, Ph.D., Director, Applications & Chemistry R&D, Agilent TechnologiesRoutine genetic screens in large cohorts of individu-als remain cost-prohibitive through whole genome sequencing approaches. To this end, Agilent Tech-nologies has developed the SureSelect platform, a portfolio of sample preparation products enabling users to focus analysis to particular genomic loci with substantial cost savings. We will demonstrate the flexibility and functionality of the SureSelect in-solution method through targeted sequence analy-sis of: (i) subsets of the human genome such as the exome and disease-focused designs, and (ii) custom content ranging in size, complexity, and chromosomal location. We discuss performance with respect to capture efficiency, uniformity, repro-ducibility of enrichment, and ability to detect SNPs.

SequeNCING teCH exPoWhat is your sequencing goal? Platformsare moving away from the one-size-fits-all approach and maturing into application-based methods. CHI’s Sequencing TechExpo show-cases industry leaders as they present the technological attributes and bioresearch appli-cations of their latest platforms.

2:00 Chairperson’s RemarksKevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

2:00-5:00 Sponsored Seminars Sponsors include Additional sponsorship opportunities available.

LIFE TECHNOLOGIES TECH EXPO- Speaker and Title to be announced

ILLUMINA TECH EXPO- Dramatic increases in throughput and yield con-tinue to drive sample preparation development for both the Illumina Genome Analyzer and HiSeq Plat-forms at the Broad Institute. Here, we report the following:• ImprovementstosampleprepautomationandQC.• Automation of enriched library quantification, nor-

malization, and denaturation to minimize cluster amplification fail rate.

• Optimization of the cluster amplification andsequencing processes to maximize data yield and quality.

• Our experience with the implementation of Illu-mina’s latest hardware, software, and kit releases, including cBot, the “95G” configuration for GAIIx, and the HiSeq platform.

As a result of the above improvements, while yield and throughput have increased, both error rates and rework rates have decreased. Moreover, efforts to automate upstream sam-ple prep and to attain target cluster densities with high repeatability and reproducibility are applicable to both the GA and HiSeq platforms.

SequeNCING teCH exPoROCHE TECH EXPO– A discussion of the Technology and Importance of Massively Parallel Long-Read Sequencing in the Challenging Applications of Metagenomic and cDNA Studies

Clotilde Teiling, 454 Genome Sequencing, Roche Applied Science

The $6,000 Genome and BeyondMichael Rhodes, Ph.D., Senior Manager Product Applications, Life TechnologiesLife Technologies is the world’s leading sequencing company, offering a complete portfolio of solutions - from capillary instru-ments through to ultra high throughput sequencing with the SOLiD™ PI technology and SOLiD 4 hq upgrades available soon. Various sequencing technologies will be dis-cussed, including the SOLiD™ system which currently generates up to 100 gigabases of mapped data in a single run. Some applica-tions successfully carried out on the SOLiD system include: complete cancer genome re-sequencing, single cell transcriptome analy-sis, interrogation of methylation status and de novo sequencing. The latest applications will be presented - including results, data analysis and automation solutions.

5:00 Interactive Panel Discussion with Sequencing LeadersModerator: Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT WorldAs the genome unit price of NGS platforms continues to tumble, excitement is growing about the scientific and commercial potential of sequencing systems. This Interactive Panel Discussion continues the traditional “roast” by the moderator and panelists, while trading insights on the latest NGS scientific and tech-nological advances. Full audience participation is warmly encouraged!

5:30 Welcoming Reception and Grand Opening of Exhibit Hall

7:00 Close of Day

tueSdaY, SePtemBeR 28, 20107:30 am Breakfast Presentation Data Intensive Discovery Initiative – Gene/Environment Interaction Analysis on Hybrid FPGA Data Intensive SupercomputersMurali Ramanathan, Associate Professor of Pharmaceutical Sciences, University at Buffalo, SUNYIn the bio-pharma research domain, rapid advance-ment in next-gen sequencing instruments, imaging systems, simulations, and discovery-to-market pro-

4 Next-GeNeRatIoN SequeNCING data maNaGemeNt

Page 5: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

cesses are generating vast data sets that need to be analyzed effectively to create new knowledge. Iden-tifying critical interactions between many variables is a key problem in many applications. Massively parallel FPGA database appliances parallelize both data storage and processing. They have been widely accepted for business/Internet search applications. Database appliances analyze massive data sets rap-idly, delivering hundred-fold or even thousand-fold speedups over server-based architectures.

8:15 Successful Sequencing Discussion Groups

Grab a cup of coffee and join a facilitated discus-sion group focused around specific themes. This unique session allows conference participants to exchange ideas, experiences, and develop future collaborations around a focused topic.

Designing Data Infrastructure, Management and Storage

9:15 Chairperson’s Remarks

9:20 Challenges in Analysis and Management of Sequencing Data at Large ScaleMatthew Trunnell, Ph.D., Manager, Research Computing, The Broad Institute of MIT and HarvardThe Broad Institute’s adoption of second-generation sequencing technologies over the past three years has driven a 25-fold growth in the size of its data repositories, placing new pressures on its IT infra-structure, motivating different approaches to data analysis, and demanding a new level of attention to issues of data management. As second and third-generation sequencing technologies continue to drive down the cost of sequencing, IT and informat-ics expenses comprise an increasingly large frac-tion of total sequencing costs to the point where consideration of data life cycle management must explicitly weigh financial factors against scientific expectations. This talk will discuss The Broad Insti-tute’s ongoing adjustment to data management at this new multi-petabyte scale and will also briefly highlight some new approaches to data analysis being actively explored.

9:50 Setting Up the Data Infrastructure of NGS DataSamuel Fulcomer, Assistant Director, Center for Computation and Visualization, Brown University Using Brown’s system as a case study, I will discuss strategies for designing the low-level system stor-age and I/O systems for a parallel computing envi-ronment. After providing a short survey of the I/O behaviors of some typical informatics applications, I will then present a case for changing the demarca-tion of the application/system interface, and mov-ing much more of data management into an active repository system better able to optimize storage and caching for application access patterns.

10:20 Improving Sequence Sponsored by Data UtilityHeather Koshinsky, Ph.D., CSO and Co-Founder, Eureka GenomicsWhat is most useful: longer reads, paired end reads, mate pair reads, or just more reads no matter what they are? The logic behind each type of sequence

data will be explored and specific examples of how using the best type of data for a specific project goal can dramatically increase the success of the project.

10:35 Networking Coffee Break, Poster and Exhibit Viewing

Pipeline Management11:15 Managing Change in a Next-Generation Sequencing PipelineGregg TeHennepe, Senior Manager, Research Liaison, Information Technology, The Jackson Laboratory There are multiple stages and dozens of compo-nents in next-generation sequencing pipeline. Each component, from chemistries to image analysis to base calling algorithms, has regular updates that improve its quality and/or reliability. Ensuring that these changes are handled in a reliable way is critical to preventing failed runs and tens of thousands of dollars in lost chemicals, samples, and staff time.

11:45 Composite Priority Scores for Next-Generation Sequence Interpretive Analysis: Order from DisorderJames Lyons-Weiler, Ph.D., Scientific Director, Senior Research Scientist, Genomics and Proteomics Core Laboratories, University of Pittsburgh This presentation will describe the development of novel information scores for establishing priority hits for follow-up validation at the base locus level. We have adopted two next-generation sequencing platforms; for whole genome sequencing the error rates, while low, result in tens of thousands of potential false positives to screen through. Our novel score, Ambiguity, is an entropy-based score that appears to be super-critical for ruling out false positive leads.

12:15 pm Close of Session

12:30 Luncheon PresentationFrom Deluge to Discovery: Sponsored by Achieving Greater Research Productivity with Powerful Yet Simple Storage SolutionsChris Blessington, Senior Director of Marketing and Communications, Isilon Next-generation sequencing is currently experi-encing unprecedented growth of critical file-based research data, which traditional storage was sim-ply not designed to manage. Though next-gen sequencing relies heavily on massive data stores, the IT staffs tasked with supporting this research are often small and without dedicated storage tech-nicians, further complicating the problem. How-ever, there is a solution. Scale-out storage accel-erates discovery while simplifying management and reducing costs. Through real world examples including multiple other organization presenting at NGSDM, attendees will learn best practices for leveraging scale-out storage in next-gen sequenc-ing environments to generate significantly greater research productivity at much less cost.

I’ve Bought My Sequencer – Now What the Heck Do I Do with the Data?As never before, the sequencing data deluge has partnered platform manufacturers and software developers with NGS end-users. Join software developers and NGS users in a unique session that partners the user’s data deluge with a company’s software solution, leading to comprehendible bio-logical results.

2:00 Chairperson’s Remarks

2:05 Partnering Company #1: Sponsored by

Deciphering Microbial Diversity in Complex Environments -A Case StudyScot E. Dowd Ph.D., Director, Research and Testing Laboratory, Director, Pathogenius DiagnosticsDetermining the microbial diversity and composi-tion of complex environments has been advanced greatly by next generation pyrosequencing approaches. There are several important tools for estimating microbial diversity. These often involve complex workflows of multiple alignments, dis-tance matrices, and modeling. Here we provide an interesting approach to evaluating the diversity and evenness of environments using DNASTAR’s Seq-Man NGen assembler as an example of the flex-ibility of this sequence assembly engine.

2:35 Partnering Company #3: Sponsored by

RNAseq as a Tool to Investigate the Genetic Risk for Alcoholism Richard A. Radcliffe, Ph.D., Associate Professor of Pharmacology, Department of Pharmaceutical Sciences, Anschutz Medical Campus, University of Colorado Denver Acute sensitivity to alcohol is thought to be an important genetically-mediated risk factor for the development of alcoholism. Using RNAseq meth-ods, gene expression and coding region differences were examined from the brains of mouse lines that were selectively bred for extreme differences in acute alcohol sensitivity. The results provide insights into the genes, pathways, and networks that contribute to individual differences in risk for developing alcoholism.

3:05 Partnering Company #2 Sponsored by

A Novel Web 2.0 Solution to Address the Sequencing Data Analysis BottleneckAndreas Sundquist, Ph.D., Co-Founder and CEO, DNAnexusDNA sequencing output has completely out-stripped the pace of Moore’s law, making a data analysis bottleneck inevitable. At the same time, sequencing costs have plummeted, bringing large-scale sequencing projects within the grasp of more institutions. New, easily accessible informatics solutions are needed to enable researchers to fully harness the advancements in DNA sequencing technology for practical use. DNAnexus will present an integrated, web-based sequence analysis plat-form powered by cloud computing that removes the data analysis bottleneck from next-generation sequencing.

Next-GeNeratIoN SequeNcING Data MaNaGeMeNt 5

Page 6: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

6 Next-GeNeratIoN SequeNcING Data MaNaGeMeNt

3:20 Refreshment Break, Poster and Exhibit Viewing

4:15 Mapping Genomes in 3DErez Lieberman-Aiden, Ph.D., Fellow, Harvard Society of Fellows, Harvard UniversityThis presentation describes Hi-C, a novel technol-ogy for probing the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We used Hi-C to construct spatial proximity maps of the human genome at a resolution of 1Mb. These maps confirm the presence of chromosome terri-tories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

4:45 ChIP-seq: Data Analysis and Applications to EpigeneticsPeter Park, Ph.D., Assistant Professor, Pediatrics, Harvard Medical School ChIP-seq combines chromatin immunoprecipitation (ChIP) with next-generation sequencing to identify protein-DNA interactions and chromatin modifica-tions genome-wide. A number of practical issues in analysis of ChIP-seq data will be discussed, includ-ing experimental design, detection of binding sites, and determination of whether a sufficient depth of sequencing has been achieved. Application of ChIP-seq to the study of X-chromosome dosage compensation and stem cell differentiation will be described.

5:15 Technology Spotlight Sponsored by Talk title and speaker to be announced

5:30 Close of Day and Short Course Registration

6:00-9:00 Dinner Short Course*SC3: Cloud Computing

Sponsored by

See page 8 for short course details*Separate Registration Required

WeDNeSDaY, SePteMBer 29, 20107:30 am Breakfast Presentation

Next Gen Data Management for Next Gen Life SciencesSponsored by

Will McGrath, Business Development Manager, Life Sciences, QuantumThe advent of next generation sequencers are contributing to orders of magnitude more data to sift through, analyze, and share, increasing com-plexities of genomic sequencing workflows. Added complexity means added risk, jeopardizing time to discovery. A tightly integrated scalable high-per-formance computing platform with intelligent data management options could make all the difference. Learn how to deploy an end-to-end data manage-ment infrastructure that solves the most demand-ing Next Gen Sequencing Workflows so your scien-

tists’ strive for the next major medical breakthrough or discovery go unimpeded.

Sequencing in the clouds8:15 Chairperson’s Remarks

8:20 Cloud Computing: A New Business Paradigm for Biomedical Information SharingArnon Rosenthal, Ph.D., Principal Scientist, Cognitive Tools and Data Management, The MITRE Corporation We aim to help biomedical informaticists decide whether, where, and how to employ cloud tech-nology. Because the Cloud literature has become extensive, we emphasize considerations for bio-medical laboratories, and especially data sharing consortia. Methodologically, we formulate our anal-yses in terms of the component technologies that typically constitute a cloud, bypassing the cacoph-ony of competing definitions; this formulation also allows us to analyze alternatives (e.g., an institu-tion’s data center, or grid computing) that employ some of the component technologies. Our compari-sons are based on two primary criteria: flexibility in establishing or extending collaborations, and the protection accorded to shared data and an organiza-tion’s other systems. In view of the many studies showing cost savings (in some cases), we conclude that cloud technology is an attractive option for data sharing consortia because a) the cost of ownership is low, b) new members can easily be added to the consortium and c) security concerns are no worse than the alternatives.

8:50 Very Large Scale Metagenome Analysis with MG-RASTFolker Meyer, Ph.D., Computational Biologist, Mathematics and Computer Science Division, Argonne National Laboratory With metagenomic data sets growing both in size and abundance, resources for processing, analysis and comparison of metagenomic data sets are a critical component supporting many groups around the planet in adopting shotgun metagenomics. This talk will present MG-RAST, a platform used by hun-dreds of groups to upload, process and analyze over 170 gigabases of metagenomic data. A key aspect of the presentation will be data mining in metag-enomic data sets and tools for the comparison of metagenomes.

9:20 Addressing the Challenges Faced in Providing Analysis Services for Next-Generation Sequencing Data Chris Hemmerich, M.S., Biological Database Unit Leader, Center for Genomics and Bioinformatics, Indiana University Analyses of next-generation sequencing data require bioinformatics expertise and computational resources beyond what a small biology lab may pos-sess. Sequencing centers have these resources; however, customizing in-house pipelines to meet the requirements of individual biologists is excep-tionally time consuming. We are addressing this problem through intuitive web tools that allow collaborating biologists to customize and run pipe-lines across distributed grid or cloud computing resources.

9:50 Practical Considerations Sponsored by for Sequencing Analysis and Bioinformatics in the CloudDavid Powers, Expert - Life Sciences, (Formerly of Eli Lilly), Cycle ComputingJason A. Stowe, CEO, Cycle ComputingNext Generation Sequencers give scientists access to larger amounts of data at cheaper costs. With the $1000 genome around the corner, the cloud offers easy, inexpensive access to computing and stor-age resources for sequence alignment, SNP detec-tion, and tertiary analysis or re-analysis pipelines. However having access to processing power and scalable, resources is only the first step to running analysis and bioinformatics on the cloud. This talk will discuss the practical considerations including storage, required bandwidth, and people costs for running compute nodes internally or in the cloud. Specific, first-hand use cases for running analysis and bioinformatics in various environments will be discussed, covering the benefits for different approaches. Based upon the experience presented, we’ll also discuss larger trends in this area.

10:05 Morning Coffee, Poster and Exhibit Viewing

10:45 Cloud BioLinux: Pre-Configured and On-Demand High Performance Computing for the Genomics CommunityKonstantinos Krampis, Ph.D., Bioinformatics Engineer, J. Craig Venter InstituteCloud BioLinux is a publicly available virtual machine that runs on cloud computing platforms, includ-ing Amazon EC2 and the open-source Eucalyptus cloud. During this talk, we will first give an overview of cloud platforms, and how they can provide small labs with access to high performance bioinformatics computing, required to work with next generation sequencing data. We will then demonstrate how users can access the bioinformatics tools included with Cloud BioLinux, by starting virtual machines on Amazon’s EC2 cloud computing platform, authenti-cating, and interacting with the virtual servers.

11:15 Speaker to be Announced

transcriptome analysis11:45 Discovering Genes and Alternative Transcripts in Deep RNA Sequencing Jean Thierry-Mieg, Ph.D., Director of Research, CNRS; Staff Scientist, NLM/NCBI, NIHDanielle Thierry-Mieg, D.Sc., Research Fellow, CNRS; Staff Scientist, NLM/NCBI, NIHMassively parallel sequencing of RNA samples allows us to characterize with great accuracy the structure of the expressed genes and to discover and quantize new genes and new alternative tran-scripts. Yet there are many pitfalls and sources of errors. We will describe how the NCBI AceView program exploits this massive amount of short cDNA sequences, determines recursively optimal discontinuous alignments to the genome, discovers new exons, new introns and new genes and identi-fies the SNPs.

12:30 Close of Morning Session

Page 7: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

7 Next-GeNeRatIoN SequeNCING data maNaGemeNt

12:40 Luncheon Presentation Sponsored by State-of-the-Art in Whole-and Multi-Genome Analysis: a Discussion and Demonstration of Critical RequirementsDon Gregory, Ph.D., Director of the Field Application Scientist Group, GenomeQuestDr. Gregory will outline use cases for multi-genome analysis (MGA), including: Disease/Normal Study, Population Genetics, Pharmacogenomics, and Pro-pensity.. He will also explore science questions enabled by MGA, such as “How prevalent is this variation?”and“HowcanIprioritizetheobservedvarations?”.Lastly,hewillreviewanddemonstratekey requirements of MGA solutions, including: scal-ability to whole-genome reads, interactive querying of sequence comparison results, and integrated access to public datasets.

1:55 Chairperson’s Remarks

2:00 Analysis of Expression, Splicing and Gene Fusions in Human Prostate Tumors by Deep SequencingSerban Nacu, Ph.D., Postdoctoral Researcher, Genentech, Inc. In a pilot study, we sequenced the transcriptomes of three prostate tumors and their matched normals, and performed a comprehensive analysis of expres-sion, splicing, SNPs and mutations. We developed several computational tools and techniques, includ-ing an alignment program with the ability to directly detect gene fusions. The biological findings include a new class of differentially expressed transcripts and multiple novel fusions.

2:30 Sequencing of Human Cancer Transcriptomes to Discover Novel Gene Fusions Christopher Maher, Ph.D., Research Investigator, Michigan Center for Translational Pathology, Center for Computational Medicine and Biology, Department of Pathology, University of MichiganCharacterization of specific genomic aberrations in cancers has led to the identification of several suc-cessful therapeutic targets. Therefore, we have employed next-generation transcriptome sequenc-ing to elucidate putative “driver” gene fusions that may be hidden by non-specific aberrations. This talk will focus on our bioinformatics approach, the iden-tification and characterization of novel gene fusions, and the implications of transcriptome sequencing for improved cancer therapeutics.

3:00 Exploring Bacterial Transcriptomes in the Age of High Throughput SequencingJonathan Livny, Ph.D., Research Scientist, Broad Institute of MIT and Harvard; Instructor, Brigham and Women’s Hospital/Harvard Medical School HTS-based bacterial transcriptomic approaches present significant technical and analytical chal-lenges that have limited their utilization and hin-dered the extraction of biological insights from the large and complex datasets they produce. To address these challenges, we are developing improved protocols for constructing and sequenc-ing bacterial cDNA libraries and more effective and accessible computational tools and infrastructures for visualizing and analyzing HTS transcriptomics data. I will present a summary of these improved experimental and analytical approaches as well as some examples of how HTS transcriptomics is being utilized to explore various aspects of bacterial physiology and evolution.

3:30 Networking Refreshment Break, Exhibit & Poster Viewing

1000 Genomes ProjectShared session with Evolution of Next-

Generation Sequencing Conference

4:00 Poster Award Sponsored by

4:00 Detecting Rare Genetic Variants in the Large-Scale 1000 Genomes Exome Resequencing Project Fuli Yu, Ph.D., Assistant Professor, Human Genome Sequencing Center, Baylor College of MedicineThe 1000 Genomes Pilot 3 Project aims to generate high coverage data primarily in the coding regions of approximately 1,000 selected genes from ~900 individuals. From this sequencing program we expect to identify essentially all variants present in the targeted exons, using the exome capture tech-nologies combined with different next-generation sequencing platforms. The key challenge in SNP dis-covery is to distinguish true individual variants from sequencing errors. We have developed Atlas-SNP2 at BCM-HGSC, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets.

4:30 Impact of the 1000 Genomes Project on the Next Wave of Pharmacogenomic DiscoveryM. Eileen Dolan, Ph.D., Professor, Medicine, University of Chicago The 1000 Genomes Project aims to provide detailed genetic variation data on >1000 genomes from worldwide populations using the next-generation sequencing technologies. Some of the samples utilized for the 1000 Genomes Project are the Inter-national Hap-Map samples that are composed of lymphoblastoid cell lines (LCLs) derived from indi-viduals of different world populations. The detailed map of human genetic variation promised by the 1000 Genomes project will allow a more in-depth analysis of the contribution of genetic variation to drug response. Future studies utilizing this new resource can greatly enhance our understanding of the genetic basis of drug response and other com-plex traits.

5:00 Integrated Analysis of Human Resequencing Data from Multiple Sequencing PlatformsDavid Craig, Ph.D., Associate Director, Neurogenomics; Investigator, Neurobehavioral Research Unit, TGen I will present integrated analysis pipelines and soft-ware tools for analyzing next-generation sequenc-ing data using multiple sequencing technologies. Specific focus will be on integrating SOLiD and Illumina datatypes, leveraging complementary strengths of both platforms. We will present whole-genome sequence analysis both in the context of 1000 Genomes and within our own whole-genome sequencing studies.

5:30 Close of Conference

Page 8: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

Sponsorship & exhibit InformationSponsorship Opportunities Include:

Sponsored PresentationPresent your scientific research and solutions for 15 or 30 minutes as part of the conference program, ensuring that your audience is seated and ready to listen.

Luncheon Presentation Invite session delegates to enjoy lunch on your company’s behalf while you give a 30-minute presentation. Your workshop is concluded with 15 minutes of Q&A, allowing you to interact with your customer base.

Invitation Only Networking ReceptionsCHI will invite all delegates from a specific conference program to a private reception at the host hotel. Cocktails and hors d’oeuvres will be served in a setting conducive to networking. These receptions are available on a first-come, first served basis. Presentation opportunities are limited, so reserve your talk today to ensure participation!

Exhibitor InformationThe exhibit hall presents an excellent opportunity to network with prominent scientists and executives who attend the event to learn about cutting edge research and technologies in their field. Exhibiting will allow your company to meet hard to reach prospects face to face, and pave the way for future sales. Exhibit space fills up quickly, so reserve yours today!

Other Promotional Opportunities•ConferenceToteBags•CorporateBrandingPackages•ToteBagInserts•RefreshmentBreaks•Andmore!

We can customize any opportunity to meet your current marketing objectives and budget. To find out more about our comprehensive sponsorship and exhibit packages, please contact:

Stacey SquatritoManager, Business [email protected]

Hotel & travel InformationConference Venue:Rhode Island Convention CenterOne Sabin StreetProvidence, RI 02903

Host Hotel: The Westin ProvidenceOne West Exchange StreetProvidence, RI 02903Phone: 401-598-8000Fax: 401-598-8200

Discounted Room Rate: $169 s/dDiscounted Cut-off Date: August 31, 2010

Please visit our website to make your reservations online or call the hotel directly to reserve your sleeping accommodations. Identify yourself as a Cambridge Healthtech Institute conference attendee to receive the reduced room rate. Reservations made after the cut-off date or after the group room block has been filled (whichever comes first) will be accepted on a space-and-rate-availability basis. Rooms are limited, so please book early.

Airline & car rental discounts have been set up with American Airlines and Hertz, so please visit our website for details and links to those sites.

Flight Discounts:To receive a 5% or greater discount on all American Airline flights please use one of the fol-lowing methods:

•Call1-800-433-1790.UseConferencecodeA5490AI

•Goonlineatwww.aa.com.EnterConferencecodeA5490Alinpromotiondiscountbox

•ContactWendyLevine,GreatInternationalTravel,at1-800-336-5248ext.137

Car Rental Discounts: •SpecialdiscountrentalshavebeenestablishedwithHertzforthisconference.Pleaseuse

one of the following methods:

•CallHERTZ,800-654-3131.UseourHertzConventionNumber(CV):04KL0001

•Goonlinewww.hertz.com.UseourHertzConventionNumber(CV):04KL0001

Page 9: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

SUNDAY, SEPTEMBER 26, 20101:30 pm Short Course Registration

2:00 -5:00 Recommended Pre-Conference Short Course*

SC2: Target Enrichment for NGSSee page 2 for details

*Separate registration required

MONDAY, SEPTEMBER 27, 2010Sessions shared with Next-Generation Sequencing Data Management ConferenceSee pages 2 & 3 for agenda

TUESDAY, SEPTEMBER 28, 20107:30 am Breakfast Sponsored by

Presentation: NanoString’s Counter Analysis System: High-Throughput NGS Data Validation for mRNA, microRNA, and Genomic Copy Number Variation Studies Sean Ferree, Ph.D., Director of Product Development, NanoString Technologies, Inc.The nCounter Analysis System is a rapid and cost effective method for direct, digital, highly mul-tiplexed quantification of hundreds of different nucleic acid species in a single, amplification-free assay with sensitivity comparable to qPCR. The technology is the perfect companion for validation of whole genome deep sequencing studies. The assays are compatible with a wide variety of clini-cal sample types, including purified mRNA, miRNA or genomic DNA, crude cell and tissue lysates, blood collections, and FFPE-extracted material.

8:15 Successful Sequencing Discussion Groups

Grab a cup of coffee and join a facilitated discus-sion group focused around specific themes. This unique session allows conference participants to exchange ideas, experiences, and develop future collaborations around a focused topic.

Evolving Sequencing Methods to Enable Genomic

Research9:15 Chairperson’s Remarks

9:20 Third Generation Sequencing Pipeline for CancerRaphael Bueno, Ph.D., The Thoracic Surgery Oncology Laboratory and Division of Thoracic Surgery, Brigham and Women’s Hospital, Harvard Medical School

This presentation will discuss the use of next gen-eration sequencing data for both transcriptome and the genome elucidation for solid tumors. It will also review and compare multiple different platforms for deep sequencing the tumor genomes and the identification of multiple types of mutations.

9:50 Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human GenomeJob Dekker, Ph.D., Associate Professor, Gene Function and Expression, University of Massachusetts Medical School To probe the spatial arrangement of genomes we developed Hi-C, a method that combines 3C and high-throughput sequencing to map three-dimensional chromatin interactions in an unbiased, genome-wide fashion. Application of Hi-C to the human genome revealed a novel folding principles of the human genome. For instance, open and closed chromatin are spatially segregated, forming two genome-wide compartments. The contents of the compartments are dynamic: changes in chro-matin state and/or expression correlate with move-ment from one compartment to the other.

10:20 Closing the Gap in Sponsored by Time from Raw Data to Real Science - Science as a Service (ScaaS)Justin H. Johnson, Director of Bioinformatics, Edge BioNext generation sequencing has drastically changed the traditional infrastructure within the sequencing community. There are several technolo-gies that show promise, but it is not always intuitive where to start. This uncertainty is compounded by the fact that commonly used bioinformatics tools are difficult to build and maintain as well as require vast amounts of compute resources. Many solu-tions and platforms, such as cloud computing, offer promises that address one or a small number of these challenges, but the inherent challenges with these are rarely discussed. Edge Bio will present information, research and case studies on how they facilitate Science as a Service (ScaaS) to the com-munity through a technology agnostic approach, taking each individual project from conception and design through informatics and analysis.

10:35 Networking Coffee Break, Poster and Exhibit Viewing

11:15 Generation Microarray and Sequencing Technologies for Genome-Wide Dosage Assays in Yeast and ManCorey Nislow, Ph.D., Assistant Professor, Donnelly Centre for Cellular and Biomolecular Research, University of Toronto We use comprehensive collections of cells in which the gene dosage is systematically altered. We have utilized the Yeast Knock Out collection (YKO) and combined it with a collection of yeast overexpressing each gene to screen thousands of bioactive small molecules genome-wide, I will pres-ent several compelling drug-target interactions that

have been uncovered using a combination of these gene-dose screens as well as provide an overview of our efforts to develop novel microarray and next-generation sequencing technologies to accelerate the tempo of these chemogenomic assays.

11:45 Bind-n-Seq: High-Throughput Analysis of in vitro Protein-DNA Interactions Using Massively Parallel SequencingArtem Zykovich, Ph.D. Candidate, Genome Center/Pharmacology, University of California, DavisHere we introduce Bind-n-Seq, a new high-through-put method for analyzing protein-DNA interactions in vitro, with several advantages over current methods. The procedure has three steps: (1) bind-ing proteins to randomized oligonucleotide DNA targets, (2) sequencing the bound oligonucleotide with massively parallel technology, and (3) find-ing motifs among the sequences. These results present Bind-n-Seq as a highly rapid and parallel method for determining in vitro binding sites and relative affinities.

12:15 pm Close of Session

12:30 Luncheon Presentation Sponsored by

Enabling Efficient Next Generation Sequencing Based Research with a Versatile LIMSMichael Kuzyk, Ph.D., Product Manager, Omics LabsAn in-depth look at UW – Northwest Genome Center and USC – Epigenome Center, both of whom considered building their own lab informa-tion management system and then chose to pur-chase a commercial solution instead. With very tight timelines building an in-house system was not an option for UW. They required exceptional sample traceability – while dealing with evolving protocols. USC was challenged with managing the high volume of sample data being generated by numerous projects - they needed a centralized system to manage their genomics/next-gen lab efficiently. They also required a system that was versatile enough to integrate seamlessly with their complex custom data analysis pipeline.Efficient

Data Analysis

Service Providers Share their Tips ‘N Tricks

An outsource provider allows immediate access to sequencing technologies that have taken years of resources and expertise to develop. The deci-sion to use an outsource provider should be based upon factors such as speed of service, increased efficiency, accuracy, quality, reliability, reproduc-ibility, and convenience. In this session, service providers share their tips ‘n tricks that enable their successful sequencing service projects.

2:00 Chairperson’s Remarks

Evolution of Next-Generation Sequencing

EvOlUTiON Of NExT-GENERATiON SEqUENciNG 9

Page 10: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

2:05 Service Provider #1: SPRIworks - An Automated Sample Preparation System For All Second Generation Sequencing PlatformsWilliam Donahue, Ph.D., Manager, Molecular Biology, Beckman Coulter Genomics The emergence of second generation sequencing technologies has enabled scientists to generate vastly increased data sets from their sequencing experiments. The dramatic increase in throughput however is supported by workflows that are rela-tively complicated and labor intensive. The SPRI-works sample preparation systems I, II and III enable automated sample preparation for each plat-form. Data will be shown highlighting the capability of SPRIworks for sample multiplexing along with its applicability to other workflows such as RNA-Seq and ChIP-Seq.

2:35 Service Provider #2:

Complete Human Genome Sequencing for Large-Scale Disease StudiesSteve Lincoln, Vice President, Scientific Applications, Complete GenomicsComplete Genomics’ sequencing platform is a com-bination of technology advancements in DNA library preparation, nanoarrays, sequencing assay chemis-try, instruments and software. This affordable, turn-key sequencing service provides human disease researchers with the ability to conduct comprehen-sive genetic studies of human diseases, and allows complete human genome sequence data to inform new methods of disease prevention and treatment.

3:05 Service Provider #3 Covering your Bases: Cost Effectiveness of Targeted Re-sequencing with 2nd Generation SequencersVictor Weigman, Ph.D., Computational Biologist, Expression AnalysisWhile the availability of deep-sequencing platforms is growing in step with cost reduction, full genomic analysis is still cost-prohibitive. Expression Analysis is approaching this by offing varied genomic re-sequencing platforms along with multiplexing to allow for high resolution sequencing of your regions of interest for fractions of the cost. We will discuss our experiences in optimizing sample barcoding along with an evaluation of various genome capture technologies. Secondary to these issues are deter-mining filtering for variant calling that leads to the highest true-positive rate. We’ll discuss strategies for variant filtering along with prioritizing candidates for association follow-ups.

3:35 Coffee Break, Poster and Exhibit Viewing

NGS Expands Genomic Horizons from Prokaryotes to

Eukaryotes4:15 The Human Microbiome Project: Next-Generation Sequencing and Analysis of the Indigenous MicrobiotaVincent B. Young, M.D., Ph.D., Associate Professor, Microbiology and Immunology, University of Michigan The goals of the NIH-initiated Human Microbiome Project are to generate resources enabling compre-hensive characterization of the human microbiota and analyzing its role in human health and disease. The diversity of the indigenous microbiota and the shortcomings of traditional culture-based microbio-logic methods have led to extensive use of molecu-lar methods to characterize the extent, dynamics and function of the microbiome. We will use a Demonstration Project investigating the role of the gastrointestinal microbiota in the pathogenesis of ulcerative colitis to illustrate the use of next genera-tion in the Human Microbiome Project.

4:45 Expanding Frontiers in Viral Genomics through the Application of Ultra-Deep Sequencing TechnologiesMatthew R. Henn, Ph.D., Director, Viral Genomics, The Broad Institute of MIT and HarvardViral diseases such as HIV have an enormous impact on human health worldwide while phage play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. The development of high-throughput sequencing and annotation strategies are trans-forming the ability to study at high resolution the diversity landscape of virus and their impact on disease and ecosystem function. Here we describe the application of high-throughput sequencing, assembly and population profiling strategies based on 454 and Illumina technologies that are tuned to the specific needs of viral and phage sequencing.

5:15 Lymphocyte Monitoring by High-Throughput DNA SequencingScott Boyd, M.D., Ph.D., Assistant Professor, Pathology, Stanford UniversityThe repertoire of immune receptors generated by genomic DNA rearrangements in B and T cells enables recognition of diverse threats to the host organism. Deep sequencing of immune receptor loci can provide direct detection and tracking of immune diversity and expanded clonal lympho-cyte populations. We have applied this approach to monitor lymphoid malignancies following therapy, and to study physiological immune responses to vaccination, as well as a variety of immune-mediated non-malignant disorders.

5:45 Close of Day and Pre-Conference and Short Course Registration

6:00-9:00 Dinner Short Course*

SC3: Cloud ComputingSee page 2 for details*Separate Registration Required

WEdNESday, SEPtEmbEr 29, 20107:30 am Sponsored byBreakfast Presentation Enabling World Class Research through Innovative Data Management StrategiesMark Fahey, Scale Out NAS & Back Up Recovery and Archive Sales Specialist, Hewlett Packard

targeted Sample Enrichment8:15 Chairperson’s Remarks

8:20 Targeted Analysis of DNA Methylation Using Hybrid Capture of Bisulfite Converted DNAJames B. Hicks, Ph.D., Research Professor, Genetics, Cold Spring Harbor LaboratoryTargeted sequencing of bisulfite treated DNA by hybrid capture requires that the array be capable of trapping the converted forms of both methy-lated and unmethylated sequences. Unmethy-lated regions will have all C residues converted to T after sequencing, while other regions will have an unknown number of C residues in CpG dinucle-otides protected from bisulfite conversion by cyto-sine methylation. We have published an array cap-ture strategy for detecting DNA methylation at the base pair level using next generation sequencing. We will describe strategies for extending and opti-mizing the capture of both methylated and unmeth-ylated sequences and the mapping of the resulting information-depleted sequence output.

8:50 Optimization of Sample Preparation for Next-Generation SequencingNiall Lennon, Ph.D., Assistant Director, Process Development, The Broad Institute of MIT and Harvard Long reads and short sequencing run times make the 454 platform a powerful tool for de novo assembly of small genomes, metagenomic profiling and ampl-icon sequencing. The challenge is that these appli-cations require a small number of reads from large numbers of samples. Overcoming these requires a combination of automation, molecular barcoding and process improvement. Through application of these changes, it is possible to process hundreds of samples in a time and cost effective manner.

9:20 Extending Read Lengths and Accuracy of Next-Generation Sequencing by Molecular Tagging and SubassemblyJoseph Hiatt, M.D., Ph.D. Candidate, Genome Sciences, University of Washington Major advances in cost and throughput associated with next-generation DNA sequencing are currently offset by significant trade-offs with respect to read length and base-calling accuracy. We have devel-oped a method called subassembly that overcomes these limitations by using unique sequence tags to identify individual sample molecules at the begin-ning of library construction. These tags then direct grouping of sequencing reads and thereby facilitate accurate reconstruction of a consensus sequence for each sample molecule. Using this approach, we have used the Illumina platform to yield highly accurate (Q40) subassembled reads with effective lengths as long as 700 bp.

10 EvolutioN of NExt-GENEratioN SEquENciNG

Page 11: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

7ONliNE: WWW.HEAlTHTEcH.cOM EMAil: [email protected] fAx: 781-972-5425 7

9:50 LabChip XT - Advanced Sponsored by Nucleic Acid Fractionation for Next-Generation Sequencing Sample PreparationIsaac Meek, Associate Director of Marketing, Caliper Life SciencesReproducible and scalable sizing and isolation has become a bottle neck for many applications of new sequencing technologies. Caliper Life Sciences has developed and commercialized instruments that utilize microfluidics to achieve rapid and high reso-lution electrophoretic separations. We have now developed a commercial solution, the LabChip XT, that will simplify and improve nucleic acid fraction-ation. By using intersecting microfluidic channels, optical detection and computer control, we can automatically extract a target band during separa-tion and route the selected material to a collection well. The presented material will describe the fun-damentals of our microfluidics-based solution.

10:05 Morning Coffee, Poster and Exhibit Viewing

10:45 Speaker to be Announced

11:15 MBD-Isolated Genome Sequencing Provides a High-Throughput and Comprehensive Survey of DNA Methylation in the Human GenomeDavid Serre, Ph.D., Assistant Staff, Genomic Medicine Institute, Cleveland ClinicWe describe a novel technique, MBD-isolated Genome Sequencing, which combines precipita-tion of methylated DNA by recombinant methyl-CpG binding domain of MBD2 protein and mas-sively parallel sequencing of the isolated DNA. We utilized this approach to study three isogenic cancer cell lines with varying degrees of DNA methylation. We successfully detected previously known methy-lated regions in these cells and identified hundreds of novel methylated regions. This technique can be applied to any biological settings to identify differen-tially methylated regions at the genomic scale.

11:45 De novo Variant Detection with Whole Exome CaptureStephan Sanders, Ph.D., Postdoctoral Associate, Yale UniversityThe number of potentially disease causing variants discovered by whole exome sequencing can be overwhelming. To identify variants with a high likeli-hood of being deleterious, we sequenced probands along with their parents to detect de novo events. While this strategy drastically reduces the number of variants to consider, detecting de novo events with next generation sequencing poses additional challenges: first, the need to improve specificity due to the rarity of de novo events (approximately two per exome), and second, the need to ensure high quality sequence in every family member.

12:15 pm Close of Session

12:25 Luncheon Presentation Sponsored by Making Whole Human Genome Sequencing RoutineTim Harkins,Ph.D., Director R&D, SOLiD CollaborationsAs sequencing technologies increase their through-put, it is now possible to readily sequence a whole human genome. As we sequence more human genomes we are learning not only new insights into human biology but also how critical certain technol-ogy features are - beyond just shear data mass. One critical factor is sequencing accuracy. Although using consensus accuracy can overcome some inherent error rates, a floor is ultimately reached, thereby limiting the ability to identify either low fre-quency events like somatic mutations or the ability to identify SNV’s when using low coverage tech-niques to support GWAS based projects. We have developed a new chemistry that incorporates error-correcting codes that enable the SOLiD platform to achieve up to five 9’s within a single sequencing read. This improvement will be presented applied to a diverse set of whole human genomes.

NGS further Elucidates the Genomic Basis of Disease1:55 Chairperson’s Remarks

2:00 Multiple Approaches for Transcriptome Analysis of a Novel 3D Breast Carcinoma Model Raymond R. Mattingly, Ph.D., Associate Professor, Pharmacology, Wayne State UniversityWe have applied and cross-validated whole genome microarray, digital gene expression (DGE), and RNA-Seq analyses to explore the networks and pathways that underlie ductal carcinoma in situ (DCIS) of the breast. The analyses have been applied to a novel and tractable model of DCIS in 3D overlay culture.

2:30 Massively Parallel DNA Sequencing to Characterize Cancers of the Head and NeckDavid I. Smith, Ph.D., Consultant and Professor, Laboratory Medicine and Pathology, Mayo ClinicWe have utilized massively parallel ligation-based DNA sequencing to characterize a group of head and neck cancers. We used a methodology to examine the complete transcriptome of a group of matched normal-tumor samples that also preserved the strandedness of each transcript. We also per-formed mate-pair sequencing of the DNA isolated from these sample normal-tumor pairs. The tran-scriptome sequencing enabled us to characterize genes with aberrant expression during the develop-ment of head and neck cancer. We found that there was allelic expression imbalances associated with copy number alterations.

3:00 Exome Sequencing of Multigenera-tional Mendelian Families

Stephan Züchner, M.D., Associate Professor, Human Genetics and Neurology; Director, Center for Human Molecular Genomics, University of Miami Miller School of MedicineExome sequencing comprises an applicable genomic tool that allows obtaining the majority of coding variation in a given individual with reason-able effort. We have begun to apply the technique to relatively small, dominant, and multigenerational Mendelian families with Axonopathies in order to identify novel causative genes. Results and chal-lenges of this ongoing work will be presented.

3:30 Refreshment Break, Poster and Exhibit Viewing

1000 Genomes Project4:00 - 5:00 Session shared with Next-Generation Sequencing Data Management ConferenceSee page 4 for agenda

5:30 Close of Conference

EvOlUTiON Of NExT-GENERATiON SEqUENciNG 11

Lead Sponsoring Publications

Page 12: September 27-29, 2010 Rhode Island Convention Center ......Biochemistry and Molecular Biology, Pennsylvania State University As the cradle of mankind, Africa harbors a much larger

Cambridge Healthtech Institute encourages attendees to gain further exposure by presenting their work in the poster sessions.

To secure a poster board and inclusion in the conference materi-als, your abstract must be submitted, approved and your registra-tion paid in full by September 8, 2010. Register online, or by phone, fax or mail. Indicate that you would like to present a poster and you will receive abstract submission instructions via email.

I am interested in presenting a poster at r Evolution of Next-Generation Sequencingr Next-Generation Sequencing Data ManagementTitle

Present a Poster and Save $50!

HOW TO REGISTER: 8 Online: www.healthtech.com* Email: [email protected] ) Phone: 781-972-5400 7 Fax: 781-972-5425

Additional Registration Details

Each registration includes all conference sessions, posters and exhibits, food functions, and a copy of the conference proceedings link.

Handicapped Equal AccessIn accordance with the ADA, Cambridge Healthtech Institute is pleased to arrange special accommodations for attendees with special needs. All requests for such assistance must be submitted in writing to CHI at least 30 days prior to the start of the meeting.

Substitution/Cancellation PolicyIn the event that you need to cancel a registration, you may: • Transfer your registration to a colleague within your organization. • Credit your registration to another Cambridge Healthtech Institute

program.• Request a refund minus a $100 processing fee per conference.• Request a refund minus the cost ($750) of ordering a copy of the CD.

NOTE: Cancellations will only be accepted up to two weeks prior to the confer-ence. Program and speakers are subject to change.

Video and or audio recording of any kind is prohibited onsite at all CHI events.

r The latest industry news, commentary

and highlights from Bio•IT World

r Innovative management in clinical trials

CHI Insight Pharma ReportsA series of diverse reports designed to keep life science professionals informed of the salient trends in pharmaceutical technology, business, clinical development, and therapeutic disease markets. For a detailed list of reports, visit InsightPharmaReports.com, or contact Rose LaRaia at [email protected], 781-972-5444.

Barnett Educational ServicesBarnett is a recognized leader in clinical education, training, and reference guides for life science professionals involved in the drug development process. For more information, visit www.barnettinternational.com.

Yes! I would like to receive a FREE eNewsletter subscription to:

chimediagroup.com

Special rates are available for multiple attendees from the same organization. Contact David Cunningham at 781-972-5472 to discuss your options and take advantage of the savings.

Group Discounts

REGISTRATION INFORMATIONo Mr. o Ms. o Mrs. o Dr. o Prof.

Name ____________________________________________________________________________________________________Job Title ________________________________________________ Div./Dept. ________________________________________Company _________________________________________________________________________________________________Address __________________________________________________________________________________________________City/State/Postal Code ____________________________________ Country __________________________________________Telephone ________________________________________________________________________________________________

HowwouldyouprefertoreceivenoticesfromCHI? Email:o Yes o No Fax: o Yes o NoEmail* _____________________________________________________________Fax ___________________________________

* Email is not a mandatory field. However, by excluding your email you will not receive notification about online access to pre-conference presenter materials, conference updates, networking opportunitites and requested eNewsletters.

CONFERENCE PRICING

Please select which conference you will most likely attend:o Evolution of Next-Generation Sequencing o Next-Generation Sequencing Data Management September 27 – 29 September 27 – 29

Academic, Government, Commercial Hospital-affiliated

Advance Registration Discount until August 20, 2010 o $1995 o $875

Registrations after August 20, 2010 and on-site o $2195 o $945

SHORT COURSES Academic, Government, Commercial Hospital-affiliated

Single Short Course o $695 o $395Two Short Courses o $995 o $695

Please make your short course selection:o NGS Data (Sept. 26) o SC2 Target Enrichment (Sept. 26) o SC3 Cloud Computing (Sept. 28)

POSTER DISCOUNT o $50 off

o I cannot attend but would like to purchase the combined conference CD for $750 (plus shipping). Massachusetts delivery will include sales tax.

o Please send information on exhibiting and opportunities to present workshops.

PAYMENT INFORMATION

o Enclosed is a check or money order payable to Cambridge Healthtech Institute, drawn on a U.S. bank, in U.S. currency.

o Invoice me, but reserve my space with credit card information listed below.

Invoices unpaid two weeks prior to conference will be billed to credit card at full registration rate. Invoices must be paid in full and checks received by the deadline date to retain registration discount. If you plan to register on site, please check with CHI beforehand for space availibility.o Please charge: o AMEX (15 digits) o Visa (13-16 digits) o MasterCard (16 digits)

Card # ___________________________________________________ Exp Date ________________________________________Cardholder ________________________________________________________________________________________________Signature _________________________________________________________________________________________________Cardholder’s Address (if different from above) _________________________________________________________________City/State/Postal Code ______________________________________________________________________________________Country __________________________________________________________________________________________________

Cambridge Healthtech Institute 250 First Avenue, Suite 300, Needham, Massachusetts 02494

T: 781-972-5400 or toll-free in the U.S. 888-999-6288F:781-972-5425•www.healthtech.com

Key Code 109100F

September27-29,2010•RhodeIslandConventionCenter•Providence,RI

Next-Generation Sequencing Data ManagementEvolution of Next-Generation Sequencing &

Lead Sponsoring Publications