Post on 30-Jun-2020
(—THIS SIDEBAR DOES NOT PRINT—) DES IGN GUIDE
This PowerPoint 2007 template produces a 48”x48” presentation poster. You can use it to create your research poster and save valuable time placing titles, subtitles, text, and graphics. We provide a series of online tutorials that will guide you through the poster design process and answer your poster production questions. To view our template tutorials, go online to PosterPresentations.com and click on HELP DESK. When you are ready to print your poster, go online to PosterPresentations.com Need assistance? Call us at 1.510.649.3001
QUICK START
Zoom in and out As you work on your poster zoom in and out to the level that is more comfortable to you. Go to VIEW > ZOOM.
Title, Authors, and Affiliations
Start designing your poster by adding the title, the names of the authors, and the affiliated institutions. You can type or paste text into the provided boxes. The template will automatically adjust the size of your text to fit the title box. You can manually override this feature and change the size of your text. TIP: The font size of your title should be bigger than your name(s) and institution name(s).
Adding Logos / Seals Most often, logos are added on each side of the title. You can insert a logo by dragging and dropping it from your desktop, copy and paste or by going to INSERT > PICTURES. Logos taken from web sites are likely to be low quality when printed. Zoom it at 100% to see what the logo will look like on the final poster and make any necessary adjustments. TIP: See if your school’s logo is available on our free poster templates page.
Photographs / Graphics You can add images by dragging and dropping from your desktop, copy and paste, or by going to INSERT > PICTURES. Resize images proportionally by holding down the SHIFT key and dragging one of the corner handles. For a professional-looking poster, do not distort your images by enlarging them disproportionally.
Image Quality Check Zoom in and look at your images at 100% magnification. If they look good they will print well. If they are blurry or pixelated, you will need to replace it with an image that is at a high-resolution.
ORIGINAL DISTORTED
Cornerhandles
Good
prin
/ngqu
ality
Badprin/n
gqu
ality
QUICK START (cont. )
How to change the template color theme
You can easily change the color theme of your poster by going to the DESIGN menu, click on COLORS, and choose the color theme of your choice. You can also create your own color theme. You can also manually change the color of your background by going to VIEW > SLIDE MASTER. After you finish working on the master be sure to go to VIEW > NORMAL to continue working on your poster.
How to add Text The template comes with a number of pre-formatted placeholders for headers and text blocks. You can add more blocks by copying and pasting the existing ones or by adding a text box from the HOME menu.
Text size
Adjust the size of your text based on how much content you have to present. The default template text offers a good starting point. Follow the conference requirements.
How to add Tables
To add a table from scratch go to the INSERT menu and click on TABLE. A drop-down box will help you select rows and columns. You can also copy and a paste a table from Word or another PowerPoint document. A pasted table may need to be re-formatted by RIGHT-CLICK > FORMAT SHAPE, TEXT BOX, Margins.
Graphs / Charts
You can simply copy and paste charts and graphs from Excel or Word. Some reformatting may be required depending on how the original document has been created.
How to change the column configuration
RIGHT-CLICK on the poster background and select LAYOUT to see the column options available for this template. The poster columns can also be customized on the Master. VIEW > MASTER.
How to remove the info bars
If you are working in PowerPoint for Windows and have finished your poster, save as PDF and the bars will not be included. You can also delete them by going to VIEW > MASTER. On the Mac adjust the Page-Setup to match the Page-Setup in PowerPoint before you create a PDF. You can also delete them from the Slide Master.
Save your work Save your template as a PowerPoint document. For printing, save as PowerPoint or “Print-quality” PDF.
Print your poster When you are ready to have your poster printed go online to PosterPresentations.com and click on the “Order Your Poster” button. Choose the poster type the best suits your needs and submit your order. If you submit a PowerPoint document you will be receiving a PDF proof for your approval prior to printing. If your order is placed and paid for before noon, Pacific, Monday through Friday, your order will ship out that same day. Next day, Second day, Third day, and Free Ground services are offered. Go to PosterPresentations.com for more information.
Student discounts are available on our Facebook page. Go to PosterPresentations.com and click on the FB icon.
©2015PosterPresenta/ons.com2117FourthStreet,UnitCBerkeleyCA94710posterpresenter@gmail.com
Machine Learning Applied to Single-Molecule Electronic DNA Mapping for Structural Variant Verification in Human Genomes
Bready, B.; Davis, J.; Grinberg, B.; Kaiser, M.; Oliver, J.; Sage, J.; Seward, L. Nabsys 2.0 LLC, Providence, RI 02903
Using SV-Verify, thousands of hypotheses can be queried in a single analysis. In the human genome NA24385, a set of 9,000 putative deletion calls ≥300 bp with varying levels of support from different technologies, was evaluated. The distributions of resulting posterior probability values for all considered deletions asserted by 1 (n = 4443), 2 (n = 600), 3 (n = 691), and 4 (n = 244) technologies are shown in the left panel. The tailed ends of the distributions indicate that SV-Verify is able to discriminate between accurate calls (enriched in 4 Tech set) and inaccurate calls (prevalent in 1 Tech set). ROC curves were used to convert the posterior probability output from each SVM to a specificity value for each putative call. Here we employed a specificity threshold of 0.9 (sp90) to confirm a putative deletion. The percent of evaluated deletions confirmed at sp90, filtered by putative deletion size and number of technologies making the call, is shown in the right panel. SV-Verify demonstrated good sensitivity across all considered putative deletion size ranges, as low as 300 bp.
SV-Verify training and workflow The Nabsys SV-Verify software package provides an efficient, robust pipeline for the systematic and automated evaluation of putative SVs. SVM training was accomplished using reference material from the National Institute of Standards and Technology (NIST) Genome in a Bottle (GIAB) consortium for a well-characterized human genome, NA12878. Deletion calls ≥300 bp asserted by multiple technologies were parsed into four classes and used to train four distinct SVMs. The relationship between specificity and sensitivity for each respective SVM is graphed as a receiver operating characteristic (ROC) curve shown to the right. To evaluate hypothesized SVs, Nabsys single-molecule reads, a base reference map, and variant reference maps are used as inputs to HD-Mapping. SV-Verify utilizes the four unique SVMs tailored to different classes of structural variation (i.e. size, type and complexity) to output a posterior probability for each putative variant. Posterior probabilities and ROC curves are then used to determine calls at a given specificity threshold.
In order to construct whole genome maps, high molecular weight DNA is isolated using solution phase, commercially available kits. The genomic DNA is tagged in a sequence-specific manner through an enzymatic nicking reaction. As single molecules pass through the detector the presence of the DNA backbone and attached tags are sensed as changes in the resistance of the detector. The resulting data indicate the time between tagged sites on each DNA backbone. The temporal events are converted to distance-based events where the distances between tags (termed an “interval”) are reported in base pairs.
Single-molecule electronic detection
The accuracy of Nabsys single-molecule read mapping is central to the effective detection of SVs across a range of sizes using SV-Verify. To demonstrate the mapping accuracy of the Nabsys platform, single-molecule data were collected for E. coli MG1655 nicked with Nt.BspQI and mapped to the high quality reference. As shown in the plot above, there is a high degree of agreement between the expected reference interval sizes and the consensus interval sizes generated through Nabsys read mapping. The linear relationship observed (R2 was found to be 0.9999) extended down to intervals as small as 300 bp, well below the diffraction limit of optical mapping approaches. De novo assembly of these data resulted in 3 maps that spanned 99.4% of the reference with 0 false positives and 0 false negatives for intervals >500 bp. These results highlight the accuracy and resolution of Nabsys HD-Mapping that forms the foundation of the SV-Verify pipeline.
Underlying technology biases can significantly impact the accuracy of SV calls, particularly as the size of a given SV exceeds read length. To investigate this phenomenon, we determined the percentages of evaluated putative deletion calls made using Illumina or PacBio data (underlying data type; may include several data sets and several callers) that were confirmed at sp90 or refuted using a posterior probability threshold of ≤0.1 by SV-Verify in various size ranges (see below). The number of calls per category is indicated above each bar. As expected based on read length, the percentage of Illumina deletion calls confirmed by SV-Verify decreased as a function of deletion size, while the longer PacBio read lengths translated to more consistent calls across a range of deletion sizes, highlighting the importance of long-range information for large SV call accuracy.
Conclusions
ASHG 2017 Booth 753
The Nabsys HD-Mapping platform combined with the SV-Verify software package provides a high throughput, fully automated tool for the evaluation of structural variation in human genomes. SV-Verify can clearly distinguish between accurate and inaccurate deletion calls as small as 300 bp in size. Results here highlight the critical need for orthogonal technologies with a broad effective size range for accurate characterization of SVs on a genome-wide scale. We thank Justin Zook, Ph.D. and Marc Salit, Ph.D., of the NIST Genome-Scale Measurement Group for welcoming our participation in GIAB.
NA24385 putative deletion call set evaluation using SV-Verify
Illumina PacBio
Evaluation of Illumina and PacBio calls
Mapping of single-molecule reads to evaluate thousands of putative SVs simultaneously
ROC curves resulting from each support vector machine
The importance of structural variation in human disease and the difficulty of detecting structural variants larger than 50 base pairs has led to the development of several long-read sequencing technologies and optical mapping platforms. Frequently, multiple technologies and ad hoc methods are required to obtain a consensus regarding the location, size and nature of a structural variant, with no single approach able to reliably bridge the gap of variant sizes between those readily detected using NGS technologies and the largest rearrangements observed with optical mapping. Often, structural variants larger than 10 kilobases are not detected.
To address this unmet need, we have developed a new software package, SV-VerifyTM, which utilizes data collected with the Nabsys High Definition Mapping (HD-MappingTM) system, to perform hypothesis-based verification of putative deletions. We demonstrate that whole genome maps, constructed from data generated by electronic detection of tagged DNA,
hundreds of kilobases in length, can be used effectively to facilitate calling of structural variants ranging in size from 300 base pairs to hundreds of kilobase pairs. SV-Verify implements hypothesis-based verification of putative structural variants using supervised machine learning. Machine learning is realized using a set of support vector machines, capable of concurrently testing several thousand independent hypotheses. We describe support vector machine training, utilizing 1089 deletions and 4637 negative controls from a well-characterized human genome. Plots delineating the specificity versus sensitivity of each of the support vector machines will be presented. We subsequently applied the trained classifiers to another human genome, evaluating > 5000 putative deletions, demonstrating high sensitivity and specificity for deletions from 300 base pairs to hundreds of kilobases. Over 78% of deletions called by three or more technologies were confirmed by SV-Verify.
Single-molecule tag detection at a velocity of >1 Mbp/s.
Tagged sample introduction into instrument
High molecular weight DNA isolation
Sequence-specific tag attachment
35-500 kb molecules t
V DNA
backbone Sequence-specific tags
Deletions 300 – 499 bp Deletions 500 – 999 bp Deletions spanning multiple intervals Deletions ≥1000 bp
Advantages of Nabsys electronic detection: • Long-range information • Easy to multiplex without cross-talk • Highly scalable • High resolution, direct detection of 300 bp intervals • Low, stochastic single-molecule false-positive and
false-negative rates • Electrophoretic and hydrodynamic control of access
to detector • Highly sensitive detection enables tag detection
during translocation • Wider range of useful DNA lengths as compared to
optical methods 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
Reference Interval Size (bp)
Con
sens
usIn
terv
alSi
ze(b
p)
300 800 1300 1800300
800
1300
1800y = 0.9983x + 10.52
R2 = 0.9999
a.E. coli MG1655 Nt.BspQI consensus vs. reference
interval sizes
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.950.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
Posterior Probability (bin center)
Frac
tion
ofTo
talD
elet
ions
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.950.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
Posterior Probability (bin center)
Frac
tion
ofTo
talD
elet
ions
1 Tech2 Tech3 Tech4 Tech
a.
b.
Posterior probability distributions parsed by number of asserting technologies
300-499 500-999 1000-2999 ≥30000%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
edCo
nfirm
ed
1 Tech
2 Tech
3 Tech
4 Tech
Call sensitivity parsed by number of asserting technologies for different deletion sizes
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.950.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
Posterior Probability (bin center)
Frac
tion
ofTo
talD
elet
ions
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.950.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
Posterior Probability (bin center)
Frac
tion
ofTo
talD
elet
ions
1 Tech2 Tech3 Tech4 Tech
a.
b.
300-499 500-999 1000-2999 ≥30000%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
ConfirmedRefuted
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
a. Illumina b. PacBio
c. Bionano d. Complete Genomics
562
139
454
255
778
1024
540
935 465
50
164
33
234
71
228
59
11
1
67
5
192
29
208
39
263
14
93
5
161
30
232192
≥3000
≥3000 ≥3000
ConfirmedRefuted
ConfirmedRefuted
ConfirmedRefuted
300-499 500-999 1000-2999 ≥30000%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
ConfirmedRefuted
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
300-499 500-999 1000-29990%
20%
40%
60%
80%
100%
Putative Deletion Size (bp)
Perc
ento
fEva
luat
ed
a. Illumina b. PacBio
c. Bionano d. Complete Genomics
562
139
454
255
778
1024
540
935 465
50
164
33
234
71
228
59
11
1
67
5
192
29
208
39
263
14
93
5
161
30
232192
≥3000
≥3000 ≥3000
ConfirmedRefuted
ConfirmedRefuted
ConfirmedRefuted