Combined Analysis of Nucleic Acids and Protein for … · Combined Analysis of Nucleic Acids and...

3
Combined Analysis of Nucleic Acids and Protein for Cancer Research Jared Isaac 1 , Mazi Mohiuddin 2 , David Sarracino 2 , Mousumi Rath 3 , Amol Prakash 4 , Haiping Liu 5 , Sharmini Muralitharan 5 , Jennifer Freeland 1 , Shen Luan 2 , and Colin Toombs 6 1 Anatomical Pathology Division (APD) Kalamazoo, MI; 2 Chromatography Mass Spectrometry Division Cambridge, MA; 3 Genetic Sciences Division South San Francisco, CA; 4 Optys Boston, MA; 5 APD Fremont, CA; 6 APD Runcorn, GB Contact: jared.isaac@thermofisher.com Poster Note 64806 ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data. A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression. The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format. MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm 3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole- Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data. RESULTS Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower. Figure 1. Nucleic Acid and Protein Yield AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50 UM_FFPE 99.20% 46/50 CTRL UM CTRL UM CTRL UM Figure 2. RNA NGS QC Parameters Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Transcript of Combined Analysis of Nucleic Acids and Protein for … · Combined Analysis of Nucleic Acids and...

Page 1: Combined Analysis of Nucleic Acids and Protein for … · Combined Analysis of Nucleic Acids and ... Combined Analysis of Nucleic Acids and Protein for Cancer Research ... a median

Combined Analysis of Nucleic Acids and Protein for Cancer Research Jared Isaac1, Mazi Mohiuddin2, David Sarracino2, Mousumi Rath3, Amol Prakash4, Haiping Liu5, Sharmini Muralitharan5, Jennifer Freeland1 , Shen Luan2, and Colin Toombs6 1Anatomical Pathology Division (APD) Kalamazoo, MI; 2Chromatography Mass Spectrometry Division Cambridge, MA; 3Genetic Sciences Division South San Francisco, CA; 4Optys Boston, MA; 5APD Fremont, CA; 6APD Runcorn, GBContact: [email protected]

Po

ster No

te 64

80

6

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Page 2: Combined Analysis of Nucleic Acids and Protein for … · Combined Analysis of Nucleic Acids and ... Combined Analysis of Nucleic Acids and Protein for Cancer Research ... a median

2 Combined Analysis of Nucleic Acids and Protein for Cancer Research

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Page 3: Combined Analysis of Nucleic Acids and Protein for … · Combined Analysis of Nucleic Acids and ... Combined Analysis of Nucleic Acids and Protein for Cancer Research ... a median

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

Jared Isaac 1, Mazi Mohiuddin 2, David Sarracino 2, Mousumi Rath 3, Amol Prakash 4, Haiping Liu 5, Sharmini Muralitharan 5, Jennifer Freeland 1 , Shen Luan 2 and Colin Toombs 6 Anatomical Pathology Division (APD) Kalamazoo, MI 1 Chromatography Mass Spectrometry Division Cambridge, MA 2 Genetic Sciences Division South San Francisco, CA 3 Optys Boston, MA 4 APD Fremont, CA 5 APD Runcorn, GB 6 ; contact: [email protected]

RESULTS

Figure 1. Nucleic Acid and Protein Yield. In comparison to CTRL extraction, UM extraction of RNA (400 ng) and Protein (10 μg) yield is comparable, however DNA (30 ng) yield is much lower.

Figure 1. Nucleic Acid and Protein Yield

ABSTRACT This study was undertaken to determine the feasibility of isolation of RNA, DNA and Protein from a single formaldehyde fixed paraffin embedded (FFPE) section to detect lung cancer biomarkers. Correlation of RNA expression and DNA mutations by Next Generation Sequencing (NGS) and Protein expression by liquid chromatography mass spectrometry (LC-MS) from the same tissue section is expected to empower researchers with confirmatory, quantitative data.

A proprietary Unified Method (UM) (Thermo Fisher Scientific, Kalamazoo, MI) for isolating RNA, DNA and Proteins from a single FFPE section was developed and used to analyze RNA expression, DNA mutations and Protein expression from a clinical sample. The RNA and DNA were analyzed using the Thermo Scientific™ Ion AmpliSeq™ NGS platform and the Proteins were analyzed on a Thermo Scientific™ LC-MS system. A Proprietry Integrated Bioinformatics Software (PIBS) (Thermo Fisher Scientific, Kalamazoo, MI) was developed to correlate RNA expression, DNA mutations and Protein Expression.

The PIBS correlates RNA and Protein Quantification, DNA mutations and morphological information, and it is scalable to handle replicate datasets from Proteogenomics workflows. Researchers will be able to correlate data across profiles and identify differentially expressed markers. Proteogenomics data can be analyzed with histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest.

INTRODUCTION Immunohistochemistry (IHC) probes are a commonly used testing technology. 20-35% of IHC qualitative testing is equivocal, leading to repeated testing and increased costs (1, 2). Evaluation of lung cancer (LCa) is dependent on determination of cancer subtype: Adenocarcinoma (AC) or Squamous Carcinoma (SC) (3). This study leverages the power of a proprietary Unified Method (UM) for extraction of biomolecules prior to Next Generation Sequencing (NGS) for RNA and DNA expression and mutational analysis and Liquid Chromatography Mass Spectrometry (LC-MS) for protein detection. The potential of this technology is to provide information for a clinical researcher to understand mutations and the resulting protein expression in a single combined testing format.

MATERIALS AND METHODS Sample Preparation LCa Adenocarcinoma purchased from Asterand Biosciences, Detroit, MI. Tissue was grossed into 3 cm3 sections and fixed with 10% neutral buffered formalin (NBF) for 24 hr at RT. After fixation, tissues were processed under standard conditions, paraffin embedded (FFPE) and stored at 4˚C. ~30 mg fresh frozen (FF) pieces were used as controls (CTRL). 7 μM curls were sectioned and processed with the UM conditions and 4 μm sections adhered to glass slides for IHC staining. IHC IHC was performed on the Thermo Scientific™ Lab Vision™ Autostainer 360-2D and PT Module and Thermo Scientific™ UltraVision™ Quanto Detection System HRP DAB per manufacturer’s recommendations. Nucleic Acid (NA) and Protein Quality Control Extracted NA and Protein were QC assayed using the Thermo Scientific™ Qubit® 3.0, Thermo Scientific™ Nanodrop® 2000c, and the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). NGS NA were processed on the Ion AmpliSeq workflow using the Thermo Scientific™ Ion AmpliSeq™ RNA and DNA Cancer 50 gene Panels on the Thermo Scientific™ Ion Personal Genome Machine™ (PGM). LC-MS Peptides from digested proteins were cleaned with Thermo Scientific™ HyperSep™ Retain CX SPE cartridges and injected in 98:2:0.2 Water:Methanol:Formic Acid into a Thermo Scientific™ Dionex™ Ultimate™ 3000 HPLC with Thermo Scientific™ Dionex™ Acclaim™ 120 C18 Reversed-Phase LC column. Data were acquired in 60 minute runs with data dependent analysis on a Thermo Scientific™ Q Exactive™ HF (QEHF) hybrid quadrupole-Orbitrap MS with Pierce PRTK standards run before and after each sample. The QEHF was used with the following settings: IT= 100 ms, AGC=3e6, 120,000 resolution, and a mass range of 350-1500 m/z. Bioinformatics NGS sequencing was processed using Thermo Scientific™ Ion Reporter™ software and LC-MS data were analyzed with Thermo Scientific™ Proteome Discoverer™ software. A Proprietry Integrated Bioinformatics Software (PIBS) was developed to perform an integrated analysis of Laser Capture Microdisection (LCM) and Proteogenomic data.

DISCUSSION NA and Protein were isolated with the UM from a single thin FFPE section to perform NGS and LC-MS. FFPE samples prepared with the UM performed similarly in comparison to FF controls. For the Ion AmpliSeq RNA Cancer Panel, 99.20% reads were on target, 46/50 targets were detected at 100X, and RNA expression was measured with predictive relevance to LCa. Data from the Ion AmpliSeq Cancer Hotspot Panel v2 replicate runs was comparable, indicating that the workflow is stable, reproducible & scalable. Both FF and FFPE sample types processed with the workflow indicated comparable performance with regards to total reads, % mapped target reads and barcode uniformity. IHC detected positives that were not present (p40/p63) and a negative that was not present (CK20) for histological subtype while the MS Discovery data predicted the correct subtype in accordance with Pathologist’s assessment but only detected MET and not EGFR, HER2 or KRAS. The protein & peptide concordance between the FF replicates (3) is greater than the FFPE replicates (3), which could be attributed to tumor heterogeneity or variability in the tissue fixation process. The overlap of RNA expression and DNA mutation status between UM FFPE and FF samples was higher than for protein expression. The PIBS has been developed to correlate RNA and Protein Quantification, DNA mutations and morphological information, demo datasets indicate that it is scalable to handle replicate datasets from Proteogenomics workflows. Cancer researchers will be able to use the PIBS to correlate data across profiles and identify differentially expressed markers to confirm histological subtype as well as the presence of drug targets for research use. Development of targeted LC-MS research methods is expected to increase the correspondence between FF and FFPE datasets as well as the detection of future biomarkers of interest. CONCLUSIONS • A unified method (UM) for extraction of RNA, DNA and Proteins has been developed. • Data generated using LCa tissue indicates reproducible and biologically interpretable information. • The PIBS empowers cancer researchers by providing detailed information about Protein, DNA & RNA markers along

with the morphological information about the tissue samples. • The PIBS can provide confirmation of protein expression of multiple targets as opposed to a single antibody based IHC

stain. • Combined extraction of Nucleic Acid and Protein from the same section of FFPE tissue enables Proteogenomics &

Translational Research For Research Use Only. Not for use in diagnostic procedures. REFERENCES 1. Makary MA et al. The British Journal of Medicine (2016). 2. Newman-Toker, et al. Diagnostic Errors More Common, Costly, and Harmful than Treatment Mistakes. News Release, Johns Hopkins University College of Medicine (2013). 3. Kim, MI Jin et al. Annals of Diagnostic Pathology (2013).

TRADEMARKS/LICENSING © 2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manner that might infringe the intellectual property rights of others.

Combined Analysis of Nucleic Acids and Protein for Cancer Research

AmpliSeq RNA Cancer 50 gene Targeted Panel (NGS) % reads on Target Targets detected at 100X UM_FF 99.39% 48/50

UM_FFPE 99.20% 46/50

CTRL UM CTRL UM CTRL UM

Figure 2. RNA NGS QC Parameters

Figure 2. RNA NGS QC Parameters. Using the RNA Cancer 50 Panel, a median read length of 112 bp, 99.2% reads on target and 46/50 reads were detected at 100X for FFPE samples extracted using the UM in comparison to FF.

Figure 3. RNA and DNA Coverage Expression Profile

Figure 3. RNA and DNA Coverage Expression Profile. Using targeted Cancer 50 NGS panels, RNA and DNA coverage expression profiles between FF and FFPE were comparable in reads per million (rpm) in log2 scale.

Figure 4. IHC Detection of LCa Biomarkers

Figure 4. IHC Detection of LCa Biomarkers. IHC results indicate positivity for CK7, EGFR, KRAS, MET, NAPSIN A, p63, and TTF-1

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS

Figure 5. Concordance of FF and FFPE Identified peptides and proteins by Discovery LC-MS. Three FF and FFPE LCa samples each were processed with the UM and analyzed by discovery LC-MS. Venn diagrams show the triplicate overlap in FF (A) and FFPE (B) identified protein groups, FF(C) and FFPE (D) identified peptide groups, and FF versus FFPE identified protein groups (E; 753 and 688, respectively) and identified peptide (F; 3685 and 2746, respectively).

Protein UM LCa

Biomarker IHC LC/MS AC SC

CK5/CK6a/6b - - - +

CK7 + + + - CK20 - + + - c-MET + + ? ? EGFR + - ? ? HER2 - - ? ? KRAS + - + -

Napsin A + + + - p63/p40 + - - +

TTF1 + - + -

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers

Figure 6. Comparison of IHC and LC-MS Detection of LCa Biomarkers. In comparison to IHC, UM LC-MS did not identify peptides corresponding to EGFR, KRAS and TTF-1. The absence of p63 may not be a negative finding as the IHC result may be a false positive since an adenocarcinoma does not express p63 from the literature. Failure of IHC to detect CK20 could lead to incorrect subtyping which was detected by LC-MS. The expected expression of the LCa biomarkers comparing adenocarcinoma (AC) to squamous (SC) subtype is summarized as well.

A B

C D

E F

Figure 7. Schematic of UM LCM Workflow with PIBS

Figure 7. Schematic of UM LCM Workflow with PIBS. Demonstration of developed PIBS illustrating spatial localization of UM gene and protein expression color coded to LCa tissue micro dissected using the Arcturus LCM. The PIBS provides correlation between RNASeq expression and Protein Quantitation coupled with the DNA mutations, and it is scalable to handle replicate datasets both from the proteomics and genomics workflows. Users will be able to correlate data across profiles, identify differentially expressed markers and generate interpretation reports.

PN64806-EN 0616S

Africa +43 1 333 50 34 0Australia +61 3 9757 4300Austria +43 810 282 206Belgium +32 53 73 42 41Brazil +55 11 2730 3006Canada +1 800 530 8447China 800 810 5118 (free call domestic)

400 650 5118

Denmark +45 70 23 62 60Europe-Other +43 1 333 50 34 0Finland +358 10 3292 200France +33 1 60 92 48 00Germany +49 6103 408 1014India +91 22 6742 9494Italy +39 02 950 591

Japan +81 6 6885 1213Korea +82 2 3420 8600Latin America +1 561 688 8700Middle East +43 1 333 50 34 0Netherlands +31 76 579 55 55 New Zealand +64 9 980 6700 Norway +46 8 556 468 00

Russia/CIS +43 1 333 50 34 0Singapore +65 6289 1190Sweden +46 8 556 468 00 Switzerland +41 61 716 77 00Taiwan +886 2 8751 6655UK/Ireland +44 1442 233555USA +1 800 532 4752

www.thermofisher.com©2016 Thermo Fisher Scientific Inc. All rights reserved. The 2100 Bioanalyzer is a registered trademark of Agilent Technologies. ISO is a trademark of the International Standards Organization. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is presented as an example of the capabilities of Thermo Fisher Scientific products. It is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult your local sales representative for details.

Thermo Fisher Scientific, Sunnyvale, CA USA is ISO 9001Certified.