The Development of an LC-FAIMS-MS Metabolomics Workflow … · Segregate Samples as Per CF TIC in...

1
Alasdair Edge 1 ; Grant Stewart 2 ; Kayleigh Arthur 1 ; Lauren Brown 1 ; Chris Hodkinson 1 ; Aditya Malkar 1 ; Paul Nasca 1 ; Marc van der Schee 1 In non-targeted ‘omics applications liquid or gas chromatography (LC or GC) is typically combined with mass spectrometry (MS) for analysis of complex biological matrices. Molecular features can be missed by LC-MS due to: Trace component suppression by chemical noise. Chromatographically unresolved isomeric species. The optimised FAIMS scan settings of CF -0.9 to 3.1 Td at DF 240 Td at 2 CF scans s -1 were used to analyse: n=10 urine samples from renal cancer patients n=10 urine samples from healthy individuals Raw spectra were used to generate features lists for each CF using XCMS using parameters: method="centWave",scanrange=c(0,504), peakwidth=c(5,30), ppm=25, mzdiff=c(-0.001), snthresh=3, integrate=2, fitgauss=FALSE Produced feature-extracted and feature-aligned data. An unsupervised, pareto scaled principal component analysis (PCA) plot for data obtained at CF 1.1 Td is shown in Figure 9. Pareto scaling used for the analysis as consistent with published untargeted metabolomic studies 2 The S-plot, a plot of covariance and correlation based on the features making up the OPLS-DA model was also plotted. Trend plots for features contributing towards class separation with the most confidence were plotted (Figure 11). Next step is to expand the data set to a statistically significant # of patients and validate the identified features by correlating multivariate analysis to raw data, as well as cross referencing to patient data to identify potential confounders. The PCA plot shows partial separation indicating differences in the data sets. A supervised o-partial least squares-discriminant analysis (OPLS-DA), highlighting the within class and between class differences, also showed separation (Figure 10). 1. Introduction 5. LC-ultraFAIMS-MS Workflow Validation The optimisation of untargeted profiling of the urinary metabolome using LC-FAIMS-MS and the data analysis workflow required to produce 3D nested data sets has been described. The optimised workflow has been applied to demonstrate proof of principle metabolic assessment of urine samples from a renal cell cancer cohort vs. healthy controls. Following further validation, the LC-ultraFAIMS-MS features list will be used to build a predictive model, or classifier, from which a probability of disease can be predicted. Combining FAIMS into a metabolomic workflow offers increased peak capacity for untargeted metabolite profiling to diagnose disease or stratify an individual’s treatment. 5. Conclusions Optimisation of nested FAIMS data acquisition focussed on: # and range of FAIMS CF settings to give optimal FAIMS separation. # of data points within the timescale of the chromatographic peaks. Optimal sensitivity via chromatographic peak heights and # of TOF scans s -1 . The effect of the different MS scan rates and CF ranges on feature detection was investigated. Acquiring data at 2 CF scans s -1 increased the # of features detected, in all cases, despite the decrease in peak intensity associated with higher acquisition speeds (Figure 7b, c). Increasing to 3 CF scans s -1 did not further increase the # of features detected. The # of features detected by scanning the FAIMS at a rate of 1 CF scan s -1 and 2 CF scans s -1 was compared. At a S:N of 3, more features were detected using a 2 CF scans s -1 scan rate (Figure 8a). Optimal DF was determined based on selectivity and sensitivity. The CF vs features plot (Figure 5) shows good coverage across the analytical space in the range -1 to +3 Td. More features were detected in the higher CF region at the higher DFs. 240 Td was selected for further experiments based on widest distribution of detected features across the CF range. As the CF scan is synchronised with the MS acquisition, the # of data points across a chromatographic peak is dependent on the # of MS scans s -1 and the CF range. Increased MS scan rate means more data points can be acquired over a given CF range. Alternatively, a smaller CF step size could be applied, increasing the # of data points across the CF peak whilst maintaining the # of CF data points across LC peak. Faster ToF scan rates reduced peak intensity (Figure 6). 4. LC-ultraFAIMS-MS optimisation Figure 1. LC-MS and LC-ultraFAIMS-MS feature determination Figure 5. Detected features at DFs of 230, 240 and 250 Td Figure 9. Unsupervised PCA (Pareto Scaling) plot at 1.1 Td CF Figure 10. OPLS-DA plot at 1.1 Td CF Figure 11. Example Trend plots of features contributing towards class separation Figure 6. (a) Raw EIC for m/z 273 across all CFs (black) 10 spectra s -1 and (red) 20 spectra s -1 , showing twice as many CF scans across the LC peak and (b) deconvoluted LC-MS peak at CF of 1.6 Td at 20 spectra s -1 Figure 7. Comparison of # of features detected in experiments 3 a-f as described in Table 1 To determine if this result was accurate, or a result of increased noise using the faster scan rate, the analysis was repeated at a S:N of 10 (Figure 8b). Whilst total # of features decreased, the faster scan rate again produced more features than the slow scan rate. 2 CF scans s -1 scan rate was used for analysis. Figure 8. Comparison of features detected at 1 CF scan s-1 and 2 CF scans s-1 at (a)S:N of 3 and (b) S:N of 10 Figure 2. (a) ultraFAIMS chip and system (b) installed on Agilent 6230 ToF-MS Figure 3. Integrated LC-ultraFAIMS-MS analysis Figure 4. LC-ultraFAIMS-MS feature extraction workflow www.owlstonemedical.com/ultraFAIMS The Development of an LC-FAIMS-MS Metabolomics Workflow and its Application to the Discrimination of Benign from Malignant Renal Cell Masses References 1. Arthur, KA; Turner, MA; Reynolds, JC; Creaser, CS.; Anal. Chem., 2017, 89, 3452 2. van den Berg, RA; Hoefsloot, HCJ; Westerhuis, JA; Smilde, AK; van der Werf, MJ; BMC Genomics, 2006, 7, 142 1 Owlstone Medical Ltd., 162 Cambridge Science Park, Cambridge, CB4 OGH, UK, 2 Academic Urology Group, University of Cambridge, Box 43, Addenbrookes Hospital, Cambridge Biomedical Campus, Hills Road, Cambridge, CB2 0QQ For further information, email: [email protected] Analysis was performed on a 6230 TOF-MS and a 1290 series LC (Agilent, Santa Clara, US) combined with an ultraFAIMS device (Owlstone Medical Ltd, Cambridge, UK). The key dimensions of the ultraFAIMS device are the 100 μm electrode gap and 700 μm path length The small scale is key to the ability to integrate into the LC-MS workflow; an entire CF scan s-1 can be achieved, making the scanning approach compatible with chromatographic timescales. The CF values are synchronised with MS acquisition so a mass spectrum is acquired for each CF (Figure 3). LC was performed using a 2. Methods The workflow for feature extraction from LC-FAIMS-MS files is detailed in Figure 4. A feature was defined as a unique identifier for each component of a m/z, a retention time (tR) and a compensation field (CF). 3. Data analysis Table 1: LC-ultraFAIMS-MS optimisation experiments Orthogonality between field asymmetric ion mobility spectrometry (FAIMS), LC and MS provides additional unique compound identifiers with detection of features based on (Figure 1): Retention time FAIMS dispersion and compensation fields (DF and CF) Mass-to-charge (m/z) RT 1 Identified Feature Identified Feature 1 Identified Feature 2 m/z 1 + = RT 1 LC-ultraFAIMS-MS LC-MS DF+ CF 1 m/z 1 DF+ CF 2 m/z 2 + + = RT 1 + + = Chromatographic Peak Mass Spectra 1 per CF Compensation Field Scan (nCFs per s) LC peak width Incorporate FAIMS into LC-MS M a s s S p e c t r o m e t r y F i e l d A s y m m e t r i c W a v e f o r m L i q u i d C h r o m a t o g r a p h y I o n M o b i l i t y S p e c t r o m e t r y MS scan rate (s -1 ) per s m/z start m/z end Index Start CF (Td) End CF (Td) N CF steps N CF actual CF step size (Td) N Repeats Start DF (Td) End DF (Td) N DF Steps 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 250 250 0 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 240 240 0 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 230 230 0 24 2 80 1500 0 -0.9 4.1 10 12 0.5 1007 Test 1 0 Test 1 10 1 80 1500 0 -0.9 3.1 8 10 0.5 503 Test 1 0 Test 1 20 2 80 1500 0 -0.9 3.1 8 10 0.5 1007 Test 1 0 Test 1 18 1 80 1500 0 -0.9 3.1 16 18 0.25 503 Test 1 0 Test 1 6 1 80 1500 0 -0.9 3.1 4 6 1 503 Test 1 0 Test 1 12 2 80 1500 0 -0.9 3.1 4 6 1 1007 Test 1 0 Test 1 18 3 80 1500 0 0 -0.9 3.1 4 6 1 1511 Test 1 Test 1 Test 1a 1b 1c 2 3a 3b 3c 3d 3e 3f No. of features 3d 3e 3f No. of features 3a 3b 3c 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 No. of features CF (Td) -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 CF (Td) -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 CF (Td) 3a 3b 3c 3d 3e 3f (a) (b) (c) A recent study 1 reported a threefold increase in features detected in non-targeted profiling of human urine with the addition of FAIMS to LC-MS analysis. Here we describe an optimised workflow to produce three-dimensional metabolomics data sets and its application to the metabolic assessment of indeterminate masses using urine samples from a renal cell cancer (RCC) cohort (DIAMOND study; NRES East Of England 03/018). (a) (b) reversed phase Poroshell 120 EC-C18 column, 2.1 x 100 mm, particle size 2.7 μm (Agilent Technologies). LC-FAIMS-MS (Agilent.d Files) Convert All Extracted TIC to .mzML Intermediate .mz5 Files Segregate Samples as Per CF TIC in Foldets CF 1 .mzML Feature List (1) Extract TICs Corresponding to Various CF Values Done via in-house Python Script and Proteowizard Feature Extraction using XCMS in R CF 2 .mzML Feature List (2) Feature Extraction using XCMS in R 500 0 1000 1500 2000 2500 3000 3500 4000 4500 -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 3.6 4.1 # of features CF 250 240 230 0 100000 200000 300000 400000 500000 600000 700000 250 260 270 280 290 Retention time (s) CF 1.6 Td 3a 3b (b) Intensity (a) Intensity 259 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 x10 5 260 261 262 263 264 265 266 267 268 269 270 271 Counts vs. Acquisition Time (sec) +ESI EIC(273.1390) MS(all) 1705_FAIMS-Optimisation_MS4_FAIMS3b_014.d 272 273 274 275 276 277 278 279 280 281 282 (a) (b) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Sum of features No. of 3 9372 25 643 8011 4488 4141 10409 32 704 8905 4488 5084 No. of 2 Common features No. of 1 Unique features 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Sum of features No. of 3 1384 1 62 1257 991 328 1470 3 64 1333 991 406 No. of 2 Common features No. of 1 Unique features 2 CF/s 1 CF/s Mid 2 CF/s 1 CF/s Mid 1500 1000 500 0 -500 -1000 -1500 -2000 -2500 -2000 -1500 -1000 -500 0 500 1000 1500 t[2] t[1] ] = 0.186 R2X[2] = 0.152 Ellipse: Hotelling’s T2 (95%) Control RenalCancer SIMCA-22Aug_RenalCancer.M2 (PCA-X) Colored according to classes in M2 1000 500 -500 0 -1000 -1500 -2000 -1500 -1000 -500 0 500 1000 1500 t[2] t[1] ] = 0.1233 R2Xo[2] = 0.119 Ellipse: Hotelling’s T2 (95%) Control RenalCancer SIMCA-22Aug_RenalCancer.M3 (OPLS-DA) Scaled proportionally to R2X Colored according to classes in M3 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.3 -0.2 -0.1 0 0.1 0.2 p(corr)[1] XVar(M167T10) p[1] R2X[1] = 0.123 Obs ID ($ClassID) SIMCA-22Aug_RenalCancer.M3 (OPLS-DA) -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.3 -0.2 -0.1 0 0.1 0.2 p(corr)[1] p[1] R2X[1] = 0.123 SIMCA-22Aug_RenalCancer.M3 (OPLS-DA) Control 10000 20000 30000 40000 50000 Average -2 std. dev. 2 std. dev. 3 std. dev. -3 std. dev. 60000 70000 80000 90000 RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer SIMCA-22Aug_RenalCancer.M3 (OPLS-DA) Average -2 std. dev. 2 std. dev. 3 std. dev. -3 std. dev. XVar(M167T10_1) Obs ID ($ClassID) Control 40000 45000 50000 55000 60000 65000 70000 75000 80000 RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer Control RenalCancer SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)

Transcript of The Development of an LC-FAIMS-MS Metabolomics Workflow … · Segregate Samples as Per CF TIC in...

Page 1: The Development of an LC-FAIMS-MS Metabolomics Workflow … · Segregate Samples as Per CF TIC in Foldets CF 1 .mzML Feature List (1) Extract TICs Corresponding to Various CF Values

Alasdair Edge1; Grant Stewart2; Kayleigh Arthur1; Lauren Brown1; Chris Hodkinson1;Aditya Malkar1; Paul Nasca1; Marc van der Schee1

In non-targeted ‘omics applications liquid or gas chromatography (LC or GC) is typically combined with mass spectrometry (MS) for analysis of complex biological matrices. Molecular features can be missed by LC-MS due to: • Trace component suppression by chemical noise.• Chromatographically unresolved isomeric species.

The optimised FAIMS scan settings of CF -0.9 to 3.1 Td at DF 240 Td at 2 CF scans s-1 were used to analyse: • n=10 urine samples from renal cancer patients• n=10 urine samples from healthy individualsRaw spectra were used to generate features lists for each CF using XCMS using parameters:• method="centWave",scanrange=c(0,504), peakwidth=c(5,30),

ppm=25, mzdi�=c(-0.001), snthresh=3, integrate=2, fitgauss=FALSE

Produced feature-extracted and feature-aligned data.An unsupervised, pareto scaled principal component analysis (PCA) plot for data obtained at CF 1.1 Td is shown in Figure 9.• Pareto scaling used for the analysis as consistent with published

untargeted metabolomic studies2

The S-plot, a plot of covariance and correlation based on the features making up the OPLS-DA model was also plotted. • Trend plots for features contributing towards class separation

with the most confidence were plotted (Figure 11).

Next step is to expand the data set to a statistically significant # of patients and validate the identified features by correlating multivariate analysis to raw data, as well as cross referencing to patient data to identify potential confounders.

The PCA plot shows partial separation indicating di�erences in the data sets. A supervised o-partial least squares-discriminant analysis (OPLS-DA), highlighting the within class and between class di�erences, also showed separation (Figure 10).

1. Introduction 5. LC-ultraFAIMS-MS Workflow Validation

• The optimisation of untargeted profiling of the urinary metabolome using LC-FAIMS-MS and the data analysis workflow required to produce 3D nested data sets has been described.

• The optimised workflow has been applied to demonstrate proof of principle metabolic assessment of urine samples from a renal cell cancer cohort vs. healthy controls.

• Following further validation, the LC-ultraFAIMS-MS features list will be used to build a predictive model, or classifier, from which a probability of disease can be predicted.

• Combining FAIMS into a metabolomic workflow o�ers increased peak capacity for untargeted metabolite profiling to diagnose disease or stratify an individual’s treatment.

5. Conclusions

Optimisation of nested FAIMS data acquisition focussed on:• # and range of FAIMS CF settings to give optimal FAIMS

separation.• # of data points within the timescale of the chromatographic

peaks.• Optimal sensitivity via chromatographic peak heights and # of

TOF scans s-1.

The e�ect of the di�erent MS scan rates and CF ranges on feature detection was investigated.• Acquiring data at 2 CF scans s-1 increased the # of features

detected, in all cases, despite the decrease in peak intensity associated with higher acquisition speeds (Figure 7b, c).

• Increasing to 3 CF scans s-1 did not further increase the # of features detected.

The # of features detected by scanning the FAIMS at a rate of 1 CF scan s-1 and 2 CF scans s-1 was compared.• At a S:N of 3, more features were detected using a 2 CF scans s-1

scan rate (Figure 8a).

Optimal DF was determined based on selectivity and sensitivity.• The CF vs features plot (Figure 5) shows good coverage across

the analytical space in the range -1 to +3 Td.• More features were detected in the higher CF region at the higher

DFs. 240 Td was selected for further experiments based on widest distribution of detected features across the CF range.

As the CF scan is synchronised with the MS acquisition, the # of data points across a chromatographic peak is dependent on the # of MS scans s-1 and the CF range.• Increased MS scan rate means more data points can be acquired

over a given CF range.• Alternatively, a smaller CF step size could be applied, increasing

the # of data points across the CF peak whilst maintaining the # of CF data points across LC peak.

• Faster ToF scan rates reduced peak intensity (Figure 6).

4. LC-ultraFAIMS-MS optimisation

Figure 1. LC-MS and LC-ultraFAIMS-MS feature

determination

Figure 5. Detected features at DFs of 230, 240 and 250 Td

Figure 9. Unsupervised PCA (Pareto Scaling) plot at 1.1 Td CF

Figure 10. OPLS-DA plot at 1.1 Td CF

Figure 11. Example Trend plots of features contributing towards class separation

Figure 6. (a) Raw EIC for m/z 273 across all CFs (black) 10 spectra s-1 and (red) 20 spectra s-1, showing twice as many CF scans across the LC peak and (b)

deconvoluted LC-MS peak at CF of 1.6 Td at 20 spectra s-1

Figure 7. Comparison of # of features detected in experiments 3 a-f as described in Table 1

To determine if this result was accurate, or a result of increased noise using the faster scan rate, the analysis was repeated at a S:N of 10 (Figure 8b).• Whilst total # of features decreased, the faster scan rate again

produced more features than the slow scan rate. • 2 CF scans s-1 scan rate was used for analysis.

Figure 8. Comparison of features detected at 1 CF scan s-1 and 2 CF scans s-1 at (a)S:N of 3 and (b) S:N of 10

Figure 2. (a) ultraFAIMS chip and system (b) installed on Agilent 6230 ToF-MS

Figure 3. Integrated LC-ultraFAIMS-MS analysis

Figure 4. LC-ultraFAIMS-MS feature extraction workflow

www.owlstonemedical.com/ultraFAIMS

The Development of an LC-FAIMS-MS Metabolomics Workflow and its Applicationto the Discrimination of Benign from Malignant Renal Cell Masses

References1. Arthur, KA; Turner, MA; Reynolds, JC; Creaser, CS.; Anal. Chem.,

2017, 89, 34522. van den Berg, RA; Hoefsloot, HCJ; Westerhuis, JA; Smilde, AK; van

der Werf, MJ; BMC Genomics, 2006, 7, 142

1Owlstone Medical Ltd., 162 Cambridge Science Park, Cambridge, CB4 OGH, UK, 2Academic Urology Group, University of Cambridge, Box 43,Addenbrookes Hospital, Cambridge Biomedical Campus, Hills Road, Cambridge, CB2 0QQ

For further information, email: [email protected]

Analysis was performed on a 6230 TOF-MS and a 1290 series LC (Agilent, Santa Clara, US) combined with an ultraFAIMS device (Owlstone Medical Ltd, Cambridge, UK).

• The key dimensions of the ultraFAIMS device are the 100 µm electrode gap and 700 µm path length

• The small scale is key to the ability to integrate into the LC-MS workflow; an entire CF scan s-1 can be achieved, making the scanning approach compat ib le with chromatographic timescales.

• The CF values are synchronised with MS acquisition so a mass spectrum is acquired for each CF (Figure 3).

• LC was performed using a

2. Methods

The workflow for feature extraction from LC-FAIMS-MS files is detailed in Figure 4.A feature was defined as a unique identifier for each component of a m/z, a retention time (tR) and a compensation field (CF).

3. Data analysis

Table 1: LC-ultraFAIMS-MS optimisation experiments

Orthogonality between field asymmetric ion mobility spectrometry (FAIMS), LC and MS provides additional unique compound identifiers with detection of features based on (Figure 1):• Retention time• FAIMS dispersion and

compensation fields (DF and CF)

• Mass-to-charge (m/z)

RT1IdentifiedFeature

IdentifiedFeature1

IdentifiedFeature2

m/z1+ =

RT1

LC-ultraFAIMS-MS

LC-MS

DF+CF1

m/z1

DF+CF2

m/z2

+ + =

RT1 + + =

ChromatographicPeak

Mass Spectra1 per CF

CompensationField Scan

(nCFs per s)

LC peak width

IncorporateFAIMS into

LC-MS

Mass Spectrometry

Field Asym

metric W

aveform

Liquid Chromatography

Ion Mobility Spectrom

etry

MS scanrate (s-1)

per s m/zstart

m/zend

Index StartCF (Td)

EndCF (Td)

N CFsteps

N CFactual

CF stepsize (Td)

NRepeats

StartDF (Td)

EndDF (Td)

N DFSteps

12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 250 250 0

12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 240 240 0

12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 230 230 0

24 2 80 1500 0 -0.9 4.1 10 12 0.5 1007 Test 1 0Test 1

10 1 80 1500 0 -0.9 3.1 8 10 0.5 503 Test 1 0Test 1

20 2 80 1500 0 -0.9 3.1 8 10 0.5 1007 Test 1 0Test 1

18 1 80 1500 0 -0.9 3.1 16 18 0.25 503 Test 1 0Test 1

6 1 80 1500 0 -0.9 3.1 4 6 1 503 Test 1 0Test 1

12 2 80 1500 0 -0.9 3.1 4 6 1 1007 Test 1 0Test 1

18 3 80 1500 0 0-0.9 3.1 4 6 1 1511 Test 1 Test 1

Test

1a

1b

1c

2

3a

3b

3c

3d

3e

3f

No

. of

feat

ures

3d3e3f

No

. of

feat

ures

3a3b3c

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

No

. of

feat

ures

CF (Td)-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

CF (Td)-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

CF (Td)

3a3b3c3d3e3f

(a) (b) (c)

A recent study1 reported a threefold increase in features detected in non-targeted profiling of human urine with the addition of FAIMS to LC-MS analysis.Here we describe an optimised workflow to produce three-dimensional metabolomics data sets and its application to the metabolic assessment of indeterminate masses using urine samples from a renal cell cancer (RCC) cohort (DIAMOND study; NRES East Of England 03/018).

(a) (b)

reversed phase Poroshell 120 EC-C18 column, 2.1 x 100 mm, particle size 2.7 µm (Agilent Technologies).

LC-FAIMS-MS(Agilent.d Files)

Convert All ExtractedTIC to .mzML

Intermediate .mz5Files

Segregate Samples asPer CF TIC in Foldets

CF 1 .mzML

Feature List (1)

Extract TICsCorresponding toVarious CF Values

Done via in-housePython Script and

Proteowizard

FeatureExtraction using

XCMS in R

CF 2 .mzML

Feature List (2)

FeatureExtraction using

XCMS in R

500

0

1000

1500

2000

2500

3000

3500

4000

4500

-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 3.6 4.1

# o

f fe

atur

es

CF

250

240

230

0

100000

200000

300000

400000

500000

600000

700000

250 260 270 280 290

Retention time (s)

CF 1.6 Td

3a

3b

(b)

Inte

nsit

y

(a)

Inte

nsit

y

259

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

x105

260 261 262 263 264 265 266 267 268 269 270 271

Counts vs. Acquisition Time (sec)

+ESI EIC(273.1390) MS(all) 1705_FAIMS-Optimisation_MS4_FAIMS3b_014.d

272 273 274 275 276 277 278 279 280 281 282

(a) (b)100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

Sum offeatures

No. of 3

937

2

25 64

3

8011

44

88

414

1

104

09 32

704

890

5

44

88

5084

No. of 2 Commonfeatures

No. of 1 Uniquefeatures

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

Sum offeatures

No. of 3

1384

1 62

1257

99

1

328

1470

3

64

1333 99

1

40

6

No. of 2 Commonfeatures

No. of 1 Uniquefeatures

2 CF/s

1 CF/s

Mid

2 CF/s

1 CF/s

Mid

1500

1000

500

0

-500

-1000

-1500

-2000

-2500-2000 -1500 -1000 -500 0 500 1000 1500

t[2]

t[1]] = 0.186 R2X[2] = 0.152 Ellipse: Hotelling’s T2 (95%)

Control

RenalCancer

SIMCA-22Aug_RenalCancer.M2 (PCA-X)Colored according to classes in M2

1000

500

-500

0

-1000

-1500-2000 -1500 -1000 -500 0 500 1000 1500

t[2]

t[1]] = 0.1233 R2Xo[2] = 0.119 Ellipse: Hotelling’s T2 (95%)

Control

RenalCancer

SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)Scaled proportionally to R2X

Colored according to classes in M3

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

-0.3 -0.2 -0.1 0 0.1 0.2

p(c

orr

)[1]

XV

ar(M

167T

10)

p[1]R2X[1] = 0.123

Obs ID ($ClassID)

SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

-0.3 -0.2 -0.1 0 0.1 0.2

p(c

orr

)[1]

p[1]R2X[1] = 0.123

SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)

Co

ntr

ol

10000

20000

30000

40000

50000Average

-2 std. dev.

2 std. dev.

3 std. dev.

-3 std. dev.

60000

70000

80000

90000

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)

Average

-2 std. dev.

2 std. dev.

3 std. dev.

-3 std. dev.

XV

ar(M

167T

10_1

)

Obs ID ($ClassID)

Co

ntr

ol

40000

45000

50000

55000

60000

65000

70000

75000

80000

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

Co

ntr

ol

Ren

alC

ance

r

SIMCA-22Aug_RenalCancer.M3 (OPLS-DA)