Post on 30-Mar-2018
Tackling the data analysis challenge for characterisation of
biotherapeutics Carsten P Sönksen, Ph.D., Novo Nordisk
CASSS AT 2015 Berlin March 2015 1
Experience:
• ~19 years with mass spectrometry
• ~14 years in an industrial setting
Responsibility:
• Senior Research Scientist, responsible for MS-based protein characterisation of biopharmaceuticals
• Dept.: CMC - analytical support, Novo Nordisk
• Implemented Genedata Expressionist for vendor independent data processing and analysis
Personal background: Carsten P Sönksen
2 Tackling the data analysis challenge…
• Stressed IgG4 sample was analysed by iCIEF
• Preparative isoelectric focusing (Agilent) of fractions for characterisation by MS
• MS-characterisation by
• Intact mass analysis LC-MS (Waters)
• Tryptic peptide map LC-MS/MS (Thermo)
• Raw data processing, analysis, and reporting
• Genedata Expressionist for Mass Spectrometry
Case study: Characterisation of the charged isoforms of stressed IgG4
3 Tackling the data analysis challenge…
iCIEF electropherograms of stressed sample and isolated preparative fractions
4 Tackling the data analysis challenge
Challenge: Several peaks in each fraction
-0.01
0.09
0.19
0.29
0.39
0.49
0.59
6.4 6.6 6.8 7 7.2 7.4
Load
Acidic 3
Acidic 2
Acidic 1
Major 1
Major 2
Basic 1
Basic 2
Basic 3
pI
Intact LC-MS data: Single sample view vs. overlay Weak patterns become clearer
Presentation title 5
Acidic 3
Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2
-F: -146 Da
Main form: +2 G0F, - 2 K, 2 Pyro-glu
- 2 Pyro-glu: + 34 Da
+ K: +128 Da
+2 K: +256 Da
+ G: + 162
+2 G: + 324 Da
Mass
Quality check of tryptic peptide maps (LC-MS/MS):
Box Plot analysis
Presentation title 6
1: Average signal intensity analysis - check for differences - check for abnormal runs 2: Set “Acidic 2” and “Basic 3” on the “watch list”
Acid
ic 3
A
cid
ic 2
A
cid
ic 1
M
ajo
r 1
M
ajo
r 2
B
asic
1
Basic
2
Basic
3
Guided vs. blind analysis: Only a fraction of the peaks are identified (red crosses)
Tackling the data analysis challenge 7
“Acidic 3” peptide map TIC chromatogram “Acidic 3” peptide map TIC 2D heat map
• ~ 345 charge clusters not identified out of 530!
• A need for further identification analysis.
Many unidentified high intense peaks still ask for identification
Tackling the data analysis challenge 8
Mass 5556.89 Charge 4 Mass 5556.89 Charge 3
Are the detected modification valid? Check modifications in the mass spectral data
Presentation title 9
Raw Data Cleaned Data
Dominant form HC aa 309-324
Succinimide Form
Dehydrated Form?
Deamidated Form
Absence of the C-terminal lysine (K)
Presentation title Date 10
Peptide with C-terminal lysine on the HC
Peptide without C-terminal lysine on the HC
Acid
ic 3
A
cid
ic 2
A
cid
ic 1
M
ajo
r 1
M
ajo
r 2
B
asic
1
Basic
2
Basic
3
Acid
ic 3
A
cid
ic 2
A
cid
ic 1
M
ajo
r 1
M
ajo
r 2
B
asic
1
Basic
2
Basic
3
Unbiased Statistic Analysis of Charged Isoforms
Presentation title Date 11
Contrast analysis: Only the HC C-terminal lysine peptides describe the difference in the preparative fractions with a significance of ~ P<0.05
Distribution of deamidated and succinimide forms
Tackling the data analysis challenge Date 12
Acid
ic 3
A
cid
ic 2
A
cid
ic 1
M
ajo
r 1
M
ajo
r 2
B
asic
1
Basic
2
Basic
3
Acid
ic 3
A
cid
ic 2
A
cid
ic 1
M
ajo
r 1
M
ajo
r 2
B
asic
1
Basic
2
Basic
3
Check signal in raw data
Δ = 0.984 Da
Δ = 1.003 Da
Deamidated
Succinimide
13 Presentation title Date
Distribution of N-glycosylation forms
G0F G1F
No N-glyco G0
• Loss of lysine on the C-terminal of the heavy chain explains the basic isoforms
• Deamidation and succinimide explain only part of the acidic isoforms
• Neutral modifications like G0F are fractionated as well
Conclusion
Presentation title 14
-0.05
0
0.05
0.1
0.15
0.2
0.25
6.4 6.5 6.6 6.7 6.8 6.9 7 7.1 7.2 7.3 7.4
C-term Lysine
+2 G0F, - 2 K, 2 Pyro-glu
Deamidation
Deamidation
G1F G0
Genedata Expressionist has become our standard platform for biopharmaceutical characterisation
15 Presentation title Date
Why did we look beyond vendor software • So much data, so little time … • Instruments from 5 vendors -> 5 softwares to learn … • Automating standardized work and using free time to dig
deeper into interesting peaks ... Workflow-based system enables: - Reporting of data, results and parameters - Swift analysis of samples in parallel - Fast iterations for reanalysis - Excellent visualisation of data - Confirmation of results in raw data
• Overlaying of parallel processed data is a simple powerful approach to verify patterns
• Box plot analysis is a simple and fast analysis to check peptide load integrity between samples
• Automation saves time on the routine tasks which can be spent on high intense unidentified peaks
• Always verify conclusion by looking at both the processed and raw data
• Sophisticated visualization aids unbiased and complete characterization
Data analytical recommendations
16 Tackling the data analysis challenge
• Brian Kristensen, Novo Nordisk
• Ingelise Fabrin, Novo Nordisk
• Leif H. Bagger, Novo Nordisk
• Arnd Brandenburg, Genedata
Thanks to
Tackling the data analysis challenge 17