AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR...

15
1 AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR CARDIAC ARRHYTHMIA USING TIME SERIES PREDICTION TECHNIQUES Myles Twete, PSU ECE510LD 6/10/2000 ABSTRACT: This investigation focuses on the feasibility of short-term prediction of cardiac arrhythmia from data given in human cardiac electrograms (ECGs) for the purpose of identifying methods and tools useful in predicting the onset of life-threatening ventricular arrhythmia, particularly fibrillation (VF). Definitions and Acronyms AAEL ......................................................... Ann Arbor Electrogram Library AT................................................................ Atrial Tachycardia AF, AFb ..................................................... Atrial Fibrillation AFl ............................................................... Atrial Flutter ECG, EGM, IEGM ................................ Electrocardiogram, Electrogram, Internal Electrogram ICD ............................................................. Implantable Cardio Defibrillator NS................................................................ Non-sustained (e.g. NSVT-non-sust. tentr. tachy) PM ............................................................... Pacemaker SR ................................................................ Sinus Rhythm (normal heartbeat) SVT ............................................................. Supra-ventricular tachycardia (e.g. originating in atrium) VF, VFb ..................................................... Ventricular Fibrillation VFt .............................................................. Ventricular Flutter Overview of Problem Domain and Summary of goals The prediction of the onset of ventricular fibrillation (VF) in the human heart has been said to be the “Holy Grail” of cardiac electrotherapy. Advanced implantable cardio-defibrillators (ICDs) continue to improve on arrhythmia discrimination and classification techniques (e.g. to distinguish between Super-Ventricular Tachycardia or SVT---due to high atrial rate, versus true VT) in order to avoid misclassification and mistreatment. Primary focus of these devices is on detection and arrhythmia confirmation. Attempting in advance to “anticipate” the onset of arrhythmia is beyond the capability of these devices. If a reliable prediction technique were developed to predict onset of VT or VF under a constraint-limited scenario, the need for and frequency of high voltage shock delivery by ICDs could be substantially reduced (given appropriate low voltage “pacing” capture techniques). This project investigates the possibility that the onset of arrhythmia can be predicted, confidently, under certain prior constraints (e.g. VT already detected) which can be identified by a typical ICD. This project suggests the application of either Neural Networks (e.g. GRNN, Time Delay NN or other ANNs) or Time Series Prediction techniques to ECG datasets for cardiac events which exhibit identifiable VT arrhythmia or arrhythmia precursors and which may or may not lead to the deadly VF. These techniques will be investigated and applied to a subset of the available arrhythmia datasets (e.g. Ann Arbor Electrogram Library data). Once a prediction model (or trained neural network) has been developed (or trained), new “test” data can be applied to the model and the results examined. The extent to which a prediction technique can both yield an accurate and a sufficiently early prediction determines its value as a useful predictor for implantable applications. But first, it’s a good idea to look at the data. Data Sources and Descriptions For this investigation, EGM data was required. There are many sources of EGM data, including the MIT-BIH database, data from PhysioNet and the Ann Arbor Electrogram Libraries (AAEL, Ann Arbor, MI, USA). Since the AAEL data was available in CD form from my employer, it was advantageous to start there. The AAEL data comes in a compressed format, which requires several steps to get it into the ASCII form desireable for data input & analysis tools such as Matlab or Neuralware. A DOS batch file was created to split the multiple channels (7 channels) of data and convert them to ASCII. The AAEL data includes data from up to 7 cardiac monitoring channels, including internal leads for High Right Atrium (HRA) and Right Ventricular Apex (RVA). Additionally, up to 3 surface lead monitoring channels are included. For this

Transcript of AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR...

Page 1: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

1

AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR CARDIAC ARRHYTHMIA USING TIME

SERIES PREDICTION TECHNIQUES

Myles Twete, PSU ECE510LD 6/10/2000

ABSTRACT: This investigation focuses on the feasibility of short-term prediction of cardiac arrhythmia from data given in human cardiac electrograms (ECGs) for the purpose of identifying methods and tools useful in predicting the onset of life-threatening ventricular arrhythmia, particularly fibrillation (VF).

Definitions and Acronyms AAEL .........................................................Ann Arbor Electrogram Library AT................................................................Atrial Tachycardia AF, AFb .....................................................Atrial Fibrillation AFl...............................................................Atrial Flutter ECG, EGM, IEGM ................................Electrocardiogram, Electrogram, Internal Electrogram ICD .............................................................Implantable Cardio Defibrillator NS................................................................Non-sustained (e.g. NSVT-non-sust. tentr. tachy) PM...............................................................Pacemaker SR ................................................................Sinus Rhythm (normal heartbeat) SVT .............................................................Supra-ventricular tachycardia (e.g. originating in atrium) VF, VFb .....................................................Ventricular Fibrillation VFt ..............................................................Ventricular Flutter

Overview of Problem Domain and Summary of goals The prediction of the onset of ventricular fibrillation (VF) in the human heart has been said to be the “Holy Grail” of cardiac electrotherapy. Advanced implantable cardio-defibrillators (ICDs) continue to improve on arrhythmia discrimination and classification techniques (e.g. to distinguish between Super-Ventricular Tachycardia or SVT---due to high atrial rate, versus true VT) in order to avoid misclassification and mistreatment. Primary focus of these devices is on detection and arrhythmia confirmation. Attempting in advance to “anticipate” the onset of arrhythmia is beyond the capability of these devices. If a reliable prediction technique were developed to predict onset of VT or VF under a constraint-limited scenario, the need for and frequency of high voltage shock delivery by ICDs could be substantially reduced (given appropriate low voltage “pacing” capture techniques). This project investigates the possibility that the onset of arrhythmia can be predicted, confidently, under certain prior constraints (e.g. VT already detected) which can be identified by a typical ICD. This project suggests the application of either Neural Networks (e.g. GRNN, Time Delay NN or other ANNs) or Time Series Prediction techniques to ECG datasets for cardiac events which exhibit identifiable VT arrhythmia or arrhythmia precursors and which may or may not lead to the deadly VF. These techniques will be investigated and applied to a subset of the available arrhythmia datasets (e.g. Ann Arbor Electrogram Library data). Once a prediction model (or trained neural network) has been developed (or trained), new “test” data can be applied to the model and the results examined. The extent to which a prediction technique can both yield an accurate and a sufficiently early prediction determines its value as a useful predictor for implantable applications. But first, it’s a good idea to look at the data.

Data Sources and Descriptions For this investigation, EGM data was required. There are many sources of EGM data, including the MIT-BIH database, data from PhysioNet and the Ann Arbor Electrogram Libraries (AAEL, Ann Arbor, MI, USA). Since the AAEL data was available in CD form from my employer, it was advantageous to start there. The AAEL data comes in a compressed format, which requires several steps to get it into the ASCII form desireable for data input & analysis tools such as Matlab or Neuralware. A DOS batch file was created to split the multiple channels (7 channels) of data and convert them to ASCII. The AAEL data includes data from up to 7 cardiac monitoring channels, including internal leads for High Right Atrium (HRA) and Right Ventricular Apex (RVA). Additionally, up to 3 surface lead monitoring channels are included. For this

Page 2: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

2

investigation, internal ventricular and atrial channels (with bipolar pacemaker leads) were used for two reasons: First, the signals are localized and less noisy than external or unipolar lead signals, second, it is desirable to have results applicable to implantable devices which directly have access to these signals. A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia (sp-VT), spontaneous Ventricular Fibrillation (sp-VF) and Non-Sustained VT and VF was desired for this investigation. Data with Ventricular Extra-Systole (VES) events and Supra-Ventricular Tachyarrhythmia (SVT--- VT due to tracking high freq. Atrium or AV node) also provide useful data for investigation and development of any useful methods for analysis and prediction.

Table 1 Ann Arbor Electrogram Library Files observed in this investigation

AAEL File Arrhythmia Duration Atrial Rhythm Patient Stats A176105 Sp VPD (3)

Induced VFt 10 110 (250cyc)

Sinus M,62yr,CAD, no drugs

A176230 Sp VFt P SR/VPDs SR/NSVT

240 (125cyc) Sinus Same

A185641 Sp SR/VPDs 96 A191118 Induced VT 810 (46cyc) Sinus M,63yr,CAD,

Meds: Endainide, Mexiletine

A198A61 Sp SR/VPDs 85 M,85yr,CAD, Meds: Procainamide

A208480 Sp Afb 200 Afb M,52yr,CAD,AICD, No drugs

A212A05 Sp SR/VPDs 50 M,85yr,CAD, no drugs A213A20 Sp SR/VPDs

(NSVT-PM) 25 segs NSVT during SR M,53yr,CAD/bypass

No drugs A213A25 SR/VPDs Same A213A80 Sp VT-PM 240-400

(11cyc) Sinus Same

A213B88 Sp SR Sp VT-PM Sp VT-RBB P SR/VPDs

6 8 70 70

Sinus Sinus

M,63,CAD/bypass Drugs: Quinidine

A214704 Sp SR/VPDs 85 Same A227B07 Sp SR 100 M,53,CAD

Meds: Verapamil A227B39 Sp SR/NSVT

Sp SR/NSVT 490 (5cyc) 490 (6cyc)

Sinus Sinus

Same

A230B87 Sp SR/VPDs 175 M,61,Normal Meds: isoproterenol, Verapamil

A239140 Sp Afb 115 Afb M, 69yr, no drugs A241546 I VT

P,CV SR 10 27

Sinus SR -> Afb

M,36yr,CAD, Meds: Procainamide

A245168 Sp SR 520 (75cyc) F,55yr, no drugs A245423 Sp VT 270 (55cyc) A-dislodged Same A288086 Sp Afb 25 M,73yr,no drugs A291A38 Sp SVT 460 (101cyc) M,69yr, no drugs A291A98 Sp SR 670 (45cyc) Same A330093 Sp SR 790 (40cyc) M,65, no drugs A330217 Sp VT

Sp NSVT 380 (50cyc) 380

Same

A343220 Sp SR 108 F,56yr,CAD Meds: Amiodarone

A344020 Sp SR/VPDs 135 M,72yr,CAD, Meds: Amiodarone

Page 3: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

3

Preliminary Data Observations and Complications Looking at the ECG datasets considered in Table 1 and the descriptions associated with them, several observations can be made: • ECG data noisy ----stochastic processes involved (complicates the modeling and prediction problem) • Noise not necessarily gaussian (per articles read) • Pseudo-periodic data contains outliers (e.g. PVC, VES events---not necessarily precursors to continuous arrhythmia) • ECG data non-stationary (IEGM mean potentials and variance change) • VT not a prerequisite for VFb----(per clinical experts---limits usefulness of constraint) • AAEL arrhythmias mostly induced (ALL VFs in datasets were induced---no onset data available) • Mostly male, elderly subjects (bias in data and hence any results) • Datasets limited to relatively short (30sec) time segments (difficult to track effect of low-freq. influences—eg. Renal flow)

Despite the limitations presented by the non-ideal datasets, there is much to gain from looking at the data. Perhaps the best place to start is to look at what a fairly normal heartbeat looks like, taken from the datasets in the AAEL. For the purposes of simplifying the number and types of signals to consider, this investigation primarily uses the ventricular IEGM signal, with the atrial IEGM signal used as suitable for the particular method or goal. For example, while the ventricular IEGM yields by itself the basic heart rate as the RR-interval from each beat’s QRS complex, the v-IEGM yields no direct measure of the AV-delay, which can be very useful analytically and provides direct evidence of ventricular tracking of the atrium (if an A-event corresponds with each V-event). At higher rates, the AV-node begins conducting alternately at a 2:1 ratio, then at higher rates at a 3:1 ratio, etc. This forms a natural protection mechanism whereby the ventricle is kept from being excited at the high rates possibly present in the atrium, especially for Afl or AF. First, a look at approx. 30 seconds of a typical AAEL Sinus Rhythm ventricular IEGM signal:

0 0.5 1 1.5 2 2.5 3

x 104

-400

-300

-200

-100

0

100

200

300

400

500a227b07\c4.tx t

Samp# (ms)

Am

pl

Figure 1 Normal ventricular sinus rhythm, 30 cycles

This vIEGM sequence shows 30 complete heartbeat cycles for a 53yr old male cardiac patient. As a data sequence and at this scale, however, it is difficult to see much more than the pseudo-periodicity of the IEGM. Plotting the same sequence as a synchronized composite plot of all 30 of the same complete cycles gives a better idea of what is occurring on a beat-by-beat basis (Figure 2 ). From this representation, several things are clear. First, each of the cycles is very similar to the other (useful in modeling/bounding sinus rhythm, SR). Second, a considerable amount of high-frequency baseline noise exists. Finally, the heartbeat period (RR-interval) from the vIEGM, while fairly constant, varies approx. +/-5% over just this 30-beat period, with little if any visible cues derived from the preceding noisy signal. Of note, however, is the fact that the vIEGM T-wave (negative-going bump in middle of figure) varies substantially from cycle to cycle. One question comes to mind: Could the shape of the T-wave (amplitude, width, slope) provide an

Page 4: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

4

early clue of impending shortening of the RR-interval? If so, T-wave measurements could be used as a feature in any prediction algorithm.

0 100 200 300 400 500 600 700 800 900 1000-400

-300

-200

-100

0

100

200

300

400

500a227b07\c4.txt

Rel. Samp# (ms)

Am

pl

Figure 2 Composite of the same 30 ventricular sinus rhythm cycles

Page 5: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

5

Taking a further look at the beat-by-beat RR- (and PP-) intervals (Figure 3) over the 30-second time sequence, we can see that the PP- and RR-intervals track quite well, perhaps as expected given the primary mechanism which gives rise to the ventricular beat is the AV-node (Figure 4) and that the heart rate is slow enough for direct tracking of the atrium by the ventricle.

0 0.5 1 1.5 2 2.5 3

x 104

920

930

940

950

960

970

980

990

1000A227b07\c3.txt

EGM Start Samp# (ms)

PP[red]&RR[blue](ms)

Figure 3 PP and RR intervals for same 30 SR cycles

0 0.5 1 1.5 2 2.5 3

x 104

192

194

196

198

200

202

204

206

208

210a227b07\c3.tx t

EGM Start Samp# (ms)

AV

dela

y (m

s)

Figure 4 AVdelay sequence for the same 30 cycles

Again, of particular note are the self-similar periodic time sequence of each heartbeat for the normal sinus rhythm. Some questions arise: How does the self-similar periodic sequence change for arrhythmia conditions? Are PP, RR, AV-delay and other parameters which are easily derived from the IEGM signals useful in any prediction algorithm? What other useful indicators are available or can be derived? Typical indicators in ICDs used to identify arrhythmia include PP and RR intervals, Sudden Onset, amplitude and period stability. Are these useful also in “predicting” arrhythmia? Taking a look now at a composite of a ventricular arrhythmia sequence (Figure 5), it is clear that in addition to the observed shortened RR-intervals, the downward slope of the QRS complex as well as the T-wave (negative dip after the QRS complex) also vary considerably. Perhaps the R-wave downslope could provide early information prior to actual RR-interval decrease.

0 200 400 600 800 1000 1200-600

-400

-200

0

200

400

600

800

1000a330217\c4.tx t

Rel. Samp# (ms)

Am

pl

Figure 5 Non-sustained Spontaneous VT,

0 0.5 1 1.5 2 2.5 3

x 104

100

200

300

400

500

600

700

800A176230\c3.tx t

EGM Start Samp# (ms)

PP

[re

d] &

RR

[bl

ue]

(ms)

Figure 6 PP and RR intervals showing ongoing VT with sinus atrial

rhythm

Page 6: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

6

Delay Coordinate Embedding Much has been said in the Time Series Prediction literature of the use of time-delay coordinates and delay coordinate embedding as a useful tool in identifying dimensionality of underlying determism in linear, non-linear and chaotic systems. It has also been shown to be useful as an essential part of several prediction techniques. Other representations include derivative coordinates, integral coordinates, phase-delay (T-D-embedding with T=1) and a host of techniques used to try and show whether an underlying system is chaotic or not (e.g. Lyapunov exponents, Correlation Dimension, Entropy, etc.). The Time-Delay Coordinates rerepresent the original data as an array of embedding dimension “m ” which indicates the dimension of the resulting array of delayed coordinates. With non-stationary and noisy data as in Figure 7, the generation of T-D Phase diagrams as in Figure 8 becomes complicated by both factors. In this case, the noise in the data was filtered out by running averaging several samples of data prior to generating the embedded coordinates. For the non-stationarity of the data, differencing the data by a dynamic average instead of the long-term average could yield better adjustment of non-stationary data for T-D phase plots. However, it may be that time-varying information in the non-stationary aspect of the data itself is useful in understanding the underlying dynamics of the system and perhaps even useful in prediction.

0 100 200 300 400 500 600 700 800-500

0

500

1000A176230\c4.tx t

Rel. Samp# (ms)

Am

pl

Figure 7 VT case showing non-stationary nature of EGM

w/arrhythmia

-400 -200 0 200 400 600 800-400

-200

0

200

400

600

800

EGM(n-tdelay)

EG

M(n

)

A176230\c3.txt

Figure 8 Corresponding Time Delay Embedding Phase Diagram with t=8ms, m=2dim

Compare this with the resulting T-D phase diagram for the earlier sinus rhythm (AAEL A227B07):

-300 -200 -100 0 100 200 300 400-300

-200

-100

0

100

200

300

400

EGM(n-tdelay)

EG

M(n

)

a227b07\c3.txt

Figure 9 T-D Phase Diagram for Ventricular Sinus Rhythm, tdelay=8ms, m=2

Page 7: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

7

Another Spontaneous VT episode represented as a 2-d T-D phase plot is seen in Figure 10 below. While this case of VT doesn’t exhibit as much of a problem with non-stationarity as Figure 8, interestingly, it bears little resemblance to either diagram and includes high frequency content the others do not.

-800 -600 -400 -200 0 200 400 600 800-800

-600

-400

-200

0

200

400

600

800

EGM(n-tdelay)

EG

M(n

)

A213A80\c3.tx t

Figure 10 Another T-D 2-d Phase Plot of Spontaneous VT; Tdelay = 8ms, m=2

Page 8: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

8

Still, none of these spontaneous VT test cases display the onset of VT (let alone VF, the original prediction goal of this investigation) from a sinus rhythm, a necessary precondition if we are to identify precursors to VT for prediction. In fact, only two of the cases in the AAEL set of files (A213B88 and A227B39) show sinus ventricular rhythm prior to VT.

Onset of VT (A213B88 case) The first case, A213B88, is shown as a T-D Phase plot in Figure 11. Again, not a lot of similarity with the earlier T-D phase plots. This case file contains approximately 6 seconds of sinus rhythm (shown in red) prior to the onset of 8-seconds of polymorphic-VT (shown in green), then steady VT (shown in blue).

-1000 -500 0 500-1000

-500

0

500

EGM(n-tdelay)

EG

M(n

)A213B88\c3.txt

Figure 11 T-D Phase Plot for Onset of VT; Red: SR, Green: PM-VT, Blue: RBB-VT

A look at the PP and RR-intervals for this case shows steady sinus rhythm (SR) for the atrium while the ventricle takes a not-so-steady dive towards tachyarrhythmia (Figure 12). Is the presence of the sudden fluctuations in the RR-interval sufficient to predict onset of VT?

0 0.5 1 1.5 2 2.5 3

x 104

200

300

400

500

600

700

800

900A213B88\c3.txt

EGM Start Samp# (ms)

PP

[red

] & R

R [

blue

] (m

s)

Figure 12 PP- and RR-intervals for A213B88; atrium displaying steady SR

Page 9: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

9

Non-Sustained VT (A227B39 case) For the case of A227B39, two brief episodes of non-sustained VT (NSVT) follow sinus rhythm and end in sinus rhythm. This is a good case to study for self-terminating VT, a situation in which intervention should not be considered necessary. The corresponding PP- and RR-intervals for this case show the two cases of sudden onset of non-sustained VT. Many patients experience such conditions without dire consequence, but could the frequency, duration or shape of this arrhythmia be useful in predicting steady arrhythmia?

0 0.5 1 1.5 2 2.5 3

x 104

-400

-300

-200

-100

0

100

200

300

400

500A227b39\c4.txt

Samp# (ms)

Am

pl

Figure 13 Ventricular Sinus Rhythm with 2 episodes of NSVT

0 0.5 1 1.5 2 2.5 3

x 104

400

500

600

700

800

900

1000

1100A227b39\c3.txt

EGM Start Samp# (ms)

PP

[re

d] &

RR

[bl

ue]

(ms)

Figure 14 PP- and RR-intervals for the SR-NSVT-SR-NSVT-SR case

The corresponding T-D phase plot is shown below in Figure 15 (VT shown in Green, Blue is the final SR segment). Interesting here is the fact that only the VT segments continue upwards to the right past the 150:150 point. Clearly the shape of the NSVT (green) portion of the vIEGM data is substantially different in phase space than the sinus rhythm. But is there an early indicator of this onset? Again, the T-waves clearly vary in the composite vIEGM with the cycle sequence, but do they give any early indication? It is common to use sudden onset of high frequency as confirmation of real VT, but by that point, VT exists, so RR intervals alone do not appear adequate as a predictor without accepting “errors of commission”.

-200 -100 0 100 200 300 400-200

-100

0

100

200

300

400

EGM(n-tdelay)

EG

M(n

)

A227b39\c3.txt

0 200 400 600 800 1000 1200-400

-300

-200

-100

0

100

200

300

400

500A227b39\c4.txt

Rel. Samp# (ms)

Am

pl

Figure 15 T-D Phase Plot and composite vIEGM for SR-NSVT-SR-NSVT case (Green: NSVT; Blue/Red: SR)

Page 10: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

10

Prediction Techniques Exploration First of all, the data available in the AAEL datasets and the best data observed thus far do not provide an adequate basis to develop prediction techniques. Nevertheless, experience in applying prediction techniques to some of this data can lead to understanding the limitations and lead to further considerations in applying the techniques to usable data at a later point. For convenience and for its applicability as a general regressor, I chose to apply the General Regression Neural Network (GRNN) artificial neural network (ANN) configuration (using NeuralWare) to the problem of short-term prediction of the actual vIEGM given a prior trained GRNN and given a test input data point in delay coordinates.

GRNN method applied to dataset A227B39 (Sinus Rhythm-NSVT-Sinus-NSVT-Sinus)

As training data, the first 12000 data point vectors of a 2-d T-D Phase Delay matrix (EGMval(n-Td),EGMval(n)) was used to train the GRNN neural net. Output value used is the next vIEGM value (EGMval(n+1)) in the sequence. Max. # patterns in the GRNN pattern layer was set to 400. Initially, the Training Cluster Radius and Test gaussian were set to 0.25 and 1, respectively. Later these were changed to 0.1 and 0.4, respectively, and tested with the 18000 remaining datapoints in the dataset. The results for predicting the next value for each input pair showed 98.6% correlation and RMS error of 0.0204. Visually, the actual and estimated values for the 18 seconds are as shown below, red being the Actual (Y) EGM values and black representing the Estimated (Yest) EGM values:

0 2000 4000 6000 8000 10000 12000 14000 16000 18000-200

-100

0

100

200

300

400Y vs Yest, GRNN, ~80 clusters, Radius = .1, Tes t S igma = .4

test sample#

Y v

alue

s

Figure 16 Single-step predictions for GRNN trained w/12000 datapoints (A227B39a)

Another look at applying GRNN to prediction, this time training the GRNN with the same 12000 samples, but with 2 outputs to the GRNN to compare to the 2 inputs (delayed and latest). This allows the incremental prediction later for test data by applying a test data pair to the input, generating a predition pair output, then applying this pair to the input and repeating. This GRNN network was trained using a Cluster Radius of 0.2 and a Gaussian Kernel of 0.3 for testing. Both Train and Test sets showed correlation of 97% or better with only 38 clusters in the pattern layer. While the former GRNN prediction applied each of the remainder data points of the A227B39 test file (18000 samples) to individually predict each next value, a better test of the model is to use the results of each prediction as the input for the next prediction. This was tried for a point in 2-d T-D phase coordinates for A227B39 and the predicted points plotted against the actual data (Figure 17), beginning from a point in an R-wave at approx. (103, 230)---See uppermost peak for Red-Blue SR cycles in the phase plot. The predictor does fairly well for 3 iterative predictions, then the predicted next step direction takes a right-hand 90-degree turn from the data. After only 8 predictions (9 datapoints), it is clear that the predictor (black curve) has gotten stuck and begins to fall behind the actual data (red curve).

Page 11: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

11

100 120 140 160 180 200 220 2400

50

100

150

200

250

300

EGM(n-tdelay)

EG

M(n

)

Next 9 pts pred. by GRNN vs. Actual: A227B39 w/init. coords = [103.289101, 229.847397]

Figure 17 Step-by-step fwd prediction using GRNN-trained model (initial point is in R-wave)

This result is at least partially due to the use of 2-d T-D phase coordinates to train the GRNN, which, when clustering the training data, cannot distinguish between trajectories when their points cross the same space (See Figure 15). Thus, while ECG trajectories crossing near 103,230 (and the next two predicted points) appropriately clustered together to yield a prediction of the next point which was substantially in the right direction and magnitude (following up the Red/Blue trajectories to upper right of Figure 15), once near the point 200, 200 the composite clustering of training data was not so clear. Effectively, enough diverse trajectories crossed this area in the T-D phase diagram to effectively create a local attractor near 190, 190 (See where Red/Blue trajectories collide on the way down with the green, VT trajectories). So clearly the 2-d T-D model is inadequate for regions where trajectories might cross close enough to cluster their predictions and negate any value to them. While decreasing the size of the GRNN gaussian kernels might help, both the benefits of clustering and the lack of any guarantee that these trajectories do not cross are reduced. To separate the trajectories adequately enough, it may require 4 or more dimensions to the T-D phase diagram.

Conclusions Clearly there is much to learn and explore with ECG and EGM data. Among the challenges are the diversity of both signal and noise, the broad spectrum of ‘normal’ and arrhythmia patterns and the highly dynamic nature of the data and the difficulties and constraints surrounding data from implantable devices. More exploration in applying time delay embedding to ECG data and its use in many of the common prediction techniques and in the neural network context should at least prove interesting. The limited results obtained in this brief look at the problem domain indicate that with careful consideration of the limitations posed by both the data and the model, a limited amount of prediction should be feasible with neural, typical time-series prediction or other techniques. Finally, a primary conclusion here is the importance of gathering much longer sets of data which span the transition from pre-arrhythmia through arrhythmia onset and on to full-arrhythmia (or not). Without an adequate dataset, development and proving of a suitable predictor is risky at best. Finally, further application of the GRNN technique to a delay coordinate representation of 3 or greater would be interesting and possibly avoid getting stuck as seen above. This should be explored. Also, a more careful look at developing a set of patterns or cues which could be used as predictor model inputs is necessary. Throughout this investigation and report many questions were raised regarding possible early indicators of arrhythmia. These questions should be explored either through the literature or testing on the data. Application of nearest neighbor and techniques, stochastic methods and consideration of real hardware filtering of signals were also areas not addressed herein and are open to investigation.

Footnotes and References Tim Sauer, Time Series Prediction by Using Delay Coordinate Embedding, pp175-196, Time Series Prediction: Forecasting the Future and Understanding the Past, Eds. Weigend and Gershenfeld, Addison-Wesley, 1993

Page 12: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

12

ATTACHMENT “A” – Batch File to convert AAEL ECG data to ASCII format mkdir %2cd %2copy d:\sig\%1\%2.sigd:\utils\splitup s=%2rename %2.c0 %20.sigrename %2.c1 %21.sigrename %2.c2 %22.sigrename %2.c3 %23.sigrename %2.c4 %24.sigrename %2.c5 %25.sigrename %2.c6 %26.sigd:\utils\sig2bin s=%20rename %20.bin c0.bind:\utils\sig2bin s=%21rename %21.bin c1.bind:\utils\sig2bin s=%22rename %22.bin c2.bind:\utils\sig2bin s=%23rename %23.bin c3.bind:\utils\sig2bin s=%24rename %24.bin c4.bind:\utils\sig2bin s=%25rename %25.bin c5.bind:\utils\sig2bin s=%26rename %26.bin c6.bindel *.sigd:\utils\bin2asc <c0.bin>c0.txtd:\utils\bin2asc <c1.bin>c1.txtd:\utils\bin2asc <c2.bin>c2.txtd:\utils\bin2asc <c3.bin>c3.txtd:\utils\bin2asc <c4.bin>c4.txtd:\utils\bin2asc <c5.bin>c5.txtd:\utils\bin2asc <c6.bin>c6.txtdel *.bincd ..

ATTACHMENT “B” - MATLAB code to generate arrays and a/v IEGM, a/v IEGMcomp, PP/RR intervals, AV delay plots % AAEL File: A227b39% CASE:% Sinus rhythm ONLY

% Bipolar Lead, Right Ventricle (channel 5, file c4.txt)egmfile = 'A227b39\c4.txt';%input variables to pass to 'displegm'egmstart=0;egmstop=0;thresh=180;refrac=200;%generate EGM plots (continuous and per-cycle composite)[EGMfull,EGM,EGMcyc,indexV,RR]=dispegmV(egmfile,egmstart,egmstop,thresh,refrac);%generate plot of complete RR intervals vs sample#[RR,indexV]=RRanal(RR,egmfile,indexV);% Bipolar Lead, Right Atrium (channel 4, file c3.txt)egmfile = 'A227b39\c3.txt';%input variables to pass to 'displegm'thresh=180;refrac=200;%generate EGM plots (continuous and per-cycle composite)[EGMA,EGMAcyc,indexA,PP]=dispegmA(egmfile,egmstart,egmstop,thresh,refrac);%generate plot of complete PP intervals vs sample#[PP,indexA]=PPanal(PP,egmfile,indexA);PPRRanal(PP,RR,egmfile,indexA,indexV);

Page 13: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

13

ATTACHMENT “C” – Function to generate RR intervals and plot Composite vIEGM for the set of heartbeats % ECE510LD Learning From Data% Written by Myles Twete 05/2000% Short-range Heart ECG Prediction Project% this function performs the following:% 1) Brings in the data from a single ASCII text ECG file% derived from one of the ANN ARBOR ECG Library datafiles 'egmfile'% Data is stored as 'EGMfull'. Input data also includes 'egmstart'% egmstop, thresh (amplitude threshold which identifies a Vpeak),% and refrac (blanking period after a detected V-peak.% 2) Plots a sub-sequence of the EGM based on 'egmstart' and 'egmstop'% For egmstart = 0, it will be forced to 1.% For egmstop = 0, it will be forced to length(EGM)--ie. max length% 3) Locates the QRS complex ventricular wave peak for each heartbeat% and creates an ECG array (EGMcyc), which is then plotted against% the maximum inter-beat period for the samples in the selected range.% 4) Outputs EGMfull,EGM,sampnum,EGMcyc,relsamp,indexV,RR and RRmax% for use in calling function or file.function [EGMfull,EGM,EGMcyc,indexV,RR]=displegm(egmfile,egmstart,egmstop,thresh,refrac);[EGMfull] = dlmread(egmfile,'\t');N = length(EGMfull);% default for egmstart is "1" if passed as egmstart = 0;if egmstart == 0

egmstart = 1;end% default for egmstop is "N" if passed as egmstop = 0;if egmstop == 0

egmstop = N;endviewlen = egmstop-egmstart+1;sampnum=linspace(egmstart,egmstop,viewlen);EGM = EGMfull(egmstart:egmstop);% shift EGM by the mean of EGMfullEGM = EGM-mean(EGMfull);% Plot EGM vs sampnumfigure(1);plot(sampnum,EGM,'r');title(egmfile);xlabel('Samp# (ms)');ylabel('Ampl');% Peak Detection% default amplitude threshold if passed as 0;if thresh == 0

thresh = 500; % peak detect thresholdendnew = 1;% default refractory if passed as 0;if refrac == 0

refrac = 250; % refractory periodendindexV=1;% determine the QRS complex location for each heartbeatfor i = 1:viewlen

if EGM(i)>threshdelta = i-indexV(length(indexV));if delta > refrac

new = 1;endif new

indexV = [indexV, i];new = 0;

endend

end% number of cycles, including first (w/o peak) and last(not ending w/peak)cycles = length(indexV);% compute a vector of RR intervals for all cycles, complete or notfor i=1:cycles

if i< cyclesRR(i) = indexV(i+1)-indexV(i); % all but last cycle

elseRR(i)= viewlen-indexV(i); %last,perhaps incomplete, cycle

endend

Page 14: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

14

%compute the maximum RR intervalRRmax = max(RR);%range to create an array of EGMs over these RR intervalsrelsamp = [1:RRmax];%declare the EGMcyc array to cover #cycles and range to RRmaxEGMcyc=zeros(cycles,RRmax);%create the EGMcyc array%outer 1:length(indexV)for i=1:cycles

cyc = indexV(i);%make each vector dimension = RRmaxfor j=1:RRmax

%handle the last cycle, which will be shortif i==cycles

%if (cyc+j-1) < viewlen

EGMcyc(i,j)= EGM(cyc+j-1);else

EGMcyc(i,j) = 0;end

%not the last cycleelse

if (cyc+j-1) < indexV(i+1)EGMcyc(i,j)= EGM(cyc+j-1);

elseif j~=1

EGMcyc(i,j) = EGMcyc(i,j-1); %HOLD if end of RRelse

EGMcyc(i,j)=0;end

endend

endend

%now print the composite of all the complete ventricular cyclesfigure(2);plot(relsamp,EGMcyc(2:cycles-1,relsamp),'r');title(egmfile);xlabel('Rel. Samp# (ms)');ylabel('Ampl');

ATTACHMENT “D” – Matlab code to generate Delay Coordinate array and plot % ECE510LD---myles twete% ECG Prediction Project, Time Delay Embedding Vector Creationfunction [EGMdel] = TDembed(EGM,egmfile,tdelay,Dembed);% take a 5-sample running average of the EGM to plot to smoothfor i=1:length(EGM)

if i>=5EGM(i)=mean(EGM(i-2:i));

endend% define the delayed embedc=tdelay*(Dembed-1)+1;d = length(EGM);embedrange = [d-c+1:d];EGMdel = zeros(d-c+1,Dembed);for i=c:d

minval = i-c+1;k=tdelay;if k==0

k=1;endEGMdel(minval,:) = EGM(i-tdelay*(Dembed-1):k:i)';

end%plot(embedrange,EGMdel(testdelay,embedrange),'r');figure(9);if Dembed == 1

plot(EGMdel(:,1),'r');xlabel('EGM(n)');

elseif Dembed==2

Page 15: AN EXPLORATION INTO THE PREDICTION OF VENTRICULAR …web.cecs.pdx.edu/~edam/Reports/2000/MTwete.pdf · A range of data to include Sinus Rhythm (SR), spontaneous Ventricular Tachycardia

15

plot(EGMdel(:,1),EGMdel(:,2),'r');xlabel('EGM(n-tdelay)');ylabel('EGM(n)');

elseif Dembed == 3plot3(EGMdel(:,1),EGMdel(:,2),EGMdel(:,3),'r');grid on;view(20,20);xlabel('EGM(n-2*tdelay)');ylabel('EGM(n-tdelay)');zlabel('EGM(n)');

endtitle(egmfile);

ATTACHMENT “D” – GRNN Next-Point-prediction Matlab plot code (data results “*.nnr” from Neuralware) % ECE510LD---myles twete% Project GRNN prediction,% using a Gaussian Kernelfunction grnnproj% NeuralWare results from datatst.nnr[ARRAY2] = sortrows(dlmread('A227B39b.nnr','\t'),1);Ntst = [1:length(ARRAY2(:,1))];Xtst = ARRAY2(:,2);Ytst = ARRAY2(:,3);% Now plot the best fit vs the original datafigure(2);plot(Ntst,Xtst,'-k',Ntst,Ytst,'-r');title('Y vs Yest, GRNN, ~80 clusters, Radius = .1, Test Sigma = .4');xlabel('test sample#');ylabel('Y values');