Probabilistic Models for Automated ECG Interval Analysis ...davidc/pubs/bsp_08_01.pdf · ECG...

Probabilistic Models for Automated

ECG Interval Analysis in Phase 1 Studies

Technical Report BSP 08-01

Nicholas P. Hughes and Lionel Tarassenko

Institute of Biomedical Engineering

University of Oxford

ORCRB, Off Roosevelt Drive

Oxford OX3 7DQ, UK

Email: [email protected]

Tel: +44 (0)1865 617674

Fax: +44(0)1865 617701

1

Abstract

The accurate measurement and assessment of the various ECG intervals, and in particular the QT interval, is

currently the gold standard for evaluating the cardiac safety of new drugs. Automated methods for ECG interval

analysis offer the potential for a more extensive evaluation of ECG data from clinical trials, and hence a more

robust assessment of drug safety. In this paper we consider the use of hidden Markov models (HMMs) for this

task, with a focus on the analysis of ECGs from phase 1 studies. We introduce a novel sample-wise encoding of

the ECG based on the undecimated wavelet transform and derivative features, and show how the robustness of the

HMM segmentations can be greatly improved through the use ofstate duration constraints. Most significantly, we

show how the probabilistic nature of the HMM can be leveragedto provide aconfidence measurein the model

segmentations. This enables our algorithm to automatically highlight any ECG interval measurements which are

potentially unreliable. The performance of our approach isevaluated on a substantial ECG data set from a clinical

study, where we demonstrate that our algorithm is able to produce QT interval measurements to a similar level of

accuracy as human experts.

2

Hughes & Tarassenko Technical Report BSP 08-01

INTRODUCTION

Drug safety is an area of increasing concern for both pharmaceutical companies and regulatory agencies. Of

particular importance is the propensity for some non-cardiac drugs (such as antihistamines) to induce potentially

fatal heart rhythms. The primary means by which this aspect of cardiac safety is assessed in clinical drug trials is

through the analysis of the 12-lead electrocardiogram (ECG), and in particular, the varioustiming intervalswithin

each individual heartbeat.

Fig. 1 shows a typical ECG waveform and the standard ECG intervals occurring within a given beat: namely, the

PR interval, QRS duration (QRSd) and QT interval. Of particular significance is the QT interval, which is defined

as the time from thestart of the QRS complex to theendof the T wave (i.e.Toff − Q). This value corresponds

to the total duration of electrical activity (both depolarisation and repolarisation) within theventriclesin a given

heartbeat. The importance of the QT interval stems from its link to the cardiac arrhythmiatorsade de pointes[1].

This arrhythmia is known to occur when the QT interval is prolonged significantly (so called long QT syndrome),

which in turn can lead to ventricular fibrillation and suddencardiac death.

[FIGURE 1 about here.]

Recently, the propensity for some non-cardiac drugs to prolong the QT interval, and hence cause torsade de

pointes, has begun to emerge [2]. This issue was first highlighted in the case of the antihistamineterfenadine,

which had the side-effect of significantly prolonging the QTinterval in a number of patients. Unfortunately this

effect was not adequately characterised in the clinical trials and only came to light after a large number of people

had unexpectedly died whilst taking the drug.

In order to address the issue of drug-induced QT prolongation, the U. S. Food and Drug Administration (FDA)

and Health Canada recently introduced a new type of phase 1 clinical study known as a “thorough phase 1 QT/QTc

study” [3]. These studies are now a mandatory requirement for all new drug submissions and must be performed

before approval can be granted by a regulatory body. A key requirement of the thorough QT study is the collection

of a large number of ECGs (typically 20,000 or more), and the need for highly accurate and reliable QT interval

measurements.

At the present time, ECG interval measurements for clinicaltrials are carried outmanuallyby expert analysts.

This procedure is both costly and labour intensive. Furthermore, manual ECG interval measurements are suscepti-

ble to occasional mistakes by the analysts. These drawbackstypically limit the total number of electrocardiograms

which can be successfully analysed in a given trial. Automated methods for ECG interval analysis therefore offer

3


the potential for a more extensive evaluation of ECG data from clinical trials, and hence a more robust assessment

of drug safety.

Previous Approaches to Automated ECG Analysis

Despite the advantages offered by automated approaches to ECG interval analysis, current automated systems are

not sufficiently reliable to be used for the assessment of ECGdata from clinical trials. The algorithms which un-

derlie these systems typically fall into one of two categories: rule-basedapproaches andmodel-basedapproaches.

Most rule-based algorithms divide the ECG segmentation problem into a number of distinct stages. In the

first stage, the locations of the R peaks in the 10-second ECG signal are estimated using a QRS detector, such as

the Pan and Tompkins algorithm [4]. Once the R peaks have beenidentified, the next step is to search forwards

and backwards from each peak to locate the QRS complex boundaries (i.e. theQ andJ points). These points are

typically determined as the first points (searching forwards or backwards from the peak) where the signal (or its

derivative) crosses a pre-determined threshold [5].

Following the identification of the QRS boundaries, the nextstage is to search for thepeaksof the P and T

waves. These can be estimated as the points with the maximum absolute amplitude (relative to theQ or J points)

occurring within a given window to the left and the right of the R peak [5]. An alternative approach is to use

template matchingtechniques to locate the peaks [6]. The advantage of this approach is that it offers an increased

degree of robustness to noise in the ECG signal, compared with thresholding methods. The drawback however is

that it requires the definition of one or more templates for the given waveform feature.

Following the determination of the P and T wave peaks, the final stage is to locate the P and T waveboundaries.

Threshold methods represent the most common approach for this problem. The standard thresholding algorithm

for ECG segmentation, known as “ECGPUWAVE”, applies a series ofadaptivethresholds to the ECG derivative in

order to determine the waveform feature boundaries [7]. In the case of the T wave offset, the associated threshold

is first estimated from thepeakof the ECG signal derivative within a given window (following the T wave peak).

The derivative value at this point is thenscaledaccording to a pre-determined scaling factork. The resulting value

then defines the threshold for detecting the T wave offset in the ECG derivative.

An alternative rule-based approach to thresholding is given thetangent method. This approach was first intro-

duced more than fifty years ago as amanualtechnique for T wave offset determination [8]. More recently it has

been used as the basis for a number of automated techniques [6]. The tangent method determines the end of the T

4


wave as the point ofintersectionbetween the (estimated) isoelectric baseline and the tangent to the downslope of

the T wave (or the upslope for inverted T waves). The tangent itself is computed at the point ofmaximum(abso-

lute) gradient following the peak of the T wave. The isoelectric baseline is typically estimated from the region of

baseline between the P wave offset and the QRS complex onset.

A significant disadvantage of the tangent method is that it issensitive to theamplitudeof the T wave. In

particular, large amplitude T waves can cause the tangent tointersect the isoelectric baseline before the true end of

the T wave. As a result, the automated QT intervals from the tangent method can significantlyunderestimatethe

“true” QT interval values (as determined by an expert analyst). To overcome this problem, Xu and Reddy apply

a “nonlinear correction” factor (based on the T wave amplitude) to the T wave offset determined by the tangent

method [6].

The application ofwavelet transformsto the analysis of the ECG has been considered by a number of authors.

In the context of ECG segmentation, wavelets are generally used in combination with standard thresholding meth-

ods [9, 10] (i.e. the thresholding is carried out in the wavelet domain). The advantage of wavelet thresholding

approaches over similar methods which operate on the ECG signal itself (or its derivative) is that the resulting

segmentations tend to be less sensitive to the effects of noise and artefact in the signal.

An alternative to the rule-based approaches to ECG signal analysis is to construct astatistical modelof the

ECG. This approach was first investigated by Coast et al., whodemonstrated that hidden Markov models (HMMs)

can be applied successfully to the problem ofcardiac arrhythmia analysis[11]. To address this problem, the

authors trained hidden Markov models (with Gaussian observation densities) to recognise ECG beats from three

different types of heart rhythms. For each model, a bandpassfiltered version of a single channel ECG signal was

used as the input observation sequence to the HMM. Separate HMMs were trained on ECG beats produced by a

normal heart rhythm, a ventricular heart rhythm, and a supraventricular heart rhythm. The three HMMs were then

connected together to form a single “global” model for continuous ECG analysis.

For HMM training, Coast et al. used a patient-specific training strategy to estimate the model parameters for

each of the three HMMs. In particular, for a given patient, the individual HMMs were trained using ECG data

recorded from that patient only. In the first step of the training procedure, three ECG beats associated with each of

the three different heart rhythms were identified manually in the ECG recording for the given patient. These beats

were then segmented manually and the annotated ECG beats foreach type of heart rhythm were used to initialise

the parameters of the corresponding model. The model parameters for each HMM were then refined by running

the EM algorithm on the three ECG beats associated with that particular model.

5


More recently, Andreao et al. have applied HMMs to the problem of ECG beat segmentation and classification

for onlineambulatoryECG analysis [12]. The authors used multiple hidden Markov models, with different HMMs

employed to model different ECG waveform morphologies. Each individual HMM consisted of a left-right Markov

chain with 3 states used for each of the isoelectric baseline, P wave and QRS complex regions of the ECG, 2 states

for the PQ and ST segments, and 6 states for the T wave. The number of states was chosen in proportion to the

average duration of each ECG feature. The observation models for each of the hidden states were chosen to be

Gaussian densities.

For each model, Andreao et al. used a continuous wavelet transform representation of the ECG signal as the

input observation sequence to the HMM. Specifically, the authors employed a Mexican Hat wavelet function, and

used only the wavelet coefficients evaluated at the dyadic scales22, 23, and24. The HMMs were then trained in an

unsupervised manner with the EM algorithm, using ECG data taken from the MIT-BIH QT Database.

A limitation of the approach of Andreao et al. is that the wavelet representation used does not encode the full

spectral content of each ECG. By selecting the coefficients from only a small number of scales of the continuous

wavelet transform, the resulting encoding only captures the ECG signal content at the particular frequency bands

corresponding to the given scales. As a result, waveform feature boundaries which consist of sharp transitions or

“discontinuities” in the ECG signal (such as theQ andJ points), and which therefore give rise to spectral content

over a wide range of different frequency bands, may not be detected as accurately as possible (compared with an

HMM trained on acompleterepresentation of the data).

In the context of ECG interval analysis, a more serious limitation of the approaches of both Coast et al. and

Andreao et al. is the use ofunsupervisedlearning to estimate the hidden Markov model parameters. Although this

is a common approach in many applications of HMMs (particularly in the field of automatic speech recognition -

where it is not feasible to collect large volumes of labelledspeech data for supervised learning), it is not guaranteed

that the resulting model will produce ECG segmentations consistent with those of expert analysts. In particular,

unsupervised learning with the EM algorithm merely seeks tofind the model parameters which maximise the likeli-

hood of the data. These parameter estimates may be very different from those which provide optimal segmentation

performance (with respect to expert measurements). Thus, amore appropriate strategy is to estimate the HMM

parameters in asupervisedmanner using ECG data annotated by expert cardiologists. This is the approach taken

in this paper.

6


Shortcomings of Current Automated Systems

As discussed previously, current automated systems are notsufficiently robust to be used for the analysis of ECG

data from clinical drug studies. This lack of robustness canbe traced back to the approach to automated ECG

interval analysis which underlies all current automated systems. Specifically, traditional ECG interval analysis

algorithms are designed to compute ECG interval measurements regardless of thesuitability for measurementof

the particular signal under consideration.

This represents a considerable drawback since not all the ECG signals recorded during the course of a clinical

study are suitable for accurate and reliable interval measurements (either by machineor by human expert). As

noted by Fenichel et al. , “Some ECGs are technically defective (with muscle artifact, misplaced leads, or other

problems), and one should not attempt to derive useful data from them. [...] Even in carefully conducted studies,

about 1 in every 30+ ECGs does not lead to meaningfully measurable data.” [1].

The issue of the “suitability for measurement” of ECG signals is complicated since there exists acontinuous

spectrumbetween extreme cases (such as ECG “signals” recorded when the electrodes have been disconnected

from the machine) and those where the signal is affected by only a small amount of noise. As such, “it is very

difficult (if not impossible) to suggest any criteria for quick visual inspection of the ECGs in order to distinguish

cases in which the automatic assessment should be excluded”[13].

Since a standard automated system will provide ECG intervalmeasurements forany input ECG signal (regard-

less of the ECG signal quality or waveform normality), it is not possible therefore to differentiate between those

automated measurements which are reliable and those which are not. This is the fundamental problem which hin-

ders current approaches to automated ECG interval analysisand prevents automated systems from being utilised in

the analysis of ECG data from clinical drug studies.

Overview and Contributions of this Paper

In this paper we consider the use of probabilistic models forautomated ECG interval analysis, with a focus on the

analysis of 10-second 12-lead ECGs from thorough phase 1 QT/QTc studies. In particular, we show how hidden

Markov models can be successfully applied to this problem through a careful consideration of the choice of ECG

representationand modelarchitecture. Most significantly, we show how the probabilistic generative nature of the

HMM can be leveraged to provide aconfidence measurein the model segmentations.

Motivated by the observation independence assumption inherent in the HMM framework, we introduce a novel

7


sample-wise ECG encoding that is well suited to analysis by an HMM. This encoding, which is based on a combi-

nation of theundecimated wavelet transformand its associated derivative features, provides a representation of the

ECG which facilitates accurate segmentations by a hidden Markov model. In addition, we analyse the problem of

“double-beat segmentations” by an HMM, and show how the use of minimum state duration constraintsserves to

improve significantly the robustness of the model segmentations.

Although the use of hidden Markov models for ECG analysis hasbeen explored previously in the literature [11,

12], the full probabilistic description these models provide has not previously been fully exploited. To this end, we

show an HMM can be used to provide a statisticalconfidence measurein its analysis of a given ECG waveform.

The level of confidence in the model segmentations can be assessed by evaluating the HMM joint log likelihood for

each segmented ECG beat, with respect to the beat duration. The resulting confidence measures can then be used

to differentiate between normal ECG waveforms (for which automated measurements may be reliably inferred),

and abnormal or unusual ECG waveforms (for which automated measurements are frequentlyunreliable). Thus,

by utilising aconfidence-basedapproach to automated ECG interval analysis, we can automatically highlight those

waveforms which are least suitable to analysis by machine (and thus most in need of analysis by a human expert).

We evaluate our approach on a substantial ECG data set from a thorough phase 1 QT/QTc study, and compare

the performance with that of a standard threshold-based algorithm for ECG analysis. The results demonstrate the

ability of our algorithm to overcome many of the problems associated with existing automated systems. Further-

more, we show that our algorithm is able to produce QT interval measurements to a similar level of accuracy as

those of expert ECG analysts (taking into account the inter-analyst variability inherent in manual QT measure-

ments).

THOROUGH PHASE 1 QT/QTc STUDIES

Overview

The phase 1 clinical study represents the first time that a drug being tested is administered to humans. The primary

aim of a phase 1 study is to assess thesafetyof the drug. In particular, the study is designed to assess what happens

to the drug in the human body, i.e. how it is absorbed, metabolised, and excreted. In addition, the phase 1 study

aims to uncover any possible side-effects which may occur asthe dosage levels of the drug are increased.

An important aspect of phase 1 studies is that they are usually carried out in a relatively small number ofhealthy

normalvolunteers, rather than the target population for the drug (who are the focus of subsequent phase 2 and phase

8


3 studies). Healthy normal volunteers are typically males between the ages of 18 and 45 years old, or females who

are not of child-bearing potential.

The thorough or definitive phase 1 QT/QTc study is a specific type of phase 1 study designed to provide a

robustassessment of the effect of a drug on the QT and QTc intervals [3]. These studies are characterised by the

following requirements:

• A large number of ECGs recorded at multiple time points, bothat baseline and on drug

• A wide range of different doses of the drug under study

• The use of a positive control to assess the sensitivity of thestudy

In common with standard phase 1 studies, thorough phase 1 QT/QTc studies are usually carried out in healthy

normal volunteers. As a result, the electrocardiogram signals recorded over the course of a thorough QT study

are generally normal and exhibit standard waveform morphologies. In some instances however, the drug under

consideration may affect the morphology of the ECG in an unusual manner (particularly at high dose levels).

An important aspect of the thorough QT study is that it is designed to detect small changes in the QT/QTc

interval, rather than instances of torsade de pointes (or other cardiac arrhythmias) which may very occasionally

result from a prolongation of the QT interval. The reason forthis focus is that torsade de pointes is extremely rare,

and hence little reassurance is provided by the lack of observation of this arrhythmia during the clinical trials [1].

However, because torsade de pointes is preceded by QT interval prolongation, measurement of the QT interval (and

any drug-induced changes thereof) can be used as a “surrogate marker” to characterise the pro-arrhythmic potential

of the drug under consideration.

Since the thorough QT study requires that multiple ECGs are recorded at multiple time points, both at baseline

and on drug (or placebo), the total number of ECGs is typically much greater than that of a standard phase 1 study.

In particular, a thorough QT study typically produces between 10,000 and 20,000 standard 10-second 12-lead

ECGs, although some studies can give rise to more than 40,000ECGs [14].

At the present time, the ECG signals resulting from a thorough phase 1 QT/QTc study must be analysed

manually by human experts. The focus of this analysis is on the accurate measurement and assessment of the

various ECG intervals. The precision with which the ECG intervals are measured is of particular importance

because of the need to detect small changes in the QT interval. This requirement, together with the large quantity

of ECG data generated by such studies, is driving an increasing demand for robust and reliable automated ECG

interval analysis technologies.

9


Data Set

The data set used in this paper is taken from theplacebo armof a thorough phase 1 QT/QTc study. The data set

consists of approximately 20,000 10-second 12-lead ECGs which were recorded (at a sampling rate of 500 Hz)

from 380 healthy normal volunteers.

All of the ECGs collected in the study were analysed manuallyby a group of expert analysts. Specifically, each

expert analyst was assigned to a unique subset of the subjects in the study and analysedonly the ECG waveforms

from these subjects.

For each 10-second 12-lead ECG, thelongest leadwas chosen for ECG interval measurements. With this

approach, the expert analyst first determined (by visual inspection) the lead which contained the longest QT interval

values, and this lead was then selected for ECG beat annotation. Three consecutive beats were annotated by the

expert (in a small number of cases only two beats could be annotated), who identified the following points within

each beat:

• P wave onset (Pon)

• QRS onset (Q)

• R peak (R)

• QRS offset (J)

• T wave offset (Toff )

The leads most frequently annotated in the study were limb lead II and chest lead V2. These two leads are also

commonly employed in ECG interval analysis when measurements must be performed on a specific lead (which

is selected in advance) across an entire ECG data set. Given the importance of leads II and V2 for ECG interval

analysis, we focus therefore in the remainder of this paper on the development of probabilistic models for ECG

signals recorded from these two leads. However, all of the techniques presented in this paper can be applied toany

given lead of the ECG. Table 1 shows the number of unique ECG signals and annotated beats for leads II and V2

of the thorough QT study data set.

[TABLE 1 about here.]

Note that, whilst a number of previous papers on automated ECG analysis have made use of publicly available

ECG databases such as the MIT-BIH QT Database [15] and the Common Standards for Electrocardiography (CSE)

10


Database [16], the ECG signals contained in these databasesare not representative of those which are typically

encountered in phase 1 clinical studies. This is because these ECGs were recorded frompatient populations

(as opposed to healthy normals), and therefore include a variety of beat morphologies, as well as other cardiac

abnormalities. Hence, these databases are most appropriate for assessing the robustness of ECG analysis algorithms

designed to work with a wide range of both normal and abnormalECG waveforms. By contrast, the focus of this

paper is on the development of techniques for the accurate measurement and assessment of the QT interval in large

volumes of ECG data fromhealthy normals. Hence, we choose to evaluate our approach on the thorough QTdata

set described above.

PROBABILISTIC MODELLING OF THE ECG

Before we can apply the probabilistic modelling approach tothe problem of automated ECG interval analysis, it is

necessary to decide upon a suitable choice for themodelitself. Specifically, we require a model that can be used

to segmentthe ECG signal and which can also take advantage of thesequence orderingwhich exists between the

waveform features of normal ECG heartbeats. These requirements motivate the use of a hidden Markov model for

this problem.

In this section we provide an overview of hidden Markov models and the associated algorithms for learning and

inference. We then compare and contrast a number of different model architectures for ECG interval analysis, and

discuss the key assumptions inherent in the HMM framework and the validity of these assumptions in the context

of ECG signal modelling.

Overview of Hidden Markov Models

A hidden Markov model, or HMM, is a probabilistic model whichdescribes the statistical relationship between an

observablesequenceO and anunobservableor “hidden” state sequenceS. The hidden state itself is discrete and

governed by an underlying Markov chain. The observation values however may be either continuous or discrete in

nature [17].

An HMM (with K hidden states) is defined by the following three parameters:an initial state distributionπ, a

state transition matrixA (with elementsaij), and a set of observation probability distributionsbk (for each statek).

The first two parameters govern the Markov model which describes the statistical properties of the hidden states.

The observation or “emission” probability distributions provide the link between the statistical properties of the

11


observable values and the associated hidden states.

It is often useful to consider a hidden Markov model from agenerativeperspective. That is, we can consider the

HMM as providing a “bottom up” description of how the observed sequenceO is produced orgenerated. Viewed

as a generative model, the operation of an HMM is as follows:

❶ Select the initial statek by sampling from the initial state distributionπ

❷ Generate an observation value from this state by sampling from the associated observation distributionbk

❸ Select the state at the next time step according to the transition matrixA

❹ Return to step❷

Hidden Markov models can be seen as generalisations of various statistical models [18]. One such view is

to consider an HMM as a form of temporalmixture model. With a standard (static) mixture model, each data

point is considered to have been “generated” by one of theK mixture componentsindependentlyof the other data

points [19]. If we now relax the strong assumption of statistical independence between data points and allow the

individual mixture components to possess Markovian dynamics, the result is a hidden Markov model (where the

hidden state at each time step corresponds to the particularmixture component which is active at that time step).

From a purely statistical perspective, an HMM (with parameter setλ) defines ajoint probability distribution

over observation sequencesO and hidden state sequencesS, i.e. p(O,S | λ). Given this joint distribution, an

important quantity of interest is theconditional distribution P (S | O,λ). In particular, it is often of interest

to find the particular state sequenceS∗ which maximisesthis conditional distribution. This corresponds to the

state sequence which is most likely to have “generated” the given observation sequence, and therefore enables us

to associate particular observation “segments” with particular model states (i.e. signal segmentation). However,

before the model can be used for this purpose, we require a method to learn the “optimal” model parameters from

a given data set (such that the HMM provides a useful statistical model of the data). We now briefly consider the

solution to each of these problems in turn.

Learning in HMMs

The learning problem in hidden Markov models is concerned with the estimation of the “optimal” model parameters

given a particular training data set. In this paper we focus on the case ofsupervisedlearning, in which the training

12


data set islabelled(or “annotated”), i.e. it consists of both the observation sequencesand the corresponding hidden

state sequences (which generated the observed data). In thecase of unsupervised learning (where the data set

consists of observation sequencesonly), the EM algorithm can be used to estimate the HMM parameters[19, 17].

In supervised learning, we can estimate the initial state distribution by evaluating the proportion of the hidden

state sequences that commence in each of the given model states at the first time step of each sequence. Denoting

the total number of hidden state sequences which commence instatei at the first time step byninit(i), we then have

the following estimator for theith element of the initial state distribution:

πi =ninit(i)

∑Kk=1 ninit(k)

(1)

For some applications, this estimator may not be appropriate. For example, the training data set used in this

paper consists of annotated ECG waveforms which commence with the onset of the P wave. Hence, using the

estimator given in Eq. (1) will result in the P wave state being assigned an initial state probability of one, and the

remaining states a initial state probability of zero. Sincea 10-second ECG recording can commence duringany

phase of the cardiac cycle, this estimator forπ is clearly inappropriate. In order to overcome this problem, the

following modified estimator can be used:

πi =n(i)

∑Kk=1 n(k)

(2)

wheren(i) is the total number of times that statei occurs in the set of hidden state sequences. Thus, Eq. (2)

estimates the initial state probability for each model state i as the fraction of times the model occupies statei

compared to the total number of state occupancies.

To estimate the elements of the transition matrix, we evaluate the proportion of each possible state transition

over all the hidden state sequences. Denoting the total number of transitions from statei to statej over all the

hidden state sequences byntrans(i, j), we have the following estimator for the(i, j)th element of the transition

matrix:

aij =ntrans(i, j)

∑Kk=1 ntrans(i, k)

(3)

The exact estimator for the parameters of the observation models depends on the specific functional form chosen

for these models. Typical choices for the observation models are Gaussian or Gaussian mixture densities. However,

the general estimation procedure for the observation models is straightforward. For each hidden statei, we first

“extract” all the observation sequences associated with this particular state. The parameters of the observation

13


model for statei are then estimated from this particular subset of the observation data.

Inference in HMMs

The inference problem in hidden Markov models can be viewed as one of determining the “optimal” state sequence

for a given observation sequence. In particular, we are interested in computing the single most probable state

sequence given the observed data, i.e.:

S∗ = argmaxS

{

P (S | O,λ)}

= argmaxS

{

p(S,O | λ)}

(4)

which follows from Bayes’ theorem. Hence it suffices to find the state sequenceS∗ which maximises the joint

distributionp(S,O | λ). The solution to this problem is provided by theViterbi algorithm[20, 17].

In the “forward pass” of the Viterbi algorithm, we compute the quantityδt(i), given by:

δt(i) = maxs1s2···st−1

{

p(s1s2 · · · st = i, O1O2 · · ·Ot | λ)}

(5)

This is the likelihood of the most probable state sequence that accounts for the firstt observations and ends in

statei at timet. This quantity can be computed in an efficient manner using the following recursion:

δt+1(i) = maxj

{

δt(j) aji

}

bi(Ot+1) (6)

It is also useful to record the particular “maximising” state which maximises the value ofδt+1(i), i.e.:

ψt+1(i) = argmaxj

{

δt(j) aji

}

(7)

Having computed Eqs. (6) and (7) for all timest and statesi, we can then compute the optimal value for the

hidden state at thefinal time step ass∗T = argmaxi

{

δT (i)}

. Using this value, we can work back to uncover the

optimal state value at theprevioustime stept = T − 1. This is easily found as the previous “maximising state” for

δT (s∗T ), i.e.ψT (s∗T ). Based on this value, we can then follow a similar procedure to uncover the optimal state value

at time stepT − 2. This generalbacktrackingapproach can be performed successively as a “look up” procedure

14


using the storedψ values to uncover the full optimal hidden state sequenceS∗:

s∗t = ψt+1(s∗t+1) t = T − 1, T − 2, · · · , 1 (8)

For a fully-connected (i.e. ergodic) HMM withK states, the time complexity of the Viterbi algorithm is

O(T K2). In many practical applications however, it is common to usean HMM architecture in which only certain

pairs of states are connected together. In this case, when computing Eq. (6) we need only maximise over those

states which can be reached from statei. When the HMM state space is large but the average state connectivity is

small, computingδt(i) in this manner can lead to a considerable computational saving. This is particularly the case

when using an HMM with built-in “duration constraints”, as considered later in this paper.

HMM Architectures for ECG Interval Analysis

We can assign the various regions of the ECG to specific HMM states in a number of different ways. Fig. 2(a)

shows a three-state model with separate states∗ for the PQ, QT, and baseline region of the ECG. Alternatively, to

estimate the QT interval only, we could use a simple two-state model with a single state for the QT interval and

an additional “catch-all” state (X) for the remainder of theECG, as shown in Fig. 2(b). The advantage of this type

of model architecture is that the compact state space results in a smaller number of model parameters (compared

with larger state space models) and more efficient inference. The disadvantage of this approach however is that by

modellingmultiple waveform features within a single state (e.g. the QRS complex and the T wave), the accuracy

of the model segmentations is typically sub-optimal since the observation models are spread over a wider region of

the data space [22].


The basic HMM architecture that will be used in this paper is the five-state HMM shown in Fig. 2(c). The

model consists of separate hidden states for the following regions of the ECG:Pon → Q (PQ/PR interval),Q → R,

R → J, J → Toff (JT interval), andToff → Pon (Baseline). The primary advantage of this model is that it enables

all the ECG interval measurements of interest (i.e. QT, PR,QRSd and RR) to be derived from the resulting state

segmentations. Furthermore, by utilising separate statesfor the different waveform features that make up the QT

∗For the purposes of clarity, we use the HMM state label “PQ” torefer to the region of the ECG from the onset of the P wave to theonsetof the QRS complex. However, the timing interval defined by this region is commonly referred to as the “PR interval” (see Fig. 1) [21].

15


interval, the HMM provides a bettergenerativemodel of the ECG, which in turn helps to improve the accuracy of

the segmentations produced by the model [22].

Validity of HMM Assumptions for ECG Modelling

The HMM framework for signal modelling is based on a number ofkey assumptions. We now consider the two

most important assumptions in turn and discuss their validity in the context of ECG signal modelling.

Observation Independence Assumption

In the standard formulation of a hidden Markov model, the observation values “within” a given state are considered

to beindependent and identically distributed(i.i.d.). Hence, when an observation value is generated by aparticular

state at a given time step, it is generated independently of any previous samples which may have been generated

from that same state in previous time steps. As a result of this assumption, the conditional probabilityp(O1:T |

S1:T , λ) can be factorised as:

p(O1:T | S1:T , λ) =T

∏

t=1

p(Ot | St, λ) (9)

The assumption of statistical independence between successive observations (within a state) is perhaps the

most serious limitation of the standard hidden Markov model, and presents a significant challenge to the use of

HMMs for ECG signal modelling. In particular, ECG samples from a given waveform feature clearly exhibit

strong correlations over the range of the feature. Indeed, it is these statisticaldependenciesbetween successive

samples which give rise to the characteristic waveform patterns which make up a typical ECG signal. By ignoring

such dependencies, the standard hidden Markov model is unlikely to provide a faithful statistical description of the

ECG. Furthermore, since these statistical dependencies give rise to theshapeof the waveform features, they are

therefore essential in determining the featureboundariesfor the purpose of ECG interval analysis.

A number of different approaches have been suggested to overcome the assumption of statistical independence

between observations within a given HMM state. One possibility is to make use of anautoregressive(AR) model

for the observation density within each HMM state [17, 23]. With this approach, the AR model is used to model

the observations within a given state, and the corresponding observation probabilities are then derived from the

Gaussian noise process for the AR residuals. However, the performance of this form of HMM is governed by the

degree to which the observations from each model state can successfully be represented as an AR process.

16


An alternative approach to incorporating information fromtemporal dependencies within each model state is

to encodethis information in the observation vector itself. This is the dominant approach in modern speech recog-

nition research with hidden Markov models. In particular, “contextual” information from neighbouring samples of

speech is captured in two different ways. Firstly, the speech signal is encoded using a technique based on the short-

time Fourier transform. The resulting feature vectors therefore incorporate spectral information over the length

of the window centred on each signal sample. Secondly, the dependencies between adjacent feature vectors are

captured through the use ofderivative featuresor so-called “delta coefficients” [24]. The use of time-frequency

transforms and derivative features to provide a more suitable representation of the ECG for hidden Markov mod-

elling is discussed in the following section.

Transition Stationarity Assumption

The transition stationarity assumption is concerned with the time-invariantnature of the state transition probabili-

ties. Thus, for any two timest1 andt2, we assume that:

P (st1+1 = j | st1 = i) = P (st2+1 = j | st2 = i) (10)

An important consequence of time-invariant transition probabilities is that the individual HMM state durations

follow a geometricdistribution. This distribution is a poor match to the true state duration distributions of many

real-world signals, including the ECG, and can lead to physiologically implausible segmentations. This issue, and

its associated remedy, are discussed in more detail in the Duration Constraints section.

SAMPLE-WISE ECG ENCODING

In this section we describe the combination of the undecimated wavelet transform and associated derivative features,

which we use to generate asample-wise encodingof the ECG for subsequent hidden Markov modelling.

Wavelet Representations

Wavelets are a class of functions that are well localised in both the time and frequency domains, and which form

a basis for all finite energy signals. The atoms used in wavelet analysis are generated byscalingandtranslatinga

17


single mother waveletψ. The general wavelet transform (WT) of a signalx ∈ L2(R) is given by:

W (s, τ) =1√s

∫ +∞

−∞

x(t) ψ∗

(

t− τ

s

)

dt (11)

whereW (s, τ) are the wavelet coefficients at scales and time-shiftτ , andψ∗ is the complex conjugate of

the mother wavelet. The specific way in which the mother wavelet is scaled and translated leads to a number of

different wavelet transform algorithms.

Table 2 shows a comparison of the key properties of the three main types of wavelet transform. The continuous

wavelet transform, or CWT, makes use of the set ofcontinuousscales and time-shifts of the mother wavelet. Specif-

ically, the CWT is defined for allpositivescaless ∈ R+ and all time-shiftsτ ∈ R. This leads to a highlyredundant

representation of the original signal, since successive wavelet atoms (across neighbouring scales or time-shifts) are

strongly correlated. Although this redundancy can be advantageous for visualisation (where the “richness” of the

coefficients aids the visual interpretation), it can be problematic for many signal analysis and pattern recognition

techniques which depend upon compact low-dimensional representations of the data. For example, if we wish to

model the pdf of the wavelet coefficients using a Gaussian mixture model, the number of model parameters which

must be estimated for each Gaussian is proportional to thesquareof the dimension of the data.


An alternative to the CWT is given by the discrete wavelet transform, or DWT, which decomposes the signal

of interest over wavelet atoms that are mutuallyorthogonal. This is achieved by using onlydyadic scales and

time-shifts in the wavelet transform (the term dyadic refers to values based on “powers of two”). The wavelet

representation produced by the DWT is considerably more efficient than that for the CWT since the signal is

projected onto orthogonal basis functions. Furthermore, the DWT can be computed in a computationally efficient

manner using a multirate filter bank structure [25].

Although the DWT is well suited to applications such as data compression, it suffers from a number of im-

portant drawbacks which limit its use for signal modelling and segmentation problems. Firstly, the DWT does not

provide aconstant-lengthvector of wavelet coefficients foreachindividual time sample. Instead, the coefficients

corresponding to different scales (or levels) of the DWT algorithm are governed by a sampling rate which is unique

to that particular scale. This stems from the restriction that the wavelet atoms are translated in time by integer

multiples of the (dyadic) scale, which leads to different numbers of wavelet coefficients at different scales.

18


An important consequence of the variation in sampling of thecoefficients at different levels of the DWT is

that the wavelet coefficients arenot translation-invariant. Translation-invariance refers to the property that “when

a pattern is translated, its numerical descriptors should be translated but not modified” [25]. For many pattern

recognition applications, it is important to construct representations of the input data that are robust with respect to

translations or shifts of the input data. In the case of the DWT however, translated (i.e. circularly shifted) versions

of a given signal can correspond to wavelet representationswhich are numerically very different, evenafter the

translations have been corrected for [26].

Given the preceding discussion, we can identify four properties which be must satisfied if a given time-

frequency representation is to be effective for the task of ECG signal modelling and segmentation:

• Constant-length vector of coefficients at each time sample

• Compact low-dimensional representation at each time sample

• Translation-invariance of coefficients

• Computationally efficient implementation

A representation which meets all of the above conditions is provided by theundecimated wavelet transform[27,

28, 26]. The undecimated wavelet transform, or UWT, restricts the scale parameters to the dyadic ranges = 2k

(for k ∈ Z) but allows the time-shift parameterτ to take onanyvalue, i.e.τ ∈ R. This results in a representation

which is redundant in the time domain, but orthogonal in the frequency domain†. More formally, the set of wavelet

atoms employed by the UWT algorithm form aframeor anovercompletebasis.

Importantly, the UWT provides a constant-length vector of wavelet coefficients for each time sample of the

original signal. These coefficients are guaranteed to betranslation-invariant, such that a (circular) time-shift of the

signal will result in an identical (circular) time-shift ofthe wavelet coefficients [25]. In addition, for a signal of

lengthT , the UWT can be computed inO(T log T ) using a fast filter bank algorithm [28]. These properties make

the UWT an effective representation for ECG signal modelling and segmentation.

†More precisely, for any given scale the UWT atoms at different time-shifts will not (in general) be orthogonal, however for any giventime-shift the UWT atoms at different scales will be orthogonal.

19


Choice of Wavelet

The first step in applying the undecimated wavelet transformto the ECG is to choose an appropriatewaveletfor the

analysis. For most practical problems of interest, the choice of wavelet depends primarily on the particular appli-

cation at hand. In the specific case of time-series segmentation however, we can identify a number of requirements

that an effective wavelet should satisfy:

• Good time localisation

• Appropriate degree of smoothness for the signal under consideration

• Realisation as a conjugate mirror filter (CMF) pair

• CMFs possess a near-linear phase response

The first requirement can be achieved by simply minimising the width of the wavelet (and hence the length of

the corresponding filters). The second requirement is designed to ensure that the wavelet provides a good match to

the characteristic features of the signal under consideration. In general, wavelets with a longer width tend to offer

a greater degree of smoothness, whereas wavelets with a shorter width are better suited to the analysis of sharp

transients [28]. In the case of the ECG, we wish to detect bothsharp transients (i.e. the onset and offset of the QRS

complex) as well as more smooth features (i.e. the end of the Twave). Hence the smoothness of the wavelet is a

compromise between these two competing requirements.

The third requirement relates to the type of wavelet transform algorithm which can be used with the given

wavelet. In particular, if we wish to perform an undecimated(or discrete) wavelet transform, then the wavelet must

admit a realisation as aconjugate mirror filterpair (this is a lowpass and a highpass filter that satisfy particular

conditions [25]). The final requirement of near-linear phase filters is designed to ensure that the wavelet coefficients

at different scales of the UWT algorithm can bealigned in time by means of a simple (circular) time-shift.

Taken together, the above requirements imply that a suitable wavelet for ECG analysis should possess a rela-

tively short width, offer a reasonable amount of smoothness, and provide a CMF pair which offer a close approxi-

mation to linear phase filters. In practice, the first waveletof theCoiflet family, commonly termed C(6) since the

corresponding filters are of length 6, satisfies all of these requirements [28].

Previous work on the application of wavelets to ECG signal analysis has primarily focused on wavelets which

are either the first or second derivative of a Gaussian. In particular, Li et al. [9] and Sahambi et al. [10] used the first

20


derivative of a Gaussian as an analysing wavelet to decompose the ECG signal. More recently, Andreao et al. [12]

used the second derivative of a Gaussian (known as the Mexican Hat function) to decompose the ECG signal.

Unfortunately, wavelets defined as a derivative of a Gaussian cannot be realised as conjugate mirror filters.

This restricts their use to the continuous wavelet transform only. We can howeverapproximatethe first and sec-

ond derivatives of a Gaussian using wavelets whichcan be realised as conjugate mirror filters. In particular, the

biorthogonal spline wavelet BIOR(6,2) with lowpass and highpass filters of length 6 and 2 respectively, provides

a reasonable approximation to the first derivative of a Gaussian. Similarly, the “least asymmetric” Daubechies

wavelet with a filter length of 12, commonly termed LA(12), provides a useful approximation to the second deriva-

tive of a Gaussian.

Fig. 3 shows the three different wavelets which we investigate in this paper for ECG analysis (see Experiments

section). To encode an ECG signal with the UWT (for a given wavelet basis), we compute the wavelet coefficients

at seven different levels/scales of the UWT filter bank. The result of this encoding is a 7-D vector of wavelet

transform coefficients for each time sample in the ECG signal.


Derivative Features

A significant limitation of the standard wavelet representation of a signal is the manner in whichphase information

about the signal is encoded. To illustrate this point, consider the first level UWT coefficients produced in the region

of the T wave of an ECG. If the filters associated with the givenwavelet are approximately linear phase (as is

the case with many commonly used wavelet families), then thecorresponding impulse responses will be close to

symmetrical. If we now assume that the T wave itself is approximately symmetrical, then the resulting wavelet

coefficients at the first level of the UWT analysis will also beapproximately symmetrical about the peak of the

T wave. As a result, thephaseof the ECG signal at each time sample is not well characterised by the wavelet

coefficients.

The issue of phase encoding with wavelet representations isparticularly significant in the context of automated

ECG interval analysis. Since we wish to detect the onset and offset points of the various waveform features as

accurately as possible, it is essential that the chosen representation of the ECG enables the automated system to

differentiate between the upslope and the downslope of eachof the ECG features.

A particularly simple and effective technique for encodinga measure of phase from a signal is to make use of

21


the signalderivative. Since this quantifies the local gradient of the signal at each time point, it can be used to aid

in the differentiation of the various regions of a given waveform. The use of derivative features (or so-called “delta

coefficients”) has been found to produce a substantial improvement in the performance of HMM-based automatic

speech recognition systems [24, 29].

In order to implement derivative features in practice, it isnecessary to first smooth the signal of interest such

that the estimate of the derivative is not unduly affected byhigh frequency noise. However, if the signal is to be

encoded with a wavelet transform, then one or more frequencybands from the encoding can be used in place of

the original signal for the derivative computations. This approach is advantageous since the frequency bands can

be chosen to minimise the effects of noise and emphasise the underlying signal content.

In order to incorporate derivative information into the UWTencoding of the ECG, we augmented the 7-D

wavelet coefficient vector at each time sample with the derivative of the coefficients from levels 4, 5 and 6 (i.e. scales

24, 25 and26) of the UWT filter bank. These levels correspond to the frequency range of 3 Hz to 34 Hz, which

in turn corresponds to the dominant spectral content of the ECG. The derivative of the wavelet coefficients at each

level was computed using a simple first-order central difference:

∆(s, τ) =W (s, τ + 1) −W (s, τ − 1)

2(12)

where∆(s, τ) is the derivative feature associated with the wavelet coefficient at scales and time-shiftτ .

Incorporating derivative features into the UWT encoding leads to a 10-D feature vector of UWT coefficients and

the associated derivative features at each time sample.

Fig. 4 illustrates the impact of derivative features on the accuracy of the T wave offset segmentation for a

particular ECG signal. The upper plots show the T wave regionof the ECG, together with the T wave offset

point determined by a.) an HMM with the UWT encoding only (upper left plot), and b.) an HMM with the UWT +

derivative features encoding (upper right plot). In each case the T wave offset point determined by an expert analyst

is also shown. For the HMM with the UWT encoding only, it is evident that the model infers the T wave offset

close to the peak of the T wave, and much earlier than the true Twave offset location. By incorporating derivative

features into the ECG encoding however, the accuracy of the Twave offset segmentation is considerably improved.


In order to understand how derivative features serve to improve the accuracy of the segmentations produced by

22


an HMM, it is useful to consider the log observation probabilities for the various HMM states. The lower plots in

Fig. 4 show, for each HMM, the difference between the log observation probabilities for the JT interval state and

those for the Baseline state, evaluated over the course of the T wave (shown in the upper plots). For the HMM

withoutderivative features (lower left plot), the log observationprobabilities for the Baseline stateexceedthose for

the JT interval state significantly before the actual end of the T wave. This results in a substantial error in the T

wave offset location inferred by the Viterbi algorithm. Conversely, for the HMMwith derivative features (lower

right plot), the log observation probabilities for the JT interval state exceed those for the Baseline state throughout

the downslope of the T wave, resulting in a significantly moreaccurate segmentation.

In the Experiments section we present detailed results comparing the performance of hidden Markov models

trained with different wavelet encodings, and demonstratethe improvement in ECG segmentation accuracy which

can be gained with derivative features.

DURATION CONSTRAINTS FOR ROBUST SEGMENTATIONS

As indicated previously, a significant limitation of the standard HMM framework is the manner in which state

durationsare modelled. In particular, for a given statei with self-transition coefficientaii, the probability mass

function of the state durationd is a geometric distribution, given by:

Pi(d) = (aii)d−1(1 − aii) (13)

In the case of the ECG, the geometric distribution provides only a poor match to the statistical characteristics

of the true state durations. Fig. 5 shows the duration distribution curves for the JT interval, based on i.) a gamma

density fitted to the JT interval durations derived from the expert analyst annotations (dashed line), and ii.) the geo-

metric duration distribution for the corresponding HMM state (solid line). It is evident that much of the probability

mass of the geometric distribution lies to the left of the minimum duration for the JT interval.


This mismatch between the statistical characteristics of the model and those of the ECG gives rise to the

problem ofdouble-beatsegmentations, an example of which is shown in Fig. 6. Such segmentations occur when

the model incorrectly infers two (or more) beats when only a single beat is present in a particular region of an

ECG [22]. The “rogue” beat segmentations are characterisedby hidden state sequences of a very short duration

23


(typically much shorter than the minimum state duration observed with genuine ECG signals). This in turn leads

to automated ECG interval measurements which are highly unreliable.


There are two possible approaches to improving the durationcharacteristics of a hidden Markov model and

hence the robustness of the associated ECG segmentations. The first approach is to restrict each of the hidden

states within the model such that they can only generate “runs” of states which satisfy a given minimum duration

constraint. The second approach is to relax the geometric duration distribution for each HMM state and to enable

the model to represent arbitrary state duration distributions. This latter approach leads to a class of statistical model

known as ahidden semi-Markov model, or HSMM [30]. Although these models offer considerable flexibility for

state duration modelling, in practice the accuracy of the resulting ECG segmentations is very similar to that pro-

duced by an HMM with minimum state duration constraints [22]. Furthermore, the improved duration modelling

of the HSMM comes at the expense of a considerable increase inthe time complexity of the associated Viterbi in-

ference procedure. We therefore focus in this paper on the use ofduration constraintsfor improving the robustness

of the HMM segmentations.

Fig. 7 illustrates how a minimum state duration constraint can be incorporated into the architecture of an

HMM. For a given statek with a minimum duration ofdmin(k) (wheredmin(k) > 1), we augment the model

with dmin(k)− 1 additional states directly preceding the original statek. Each additional state has a self-transition

probability of zero, and a probability of one of transitioning to the state immediately to its right. Thus taken together

these states form a simplelinear Markov chain, where each state in the chain is only occupied for at most one time

sample (during any run through the chain).


An important feature of this chain is that the observation probability distribution for each additional state is

tied to the distribution for the original statek. Thus the observations associated with thedmin states associated

with the original statek are governed by a single set of parameters (which is shared byall dmin states). The use of

parameter “tying” serves to reduce the number of free parameters in the resulting HMM.

From Fig. 7 it is easy to understand the effect of the durationconstraints upon the generative nature of the

model. Once the model has transitioned into statek1 it is then forced to emitdmin(k) samples from the observation

probability distribution associated with state labelk before it has the opportunity to enter a state with adifferent

24


observation distribution. The result is that the model is prevented from generating “runs” of statek whose duration

is less thandmin(k).

In practice, the use of an appropriate set of minimum state duration constraints with an HMM prevents the

model from inferring double-beat segmentations, and leadsto a considerable improvement in the robustness of the

associated ECG interval measurements. Table 3 shows the number of single-beat and double-beat segmentations

on leads II and V2 of the test set (see Experiments section formore details), for a standard HMM and an HMM

with duration constraints. For both leads II and V2, the standard HMM produces a significant number of double-

beat segmentations. However by incorporating duration constraints into the HMM architecture such rogue beat

segmentations are eliminated.


CONFIDENCE MEASURES FOR HMM SEGMENTATIONS

The main focus of the work presented so far has been on improving the accuracy and robustness of the ECG

segmentations produced by a hidden Markov model. This is clearly an important and worthwhile goal, since any

automated system for ECG interval analysis must reach a certain performance level if it is to be used successfully

in a real-world setting.

In practice however, it is not realistic to expect the model (or any other automated system) to maintain a con-

sistently high level of segmentation accuracy for every possible ECG waveform with which it might be presented.

Thus, we must accept that the model may produceunreliablesegmentations when analysing ECG waveforms which

are sufficientlynovelwith respect to the waveforms on which the model was trained.

More generally, it is often the case that only a subset of the heartbeats contained within a given ECG data set

aresuitablefor accurate and reliable ECG interval measurements (either by an expertor by an automated system).

ECG signals often contain beats which are corrupted by noiseor artefact, such that the various intervals cannot be

measured to a high degree of accuracy [1]. In this case, it is essential that the automated system has the ability to

highlight those beats for which it isleastconfident in its analysis.

We now focus therefore on the problem of automatically detecting unreliable ECG segmentations, and hence

improving the level of sophistication of the ECG analysis produced by the trained model. More generally, we wish

to enable the model to detect automatically those ECG beats which areleast suitableto analysis by machine, and

thus most in need of analysis by an expert cardiologist. In this way, we aim to combine the advantages of automated

25


analysis with those of manual analysis.

Assessing Confidence Using the Joint Log Likelihood

In the context of automated ECG interval analysis, an effective confidence measure should provide an assessment

of the quality of the waveform segmentations produced by the automated system and hence the reliability of the

associated interval measurements. More generally, we can view the confidence measure as quantifying the “nor-

mality” of the ECG waveform under consideration (with respect to the waveforms in thetraining set). Intuitively

we should have more confidence in the segmentations of ECG waveforms which aresimilar to those on which the

model was trained, compared with the segmentations of waveforms which are markedlydifferent from those on

which the model was trained.

By formulating the confidence measure in terms of the normality of the ECG waveform, we can utilise the

full probabilistic nature of the models considered in this paper in order to derive a suitable quantitative defini-

tion. Specifically, since the models considered in this paper aregenerative models, they inherently define ajoint

probability distribution over observation signalsand hidden state sequences, i.e.p(O,S | λ). This probability

distribution can therefore be exploited in order to assess the normality of a given ECG waveform and the associated

segmentation produced by the model.

More formally, for an observation signalO with corresponding Viterbi state sequenceS∗, we can compute the

joint log likelihood as:

log p(O1:T , S∗1:T | λ) = log πs∗

1+

T−1∑

t=1

log as∗ts∗t+1

+

T∑

t=1

log bs∗t(Ot) (14)

In practice, a 10-second ECG signal will typically contain anumberof heartbeats and hence a number of

interval measurements. We would therefore like to compute aseparate log likelihood value for each individual

heartbeat in an ECG (with onset timet1 and offset timet2). This can be achieved by simply redefining the limits

of the summation in Eq. (14):

log p(Ot1:t2 , S∗t1:t2 | λ) = log πs∗

t1+

t2−1∑

t=t1

log as∗ts∗t+1

+

t2∑

t=t1

log bs∗t(Ot) (15)

26


Modelling the Joint Log Likelihood

When assessing the joint log likelihood value for a given ECGheartbeat, we must also consider the duration of

the beat under consideration (i.e.t2 − t1). To understand why this is necessary, it is useful to consider a simple

example. Consider an HMMλ, whichgeneratesa signalO of lengthT according to the hidden state sequenceS.

The joint log likelihood of the signal and the state sequenceis then given by Eq. (14). If at thefollowing time step

the model transitions to hidden statesT+1 and generates a signal sampleOT+1 (by sampling from the observation

density for that state), then the joint log likelihood willchangeby a factor oflog asT sT+1+ log bsT+1

(OT+1).

Crucially, the most significant contribution to the change in the log likelihood will be due to theobserva-

tion probability term bsT+1(OT+1). This is because the transition probabilities span the range [0, 1], whereas the

observation probabilities (which are densities) span the range[0,∞). As a result, the dynamic range of the log

observation probabilities is muchgreaterthan that of the log transition probabilities.

Given the preceding analysis, we might expect an approximately linear relationship between the joint log

likelihood p(Ot1:t2 , S∗t1:t2 | λ) and the duration of the ECG beat under consideration (t2 − t1). Fig. 8 shows

a scatter plot of the joint log likelihood values against thebeat duration for the annotated ECG beats from the

training set for lead V2. As expected, there is a linear trendto the data points shown in the scatter plot.


Given this linear trend, we can model the relationship between the log likelihood and the beat length using

standard linear regression techniques [31]. In particular, we assume a simple linear model of the form:

log p(Ot1:t2 , S∗t1:t2 | λ) = υ0 + υ1(t2 − t1) + ǫi (16)

whereυ0 andυ1 are the parameters of the model (the intercept and gradient,respectively), andǫ is a Gaussian

noise process with zero mean and varianceσ2. These parameters can be estimated in a standard manner using

least-squares.

From the linear regression model we can evaluate the lower confidence bound at the 99% level of significance

(indicated by the dashed line in Fig. 8). This lower bound then provides a suitable threshold which can be used

to determine if ECG beats segmented by the model are sufficiently novelas to cast doubt on the reliability of the

associated interval measurements.

27


Normalised Confidence Measures

In a practical setting, it is desirable to produce anormalisedconfidence measure for each ECG beat. If each log

likelihood value is normalised to the range[0, 1] (where a value of 0 represents minimum confidence, and a valueof

1 represents maximum confidence), then the normalised values can be used directly (without the need for graphical

display of the log likelihoods) and can be readily interpreted by cardiologists.

In order to normalise each log likelihood value, we need tomap the log likelihood range[0,∞) to the nor-

malised range[0, 1]. Specifically, we require a normalisation mapping which is appropriate for the distribution

of log likelihood values for each beat duration. Such a mapping can be achieved through the use of thesigmoid

function [19]:

y =1

1 + e−ν(x−µ)(17)

wherex is the input to the function (i.e. the joint log likelihood value), y is the output of the function (i.e. the

normalised confidence value), andµ andν are parameters which govern the characteristic form of the function.

In order to set appropriate values for the sigmoid parameters (at a given beat duration), we must first select two

distinct normalised confidence values corresponding to twodistinct log likelihood values. We set the normalised

confidence value at the 99% lower confidence bound to 0.7, and the normalised confidence value at the mean

regression line to 0.85. Given these values, we can then rearrange Eq. (17) and solve for the sigmoid parametersµ

andν at a given beat duration.

With the above procedure, we can convert the log likelihood value (at a given beat duration) to anormalised

confidence value by simply applying the sigmoid mapping function. If we then set the “confidence threshold value”

(i.e. the confidence value below which we will label segmented beats aslow confidence) to 0.7, we can automatically

exclude from any subsequent analysis those ECG beats whose segmentations are deemed to be unreliable.

In the following Section, we evaluate the performance on ourconfidence-based approach to ECG interval

analysis on the thorough QT study data set.

EXPERIMENTS

Construction of Training and Test Sets

In order to evaluate the performance of HMMs for automated ECG interval analysis, it is desirable to use indepen-

dent training and test sets. In particular, if we wish to assess how an HMM-based automated system is likely to

28


perform in practice when used to analyse ECG data from a clinical trial, we should evaluate the performance of the

model on ECGs from subjects who werenot present in the training set. This reflects the way in which we would

expect the automated system to be used in practice (it is possible however for the model to be “re-trained” for each

new study, however this would require manual annotations for a subset of the ECG data from the study). In this

way, we can evaluate the ability of the trained model togeneraliseto ECG data from new subjects.

Given the above, we partitioned the thorough QT study data set into approximately equally-sized training and

test sets in the following manner. For each of leads II and V2,the ECGs in the data set with manual annotations on

that particular lead were first selected together with the corresponding subject identifiers. From this subset of the

ECGs in the study, the 10-second ECG signals from a random selection of half of the subjects were used to form

the training set. The 10-second ECGs recorded from the remaining subjects were then used to form thetest set.

Following the partitioning, the training and test sets for each lead consisted of approximately 6000 annotated ECG

beats, with the ECGs in each set recorded fromdifferentsubjects.

ECG Encoding Schemes

We evaluated four different sample-wise ECG representations, namely:

1. Bandpass filtered ECG time-series

2. Short-time Fourier transform (STFT)

3. Undecimated wavelet transform (UWT)

4. UWT + derivative features

The bandpass filtered ECG time-series representation was formed by simply bandpass filtering each ECG signal

using a linear-phase FIR filter with lower and upper half-power cutoff frequencies at 1.5 Hz and 35 Hz respectively.

For the short-time Fourier transform, we used a sliding window procedure to form a sample-wise encoding of

the ECG power spectrum. Specifically, for each time sample wefirst applied a Hann window (of length 256 sam-

ples, centred on the given sample) to the signal and then computed the power spectrum of the resulting windowed

signal using the FFT. This procedure was repeated for successive ECG samples (by sliding the window along by

one sample) until the entire 10-second ECG signal had been fully encoded. Finally, we formed an 8-dimensional

representation of the power spectrum (at each time sample) by averaging “blocks” of consecutive coefficients.

29


For the undecimated wavelet transform encoding, we computed the wavelet coefficients at seven different lev-

els/scales of the UWT filter bank (generating a 7-D wavelet coefficient vector at each time sample). As discussed

previously, we experimented with three different wavelet bases: the C(6) Coiflet wavelet, the BIOR(6,2) biorthog-

onal wavelet, and the LA(12) “least asymmetric” Daubechieswavelet. The latter two wavelets were chosen due to

their similarity to the first and second derivatives of a Gaussian, respectively (see Choice of Wavelet section).

Finally, we selected the wavelet basis with the best overalltest set performance (based on the segmentation

accuracy for theQ, R andToff points), and augmented the 7-D wavelet coefficient vector ateach time sample with

the derivative of the coefficients from levels 4, 5 and 6 of theUWT filter bank (see Derivative Features section).

This resulted in a 10-D encoding of UWT coefficients and the associated derivative features.

Model Training and Testing

For each ECG encoding scheme, we trained a hidden Markov model in a supervised manner on the annotated

ECG waveforms from the training set (we trained separate models for leads II and V2). We used the five-state

HMM shown in Fig. 2(c) and incorporated minimum duration constraints into the model architecture as described

previously. In particular, the minimum duration for each model state was set to 0.8 of the minimum duration present

in the training set for that state (this was value found to be effective in preventing double-beat segmentations whilst

allowing accurate segmentations of short ECG waveform features). The initial state distribution and transition

matrix for the HMM were then estimated according to Eqs. (2) and (3). We used Gaussian mixture models (GMMs)

with full covariance matrices as the observation densitiesfor each of the hidden states, due to their flexibility

in modelling multi-modal data. The parameters for each GMM were estimated using a combined model order

selection and EM algorithm due to Figueiredo & Jain [32].

The final stage of the model training procedure consisted of learning the HMMconfidence modelfrom the

annotated ECG waveforms. This was achieved by first computing the joint log likelihood value, according to

Eq. (15), for each annotated ECG beat in the training set. A linear regression model was then fit to the set of{beat

duration, log likelihood} pairs, and the mean regression line and 99% lower confidence bound determined. For a

segmented beat from the test set, a normalised confidence measure could then be computed by mapping the log

likelihood score through a sigmoid function fit to the mean regression and lower confidence values at the given beat

duration (see Confidence Measures section for further details).

In the model testing phase, the trained HMM was used to segment each10-secondECG signal in the test set.

30


The segmentation was performed by using the Viterbi algorithm to infer the most probable underlying sequence

of hidden states for the given signal. Note that the full 10-second ECGsignal was processed, as opposed to

just the manually annotated ECGbeats, in order to mimic the way an automated system would be used for ECG

interval analysis in practice. Next, for each segmented ECGbeat, we computed the normalised HMM confidence

measure using the learnt confidence model, as described previously. The model segmentations corresponding to

the particular beats which had beenmanuallyannotated were then extracted, and the differences betweenthe model

segmentations and the corresponding expert analyst annotations were computed for those beats with a confidence

measure≥ 0.7 (i.e. high confidence beatsonly).

The overall training and testing procedure (for each of leads II and V2) is summarised in Fig. 9.


In terms of computational complexity, our algorithm consists of three key stages: 1.) computation of the

undecimated wavelet transform, 2.) computation of the Viterbi algorithm, and 3.) computation of confidence

measures. Each of these stages can be performed in an efficient manner. In particular, for an ECG signal of length

T , the undecimated wavelet transform of the signal can be evaluated inO(T log T ), and the model segmentations

using the Viterbi algorithm inO(T K Cavg), whereK andCavg are the total number of hidden states in the HMM

and the average state connectivity, respectively. Furthermore, the joint log likelihood values required for the HMM

confidence measures can be computed inO(T ). Thus, our method is computationally efficient and can be readily

used in practice with very large data sets of ECG signals (such as those collected in thorough QT studies).

Results

Table 4 shows the mean errors (and the corresponding standard deviations in parentheses) between the hidden

Markov model segmentations and the expert analyst annotations on leads II and V2 of the test set. For each ECG

annotation point, the smallest mean error (across the six different encodings) is highlighted in bold.


In our experiments we found that the use of the bandpass filtered ECG time-series as the observation sequence

for an HMM led to inaccurate and unreliable segmentations onthe test set. In particular, the trained model tended

to produce very early P wave onset and T wave offset segmentations, most noticeably on the ECGs from lead II.

31


This poor performance is most likely a result of the considerable overlap in the distributions of the signal samples

for the various ECG wave shapes, which in turn leads to a lack of discrimination between the Gaussian mixture

observation models for the corresponding HMM states.

The three ECG transform encodings (STFT, UWT, and UWT + derivatives) each resulted in a substantial

improvement in the model segmentation performance compared with the filtered time-series representation. For

both leads II and V2, there is a marked improvement in the accuracy of the segmentations with the UWT encoding,

compared with the STFT. This highlights the advantage of using anadaptivetime-frequency representation over a

fixed window approach.

For the three different wavelet bases investigated, the BIOR(6,2) wavelet resulted in the best overall perfor-

mance. Most significantly, the BIOR(6,2) encoding led to a considerable improvement in the R peak accuracy,

compared with that for the C(6) and LA(12) encodings. These results can be understood in terms of the similarity

of the BIOR(6,2) wavelet to the first derivative of a Gaussian(see Fig. 3 previously). The latter is known to be

optimal for the detection of discontinuities in signals andimages (which have been corrupted by Gaussian noise),

and is commonly used for edge detection in image processing [33].

The use of derivative features with the UWT BIOR(6,2) encoding adds a further level of accuracy to the HMM

segmentations. This improvement is most pronounced for lead V2, for which the accuracy of the segmentations for

both the QRS onset and T wave offset points is improved significantly. For lead II, there is a small improvement in

the T wave offset error with the addition of derivative features. This indicates that the BIOR(6,2) wavelet alone is

able to capture most of the salient information in the ECG signal for this particular lead.

When assessing the performance of the trained HMMs, it is advantageous to consider theinter-analyst variabil-

ity inherent in the expert analyst ECG annotations. Given that atotal of 21 expert analysts were used to annotate the

thorough QT data set, we would naturally expect a degree of variability in the manual QT measurements produced

by different analysts. Previous investigations into this issue have shown that the inter-analyst variability (defined

as thestandard deviationof the differences in the QT intervals, for two analysts measuring the same set of ECG

waveforms) is typically of the order of 10 ms [34, 35].

For the HMM trained with the UWT BIOR(6,2) + derivative features encoding, the standard deviation of the

errors in the QT interval (i.e. automated QT - manual QT) for leads II and V2 is 12.4 ms and 10.4 ms, respectively.

Taking the inter-analyst variability in the manual QT measurements to be approximately 10 ms, we can conclude

that the model for both leads is performing close to the optimum level possible (given the variability inherent in the

data set annotations).

32


Analysis of Confidence Measures

Fig. 10 shows a plot of thecumulative distributionof the HMM confidence measures for the manually annotated

beats from lead II of the test set. A notable feature of the cumulative confidence distribution is that only a small

proportion of the manually annotated ECG beats have a low confidence value. This is to be expected however since

the beats chosen for manual analysis by an expert will typically correspond to clean ECG waveforms with little

noise or artefact. Applying a confidence threshold of 0.7 results in 7.4% of the model segmentations for ECG beats

in the lead II test set being labelled as low confidence, and 8.3% of the beats in the lead V2 test set.


Fig. 11 shows a selection of thelow confidence beats (i.e. confidence< 0.7) from the HMM segmentations of

the full 10-second ECG signals for the lead II test set. In each plot, the confidence value specified is an average of

the confidence values for the beat segmentations shown. For the ECGs in plots A, B and C, the Viterbi algorithm

was unable to locate any valid beats in the 10-second ECG signal. This was because the state observation prob-

abilities were extremely small, and thusδt(i) for each state (as computed in the Viterbi algorithm) was equal to

zero for much of the duration of the signal. As a result, it wasnot possible to segment these ECG signals and the

confidence values were therefore set to zero.


The ECGs shown in Fig. 11 exhibit a variety of different ECG noise sources and waveform abnormalities. In

particular, we can identify the following in each of the respective plots:

A. Poor electrode contact

B. Electrode contact noise

C. High-frequency mains interference

D. Low amplitude noisy T waves

E. Large amount of movement artefact (between the second andthird beats)

F. Abnormal QRS complex morphology

33


Fig. 12 shows a selection of thehigh confidence beats (i.e. confidence≥ 0.7) from the 10-second ECG signals

for the lead II test set. Again, the confidence value specifiedin each plot is an average of the confidence values for

the beat segmentations shown. We can see from Fig. 12 that thehigh confidence beats are associated withnoise-free

ECG signals which contain well-defined QRS complexes and T waves. Visual inspection of the model annotations

(i.e. theQ andToff points) for these beats indicates that the model is performing to a high degree of accuracy.


Comparison with Threshold-based Algorithm

We compared the performance of our HMM algorithm for automated ECG interval analysis (using the UWT

BIOR(6,2) + derivative features encoding scheme) with thatof ECGPUWAVE – the standard thresholding al-

gorithm for ECG segmentation [7] (see Introduction). In order to compare and contrast the performance of the

two approaches, we produced Bland-Altman plots of agreement between the automated QT intervals (for each of

the two methods) and the corresponding manual QT intervals (as determined by the expert analysts). The Bland-

Altman plot displays thedifferencebetween two sets of measurements against themeanof the measurements [36].

This enables any systematic differences between the measurement techniques to be visualised, together with any

trend in the differences.

Fig. 13 shows the Bland-Altman plots for the HMM segmentations (with confidence≥ 0.7) and the ECG-

PUWAVE segmentations, evaluated on lead V2 of the test set. The plot for the ECGPUWAVE algorithm shows a

considerable number ofoutliers, indicated by the large cloud of data points which lie to the upper right of the main

data density. These points correspond to significant over-estimates of the QT interval by the algorithm. In contrast,

the data points on the HMM plot are well clustered around the mean difference, and there is only a relatively small

number of outliers lying outside the two standard deviationrange.


DISCUSSION

The introduction of the thorough phase 1 QT/QTc study, and the large volumes of ECG data which are now being

generated as a result, has led to a growing demand for accurate and reliable methods for ECG interval analysis. In

this context, automated systems offer the potential for a more extensive evaluation of ECG data from clinical trials,

34


and hence a more robust assessment of drug safety.

In this paper we have presented a new approach to automated ECG interval analysis based on the use of

probabilistic models. One of the most powerful features of this approach is the ability to learn a model for ECG

segmentation from expert annotated ECG data. Suchdata-drivenlearning allows us to sidestep neatly the problem

of having to specify an explicit rule for determining the endof the T wave in the electrocardiogram (as with, for

example, threshold-based methods).

Perhaps the most important aspect of the probabilistic modelling approach is the ability of the model to gen-

erate a statisticalconfidence measurein its analysis of a given ECG waveform. Current automated ECG interval

analysis systems are limited by their inability to differentiate between normal ECG waveforms (for which auto-

mated measurements may be reliably inferred), and abnormalor unusual ECG waveforms (for which automated

measurements are frequentlyunreliable). By utilising a confidence-basedapproach to automated ECG interval

analysis however, we can automatically highlight those waveforms which are least suitable to analysis by machine

(and thus most in need of analysis by a human expert). This strategy therefore provides an effective way to combine

the twin advantages of manual and automated ECG interval analysis.

Although the use of hidden Markov models for ECG analysis hasbeen explored previously in the literature [11,

12], the joint probabilistic description these models provide has not previously been fully exploited. We have

shown how the confidence in the model segmentations can be assessed by evaluating the HMM joint log likelihood

p(S∗, O) with respect to the beat duration. Furthermore, by normalising this log likelihood value in an appropriate

manner, it is possible to derivenormalisedconfidence values which can be readily interpreted by cardiologists and

medical practitioners.

In order for the automated ECG interval analysis system to produce both accurate segmentations together with

robust and reliable confidence measures, it is necessary that the HMM provides a good model of the statistical

characteristics of normal ECG waveforms (i.e. the trained HMM is a suitable description of ECG waveformnor-

mality). Much of the research described in this paper is driven by the requirement to improve the “fit” between the

statistical properties of the standard hidden Markov modeland those of the electrocardiogram signal.

Motivated by the observation independence assumption inherent in the HMM framework, we have demon-

strated that a combination of the undecimated wavelet transform and associated derivative features (evaluated over

a number of UWT scales) provides a representation of the ECG which is well suited to analysis by an HMM.

Furthermore, the robustness of the segmentations producedby an HMM can be improved considerably by incor-

poratingminimum state duration constraintsinto the model architecture. In this way the “duration constrained”

35


HMM is prevented from inferring waveform segmentations that are physiologically implausible.

The results on the thorough QT data set demonstrate that our algorithm is able to produce QT interval mea-

surements to a similar level of accuracy as expert analysts (taking into account the inherent variability in the

measurements of different experts). However, to quantify more precisely the sensitivity of our automated system, it

would be necessary to evaluate the performance of the model on an ECG data set with a smalldrug-inducedchange

in the QT/QTc interval (i.e. 5 to 10ms). This level of change can be brought about by the drug moxifloxicin, which

is commonly used as a “positive control” in thorough QT studies. Demonstrating the sensitivity of our approach on

such data is a key focus of our current research.

ACKNOWLEDGEMENTS

This work was supported in part through the Engineering and Physical Sciences Research Council (EPSRC) under

Grant GR/N14248/01 and the UK Medical Research Council (MRC) under Grant No. D2025/31, and in part by

Oxford BioSignals Ltd. The authors would like to thank Jay Mason, Iain Strachan, Mahesan Niranjan, Tony

Robinson and Steve Young for many helpful comments on this work, and Covance Inc. for providing the clinical

data.

References

[1] R. R. Fenichel, M. Malik, C. Antzelevitch, M. Sanguinetti, D. M. Roden, S. G. Priori, J. N. Ruskin, R. J. Lipicky, and L.R. Cantilena.

Drug-induced torsades de pointes and implications for drugdevelopment.Journal of Cardiovascular Electrophysiology, 15(4):475–

495, 2004.

[2] R. R. Shah. The significance of QT interval in drug development. British Journal of Clinical Pharmacology, 54:188–202, 2002.

[3] FDA and Health Canada. The clinical evaluation of QT/QTcinterval prolongation and proarrhythmic potential for non-antiarrhythmic

drugs. Draft 4 (June 10, 2004), 2004.

[4] J. Pan and W. J. Tompkins. A real-time QRS detection algorithm. IEEE Transactions on Biomedical Engineering, 32(3):230–236,

1985.

[5] S. H. Meij, P. Klootwijk, J. Arends, and J. R. T. C. Roelandt. An algorithm for automatic beat-to-beat measurement of the QT-interval.

In Computers in Cardiology, pages 597–600, 1994.

[6] Q. Xue and S. Reddy. Algorithms for computerized QT analysis. Journal of Electrocardiology, 30:181–186, 1998. Supplement.

[7] P. Laguna, R. Jane, and P. Caminal. Automatic detection of wave boundaries in multilead ECG signals: validation withthe CSE

database.Computers and Biomedical Research, 27:45–60, 1994.

36


[8] E. Lepeschkin and B. Surawicz. The measurement of the Q-Tinterval of the electrocardiogram.Circulation, VI:378–388, September

1952.

[9] C. Li, C. Zheng, and C. Tai. Detection of ECG characteristic points using wavelet transforms.IEEE Transactions on Biomedical

Engineering, 42(1):21–28, 1995.

[10] J. S. Sahambi, S. N. Tandon, and R. K. P. Bhatt. An automated approach to beat-by-beat QT-interval analysis.IEEE Engineering in

Medicine and Biology magazine, 19(3):97–101, 2000.

[11] D. A. Coast, R. M. Stern, G. G. Cano, and S. A. Briller. An approach to cardiac arrhythmia analysis using hidden Markovmodels.

IEEE Transactions on Biomedical Engineering, 37(9):826–835, 1990.

[12] R. V. Andreao, B. Dorizzi, and J. Boudy. ECG signal analysis through hidden Markov models.IEEE Transactions on Biomedical

Engineering, 53(8):1541–1549, 2006.

[13] M. Malik. Errors and misconceptions in ECG measurementused for the detection of drug induced QT interval prolongation. Journal

of Electrocardiology, 37:25–33, 2004. Supplement.

[14] S. Grisanti. The thorough phase I ECG.Applied Clinical Trials, October, 2004.

[15] P. Laguna, R. G. Mark, A. Goldberger, and G. B. Moody. A database for evaluation of algorithms for measurement of QT and other

waveform intervals in the ECG. InComputers in Cardiology, pages 673–676, 1997.

[16] J. L. Willems, P. Arnaud, J. H. van Bemmel, P. J. Bourdillon, R. Degani, B. Denis, I. Graham, F. M. Harms, P. W. Macfarlane,

G. Mazzocca, J. Meyer, and C. Zywietz. A reference data base for multilead electrocardiographic computer measurement programs.

Journal of the American College of Cardiology, 10(6):1313–1321, 1987.

[17] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition.Proceedings of the IEEE,

77(2):257–286, 1989.

[18] S. Roweis and Z. Ghahramani. A unifying review of linearGaussian models.Neural Computation, 11:305–345, 1999.

[19] C. M. Bishop.Pattern Recognition and Machine Learning. Springer, 2006.

[20] A. J. Viterbi. Error bounds for convolutional codes andan asymptotically optimal decoding algorithm.IEEE Transactions on Infor-

mation Theory, IT-13:260–269, April 1967.

[21] A. R. Houghton and D. Gray.Making Sense of the ECG. Arnold, 1997.

[22] N. P. Hughes.Probabilistic models for automated ECG interval analysis. PhD thesis, University of Oxford, 2006.

[23] A. M. Fraser and A. Dimitriadis. Forecasting probability densities by using hidden Markov models with mixed states. In A. S.

Weigend and N. A. Gershenfeld, editors,Time Series Prediction: Forecasting the Future and Understanding the Past, pages 265–282.

Addison-Wesley, 1994.

[24] S. Furui. Speaker-independent isolated word recognition using dynamic features of speech spectrum.IEEE Transactions on Acoustics,

Speech and Signal Processing, 34:52–59, 1986.

[25] S. Mallat. A Wavelet Tour of Signal Processing. Academic Press, 2nd edition, 1999.

[26] R. R. Coifman and D. L. Donoho. Translation-invariant de-noising. In A. Antoniadis and G. Oppenheim, editors,Wavelets and

Statistics, volume 103 ofLecture Notes in Statistics, pages 125–150. Springer-Verlag, 1995.

37


[27] G. P. Nason and B. W. Silverman. The stationary wavelet transform and some statistical applications. In A. Antoniadis and G. Oppen-

heim, editors,Wavelets and Statistics, volume 103 ofLecture Notes in Statistics, pages 281–300. Springer-Verlag, 1995.

[28] D. B. Percival and A. T. Walden.Wavelet Methods for Time Series Analysis. Cambridge University Press, 2000.

[29] S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D.Ollason, D. Povey, V. Valtchev, and P. Woodland.The HTK Book.

Cambridge University, 3.2.1 edition, 2002.

[30] N. P. Hughes, L. Tarassenko, and S. J. Roberts. Markov models for automated ECG interval analysis. In S. Thrun, L. Saul, and

B. Schoelkopf, editors,Advances in Neural Information Processing Systems 16, pages 611–618, Cambridge, MA, 2004. MIT Press.

[31] R. J. Freund and W. J. Wilson.Regression Analysis: Statistical Modelling of a Response Variable. Academic Press, 1998.

[32] M. A. T. Figueiredo and A. K. Jain. Unsupervised learning of finite mixture models.IEEE Transactions on Pattern Analysis and

Machine Intelligence, 24(3):381–396, 2002.

[33] J. Canny. A computational approach to edge detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6):679–

698, 1986.

[34] J. Morganroth and H. M. Pyper. The use of electrocardiograms in clinical drug development: part 2.Clinical Research Focus,

12(6):19–25, 2001.

[35] I. Savelieva, G. Yi, X. Guo, K. Hnatkova, and M. Malik. Agreement and reproducibility of automatic versus manual measurement of

QT interval and QT dispersion.The American Journal of Cardiology, 81:471–477, 1998.

[36] J. M. Bland and D. G. Altman. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet,

i:307–310, 1986.

38


TABLE 1. Number of 10-second ECG signals and annotated beatsfor leads II and V2 of the thorough QTstudy data set.

Lead Num. of 10-second ECGs Num. of annotated ECG beats

II 4135 12291

V2 4109 12307

39


TABLE 2. A comparison of the three main types of wavelet transform.

WAVELET TRANSFORM TYPE: Scales, Time-shiftτ Translation-invariant Representation

Continuous (CWT) s ∈ R+, τ ∈ R Yes Highly redundant

Discrete (DWT) s = 2k, τ = 2kl, (k, l) ∈ Z2 No Orthogonal basis

Undecimated (UWT) s = 2k, τ ∈ R, k ∈ Z Yes Frame

40


TABLE 3. The effect of duration constraints on the HMM segmentations.

(a) Lead II test set (5930 beats)

MODEL: Number of segmentations of type:

Single-beat Double-beat

Standard HMM 5196 734

HMM with duration constraints 5930 0

(b) Lead V2 test set (6742 beats)

MODEL: Number of segmentations of type:

Single-beat Double-beat

Standard HMM 6343 399

HMM with duration constraints 6742 0

41


TABLE 4. Mean errors in milliseconds (standard deviations in parentheses) between the “duration-constrained” HMM segmentations (with different ECG encodings) and the manual expert annotations, forleads II and V2 of the test set.

(a) Lead II test set (5930 beats)

ECG ENCODING: Mean segmentation error (ms)

Pon Q R J Toff

BP filtered time-series -461.1(196.7)

-31.5(59.6)

-62.1(60.7)

-98.3(60.6)

-208.2(65.7)

STFT -8.3(25.4)

-40.5(33.6)

-29.2(69.4)

41.3(30.6)

-46.0(44.5)

UWT [C(6) wavelet] −0.8

(16.6)−2.6

(4.7)4.4

(10.4)5.6

(7.3)3.8

(19.5)

UWT [LA(12) wavelet] -1.3(15.5)

-3.7(5.1)

5.8(11.2)

4.4

(7.7)2.3

(17.2)

UWT [BIOR(6,2) wavelet] -1.0(16.4)

-3.0(4.4)

0.2

(1.5)5.6

(7.1)2.4

(12.6)

UWT [BIOR(6,2)] + derivatives 1.6(8.7)

-3.6(5.0)

0.4(1.6)

4.5(7.5)

−2.2

(11.2)

(b) Lead V2 test set (6742 beats)

ECG ENCODING: Mean segmentation error (ms)

Pon Q R J Toff

BP filtered time-series -94.8(254.6)

-3.3(45.6)

8.2(3.8)

-17.3(17.5)

-76.0(42.6)

STFT -10.7(35.7)

-27.2(31.8)

21.2(57.2)

18.7(28.7)

-27.4(36.1)

UWT [C(6) wavelet] -7.8(23.3)

-3.5(4.0)

1.9(3.2)

3.9(6.9)

-5.2(16.2)

UWT [LA(12) wavelet] -10.4(19.3)

-4.2(4.2)

2.2(3.1)

5.3(7.1)

-7.0(16.6)

UWT [BIOR(6,2) wavelet] −4.8

(28.3)-2.0(4.0)

0.5

(4.1)3.6

(6.7)3.8

(12.0)

UWT [BIOR(6,2)] + derivatives -5.6(22.7)

−1.4

(3.9)1.0

(2.7)2.0

(6.7)1.7

(9.7)

42


List of Figures

1 A typical human ECG waveform together with the standard ECGintervals. . . . . . . . . . . . . 442 Hidden Markov model architectures for ECG interval analysis. . . . . . . . . . . . . . . . . . . . 453 The three different wavelets considered in this paper (C = Coiflet, BIOR = Biorthogonal, LA =

Least Asymmetric). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 464 The impact ofderivative featureson the accuracy of the T wave offset segmentation. The upper

plots show the T wave region of an ECG, together with the T waveoffset point (blue dash-dot line)determined by a.) an HMM with the UWT encoding only (upper left plot), and b.) an HMM withthe UWT + derivative features encoding (upper right plot). In each case the T wave offset pointdetermined by an expert analyst is also shown (red dashed line). The corresponding lower plotsshow, for each HMM, the difference between the log observation probabilities for the JT intervaland Baseline states evaluated over the course of the T wave. .. . . . . . . . . . . . . . . . . . . 47

5 A comparison of the duration distribution curves for the JTinterval, based on i.) a gamma densityfitted to the JT interval durations derived from the expert analyst annotations (dashed line), and ii.)the geometric duration distribution for the correspondingHMM state (solid line). . . . . . . . . . 48

6 An example of adouble-beatsegmentation by a hidden Markov model. The first beat segmentationinferred by the model (in blue) correctly identifies the onset of the P wave, but incorrectly “locates”a QRS complex (of duration one sample) and the T wave offset within the PR segment. Thesecondbeat segmentation inferred by the model (in green) then correctly identifies the boundaries of theQRS complex and the offset of the T wave in the ECG signal. . . . .. . . . . . . . . . . . . . . . 49

7 Graphical illustration of incorporating a duration constraint into an HMM (the dashed box indicatesthat the observation densities for all the states in the chain aretied). . . . . . . . . . . . . . . . . 50

8 A scatter plot of the joint log likelihood values against the beat duration for the annotated ECGbeats from lead V2 of the training set. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 51

9 Schematic illustration of the model training and testing procedure. . . . . . . . . . . . . . . . . . 5210 Cumulative distribution of the HMM confidence measures for the manually annotated ECG beats

from lead II of the test set. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 5311 A selection of thelow confidence ECGs (i.e. confidence< 0.7) from the lead II test set. . . . . . . 5412 A selection of thehigh confidence ECGs (i.e. confidence≥ 0.7) from the lead II test set. . . . . . 5513 Bland-Altman plots of agreement between automated (HMM -left plot, ECGPUWAVE - right plot)

and manual QT interval measurements for lead V2 of the test set. The solid horizontal line indicatesthe mean QT difference, and the dashed lines indicate 2 x standard deviation error bars. . . . . . . 56

43


Q JP Ton off

Baseline

P wave

QRS complex

T wave

Baseline

PR QRS d

QT

FIGURE 1. A typical human ECG waveform together with the standard ECG intervals.

44


PQ QT B

(a) HMM for PR and QT interval segmentation

QT X

(b) HMM for QT interval segmen-tation

PQ QR RJ JT B

(c) HMM for full ECG interval segmentation

FIGURE 2. Hidden Markov model architectures for ECG interva l analysis.

45


−1.5 0 1.5

−0.9

0

1.5

t

ψ(t)C(6)

−1.4 0 1.4

−0.8

0

0.8

t

ψ(t)BIOR(6,2)

−3.3 0 3.3

−0.6

0

0.9

t

ψ(t)LA(12)

FIGURE 3. The three different wavelets considered in this paper (C = Coiflet, BIOR = Biorthogonal, LA =Least Asymmetric).

46


EC

G s

igna

l [T

wav

e]

ToffHMM [UWT only]:

ToffANALYST:

−5

0

5

10

15

20

25

30

JT lo

g ob

s pr

obs

− B

asel

ine

log

obs

prob

s

(a) Without derivative features

EC

G s

igna

l [T

wav

e]

ToffHMM [UWT + deriv.]:

ToffANALYST:

−5

0

5

10

15

20

25

30

JT lo

g ob

s pr

obs

− B

asel

ine

log

obs

prob

s

(b) With derivative features

FIGURE 4. The impact of derivative features on the accuracy of the T wave offset segmentation. The upperplots show the T wave region of an ECG, together with the T waveoffset point (blue dash-dot line) deter-mined by a.) an HMM with the UWT encoding only (upper left plot ), and b.) an HMM with the UWT +derivative features encoding (upper right plot). In each case the T wave offset point determined by an expertanalyst is also shown (red dashed line). The corresponding lower plots show, for each HMM, the differencebetween the log observation probabilities for the JT interval and Baseline states evaluated over the course ofthe T wave.

47


1 100 200 300 400 5000

0.005

0.01

0.015

State duration d (ms)

P(d

)

HMMAnalyst

FIGURE 5. A comparison of the duration distribution curves for the JT interval, based on i.) a gammadensity fitted to the JT interval durations derived from the expert analyst annotations (dashed line), and ii.)the geometric duration distribution for the corresponding HMM state (solid line).

48


QJ Q JP T P Ton off on offHMM:

FIGURE 6. An example of adouble-beat segmentation by a hidden Markov model. The first beat segmen-tation inferred by the model (in blue) correctly identifies the onset of the P wave, but incorrectly “locates”a QRS complex (of duration one sample) and the T wave offset within the PR segment. Thesecond beatsegmentation inferred by the model (in green) then correctly identifies the boundaries of the QRS complexand the offset of the T wave in the ECG signal.

49


k k1 k2 · · · kd-1 kd

⇒bk

akk akk

bk

FIGURE 7. Graphical illustration of incorporating a durati on constraint into an HMM (the dashed boxindicates that the observation densities for all the statesin the chain are tied).

50


400 600 800 1000 1200 1400 16000

0.5

1

1.5

2

2.5

3

3.5x 10

4

Beat duration (ms)

HM

M jo

int l

og li

kelih

ood

p(O

,S* )

for

beat

Annotated ECG beat from training setLower confidence bound for linear regression

FIGURE 8. A scatter plot of the joint log likelihood values against the beat duration for the annotated ECGbeats from lead V2 of the training set.

51


Encode training set ECGsusing UWT + derivative

features

Incorporate durationconstraints into

HMM architecture

Learn HMM parameters

{ , , } using supervised

learning estimators

A bpk

Compute HMM loglikelihood scores for

annotated ECG beats

Learn HMM confidencemodel by fitting linear

regression to log likelihoods

Training

Encode test set ECGsusing UWT + derivative

features

Segment 10-secondECG signals using

HMM + Viterbi algorithm

Compute HMM confidencemeasure for each

segmented ECG beat

Extract HMM segmentationsfor ECG beats which were

manually annotated

Compute segmentationerrors for high confidence

ECG beats only

Testing

FIGURE 9. Schematic illustration of the model training and testing procedure.

52


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

1000

2000

3000

4000

5000

Confidence

Num

ber

of E

CG

bea

ts

FIGURE 10. Cumulative distribution of the HMM confidence measures for the manually annotated ECGbeats from lead II of the test set.

53


Plot A: Confidence = 0.00 Plot B: Confidence = 0.00 Plot C: Confidence = 0.00

Plot D: Confidence = 0.16

Q Q QT T Toff off off

Plot E: Confidence = 0.34


Plot F: Confidence = 0.43


FIGURE 11. A selection of thelow confidence ECGs (i.e. confidence< 0.7) from the lead II test set.

54


Confidence = 0.70


Confidence = 0.73


Confidence = 0.77


Confidence = 0.84


Confidence = 0.89


Confidence = 0.92


FIGURE 12. A selection of thehigh confidence ECGs (i.e. confidence≥ 0.7) from the lead II test set.

55


300 350 400 450 500 550−100

−50

0

50

100

150

200

Average of automated and manual QT (ms)

Diff

eren

ce b

etw

een

auto

mat

ed a

nd m

anua

l QT

(m

s)

Hidden Markov model

300 350 400 450 500 550−100

−50

0

50

100

150

200

Average of automated and manual QT (ms)

Diff

eren

ce b

etw

een

auto

mat

ed a

nd m

anua

l QT

(m

s)

ECGPUWAVE

FIGURE 13. Bland-Altman plots of agreement between automated (HMM - left plot, ECGPUWAVE - rightplot) and manual QT interval measurements for lead V2 of the test set. The solid horizontal line indicatesthe mean QT difference, and the dashed lines indicate 2 x standard deviation error bars.

56

Probabilistic Models for Automated ECG Interval Analysis ...davidc/pubs/bsp_08_01.pdf · ECG...

Documents

Transcript of Probabilistic Models for Automated ECG Interval Analysis ...davidc/pubs/bsp_08_01.pdf · ECG...