Snoring DeteA Simple and Efficient Spectral Features for Breathing and Snoring Sound Classification

A Simple and Efficient Spectral Features for Breathing and Snoring Sound Classification

Xiang Sun a, Jin Young Kim a,*, Yonggwan Won a, Jung-Ja Kimb, and Kyung-Ah Kimc

a Dept. of ECE, College of Eng., Chonnam National University, Gwangju, 500-757,S. Korea

b Dept. of Biomedical Eng., College of Eng., Chonbuk National University, Jeonju, S. Korea

c Dept. of Biomedical Eng., College of Medicine., Chonbuk National University, Jeonju, S. Korea

Abstract. An efficient method to detect snoring and related events (expiration, inspiration and silence) in sleep sound

recordings is proposed in this paper. The feature vector is obtained using normalized mean and standard deviation of 3 sub-

bands energy. The proposed method is based on the acoustic properties of snoring sound which have been validated to be

effective for snoring detection by our experiments. Then the classification procedure is done by applying Support Vector

Machine. An approximately 32 hours’ database were recorded from the subjects who have acknowledged snoring habit. The

performance of our method is evaluated by classifying the different events in sleep sound recordings and comparing with the

ground truth. This algorithm was able to correctly classify the snores with the accuracy of 97.40%, 99.90% for breath and

100% for silence.

Keywords: Snoring detection, Obstructive Sleep Apnea, Super Vector Machine

1. Introduction

Sleep quality is one of the most important factors for evaluating human’s health condition. One type

of respiratory disease named Obstructive sleep apnea (OSA) has been reported as a common disease in

our lives [1]. The term OSA includes a set of symptoms such as repetitive pauses in breathing during

* Corresponding author. E-mail: [email protected]

sleep, and usually associated with the Hypoxemia Syndrome which might cause some harmful

consequences, such as tiredness in the daytime, increased risk of strokes cardiovascular diseases and

even sudden apnea [2, 3]. However, usually the individual with OSA is rarely aware of having

difficulty breathing, which make the symptoms may last for years or even decades without

identification.

In recent years, several studies have shown the relationship between snoring and OSA, which is

usually related to loud and heavy snoring. According to that snoring is the most common symptom of

OSA, occurring in 70% to 95% of patients[4]. Earlier studies [5] indicated that the snoring may play a

key-role in diagnosing and differentiating between healthy and OSA patients. Although, it is possible

to analyze patients’ sleep acoustic characteristics via whole night polysomnography (PSG) [6] records,

which requires a full-night diagnosis while the individual is connected to numerous facilities in the

diagnosis room. But the PSG is difficult to implement for every patients and also expensive. Therefore

a fast and efficient method for snore and non-snore detection is urgently needed.

Until recently, several related snoring detection researches have been developed which all have

inspiring performance on snoring detection. Duckitt, Tuomi and Niesler [7] applied speech recognition

technique to snoring detection (using 6 simple snoring subjects). The Mel-frequency cepstral

coefficients (MFCC) feature is extracted and then classified using Hidden Markov model (HMM)

which achieved a classification rate of 89%. Cavusoglu et al. [8] developed a method by applying

Principal Component Analysis (PCA) to acquire 2-dimensional primary components from a 15-

dimensional sub-band spectral energy vectors, with robust linear regression (RLR) for classification.

The detection accuracy is 90.2% (using 15 subjects for design and testing respectively). Dafna et al. [9]

proposed a Gaussian mixture model (GMM) –based method for snoring detection that involves the 40-

dimensional feature vectors using MFCC, time domain and energy features. The method produced a

detection rate of 98.1% for snore and non-snore.

In order to develop a fast and efficient method for snoring detection, we proposed a simple and

principal features for classification using the normalized mean and standard deviation of sub-bands

spectral energy which show apparent distinction between snore and other classes. Then adopted the

Support Vector Machine (SVM) to classify each frame and achieved an accuracy rate of 97.40% for

snore, 99.90% for breath, 100% for silence.

The section 2 is the methods description that include the overall structure in sub-section 2.1. The

snoring signal analysis and feature extraction is in sub-section 2.2. In section 3 all the details about

database and experiment results are described, and finally, conclusion is given in section 4.

2. Methods

2.1. Overall structure

The raw sleep recordings are processed following the proposed detection system that is shown in

Fig. 1. The overall system structure is composed of the following steps.

1) As shown in Figure 1, a breathing sound signal is segmented in a constant size by framing and

spectral features are obtained from the segmented signals.

2) In training stage, all tagged features as silence, snore, inspiration and expiration are applied to

SVM classifier, and SVM training results in support vectors for each classes.

3) In the testing stage, an input feature vector is classified into breathing modes by the SVM

classifier.

Fig. 1. The block diagram of the snore detection system.

2.2. Feature extraction

Fig. 2. The spectrogram of sequence of snoring and other sounds.

(Annotated episodes: 1. Inspiration. 2. Expiration. 3. Snore.)

By observing the snoring samples in frequency domain, Beck et al.[10] indicates that snore and

breath usually have strong energy in the range of 64Hz-800Hz. By investigating spectral signals in that

band, we found more distinctive characteristics in that band as follows.

1) Snoring signals have harmonic components (HCs) and HCs are very clear in the frequency

range of 50Hz-300Hz.

2) In the range of 300-550Hz, HCs in the snoring signals are mixed with breathing noise with

nearly equal strengths.

3) In the range of 550-800Hz, breathing noises are more dominant than HCs even though HCs are

shown in the snoring case.

4) Breathing sounds without snoring have strong power in the rage of 300-800Hz.

5) Expiration in breathing makes a little strong and short burst noise rather than inspiration.

2 31 2

By considering the above observations, we set critical frequencies of 50Hz, 300Hz, 550Hz and 800Hz.

That is, the interesting band of 50Hz-800Hz can be divided into three sub-bands by the critical

frequencies. For features, we have to consider the tonal property of snoring signals. Tonal signals have

spectral peaks and show high kurtosis in frequency domain. However, kurtosis calculation is a time-

consuming feature. So, we adopt the standard deviation feature considering the domain of breathing

and snoring detection.

Base on the back ground above, a 6-dimensional feature vector is proposed by computing the

normalized mean and standard deviation of 3 sub-bands: 50-300Hz, 50-550Hz, 50-800Hz for each

frame. In addition, we also experiment a non-overlapped sub-bands to compare with our proposed

method. Specifically, the sub-bands are distributed as follows: 50-300Hz, 300-550Hz, 550-800Hz.

Experimentally, the overlapped sub-bands have better performance due to the similarity between

expiration and snore in the sub-band of 550-800Hz. The advantage of the proposed sub-bands is

shown in Figure 3. The frame size is shaped by 128ms Hamming window, 50% overlap. Then the features at jth frame are calculated as:

μi=E [|F ( j , f )|2|f ∊Bi ]E [|F ( j , f )|2|f ∊ B ]

(1)

σ i=E [|F ( j , f )|2|f ∊ Bi ]E [|F ( j , f )|2|f ∊ B ]

(2)

where B1=[50,300], B2=[50,550], B3=[50,800] , B=[ 50,8000 ] and F ( j , f ) is the fast Fourier

transform in the sub-band Biat jth frame. A median filter is applied after computing the energy for

each frames. Finally, the 6-dimensional feature vector is applied to SVM classifier for training.

3. Experimental Results

A portable recorder was used to record an 8 hours overnight sleeping audio recording from each of

the four subjects at a sample frequency of 16 kHz, 16 bits per sample. All these subjects have

acknowledged snoring habit, providing approximately 32 hours’ recordings. The database, containing

sufficient snoring episodes for our analysis and experiments, is categorized into 4 classes: expiration,

inspiration, snore and silence. In order to determine the feasibility of our method for snoring and apnea

detection only the primary classes in the database are selected.

To obtain the experimental data, three main steps are employed:

1) For all of these four 8 hours’ recordings, randomly extract the episodes which belong to each

class and annotate them respectively.

2) Combine the extracted episodes into 4 new audio datasets separately each of these new audio

datasets only contain one type of sounds.

3) Divide each new audio data as 2 parts, the first half parts for training and the rest parts for

testing.

The experiment using proposed approach was conducted using the datasets that is shown in Table 1.

The data length of each class is given in this table, and the length in each cell is converted into frame

using the frame size of 128ms, 50% overlap. The individuals in the training and testing datasets are

same while the training and testing datasets are non-overlapped

Fig. 3. The feature details in different sub-bands.

(a) and (b) show the feature values of the sub-band 550-800Hz. (c) and (d) show the feature values of the sub-band

50-800Hz, using the same sequence of signal as figure 2.

Calculating the 6-dimensional feature vector of experimental datasets in Table 1, and apply it to

SVM for training and classification. Then compare the results with ground truth to calculate the

accuracy. The confusion matrix is shown in Table 2. In this work, another set of feature Mel-

frequency cepstral coefficient (MFCC) is also implemented to compare with the proposed features.

MFCC is recommended as the best feature for audio event detection [11]. Therefore, several related

work used MFCC and got good performance. The spectral features are 42-dimensional MFCC

including log energy feature, 0th cepstral coefficient, delta and delta-delta coefficients. The features are

also calculated every 64ms with 128ms Hamming window using the same datasets and applied to the

same SVM classifier. The results of using MFCC feature are shown in Table 3.

Table 1

Training and testing datasets information.

Total length Training Testing

Expiration 360s (5625frames) 180s (2812frames) 180s (2812frames)

Inspiration 320s (5000frames) 160s (2500frames) 160s (2500frames)

Silence 410s (6406frames) 205s (3203frames) 205s(3203frames)

Snore 670s (10468frames) 335s (5234frames) 335s (5234frames)

Table 2

Results using proposed feature.

Expiration Inspiration Silence Snore

Expiration 95.40% 4.60% 0.00% 0.00%

Inspiration 35.20% 64.60% 0.00% 0.20%

Silence 0.00% 0.00% 100.00% 0.00%

Snore 2.60% 0.00% 0.00% 97.40%

Table 3

Results using MFCC feature.

Expiration Inspiration Silence Snore

Expiration 14.70% 75.60% 0.00% 9.70%

Inspiration 0.10% 99.90% 0.00% 0.00%

Silence 0.00% 5.80% 94.20% 0.00%

Snore 0.00% 5.00% 0.00% 95.00%

The performances of two experiments both show that most of the errors occurred between

expiration and inspiration. However, considering expiration and inspiration as breath, when using our

proposed feature the performance for breath detection is 99.90% while the accuracies for silence and

snore are 100% and 97.40% respectively. Moreover, Table 3 indicates that our proposed feature

outperforms the MFCC in snoring classification problems. Specifically, the MFCC based detection

rates are 94.86 for breath, 94.2% for silence and 95% for snore.

4. Discussion and conclusion

In this paper we proposed the simple and efficient features for snoring detection. The features are 6-

dimensional sub-band mean power and standard deviations with normalization. The performance of

the proposed system is encouraging and comparable with MFCC based system, which is

recommended as the best feature. The advantages of the proposed system are less calculation amount,

low cost and convenient for application implementation such as bedside devices installed at patient’s

home and smart phone application.

In future, we will implement a post processing based on this method and apply the results to snoring

episode detection and also explore other feature extraction methods for more comprehensive database.

5. References

[1] M. R. Mannarino, F. D. Filippo, and M. Pirro, “Obstructive sleep apnea syndrome,” European Journal of Internal

Medicine, vol. 7, pp. 586-593, 2012.

[2] A. Tarasiuk, S. G. Dotan, T. Simon, T. TAL, A. Oksenberg, and H. Reuveni, “Low socioeconomic status is a risk

factor for cardiovascular disease among adult obstructive sleep apnea syndrome patients requiring treatment,”

Chest, vol. 130, pp. 766-773, 2006.

[3] H. K. Yaggi, J. Concato, W. N. Kernan, J. H. Lichtman, L. M. Brass, and V. Mohsenin, “Obstructive Sleep Apnea

as a Risk Factor for Stroke and Death,” New England Journal of Medicine, vol. 353, no. 19, pp. 2034-2041, 2005.

[4] M. Partinen, and T. Telakivi, “Epidemiology of obstructive sleep apnea syndrome,” Sleep, vol. 15, pp. S1-4, 1992.

[5] N. Ben-Israel, A. Tarasiuk, and Y. Zigel, “Nocturnal sound analysis for the diagnosis of obstructive sleep apnea,”

Proceeding of International Conference of IEEE Engineering in Medicine and Biology Society 2010, pp. 6146-

6149, 2010.

[6] R. Agarwal, and J. Gotman, “Digital tools in polysomnography,” Jouranl of Clinical Neurophysiology, vol. 19(2),

pp. 136-43, 2002.

[7] W. D. Duckitt, S. K. Tuomi, and T. R. Niesler, “Automatic detection, segmentation and assessment of snoring from

ambient acoustic data,” Physiological Measurements. , vol. 27, pp. 1047-1056, 2006.

[8] M. Cavusoglu, M. Kamasak, O. Erogul, T. Ciloglu, Y. Serinagaoglu, and T. Akcam, “An efficient method for

snore/nonsnore classification of sleep sounds,” Physiological Measurements, vol. 28, pp. 841-853, 2007.

[9] E. Dafna, A. Tarasiuk, and Y. Zigel, “Automatic Detection of Whole Night Snoring Events Using Non-Contact

Microphone,” PLoS ONE, vol. 8, no. 12, pp. e84139, 2013.

[10] R. Beck, M. Odeh, A. Oliven, and N. Gavriely, “The acoustic properties of snores,” European Respiratory Journal,

vol. 8(12), pp. 2120-8, 1995.

[11] T. Kinnunen, and H. Li, “An overview of text-independent speaker recognition: From features to supervectors,”

Speech Communication, vol. 52, pp. 12-40, 2010.

Snoring DeteA Simple and Efficient Spectral Features for Breathing and Snoring Sound Classification

Documents

Transcript of Snoring DeteA Simple and Efficient Spectral Features for Breathing and Snoring Sound Classification