Abeyratne Et Al. - 2013 - Obstructive Sleep Apnea Screening by Integrating Snore Feature Classes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

[Abeyratne2013]

IOP PUBLISHING PHYSIOLOGICAL MEASUREMENT


Physiol. Meas. 34 (2013) 99–121 doi:10.1088/0967-3334/34/2/99

Obstructive sleep apnea screening by integrating snore


feature classes
U R Abeyratne 1 , S de Silva 1 , C Hukins 2 and B Duce 2
1 School of Information Technology and Electrical Engineering, The University of Queensland,

St. Lucia, Brisbane, Australia


2 Sleep Disorder Center, Princess Alexandra Hospital, Brisbane, Australia

E-mail: [email protected]

Received 16 August 2012, accepted for publication 15 November 2012


Published 23 January 2013
Online at stacks.iop.org/PM/34/99

Abstract
Obstructive sleep apnea (OSA) is a serious sleep disorder with high
community prevalence. More than 80% of OSA suffers remain undiagnosed.
Polysomnography (PSG) is the current reference standard used for OSA
diagnosis. It is expensive, inconvenient and demands the extensive involvement
of a sleep technologist. At present, a low cost, unattended, convenient OSA
screening technique is an urgent requirement. Snoring is always almost
associated with OSA and is one of the earliest nocturnal symptoms. With
the onset of sleep, the upper airway undergoes both functional and structural
changes, leading to spatially and temporally distributed sites conducive to
snore sound (SS) generation. The goal of this paper is to investigate the
possibility of developing a snore based multi-feature class OSA screening
tool by integrating snore features that capture functional, structural, and
spatio-temporal dependences of SS. In this paper, we focused our attention
to the features in voiced parts of a snore, where quasi-repetitive packets of
energy are visible. Individual snore feature classes were then optimized using
logistic regression for optimum OSA diagnostic performance. Consequently,
all feature classes were integrated and optimized to obtain optimum OSA
classification sensitivity and specificity. We also augmented snore features with
neck circumference, which is a one-time measurement readily available at no
extra cost. The performance of the proposed method was evaluated using snore
recordings from 86 subjects (51 males and 35 females). Data from each subject
consisted of 6–8 h long sound recordings, made concurrently with routine PSG
in a clinical sleep laboratory. Clinical diagnosis supported by standard PSG was
used as the reference diagnosis to compare our results against. Our proposed
techniques resulted in a sensitivity of 93 ± 9% with specificity 93 ± 9% for
females and sensitivity of 92 ± 6% with specificity 93 ± 7% for males at an
AHI decision threshold of 15 events/h. These results indicate that our method

0967-3334/13/020099+23$33.00 © 2013 Institute of Physics and Engineering in Medicine Printed in the UK & the USA 99
100 U R Abeyratne et al

holds the potential as a tool for population screening of OSA in an unattended


environment.

Keywords: obstructive sleep apnea, community screening, snoring


(Some figures may appear in colour only in the online journal)

1. Introduction

Obstructive Sleep Apnea (OSA) syndrome is characterized by the recurring full or partial
occlusion of the upper airway (UA) during sleep. Full cessation of the UA is defined
as obstructive apnea and partial closure is termed as hypopnea. Intense snoring and excessive
daytime sleepiness are commonly observed among individuals suspected of OSA.
OSA is a highly prevalent disease leading to a range of downstream complications such
as increased risk of ischaemic heart disease (Baldwin et al 2005), stroke, type II diabetes,
neurocognitive dysfunction, and increased vulnerability to accidents. If diagnosed early, its’
devastating effects can be thwarted. However, 93% of women and 82% of men suffering OSA
remain undiagnosed (Young et al 1997).
With the onset of sleep, the UA encounters a number of physiological changes including
the relaxation of dilator muscle activity, associated mechanics, and the increased latency of
the reflex muscle activity of the UA (Indu and David 2003). The UA becomes unstable due to
altered arrangements of muscle functions. This instability and impaired muscle tone narrows
the UA, evoking soft tissues to vibrate. All these developments may affect the subtle balance
between dilating and collapsing forces of the UA that will determine the UA patency. (Baldwin
et al 2001, Bar et al 2003, Malhotra and White 2002.)
OSA, by definition, is closely related to the UA patency and snoring is the earliest
cardinal sign of OSA. OSA, UA patency and snoring are delicately coupled, interrelated and
inseparable from one another. SS generation is an integrated acoustic process of the UA,
and it could happen in any membranous region of the UA. As such, alterations of the UA
should be embedded in the SS. SS have a complex shape and composition. The UA muscle
activity, anatomy, degree of the constrictions, sites of snoring and gender are prime factors
that influence snore properties.
In the past, many researchers have attempted to derive snore features to characterize
different aspects of OSA. This approach is advantageous, as it is a direct manifestation of
the UA properties occur during OSA events unlike other methods that have to circumvent
instrumentation problems and uncertainties associated with surrogate measures. Cavusoglu
et al 2008 investigated changes of snoring behavior in the time domain by analyzing inter-
snore episode association to characterize OSA condition. Fiz et al 1996 explored the changes
of SS in the frequency domain. Time and frequency properties of SS were investigated by
Soler et al 2007. Although these methods affirmed that SS carries information about OSA, it
gave mixed results in diagnosing the disease.
In our group’s previous work, Lee et al (2000) explored the phenomenon of snore phase-
coupling to diagnose apnea. Abeyratne et al (2005) have proposed to capture the UA vibration
properties by characterizing the periodicity of SS through the estimation of pitch derived
parameters and reported of achieving sensitivity (specificity) of 91% (67%) in diagnosing
OSA at AHI decision threshold = 10. Lee et al (2000), Ng et al (2008) and de Silva
et al (2011) explored the use of formant frequencies as a feature to characterize upper airway
Obstructive sleep apnea screening by integrating snore feature classes 101

in OSA. de Silva et al (2011) showed that recurrence-based features are efficient in OSA
diagnosis. Abeyratne et al (2007) have suggested capturing SS origin information which is
spatially and temporally distributed using higher order statistics (HOS) based techniques.
Commonly used second order statistics based techniques may be insufficient to adequately
describe it. Karunajeewa et al (2011) has shown its superlative applicability to diagnosis
of OSA in spite of the associated computational complexities. Inspired by the success of
HOS work, Ghaemmaghami et al (2009) proposed a simpler feature to characterize the upper
airway by quantifying the deviation of snores from a Gaussian distribution. Karunajeewa
et al (2011) have proposed to capture auditory properties of SS to diagnose OSA using Mel
frequency cepstral coefficients (MFCC), motivated by its performance in speech recognition
applications. Emoto et al (2011) investigated the information content in higher-frequency
regions of the snore spectrum. Our group initially focused into the problem of developing
individual snore features to diagnose OSA either based on physiological observations of the
UA or biomechanical considerations.
Individual features developed over many years led to the conclusion that snore sound
analysis carries vast potential in diagnosing sleep apnea without using sensors requiring
physical contact with patients. The prime aim of this paper is to develop a new technique by
integrating snore derived features that represent different perspectives of the OSA process.
In this paper, snore features corresponding to several feature classes, namely pitch,
recurrence, formant, HOS, non-Gaussianity and MFCC were derived, focusing into the
voiced parts of a snore. Features were optimized to obtain maximum OSA diagnostic
sensitivity/specificity in individual feature classes using logistic regression techniques. Next,
snore feature classes were integrated and feature set was optimized to obtain maximum OSA
diagnostic sensitivity/specificity performance. We used receiver operating characteristics
(ROC) curves to find optimum threshold values and all the derived models were cross-
validated. These techniques were applied to male and female data sets separately. We also
investigated the impact of neck circumference on our model performance. The use of our
intensity independent objective definition (Abeyratne et al 2005) for snoring enabled us to
fully automate the technology. Our methods do not require the presence of sleep technologists
and uses simplistic non-contact snore acquisition instrumentation. The proposed methods,
unlike standard PSG, are also free from time consuming, subjective manual analysis of a
massive amount of data.

2. Method

An overall summary of the methods used in this paper is presented in figure 1. We explain
the OSA screening technique developed by us in sections 2.1–2.5. Snore related sounds
(SRS) were segmented and voiced snoring segments (VSS) (refer to the appendix for VSS
definition) were identified. The use of voiced snoring segments is one of the novelties of this
paper, compared to existing work. OSA diagnostic snore parameters were calculated for all
feature classes. Optimum snore features were selected using a logistic regression technique
for individual feature classes as well as for integrated feature class scenarios for optimum
OSA classification performance. Selected snore features were then applied to develop the
classification models. These logistic regression based models were applied for male, female and
male/female combined subjects separately for the work of this paper. Sensitivity/specificity
values were calculated for different apnea hypopnea index (AHI) thresholds for OSA/non-
OSA classification. Subsequently, NC was augmented to optimum snore features based model
to investigate its’ impact on our developed model performance. Our results were cross validated
by repeating the same process with randomly selected training and testing sets in all scenarios.
102 U R Abeyratne et al

Snore Related Sounds (SRS)

Identification of
Voiced Snoring Segments (VSS)

OSA diagnostic feature estimation


Pitch/Recurrence/Formant/HoS/Non-Gaussian/MFCC

Logistic regression Logistic regression


(LR) based feature (LR) based feature
optimization for optimization for
individual feature integrated feature
classes classes

LR modelling based OSA/non-OSA


classification

Figure 1. Block diagram showing the overall summary of the method proposed in this paper to
classify subjects into OSA/non-OSA. This process consists of two main parts with the first part
covering the identification of snore derived features in individual feature classes as well as in
integrated feature classes, to obtain optimum sensitivity/specificity performance in OSA/non-OSA
classification. The second part involves adaptation of logistic regression modeling technique for
OSA/non-OSA grouping.

2.1. Snore related sound data acquisition


SRS were acquired with a high fidelity, CD-quality computerized data acquisition system. Two
matched low noise microphones having a hypercardiod beam pattern (Model NT3, RODE R
,
Sydney, Australia) were used for recording. They were placed 50 cm away from each other
behind the patients head and equal gain was set. The nominal length from the microphone to
the mouth of the patient was 50 cm, but could have vary between 40 cm to 70 cm due to patient
movements. Although our recording setup consisted of two microphones, we discovered that a
single channel of data was sufficient for the work of this paper. An A/D converter unit (Model
Mobile-Pre USB, M-Audio R
, CA, USA) and a low-end, professional quality pre-amplifier
were used for SRS acquisition. The SRS acquisition process involved capturing SRS data:
with the amplification, filtering and A/D conversion done within the M-Audio system. The
SRS was digitized at a sampling rate of 44.1 k samples/s to obtain the best sound quality.
However, the proposed method did not rely upon on the sound intensity and the results were
independent of the mouth-to-microphone distance.

2.2. Subject database and routine PSG data acquisition


Table 1 summarizes the details of our subjects who were referred to the sleep laboratory for the
diagnosis of suspected OSA. All of them were naive to oral surgery, uvulopalatopharyngoplasty
Obstructive sleep apnea screening by integrating snore feature classes 103

Table 1. Subject variable comparison between female/male/female and male combined groups
for AHI decision threshold = 15, 30.
No of AHI NC Age BMI
subjects
0  AHI < 15 Female 19 6.2 ± 4.8 32.95 ± 3.65 49.7 ± 9.9 27.02 ± 8.94
Male 12 8.7 ± 4.0 42.04 ± 5.21 47.3 ± 12.3 33.1 ± 6.74
Combined 31 7.8 ± 4.6 39.66 ± 4.26 46.5 ± 10.7 31.99 ± 8.09
AHI > 15 Female 16 41.7 ± 27.3 39.58 ± 4.74 52.2 ± 13.5 35.7 ± 10.80
Male 39 45.5 ± 24.8 44.86 ± 3.46 53.5 ± 13.8 33.08 ± 5.34
Combined 55 44.4 ± 25.4 43.32 ± 4.53 53.1 ± 13.6 33.84 ± 7.34
0  AHI < 30 Female 22 9.7 ± 6.7 37.5 ± 3.68 48.4 ± 11.7 29.96 ± 8.5
Male 23 14.1 ± 7 42.97 ± 4.92 48.5 ± 13.1 31.9 ± 6.26
Combined 45 11.9 ± 7.1 40.18 ± 4.67 48.5 ± 12.3 30.91 ± 7.47
AHI > 30 Female 13 51.8 ± 27.4 41.66 ± 3.92 49.6 ± 12.8 40.62 ± 9.17
Male 28 55.5 ± 22.0 45.20 ± 3.45 55 ± 13.6 34.06 ± 4.96
Combined 41 54.4 ± 23.5 44.20 ± 3.88 53.5 ± 13.4 35.91 ± 6.96
AHI: apnea-hypopnea index; NC: neck circumference; BMI: body mass index.

surgery and CPAP. A typical SRS recorded in a sleep laboratory is approximately five to 10 h
in duration and may sometimes contain more than five thousand snore episodes (SE).
For the work of this paper, we gathered PSG data from patients who are subjected
to routine PSG testing at Princess Alexandra Hospital which recorded data using clinical
PSG equipment (Model Siesta, Compumedics R
, Sydney, Australia). We obtained Ethical
clearance approvals from Princess Alexandra Hospital, Brisbane, and from the University of
Queensland, Australia to use the collected data for research purposes. A sleep technician,
having over 9700 h of experience, prepared the patient, placed electrodes, and set up the
instrument. The recording montage included; EEG electrodes which were placed at the
C3/A2 position and its backup derivation C4/A1, left and right EOG, chin EMG, nasal
pressure, oxygen saturation via finger pulse oximetry (Nonin Xpod), thoracic and abdominal
movement (inductance plethysmography), left and right leg movement (piezoelectric sensors),
ECG, body position and sound microphones. PSG analysis was accomplished according to
the following guidelines. Sleep staging was done following Rechtchaffen and Kales (1968)
criteria. The recommendations of ASDA (1992) were adhered to in EEG arousals scoring.
Flemons et al (1999) recommendations were followed in event definition and measurement
techniques. The recommendations of Atlas Task Force of ASDA (1993) were adopted for the
recording and scoring of leg movements. Apart from PSG data, an expert-edited annotation
file with an event-by-event description of features such as apneas and sleep stages as well as
simultaneously recorded SRS were collected.

2.3. Snore sound segmentation and VSS identification

We used a pattern recognition algorithm developed by our group (Karunajeewa et al 2008 and
de Silva et al 2011) to classify SS into snore, breathing and silence. Figure 2 shows a typical
snore segment that describes VSS. SEs were scrutinized to identify VSS based on the presence
of pitch information (refer to the appendix for SE and VSS definition). We applied a simple
time domain criterion to discard the specific class of unwanted background sounds such as
bed sounds, duvet sounds and speech sounds from further analysis that even can be found in
controlled hospital environment.
104 U R Abeyratne et al

Snore Episode

Inspiration (a)
0.02 Expiration

Magnitude
0

-0.02

-0.04
269.065 269.07 269.075 269.08 269.085 269.09 269.095 269.1
Time ( s )

(b)
0.02 silence
Magnitude

-0.02 Unvoiced snore segment

Voiced snore segment


-0.04
269.075 269.077 269.079 269.081 269.083 269.085
Time(s)

Figure 2. (a) Snore episode as defined through (I)–(III) as in the appendix. (b) an enlarged portion
within the snore episode, demonstrating the voiced snore segment, unvoiced snore segment and
silence segments.

2.4. OSA/non-OSA classification features


We identify that the mechanism of snoring is analogous to the generation of human speech.
Snoring is a consequence of the changes in the configuration and properties of the UA which
carries vital information about the dynamic state of the UA. The UA acts as an acoustic filter
during snore generation in the similar way the vocal tract does in the case of speech. The
human snore generation process can be explained in terms of a source vibration model while
speech generation can be explained in terms of a source/vocal-tract model (Abeyratne et al
2005). Inspired by these analogies, we customized the speech processing techniques to SS
analysis. In literature, it has been shown that there plausibility in adapting speech processing
techniques to snore based OSA diagnosis (Fiz et al 1996, Abeyratne et al 2001 and Ng et al
2008). Incorporating an objective snore definition, Abeyratne et al 2005 have introduced a
fully automated overnight SS analysis technique using digital speech processing techniques. In
this paper, we derived OSA diagnostic features based on the snore generation process and also
on ideas borrowed from speech analysis. In sections 2.4.1–2.4.6, we illustrate OSA diagnostic
features in detail.

2.4.1. Feature class I: pitch based feature estimation. Let ith voiced snore segment (VSSi)
consist of Li number of data blocks each of length D and VSSi length given by D x Li. By
definition (see the appendix), VSSi will have blocks having pitch period values {μij} < D
{where μij : j = 1, 2, . . . , Li}. We term μij as intra-snore pitch series of VSSi. We calculated
mean (mμi), standard deviation (sdμi), skewness (skμi) and kurtosis (kuμi) values for all VSS.
Subsequently, we estimated the mean and standard deviation of the above variables ({memμ,
stmμ}, {mesdμ, stsdμ}, {meskμ, stskμ}, {mekuμ and stkuμ}) respectively. For the work of
this paper, we augmented the above features with the group pitch variation probability (GPVP)
feature borrowed from our previous work (de Silva et al 2011). GPVP captures, and quantifies
pitch variation and while discounting the effects introduced by VSS length variation. Refer de
Obstructive sleep apnea screening by integrating snore feature classes 105

Silva et al (2011) for further details on this feature estimation procedure. For the remainder of
the paper, we refer to these features as ‘pitch feature class’.

2.4.2. Feature class II: recurrence based feature estimation. The extent of the deterministic
structure present in the signal can be quantified using a normalized recurrence time probability
density entropy which is commonly used in speech disorders analysis (Max et al 2007).
Considering the similarities in speech/snore production, we extended these techniques for
OSA detection using SS. We calculated mean normalized recurrence time probability density
entropy (mR) for all VSS. Consequently, we estimated the mean (meR), standard deviation
(stR), skewness (skR) and kurtosis (kuR). We also added the quantified recurrence probability
density entropy feature developed in our previous work (refer de Silva et al (2011) for feature
estimation details) to the recurrence based feature vector. In the rest of the paper, we refer the
above feature vector as recurrence feature class.

2.4.3. Feature class III: formant based feature estimation. We recognize that acoustic
properties of SS depending on the physical dimensions of the UA when constrained during
apnea events may reveal the acoustic structure of the UA. First formant frequency (F1) is
a resonance frequency of UA. We calculated mean (mF1) and standard deviation (stF1) of
first formant frequency for all VSS (refer de Silva et al (2011) for formant computation and
the method followed details). Next, we computed mean (mmF1), standard deviation (smF1),
skewness (skmF1), kurtosis (kmF1) of mF1 and skewness (sksF1) and kurtosis (ksF1) of stF1.
We also estimated the ratio of VSS with values below 400 Hz (b400), between 400 and 800 Hz
(bet4800) and above 800 Hz (a800) to total VSS. These formant based features are referred
as formant feature class for the work in this paper.

2.4.4. Feature class IV: higher order statistics based feature estimation. Inspired by the
source/vocal tract model for human speech synthesis, we presume that SS can be modeled as
a convolution of source signals that represent the acoustical energy and total airway response
(TAR) which captures the acoustical signature of the UA. In quantifying TAR, we estimated
the bispectrum {Ci( f 1, f2)} of TAR for all VSSi. We used the property that the non redundant
region confine to the principal domain triangle {PTi( f 1, f 2)} due to symmetry of bispectrum
in the f 1, f 2 plane in these feature calculation. Therefore, using principal domain triangle, we
estimated the diagonal slice Di(fn) {n = 1:N number of frequency bins in Di(fn)}. Di(fn) is one-
dimensional distributions of amplitude as a function of frequency representing the spectral
composition of TAR (Abeyratne 1999). We compute the center frequency (Fci), standard
deviation of frequency (Fdi), symmetry coefficient (Fsi), ratio of total band amplitude for
a given frequency band (Rp500, Rp800, Rp1000) ratio of total band amplitude for a given
frequency band to a total amplitude outside that band (Rp501, Rp801, Rp1001), mean and
variance of TAR using Di(fn). In previous studies, Perez-Padilla et al (1993) mentioned that
OSA patients had a spectral peak above 800 Hz and Fiz et al (1996) reported that benign
snorers had spectral peaks below 500 Hz and OSA patient peaks below 1000 Hz. Considering
these findings, we selected 500, 800 and 1000 Hz as our frequency bands for analysis (refer
Karunajeewa et al (2011) for detail information of feature development and methodology
adopted). These 12 features are referred to as HoS feature class in the rest of the paper.

2.4.5. Feature class V: non-Gaussianity based feature estimation. VSS were pre-emphasized
to amend the roll-off in the spectrum as commonly used in speech analysis. The pre-processed
106 U R Abeyratne et al

VSS were used to derive a new segment per segment measure (length of D), called the non-
Gaussianity score (NGS), which is a quantitative measure of the deviation from Gaussianity
of a segment. The NGS was estimated using a method centered on the normal probability plot
which is a plot of the midpoint positions given data segment versus the theoretical quantiles
of a normal distribution (Shama et al 2007). If the distribution of the data under consideration
is normal, the plot will be linear. Other probabilities will lead to plots that deviate from
linearity depending on the nature of the actual distribution. The normal probability plot is
often used as a powerful qualitative tool in visualizing the ‘Gaussianity’ of a given set of
data. We computed mean (gm), standard deviation (gsd) and skewness (gsk) of the NGS for
each VSS. We derived three OSA diagnostic parameters by estimating mean of gm, gsd and
gsk. A further six parameters were derived by estimating the ratio of total number of VSS to,
(1) the number of VSS having gm > mean(gm), (2) the number of VSS having gm values in
between mean(gm) ± standard deviation(gm), (3) the number of VSS having gsd > mean(gsd),
(4) the number of VSS having gsd values in between mean(gsd) ± standard deviation(gsd),
(5) the number of VSS having gsk > mean(gsk), (6) the number of VSS having gsk values
in between mean(gsk) ± standard deviation(gsk). We refer to these features as non-Gaussian
feature class for the remainder of this paper. Refer Ghaemmaghami et al (2009) for NGS
parameter estimation details.

2.4.6. Feature class VI: Mel frequency cepstral coefficients (MFCC). Motivated by MFCC
effectiveness in automatic speech recognition systems and following our previous exploration
(Karunajeewa et al 2011) we adopted MFCC parameters for the analysis of voiced components
of snores. MFCC encode the sound in a way that mimics the function and capabilities of
the human ear (Davis and Merlmestein 1980). The VSS was split into lengths of 30 ms
each. Thirteen Mel cepstral coefficients were estimated using a triangular Mel filter banks.
We calculated mean MFFC for each VSS and then estimated the overall mean MFCC
(mMFCCi = 1, . . . ,13) values using all VSS.

2.4.7. Neck circumference as an important risk factor. Obesity is one of the few controllable
risk factors associated with OSA. Pharyngeal airway size reduction with increasing weight
may raise the propensity for OSA. Davies et al 1992 and Mortimore et al 1998 have argued
that NC indirectly reveals the geometrical properties of the UA. NC is a common clinical
variable measured only once prior to PSG test on the subject before going to bed.

2.5. OSA/non-OSA classification model


A classification model that is capable of discriminating the benign and apenic snorers using
snore parameters is now required. Snore parameters consist of a pool of independent variables
and one dependent variable with categorical outcomes, namely OSA and non-OSA. Logistic
regression (LR) is a multi-parametric method devised for dichotomous outcomes frequently
used in medical literature (Epstein et al 2002, Steven et al 2001 and Timmerman et al 2005).
LR analysis demands no assumptions about the distribution of the independent variables
unlike other methods used in health sciences such as linear discriminant analysis. It is also not
feasible to check the adherence to assumptions of independent variables continually. Hence,
it is assumed that LR is a more flexible and robust method in the case of violations of these
assumptions. The relative ease of interpretation of the results thus makes logistic regression
one of the most appropriate techniques for OSA/non-OSA classification.
For the work of this paper, the dependent variable Z is assumed to be equal to ‘zero’
(Z = 0) for non-OSA subjects and ‘one’ (Z = 1) for OSA subjects. A model is derived using
Obstructive sleep apnea screening by integrating snore feature classes 107

logistic regression to estimate the outcome variable probability Z = 1 for a set of n predictor
independent variables as follows:
exp(β0 + β1 x1 + · · · + βn xn )
Pn (Z = 1|x1 , x2 , . . . , xn ) = ,
1 + exp(β0 + β1 x1 + · · · + βn xn )
where β m (m = 0, 1, 2, . . . , n) is the model parameters estimated by the maximum likelihood
method.

2.5.1. Model parameter estimation. The subject database was divided into two parts termed
as training and testing data sets. 70% of the subjects were randomly chosen for the training
set in such a way as to achieve an approximately uniform distribution. The remaining subjects
were then utilized for testing. We termed one such realization of training and testing data set
as a classification data set. By repeating the above procedure, we developed 50 classification
data sets and an arbitrary kth set was denoted by Ck, k = 1, 2, . . . , 50. It must be noted that
training and testing sets were independent and mutually exclusive from one another.
The training data set was used to derive the model parameters (β m: m = 0, . . . , n)
and the final model was determined by adopting a stepwise approach using the MATLAB
statistical toolbox version 7.5. In the first step, we considered all available independent
variables to derive model parameters. For the second step, we used likelihood statistic to
find the important predictor variables for the final model. The final model includes only the
optimum independent variables that facilitate the classification. Those parameters were used
to estimate the probability (Pn) and then were classified as belonging to either of the two
classes using a probability threshold Pthre.

2.5.2. OSA/non-OSA classification model performance evaluation. In order to classify an


unknown subject into OSA/non-OSA, the ‘true clinical diagnosis’ of the patient is required.
We considered the clinical diagnosis was positive for OSA at that particular decision threshold,
if the PSG derived AHI > AHI threshold (AHI threshold = 15,30), otherwise the diagnosis
was regarded as negative. For the work of this paper, the clinical diagnosis obtained using
PSG was considered as the absolute truth and we used the same. The LR analysis based
classification (OSA or non-OSA) of the subject was compared to the ‘absolute truth’, and
the class of the decision was noted down as one among (i) true positives, (ii) true negatives,
(iii) false positives or (iv) false negatives.
Sensitivity/specificity values were calculated for training and testing data sets for different
AHI decision thresholds. We optimized Pthre value to get the optimum sensitivity and specificity
values. We found the optimal Pthre value by capitalizing on the most widely used Receiver
Operating Characteristics (ROC) curve techniques (Dwyer 1996 and Metz 1978). ROC curves
were plotted by computing the sensitivity and specificity values by changing Pthre values
in steps of 0.03. To provide further insights on how the model would perform in a clinical
application we computed the positive predictive value (PPV) and the negative predictive value
(NPV).

2.5.3. Cross validation of results. We repeated the process explained in section 2.5.1
and generated fifty mutually exclusive independent classification data sets for all the
combinations based on snore features, gender and different AHI decision thresholds. All
classification data sets (Ck, k = 1, 2, . . . , 50) were analyzed independently and corresponding
sensitivity/specificity values calculated. We further cross validated our results by effectively
applying our dataset.
108 U R Abeyratne et al

Table 2. Female HoS feature class LR model parameter optimization (for AHI decision threshold
= 15).
All snore features based model parameters Selected snore features based model parameters
β SE p value β SE p value

Con −36.983 13.739 0.007 −10.271 6.718 0.126


F1 −0.021 0.006 0.001 −0.011 0.004 0.007
F2 −0.001 0.009 0.974
F3 0.029 0.010 0.005 0.026 0.011 0.018
F4 22.596 11.002 0.040
F5 −100.174 28.716 0.001 −74.163 27.173 0.006
F6 −19.198 8.983 0.033
F7 0.004 0.001 0.006 0.001 0.001 0.098
F8 −0.048 0.014 0.001 −0.016 0.006 0.011
F9 0.029 0.011 0.006 −0.011 0.007 0.012
F10 0.955 0.669 0.153
F11 18.922 8.678 0.029 20.377 9.504 0.032
F12 1.746 0.635 0.006 1.344 0.573 0.019
HoS: higher order statistics, β: coefficients, SE: standard error, Con: constant or intercept, p-value: indicates the
overall contribution of corresponding variable to model definition, F1–F12: HoS based features.

3. Result and discussion

In this section, we show the results obtained for diagnosing OSA for individual feature
classes as well as for integrated feature classes using the LR based technique. We resort to
a two-step approach for LR based model development. In the first step, the training data
sets of independent snore parameters were applied to the logistic regression algorithm which
identified the optimum parameters that gave superlative OSA diagnostic performance. In the
second step, the chosen parameters were applied for the training data set to derive the LR
model parameters. Thereafter, the LR model was applied to the mutually exclusive testing
data set to obtain OSA diagnosis sensitivity and specificity.
It should be noted that in this paper we compute features from voiced snore segments
(VSS) which is a novel approach compared to previous work.

3.1. Optimum parameter selection


Table 2 shows LR model parameters obtained for the female training data set in HoS feature
class at AHI decision threshold = 15. The left hand side of table 2 shows the LR model
parameters derived for all snore features. The independent snore features that facilitate
the optimum OSA/non-OSA classification were selected by analyzing the corresponding
p value of the snore feature. The right hand side of table 2 shows the model parameter
values derived when the optimum feature set is used. All features based models resulted in
sensitivity (specificity) of 72%(85%) while the selected optimum features based model lead
to an improvement of 9%(0.1%) in sensitivity (specificity) thus showing the better overall fit
of the optimum feature based model.

3.2. Female OSA diagnosis


Snore features were estimated for the training and testing data sets from all feature classes. The
process described in section 3.1 was followed for all feature classes independently. Figure 3
shows the receiver operating characteristics curves (ROC) drawn for different snore feature
Obstructive sleep apnea screening by integrating snore feature classes 109

Figure 3. Receiver operating characteristic curves drawn for a classification training data set
of female subjects for individual feature classes in OSA/non-OSA separation at AHI decision
threshold = 15 (HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients).

Table 3. Female OSA detection performance for 50 testing data sets when considered for individual
snore feature classes (for AHI decision threshold = 15).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.78 ± 0.03 80 ± 14.56 66 ± 9.25 0.8 ± 0.14 0.66 ± 0.09
Recurrence 0.70 ± 0.04 61.2 ± 4.79 80.4 ± 2.82 0.61 ± 0.04 0.80 ± 0.02
Formant 0.75 ± 0.07 80 ± 0.0 80 ± 0.0 0.8 ± 0.00 0.8 ± 0.00
HoS 0.79 ± 0.03 81.6 ± 5.48 68.4 ± 9.97 0.81 ± 0.05 0.68 ± 0.09
Non-Gaussian 0.62 ± 0.03 69.6 ± 13.54 60 ± 0.0 0.69 ± 0.13 0.6 ± 0.00
MFCC 0.79 ± 0.08 71.2 ± 10.02 72 ± 9.89 0.71 ± 0.10 0.72 ± 0.09
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

classes for training data sets. Table 3 summarises the results derived for individual feature
classes at AHI decision threshold = 15 for the testing data set by applying the model parameters
derived from the training data set.
In case of female subjects at AHI decision threshold = 15, individual snore feature class
based methods resulted in mean OSA detection sensitivities in the range 61–81% while holding
the specificities from 60% to 80%.
We methodically integrated snore feature classes and the process described in section 3.1
was followed separately for all cases (I–VII). Table 4 displays the mean OSA detection
sensitivity/specificity obtained for female data sets when snore feature classes were integrated
at AHI decision threshold = 15. The left hand side of the table shows feature class mix denoted
by case numbers from I to VII.
All the snore feature class integrated models resulted in 92 ± 9.89% sensitivity while
holding specificity at 91.2 ± 11.54%. Furthermore, snore feature class integrated models
resulted in sensitivities in the range 84–92% while specificities in the range 86–91%.
Table 5 summarizes the mean OSA detection sensitivity/specificity values obtained for
female data sets for individual feature classes at the decision threshold AHI = 30. Table 5
110
Table 4. OSA detection performance for 50 testing data sets for integrated snore feature classes (for AHI decision threshold = 15).
Case Pit Rec For HoS Non Gau MF CC AUC Sensitivity (%) Specificity (%) PPV NPV
I x x x x x x 0.97 ± 0.03 92 ± 9.89 91.2 ± 11.54 0.92 ± 0.09 0.91 ± 0.11
II x x x x x 0.91 ± 0.09 89.6 ± 10.09 88.4 ± 12.18 0.89 ± 0.10 0.88 ± 0.12
III x x x x x 0.88 ± 0.04 84.0 ± 8.08 87.6 ± 9.80 0.84 ± 0.08 0.87 ± 0.09
IV x x x x x 0.94 ± 0.06 90.0 ± 10.10 89.2 ± 11.57 0.90 ± 0.10 0.89 ± 0.11
V x x x x x 0.94 ± 0.02 84.8 ± 8.62 86.4 ± 9.42 0.84 ± 0.08 0.86 ± 0.09
VI x x x x x 0.95 ± 0.04 87.2 ± 9.69 90.4 ± 10.09 0.87 ± 0.09 0.90 ± 0.10
VII x x x x x 0.94 ± 0.05 88.4 ± 9.97 86.4 ± 9.42 0.88 ± 0.09 0.86 ± 0.09
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence, For: formant, HoS: higher order statistics,
Non-Gau: non-Gaussianity, MFCC: Mel frequency cepstrum coefficients.

U R Abeyratne et al
Obstructive sleep apnea screening by integrating snore feature classes 111

Figure 4. Receiver operating characteristic curves drawn for a one classification training data set
of male subjects for individual dimension in OSA/non-OSA separation at AHI decision threshold
= 15 (HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients).

Table 5. Female OSA detection performance for 50 testing data sets when considered for individual
feature classes (for AHI decision threshold = 30).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.74 ± 0.08 64 ± 9.89 74 ± 12.28 0.64 ± 0.09 0.74 ± 0.12
Recurrence 0.73 ± 0.04 61.2 ± 4.79 80 ± 0.0 0.61 ± 0.04 0.80 ± 0.00
Formant 0.75 ± 0.03 68.4 ± 9.97 80 ± 0.0 0.68 ± 0.09 0.8 ± 0.00
HoS 0.82 ± 0.08 80 ± 0.00 84.8 ± 8.62 0.80 ± 0.00 0.84 ± 0.80
Non-Gaussian 0.63 ± 0.05 61.20 ± 4.79 65.6 ± 9.07 0.61 ± 0.04 0.65 ± 0.09
MFCC 0.72 ± 0.06 66.0 ± 9.25 78.8 ± 4.79 0.66 ± 0.09 0.78 ± 0.04
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

illustrates that individual feature class based OSA detection lead to sensitivities in the range
61–80% while specificities in the range 65–80%.
Table 6 shows the results obtained for female data sets at the AHI decision threshold = 30.
Table 5 indicates that mean OSA detection sensitivity (specificity) remains in the range
84–93% (85–92%) for females at the decision threshold AHI = 30.
According to tables 3 and 5 we have observed that none of the individual feature class
based models had produced good results for OSA detection at AHI decision thresholds = 15
or at 30. However, according to tables 4 and 6, we have observed an average improvement in
the range from 11–23% in sensitivity and 11–26% in specificity at the AHI decision threshold
= 15. The corresponding improvement of figures in sensitivity is 13–23% and in specificity
12–20% at AHI decision threshold = 30. In summary, we have observed a coherent pattern in
OSA detection results at the AHI decision threshold = 15 and at 30 due to snore feature class
integration.
112
Table 6. OSA detection performance for 50 testing data sets for integrated snore feature classes (for AHI decision threshold = 30).
Case Pit Rec For HoS Non Gau MF CC AUC Sensitivity (%) Specificity (%) PPV NPV
I x x x x x x 0.97 ± 0.03 93.2 ± 9.57 92.4 ± 11.34 0.93 ± 0.09 0.92 ± 0.11
II x x x x x 0.95 ± 0.07 86.4 ± 9.42 86.8 ± 15.96 0.86 ± 0.09 0.86 ± 0.15
III x x x x x 0.93 ± 0.05 88.0 ± 9.89 85.6 ± 9.07 0.88 ± 0.09 0.85 ± 0.09
IV x x x x x 0.92 ± 0.04 88.0 ± 9.89 86.0 ± 9.25 0.88 ± 0.09 0.86 ± 0.09
V x x x x x 0.94 ± 0.03 84.0 ± 8.08 92.0 ± 9.89 0.84 ± 0.08 0.92 ± 0.09
VI x x x x x 0.95 ± 0.03 88.8 ± 10.02 91.2 ± 10.02 0.88 ± 0.10 0.91 ± 0.10
VII x x x x x 0.92 ± 0.05 86.8 ± 9.57 89.2 ± 10.06 0.86 ± 0.09 0.89 ± 0.10
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence, For: formant, HoS: higher order statistics, Non-Gau: non-Gaussianity,
MFCC: Mel frequency cepstrum coefficients.

U R Abeyratne et al
Obstructive sleep apnea screening by integrating snore feature classes 113

Figure 5. Receiver operating characteristics curves drawn for classification training data sets for
integrated feature classes in OSA/non-OSA separation at AHI decision threshold = 15.

Table 7. Male OSA detection performance for 50 testing data sets when considered for individual
feature classes (for AHI decision threshold = 15).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.75 ± 0.08 79 ± 6.8 64.5 ± 8.2 0.79 ± 0.06 0.64 ± 0.08
Recurrence 0.65 ± 0.06 64 ± 15.6 64.8 ± 10.4 0.64 ± 0.15 0.64 ± 0.10
Formant 0.55 ± 0.05 66.7 ± 13.7 48.2 ± 10.3 0.66 ± 0.13 0.48 ± 0.10
HoS 0.69 ± 0.04 67.5 ± 7.9 65.4 ± 9.6 0.67 ± 0.07 0.65 ± 0.09
Non-Gaussian 0.62 ± 0.05 68.7 ± 8.4 63.1 ± 9.1 0.68 ± 0.08 0.63 ± 0.09
MFCC 0.68 ± 0.07 60.7 ± 12.3 73.4 ± 12.9 0.60 ± 0.12 0.73 ± 0.12
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

3.3. Male OSA diagnosis

We adopted the same method for male subject data analysis as used for female subjects. Figure 4
shows the receiver operating characteristics curves (ROC) drawn for different snore feature
classes for training data sets of male subjects. Table 7 contains OSA detection performance
results for male subjects at the AHI decision threshold = 15.
Individual feature class based analysis resulted in an OSA detection sensitivities in the
range 60–79% while holding specificities from 48% to 73% for males at the AHI decision
threshold = 15. Table 8 exhibit the results we obtained for integrated snore feature classes
for male data sets at the AHI = 15 decision threshold. It demonstrates that integration of
snore feature classes resulted in a mean sensitivity in the range from 85–90% while holding
specificity in the range 83–91%. All feature class integrated models achieved a sensitivity of
90.2% while holding specificity of 91.7%.
Tables 9 and 10 demonstrate the OSA diagnostic performance we obtained for individual
feature classes and integrated feature class cases respectively at AHI decision threshold = 30.
114
Table 8. OSA detection performance for 50 testing data sets for integrated snore feature classes (for AHI decision threshold = 15).
Case Pit Rec For HoS Non Gau MF CC AUC Sensitivity (%) Specificity (%) PPV NPV
I x x x x x x 0.93 ± 0.03 90.2 ± 5.23 91.7 ± 7.12 0.90 ± 0.05 0.91 ± 0.07
II x x x x x 0.87 ± 0.05 87.7 ± 3.98 87.7 ± 6.46 0.87 ± 0.03 0.87 ± 0.06
III x x x x x 0.92 ± 0.03 85.5 ± 6.85 83.7 ± 8.16 0.85 ± 0.06 0.83 ± 0.08
IV x x x x x 0.91 ± 0.05 88.0 ± 4.34 88.0 ± 6.68 0.88 ± 0.04 0.88 ± 0.06
V x x x x x 0.91 ± 0.04 85.0 ± 7.14 84.5 ± 9.93 0.85 ± 0.07 0.84 ± 0.09
VI x x x x x 0.91 ± 0.05 87.5 ± 3.98 88.2 ± 6.88 0.87 ± 0.03 0.88 ± 0.06
VII x x x x x 0.92 ± 0.05 87.0 ± 6.16 86.2 ± 8.14 0.87 ± 0.06 0.86 ± 0.08
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence, For: formant, HoS: higher order statistics, Non-Gau: non-Gaussianity,
MFCC: Mel frequency cepstrum coefficients.

U R Abeyratne et al
Obstructive sleep apnea screening by integrating snore feature classes 115

Table 9. Male OSA detection performance for 50 testing data sets when considered for individual
feature classes (for AHI decision threshold = 30).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.72 ± 0.08 69.7 ± 14.08 68.0 ± 11.01 0.69 ± 0.14 0.68 ± 0.11
Recurrence 0.55 ± 0.04 64.75 ± 5.46 60.85 ± 7.53 0.64 ± 0.05 0.60 ± 0.07
Formant 0.49 ± 0.03 53 ± 5.95 53.14 ± 10.81 0.53 ± 0.05 0.53 ± 0.10
HoS 0.61 ± 0.05 63 ± 2.47 64 ± 7.76 0.63 ± 0.02 0.64 ± 0.07
Non-Gaussian 0.52 ± 0.07 50.5 ± 2.47 72.57 ± 3.91 0.50 ± 0.02 0.72 ± 0.03
MFCC 0.57 ± 0.03 64.7 ± 4.85 58.57 ± 4.32 0.64 ± 0.04 0.58 ± 0.04
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

Table 10. OSA detection performance for 50 testing data sets for integrated snore feature classes
(for AHI decision threshold = 30).
Non MF AUC (%) (%)
Case Pit Rec For HoS Gau CC AUC Sensitivity Specificity PPV NPV
I x x x x x x 0.96 ± 0.04 91.2 ± 3.23 92.4 ± 9.80 0.91 ± 0.03 0.92 ± 0.09
II x x x x x 0.86 ± 0.04 86.4 ± 5.79 84.5 ± 13.44 0.86 ± 0.05 0.84 ± 0.13
III x x x x x 0.93 ± 0.05 85.5 ± 5.59 85.8 ± 11.83 0.85 ± 0.05 0.85 ± 0.11
IV x x x x x 0.92 ± 0.04 83.4 ± 5.48 86.0 ± 9.25 0.83 ± 0.05 0.86 ± 0.09
V x x x x x 0.93 ± 0.05 84.9 ± 5.60 86.6 ± 11.22 0.84 ± 0.05 0.86 ± 0.11
VI x x x x x 0.94 ± 0.05 86.0 ± 5.78 88.9 ± 11.35 0.86 ± 0.05 0.88 ± 0.11
VII x x x x x 0.93 ± 0.05 85.6 ± 5.80 88.2 ± 11.14 0.85 ± 0.05 0.88 ± 0.11
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence,
For: formant, HoS: higher order statistics, Non-Gau: non-Gaussianity, MFCC: Mel frequency cepstrum coefficients.

Individual feature classes based methods resulted in a sensitivity in the range 50–69% with
specificity in the range 53–72%. Snore feature class integrated methods resulted in sensitivities
in the range 83–91% while holding specificities in the range 85–92%.
It is visible from tables 7, 8, 9 and 10 that snore feature class integration based methods
provides better OSA diagnostic results for male subjects for both AHI = 15 and 30 decision
thresholds.

3.4. Male/female combined OSA diagnosis


We combined male and female subjects into one group and analysed the data set following
the same procedure as was performed with male and female subject data sets. Figure 5 shows
the receiver operating characteristics curves (ROC) drawn for integrated snore feature classes
for training data sets of male, female and male/female combined subjects at AHI decision
threshold = 15. Tables 11 and 12 show the results we obtained for individual feature classes
and integrated feature class cases respectively for AHI = 15 decision threshold.
As stated in tables 11 and 12, we can see that the integration of snore feature classes
leads to an average sensitivity improvement in the range 29–56% and an average specificity
improvement of 7–20%.
Tables 13 and 14 show the diagnostic performance sensitivity/specificity we achieved
for individual feature classes and integrated feature class cases respectively for AHI = 30
decision threshold. It show that individual feature class based methods resulted in OSA
diagnostic performance sensitivity in the range 31–66% while holding specificities in the
range from 67–78%. Snore feature class integration based methods reported sensitivities in
the range from 84–88% at specificities in the range from 81–84%.
116 U R Abeyratne et al

Table 11. Male/female combined OSA detection performance for 50 testing data sets when
considered for individual feature classes (for AHI decision threshold = 15).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.64 ± 0.05 57.23 ± 6.05 68.30 ± 7.39 0.57 ± 0.06 0.68 ± 0.07
Recurrence 0.55 ± 0.04 38.30 ± 10.36 73.38 ± 10.79 0.38 ± 0.10 0.73 ± 0.10
Formant 0.49 ± 0.05 34.3 ± 5.64 72.3 ± 9.9 0.34 ± 0.05 0.72 ± 0.09
HoS 0.41 ± 0.03 27.84 ± 5.57 79.38 ± 4.23 0.27 ± 0.05 0.79 ± 0.04
Non-Gaussian 0.40 ± 0.04 33.07 ± 4.18 63.84 ± 7.33 0.33 ± 0.04 0.63 ± 0.07
MFCC 0.57 ± 0.05 51.07 ± 7.08 67.69 ± 9.82 0.51 ± 0.07 0.67 ± 0.09
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

Overall, we have observed a coherent pattern in OSA detection performance due to snore
feature class integration across AHI decision thresholds (15, 30) and subject groups (male,
female, male/female combined). It demonstrates that proper integration of snore features
facilitate a mean sensitivity of 93.2% while holding mean specificity at 92.4% for females and
91.2% and 92.4% for males respectively.

3.5. Impact of gender and neck circumference on OSA diagnostic performance

OSA has marked in prevalence in males. NC has consistently been reported to show affiliation
with OSA and usually a measurement is taken during clinical examination of the patient.
In our method, snore features were augmented with NC because it incurs no extra cost
or complication; yet provides us with an opportunistic-feature to improve the diagnostic
performance of the snore-sound based technology. Mean sensitivity (specificity) values for 50
classification data sets were presented in table 15 for AHI decision threshold = 15. Table 15
shows that the augmentation of NC, resulted in mean sensitivity (specificity) of 92.2% (93.1%)
for males and 93.6% (93.6%) for females respectively. Male/female combined data set resulted
in sensitivity of 87.8% while holding a specificity of 87.2% at the AHI decision threshold
= 15.
Table 16 provides performance statistics of our algorithm at the decision threshold
AHI = 30 when snore features were augmented with NC. These methods resulted, in
mean sensitivity (specificity) of 92.8% (93.2%) for males and 94.4% (93.6%) for females
respectively. Male/female combined data set resulted in sensitivity of 88.9% while holding a
specificity of 85.5% at the AHI decision threshold = 30.
Figure 6 displays the mean OSA diagnostic performance for female data sets for individual
feature classes, integrated feature classes scenario and integrated feature classes augmented
with NC at AHI decision threshold = 15. These observations verify that the incorporation
of NC enhanced our method’s diagnostic performance. Figure 7 demonstrates mean OSA
diagnostic sensitivity/specificity for male/female combined, male and female, data sets for
integrated snore feature classes augmented with/without NC at AHI decision threshold =
15. These results indicate augmentation of NC into our feature vector improved our methods
mean diagnostic sensitivity by 2.05% and specificity by 1.44% for males while corresponding
values for females were 0.4% and 1.2%. Male/female combined data set mean sensitivity and
specificity were improved by 0.9% at AHI decision threshold = 15.
The work discussed in this paper was limited to a database consisted of subjects who
were referred to sleep clinic for suspected OSA. Our database has very few subjects who were
having AHI index less than 5. However this algorithm was tested on two AHI thresholds (AHI
Obstructive sleep apnea screening by integrating snore feature classes
Table 12. OSA detection performance for 50 testing data sets for integrated snore feature classes (for AHI decision threshold = 15).
Case Pit Rec For HoS Non Gau MF CC AUC Sensitivity (%) Specificity (%) PPV NPV
I x x x x x x 0.91 ± 0.01 86.9 ± 5.43 86.3 ± 4.73 0.86 ± 0.05 0.86 ± 0.04
II x x x x x 0.86 ± 0.02 84.6 ± 3.80 85.0 ± 3.93 0.84 ± 0.03 0.85 ± 0.03
III x x x x x 0.88 ± 0.02 83.6 ± 5.52 83.8 ± 5.65 0.83 ± 0.05 0.83 ± 0.05
IV x x x x x 0.89 ± 0.02 86.0 ± 4.30 85.8 ± 3.91 0.86 ± 0.04 0.85 ± 0.03
V x x x x x 0.88 ± 0.02 84.1 ± 4.50 84.6 ± 4.91 0.84 ± 0.04 0.84 ± 0.04
VI x x x x x 0.90 ± 0.02 86.7 ± 4.67 87.0 ± 4.77 0.86 ± 0.04 0.87 ± 0.04
VII x x x x x 0.90 ± 0.02 85.8 ± 3.59 86.4 ± 4.27 0.85 ± 0.03 0.86 ± 0.04
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence, For: formant, HoS: higher order statistics,
Non-Gau: non-Gaussianity, MFCC: Mel frequency cepstrum coefficients.

117
118 U R Abeyratne et al

Table 13. Male/female combined OSA detection performance for 50 testing data sets when
considered for individual feature classes (for AHI decision threshold = 30).
AUC Sensitivity (%) Specificity (%) PPV NPV
Pitch 0.68 ± 0.02 66.28 ± 3.82 68.16 ± 4.01 0.66 ± 0.03 0.68 ± 0.04
Recurrence 0.64 ± 0.03 65.42 ± 2.64 67.5 ± 2.52 0.65 ± 0.02 0.67 ± 0.02
Formant 0.55 ± 0.05 51.42 ± 3.22 71.16 ± 6.77 0.51 ± 0.03 0.71 ± 0.06
HoS 0.46 ± 0.02 37 ± 2.77 73.66 ± 7.21 0.37 ± 0.02 0.73 ± 0.07
Non-Gaussian 0.50 ± 0.03 31 ± 3.98 78 ± 9.33 0.31 ± 0.03 0.78 ± 0.09
MFCC 0.58 ± 0.05 49.9 ± 8.07 68.8 ± 12.30 0.49 ± 0.08 0.68 ± 0.12
HoS: higher order statistics, MFCC: Mel frequency cepstrum coefficients, AUC: area under curve, PPV: positive
predictive value, NPV: negative predictive value.

Table 14. OSA detection performance for 50 testing data sets for integrated snore feature classes
(for AHI decision threshold = 30).
Non MF Sensitivity Specificity
Case Pit Rec For HoS Gau CC AUC (%) (%) PPV NPV
I x x x x x x 0.91 ± 0.03 88.4 ± 7.51 84.9 ± 5.45 0.88 ± 0.07 0.84 ± 0.05
II x x x x x 0.85 ± 0.04 86.8 ± 6.09 82.9 ± 4.13 0.86 ± 0.06 0.82 ± 0.04
III x x x x x 0.88 ± 0.02 85.4 ± 7.50 83.1 ± 5.13 0.85 ± 0.07 0.83 ± 0.05
IV x x x x x 0.89 ± 0.03 84.4 ± 9.04 81.4 ± 5.51 0.84 ± 0.09 0.81 ± 0.05
V x x x x x 0.89 ± 0.03 84.8 ± 9.19 81.5 ± 6.01 0.84 ± 0.09 0.81 ± 0.06
VI x x x x x 0.90 ± 0.03 85.7 ± 7.93 83.0 ± 5.93 0.85 ± 0.07 0.83 ± 0.05
VII x x x x x 0.90 ± 0.03 85.6 ± 8.53 82.6 ± 5.76 0.85 ± 0.08 0.82 ± 0.05
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value, Pit: pitch, Rec: recurrence,
For: formant, HoS: higher order statistics, Non-Gau: non-Gaussianity, MFCC: Mel frequency cepstrum coefficients.

Table 15. OSA detection performance for 50 testing data sets when considered only snore derived
features augmented with neck circumference (for AHI decision threshold = 15).
AUC Sensitivity (%) Specificity (%) PPV NPV

Male 0.94 ± 0.03 92.25 ± 6.12 93.14 ± 7.20 0.92 ± 006 0.93 ± 0.07
Female 0.98 ± 0.01 93.6 ± 9.42 93.6 ± 9.42 0.93 ± 0.09 0.93 ± 0.09
Male/Female 0.92 ± 0.02 87.84 ± 4.68 87.23 ± 4.28 0.87 ± 0.04 0.87 ± 0.04
combined
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value.

Table 16. OSA detection performance for 50 testing data sets when considered only snore derived
features augmented with neck circumference (for AHI decision threshold = 30).
AUC Sensitivity (%) Specificity (%) PPV NPV

Male 0.96 ± 0.04 92.8 ± 3.71 93.20 ± 9.57 0.92 ± 0.03 0.93 ± 0.09
Female 0.98 ± 0.02 94.4 ± 9.07 93.6 ± 11.02 0.94 ± 0.09 0.93 ± 0.11
Male/female 0.92 ± 0.03 88.92 ± 6.48 85.52 ± 4.95 0.88 ± 0.06 0.85 ± 0.04
combined
AUC: area under curve, PPV: positive predictive value, NPV: negative predictive value.

threshold = 15 and 30). For this reason, this algorithm was not extensively tested on healthy
snorers.
Another limitation of the work is that we recorded snore sounds in a sleep laboratory of a
hospital. Therefore, the results reported here may not directly apply to home-based recordings.
Obstructive sleep apnea screening by integrating snore feature classes 119

Figure 6. Female subject results obtained at AHI decision threshold = 15. This figure
clearly indicates that integration of snore feature classes lead to an improved diagnostic
sensitivity/specificity performance. (For: formant, Rec: recurrence, HoS: higher order statistics,
Pit: pitch, Sno: integrated snore feature classes, Gau: non-Gaussianity, Sno+NC: integrated snore
feature classes augmented with neck circumference (NC), Sen: sensitivity, Spe: specificity).

Figure 7. Impact on OSA diagnostic performance due to incorporation of gender and neck
circumference (NC) to our algorithms. Sensitivity/specificity values for combined, male and
females subject groups are shown. This diagram shows that augmentation of NC and gender
improves the diagnostic performance of our algorithm.

In order to extend the work to a home environment we need to further study the influence of
external sounds on our algorithms.

4. Conclusion

In this paper, we propose a snore based OSA screening method where SRS were recorded in
a controlled hospital environment. All night SRS recordings were automatically segmented
to identify snore episodes using a pattern recognition algorithm. Snore episodes were used to
derive new features from voiced segments of snores, in consideration of the snore generation
process and the UA properties during apnea events. Snore feature classes individually as
well as in integration were analyzed using logistic regression modeling techniques to classify
120 U R Abeyratne et al

into OSA/non-OSA classes. We also incorporated freely and readily available gender and
neck circumference information in to our model, which facilitated performance enhancement
for the OSA diagnosis sensitivity/specificity without further adding the complexity to the
technique. The proposed methods were applied to our data base (51 males and 35 females)
and the results were cross validated. We achieved mean OSA detection sensitivities in the
range 92.2–93.6% and mean specificities in the range 93.1–93.6% at AHI decision threshold
= 15 and the corresponding values at AHI decision threshold = 30 are 92.8–94.4% and 93.2–
93.6%, respectively. These results established the potential of developing a snore based OSA
detection tool with an acceptable sensitivity/specificity performance. The methods proposed in
this paper, allowed us to record overnight snore sounds of any length. Automation of recording
and the post data analysis process make the services of a sleep technologist irrelevant. The
automated, non-contact, low cost and unattended nature of this method make it an exemplary
tool for population screening of OSA.

Acknowledgment

This work is partially supported by the Australian Research Council under grant DP120100141
to URA.

Appendix

Objective definition for snoring episodes (SE).


(I) We define a term ‘Breath Record’ as the snore related sound data originated from the
patient from the start of an inspiration to the corresponding end of expiration.
(II) We define a term ‘Snoring Episode’ (SE) as a Breath Record with at least one portion of
it containing sound with a detectable pitch. The part with detectable pitch is termed as
‘Voiced snoring segment (VSS)’. The rest of the SE containing sound without pitch is
classified as ‘Unvoiced snoring segment (UVSS)’.
(III) A Breath Record that is not a Snoring Episode is called a ‘Pure-Breath Record’.

References

Abeyratne U R 1999 Blind reconstruction of non-minimum phase systems from 1-D oblique slices of the bispectrum
IEEE Proc. Vis. Image Signal Process. 146 253–64
Abeyratne U R, Karunajeewa A S and Hukins C 2007 Mixed-phase modeling in snore sound analysis Med. Biol. Eng.
Comput. 45 791–806
Abeyratne U R, Patabandi C K K and Puvanendran K 2001 Pitch-jitter analysis snoring signals in the diagnosis of
obstructive sleep apnea IEEE EMBC: Proc. 23rd Annu. Int. Conf. IEEE Engineering in Medicine and Biology
Society (Istanbul, Turkey) pp 2072–5
Abeyratne U R, Wakwella A S and Hukins C 2005 Pitch jump probability measure for the analysis of snoring sound
in apnea Physiol. Meas. 26 779–98
ASDA (The Sleep Disorders Atlas Task Force of the American Sleep Disorders Association) 1992 EEG arousals:
scoring rules and examples Sleep 15 173–84
ASDA (The Sleep Disorders Atlas Task Force of the American Sleep Disorders Association) 1993 Recording and
scoring leg movements Sleep 16 748–59
Baldwin C M, Bell I R, Guerra S and Quan S F 2005 Obstructive sleep apnea and ischemic heart disease in
southwestern US veterans: implications for clinical practice Sleep Breath 9 111–8
Baldwin C M, Griffitch K A, Nieto F J, O’Connor G T, Walsleben J A and Redline S 2001 The association of
sleep-disordered breathing and sleep symptoms with quality of life in the sleep heart health study Sleep
24 96–105
Obstructive sleep apnea screening by integrating snore feature classes 121

Bar A, Pillar G, Dvir I, Sheffy J, Schnall R P and Lavie P 2003 Evaluation of a portable devise based on peripheral
arterial tone for unattended home sleep studies Chest 123 695–703
Cavusoglu M, Ciloglu T, Serinagaoglu Y, Kamasak M, Erogul O and Akcam T 2008 Investigation of sequential
properties of snoring episodes for obstructive sleep apnea identification Physiol. Meas. 29 879–98
Davies R J, Ali N J and Stradling J R 1992 Neck circumference and other clinical features in the diagnosis of the
obstructive sleep apnoea syndrome Thorax 47 101–5
Davis S and Merlmestein P 1980 Comparison of parametric representations for monosyllabic word recognition in
continuously spoken sentences IEEE Trans. Acoust. Speech Signal Process. 28 357–66
de Silva S, Abeyratne U R and Hukins C 2011 A Method to screen obstructive sleep apnea using multi-variable
non-intrusive measurements Physiol. Meas. 32 445–65
Dwyer A J 1996 In pursuit of a piece of the ROC Radiology 201 621–25
Emoto T, Abeyratne U R, Akutagawa M, Konaka S and Kinouchi Y 2011 High frequency region of the snore spectra
carry important information on the disease of sleep apnea J. Med. Eng. Technol. 35 425–31
Epstein E, Skoog L, Isberg P, De Smet F, De Moor B, Olofsson P, Gudmundsson S and Valentin L 2002 An algorithm
including results of gray scale and power Doppler ultrasound examination to predict endometrial malignancy in
women with postmenopausal bleeding Ultrasound Obstet. Gynecol. 20 370–76
Fiz J A, Abad J, Jane R, Riera M, Mannanas M A, Caminal P, Rodenstein D and Morera J 1996 Acoustic analysis of
snoring in patients with simple snoring and obstructive sleep apnea Eur. Respir. J. 9 2365–70
Flemons W W et al (American Academy of Sleep Medicine Task Force) 1999 Sleep related breathing disorders
in adults: recommendations for syndrome definition and measurement techniques in clinical research Sleep
22 667–89
Ghaemmaghami H, Abeyratne U R and Hukins C 2009 Normal probability testing for snore signals for diagnosis of
obstructive sleep apnea Proc. IEEE Conf. on Eng Med. Biol. Soc. (USA, 3–6 Sep.) pp 5551–4
Indu A and David M R 2003 The upper airway in sleep: physiology of the pharynx Sleep Med. Rev. 7 9–33
Karunajeewa A S, Abeyratne U R and Hukins C 2008 Silence-breathing-snore classification from snore related sounds
Physiol. Meas. 29 227–43
Karunajeewa A S, Abeyratne U R and Hukins C 2011 Multi-feature snore sound analysis in obstructive sleep apnea-
hypopnea syndrome Physiol. Meas. 32 83–97
Lee T H, Abeyratne U R, Puvanendran K and Goh K L 2000 Formant-structure and phase-coupling analysis of human
snoring sounds for the detection of obstructive sleep apnea Computer Methods in Biomechanics and Biomedical
Engineering: 3 ed J Middletion, M L Jones and G N Pande (Amsterdam: Gordon and Breach)
Malhotra A and White D P 2002 Obstructive sleep apnoea Lancet 360 237–45
Max A L et al 2007 Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection Biomed.
Eng. Online 6 23
Metz C E 1978 Basic principles of ROC analysis Semin. Nucl. Med. 8 283–98
Mortimore I L et al 1998 Neck and total body fat deposition in non-obese and obese patients with sleep apnea
compared with that in control subjects Am. J. Respir. Crit. Care Med. 157 280–83
Ng A K, Koh T S, Baey E, Lee T H, Abeyratne U R and Puvendran K 2008 Could formant frequencies of snore sound
be an alternative means for the diagnosis of obstructive sleep apnea? Sleep Med. 9 894–98
Perez-Padilla J R, Slawinski E, Difrancesco L M, Feige R R, Remmers J E and Whitelaw W A 1993 Characteristics
of the snoring noise in patients with and without occlusive sleep-apnea Am. Rev. Respir. Dis. 147 635–44
Rechtchaffen A and Kales A 1968 A manual of standardized terminology: techniques and scoring system for sleep
stages of human subjects UCLA Brain Information Service/Brain Research Institute, Los Angeles, CA , USA
Shama K, Krishna A and Cholayya N U 2007 Study of harmonics-to-noise ratio and critical-band energy spectrum
of speech as acoustic indicators of laryngeal and voice pathology EURASIP J. Adv. Signal Process 2007
Soler J S, Jane R, Fiz J A and Morera J 2007 Automatic classification of subjects with and without sleep apnea through
snoring analysis Proc. 29th IEEE Conf. on Engineering in Medicine and Biology Society (Lyon, France, 23–26
Aug.) pp 6093–6
Steven C B, Halbert W and Beatrice A G 2001 Logistic regression in the medical literature: standards for use and
reporting, with particular attention to one medical domain J. Clin. Epidemiol. 54 979–85
Timmerman D et al 2005 Logistic regression model to distinguish between the benign and malignant adnexal mass
before surgery: a multicenter study by the international ovarian tumor analysis group J. Clin. Oncol. 23 8794–801
Young T, Evans L, Finn L and Palta M 1997 Estimation of the clinically diagnosed proportion of sleep apnea syndrome
in middle aged men and women Sleep 20 705–6

You might also like