• Logo
  • HamaraJournals


Developing an Apnea/Hypopnea Diagnostic Model Using SVM

, and



Among sleep-related disorders, Sleep apnea has been under more attention and it’s the most common respiratory disorder in which respiration ceases frequently which can lead to serious health disorders and even mortality. Polysomnography is the standard method for diagnosing this disease at the moment which is costly and time-consuming. The present study aimed at analyzing vital signals to diagnose Sleep apnea using machine learning algorithms.

Material and Methods:

This analytical–descriptive was conducted on 50 patients (11 normal, 13 mild, 17 moderate and 9 severe patients) in the sleep clinic of Imam Khomeini hospital. Initially, data pre-processing was carried out in two steps (noise elimination and moving average algorithm). Next, using the singular value decomposition method, 12 features were extracted for airflow. Finally, to classify data, SVM with quadratic, polynomial and RBF kernels were trained and tested.


After applying different kernel functions on SVM, the RBF kernel showed the most efficient performance. After 10 fold cross validation method for evaluation, the mean accuracy obtained for normal, apnea, and hypopnea modes were 92.74%, 91.70%, 93.26%.


The results show that in online applications or applications where the volume and time of calculations and at the same time the accuracy of the result is very important, The disease can be diagnosed with acceptable accuracy using machine learning algorithms.


Apnea means cessation of breathing and when this incident occurs during sleep, it’s called Sleep apnea [1]. According to Apnea definition in adults, it is the cessation of breathing for more than 10 seconds and Hypopnea means 50% air circulation reduction from the base during sleep for more than 10 seconds which comes along with arterial oxygen drop more than 4 % or arousal [2]. This disorder appears as central respiratory distress (36%), obstructive apnea (12%), and the combination of abnormal central respiration and obstructive apnea in the rest of the cases [3]. Sleep apnea disorder is common and it’s estimated that 4 % of men and 2 % of women of 30-60 years old suffer from this disorder [4, 5]. Approximately, 93% of women and 82% of men with moderate to severe Sleep apnea syndrome have not been clinically diagnosed [6]. From the third decade age to the seventh decade, the prevalence of this disorder would raise from 2% to 36% and from 4% to 50% among men and women, respectively [7]. Exhaustion and extreme drowsiness are some the most conventional which play an important role in one’s efficiency reduction during the day and are followed by irreparable outcomes and even death in some cases. Other symptoms include:

High blood pressure, cardiovascular disorders, brain stroke, depression feeling, increasing the risk of driving accidents, lower efficiency and poor performance, lack of motivation and stimulant factor, concentration disability, low tolerance poor endurance and inadequacy in problem-solving [4, 8-10].

Due to the high importance of this disorder and known treatment methods, the diagnosis is imperative. The sleep test is the standard sleep apnea diagnosis polysomnography. In this method, the patient is under the patient care technician or trained sleep disorder specialist’s care for an entire night- time in a sleep clinic. To this end, they place different electrodes and sensors in different parts of the body such as head, face, chest, and legs which are associated with sleep disorders, so that patient’s body performances during the sleep at night are examined [11]. Some of the vital signals which are monitored and submitted using this method during sleep includes electroencephalography (EEG), electrocardiography (ECG), electromyography (EMG) and electrooculography (EOG), heart rate variability (HRV), snore sounds, oxygen saturation rate (Spo2), nasal and oral airflow and thoracic & abdominal motions. On the other hand, this test is expensive and limited health centers are ready for providing these services, thus diagnosis patients who probably have sleep apnea has been studied [12-14]. For this purpose, several methods are used.

Four feedforward neural networks with different structures were used to separate the respiratory distress and slow respiratory, while two networks are directly applied to the airflow process, and two other networks used abdominal and thoracic besides the airflow for processing [15].

Using non-linear dynamic information existent in HRV signals, Nguyen et al. categorized both apnea and healthy cases [16]. Hassan et al. used statistical features and extracted frequencies from ECG signals for separating healthy and apnea cases [17]. In another study, Spo2 None-Linear and statistical features of 320 patients were extracted to classify apnea-hypopnea severity using AdaBoost, Bayesian multilayer perceptron, linear discriminant analysis, 1-vs-all logistic regression [18]. In [19], the artificial bee colony algorithm (ABC) is used for extracting EEG signal coefficients characteristics to diagnose apnea and utilized the least square and machine learning for classification. The present study aimed at smart diagnosis of apnea-hypopnea based on classification algorithms. Considering the low cost, high accuracy, convenient setup and sleep apnea effects on different vital signals, the airflow signal is chosen for processing to help this research achieve its purposes.


This study is analytical–descriptive. The data are collected from 50 patients in the sleep clinic of Imam Khomeini hospital according to the results of previous studies and expert opinions. With sleep duration between 7 to 9 hours, submitted signal frequency of 100 Hz, and 16-bit resolution which their results have been determined previously. Among 50 individuals, 10 normal, 13 mild, 17 moderate, and 9 severe cases were identified.

After choosing the appropriate signal and eliminating the noise with Chebyshev Type II according to the apnea definition, the windows were chosen within ten seconds so that the end of each window previous apnea is the start of new apnea. Afterward, to increase the accuracy and analyze the whole signal, a one-second shift was considered for the window and this shift goes on till the window beginning is coincides with the end of apnea.

Since various apnea has a different duration, we consider 30 seconds window for each apnea with a frequency of 100 Hz. To count the numbers of the normal window, the number of apnea was multiplied by 1000. Therefore, all windows which should be assigned to the appropriate classifier was determined. In this research, before extracting features, we used the moving average algorithm. In statistics, moving average is one of the utilized techniques for analyzing time series [20].

This technique is used to reduce short-term oscillations and showing long-term behavior of time series. Mathematically, the moving average is used as an example of convolution and an applicable filter in the signal processing point of view. The features extracted are shown in Table 1.

After extracting features, statistical analysis was used for data and features comparison. For this purpose, the t-test with a significance level of (α=0.05) was applied on them for determining difference between extracted features.


Support vector machine (SVM) is a machine learning method and a two-sided classifier. This method attempts to create a hyperplane in the case of two classes that the distance from each class to the hyperplane is maximum. The point data that are closest to the hyperplane are used to measure the distance. Therefore, these point data are called the support vectors. In this method, the model consists of two stages of training and testing. At the end of the training phase, the generalization ability of the trained model is being analyzed using test data [21]. An algorithmic explanation of the SVM model is as follows:

If D= {(xi,yi)}i=1mis the data set consisting of m numbers of sample xi and labeled yiє -1.1 from two classes (healthy and unhealthy), to separate these two classes we can consider numerous hyperplanes. The most appropriate choice is the plane which creates a margin between the two classes. According to the definition, the margin is the sum of the distance between the nearest points from each of two classes to the separator plane. The balance between the margin and the error of the misclassified samples can be controlled by the positive value of C which is already determined. It can be shown that the decision function f (x) is expressed as follows:

f(x) = sign i=1mλiyixTxi +b


λi is the corresponding Lagrange coefficients. The data with the nonzero Lagrange coefficient are called. These support vectors are placed on the boundary between the two classes. In practice, the use of linear classifier to separate nonlinear data causes a significant decrease in efficiency. Therefore, it is better to use nonlinear classifiers. This is easily possible by visualizing the data to a feature space with a higher dimension, so that: X є Rd       →              Z(x) ≡ (𝜙1 (x), … , 𝜙n (x)) є Rn


We can now write the relationships of the linear classifier in this new space. As a result, the decision making function of this state becomes the following form:

f(x) = sign i=1mλiyizT(x)z(xi)+b


A key point about the SVM is that the only value which should be calculated is the point multiplication of zT(x)z(xi) for decision making. For convenience, the kernel function K is introduced:

zTxziy=i=1αiφixφiy=Kx y


that {αi}i=1mand{φi}i=1 are a set of numbers and real functions, respectively. In this way, the decision-making function becomes the following form:

fx=sign[i=1mλiyiK(x yi)+b]


λi is obtained from solving an equation like the equation (3), except that Di,j =yiyjK(xi,xj). Several kernel functions can be used to transmit data into different spaces. The most common kernels are linear, polynomial, and RBF [22, 23]. The mathematical formula of these kernel functions is as follows:

Linear Kernel: Kxi x=x.xi


Polynomial Kernel: Kxi, x=x.xi+cd c>0;


Radial Bases Function (RBF) Kernel: Kxi, x=exp-xi- x22δ2(δ is a positive real number).


To use the SVM in the multiple-class mode, there are two strategies including one against all (OAA) and one against one (OAO). A binary SVM is used for each possible class couple in OAO. Therefore, for n classes, we have n(n-1)2 binary classifiers. In the OAA method, each SVM separates data of one class from other classes, i.e., for each n class, we would have n binary classifiers. In both methods, the final label of data is determined by the maximum bias method [24]. In this study, the OAA method considering 3 Apnea classes. In this way, the label1 is assigned to the data of the class of interest and lable0 to the data of other classes. This will happen for each class and so the best kernel for SVM is selected.


The results of the study showed that airflow signals were randomly selected and analyzed which included 34 male and 16 female with (32≤ age ≤ 64) and (24≤ BMI≤ 35) and concerning the windowing method among 50 files, including 10 normal, 13 mild, 17 moderates, and 9 severe cases, in total and randomly 8880 windows allocated including 3000 normal cases were studied with 12 features. Results indicated a significant difference among data in windows with apnea, hypopnea, and healthy. 70% of these cases were assigned for training and 30% for testing to the SVM classifier. The SVM models were built with linear, polynomial, and RBF kernel functions based on all of the features. One of the most effective ways to evaluate the performance of a classifier which classifies the labeled data into several subsets is cross-validation. For this purpose, 10 fold cross- validation was used to classify the data sets into ten independent subsets and each time one of them is used as a test data and other data were considered as training data. In this case, each data is used once for test and nine times for training. As a result, the whole data set is covered for training and testing. The accuracy, sensitivity, and specificity are measures to evaluate the classification performance which were calculated by averaging the results of SVM iteration. The following relationships illustrate how each of these criteria is calculated.

Table 1

Extracted features by time, by frequency and non-linear

Characteristic Equation Variable Description
By time f1=E(x)=μ=1Ni=1Nxi Mean A statistical measurement of data distribution
f2=Ex- μ2=σ2=
1N-1i=1N(xi- μ)2
Variance Data scattering
f3=1σ3 E[(x- μ )3] Skewness Data distribution symmetry
f4=1σ4 E[(x- μ )4] Kurtosis Data peak statistical measurement
By frequency f5f6f7f8 Mean, Variance,
Skewness, Kurtosis
Time features which are in the frequency domain
f9=0.5fj=0Hz0.5fsPSD(fj) Median frequency This frequency is defined for components with 99% of whole signal power
Spectral entropy For measuring spectrum disarray
Non-linear f11=SampEnm,r,N=-LnAmrBmr Sample entropy M: pattern length,
r: similarity measure, N: sample frequency
di=1, & (xi+2-x(i+1))2++(xi+1-x(i))212<ρ0, & otherwise
Central tendency Non-linear measurement of the Second derivative

Accuracy= TP+TNTP+TN+FP+FN


Specificity= TNTN+FP


Sensitivity= TPTP+FN


Where TP: true positive, TN: true negative, FP: false positive, and FN: false negative for each set. In all of the tables, sens: sensitivity, acc: accuracy, spec: specificity. The average of the results obtained from 10 iterations of the SVM with different kernel functions for the case that all of the parameters are introduced into the model and for different kernel functions are shown in the Tables. 2, 3, and 4.

Table 2

Evaluation criteria for linear kernel function

Types of apnea Acc Sens Spec
Normal 88.60 79.59 91.66
Apnea 83.41 80.24 85.71
Hypopenea 86.52 77.77 90.76

Table 3

Evaluation criteria for polynomial kernel function

Types of apnea Acc Sens Spec
Normal 89.11 77.55 93.05
Apnea 84.45 81.48 86.60
Hypopenea 83.93 76.19 87.69

Table 4

Evaluation criteria for RBF kernel function

Types of apnea 𝛿 = 1
𝛿 = 10
𝛿 = 20
Acc Sens Spec Acc Sens Spec Acc Sens Spec
Normal 92.22 85.71 94.44 92.74 89.79 95.74 90.15 81.63 93.05
Apnea 89.11 87.65 90.17 91.70 90.12 92.85 86.52 83.95 88.39
Hypopenea 90.67 84.12 93.84 93.26 88.88 95.38 88.08 80.95 91.53

The results show that the RBF kernel function with δ=10has higher, value, and efficiency for the classification of patients in three categories than kernel functions and this indicates a nonlinear pattern in the data of this study. The proposed method resulted in an accuracy for normal, apnea and hypopnea were obtained 92.74%, 91.70%, 93.26% respectively.


Nowadays, early and accurate diagnosis of many diseases is of critical importance. To this end, the diagnosis and classification process of diseases using modern computer technology has many advantages. In recent years, we observe the growing use of computational intelligence in solving problems that have no specific solution or cannot be solved easily. The basis of the intelligent methods is to use the knowledge embedded in the data, an attempt to extract the inherent relationships between them and to generalize to other conditions [25].

In this study, a method for the diagnosis of different types of apnea based on data preprocessing and SVM was proposed. First, the data preprocessing was carried out in two steps including eliminating the noise and windowing. Afterward, 12 temporal, frequency and non-linear features were extracted from airflow signals. In the next stage using a multi-class OAA SVM method, data were classified into three categories. To train the SVM, linear, RBF, quadratic, and polynomial kernels were used. The results showed that the RBF kernel function was the most appropriate function for normal, apnea, and hypopnea, 92.74%, 91.70%, 93.26% respectively.

Morillo et al, investigated the 17 features extracted from 117 SPO2 signal recording (70 for network training and 47 for testing) using a stochastic neural network to identify hypopnea-apnea syndrome where the results had 90.9% of sensitivity and 84 % of specificity [26]. In [27], 30 single-channel airflow recording was employed using the algorithm to identify the hypopnea-apnea, which were accuracy, sensitivity, and specificity were obtained 95, % 90, % 96, respectively. Lee et al, for the automatic diagnosis of hypopnea-apnea, the rule-based algorithm applied to 50 airflow signals. In this algorithm, preprocessing consists of an intermediate filter, domain calculation, and hypopnea-apnea area. To evaluate the statistical performance of the algorithm, the confusion matrix has been used and the calculated sensitivity is 86.4% [28].

Gutierrez-Tobal et al. diagnosed hypopnea-apnea from 317 single-channel recordings by applying the AdaBoost algorithm and extraction of spectral and nonlinear features. The accuracy obtained 86.5 %, % 81, % 83.3 for AHI = 5, 15, 30, respectively [29].

The results of comparison and evaluation confirm that the proposed method has higher diagnostic accuracy compared to other methods. In this paper, the high accuracy of the proposed method is preprocessing of input data and the selection of SVM classifiers with the appropriate kernel. The same is almost good for high-dimensional data, and the tradeoff between complexity and error is controlled.

Such studies can be analyzed and used for future research. On the other hand, due to the low cost and high speed of the process, they can be cost-effective. The constraint of this research is the low number of patients to develop the model which should be developed with a more number of patients in future studies. The possibility of establishing a data center for data collection from various active sites in this area is an appropriate topic for further studies. Besides, the use of optimization and other classification algorithms for further studies can help the development of this study. Another constraint of this study is that machine learning algorithms are like black boxes, so you can observe input data and decisions, but the procedures of these tasks are not obvious. This makes it difficult to adapt to the algorithm for humans and to counteract the prediction in the new scenario.


The results of this study indicate that the use of intelligent methods is beneficial in the analysis of the vital signal in the diagnosis of patients suffering from Apnea-Hypopnea with more speed and accuracy in hospitals, houses, and areas with weak health and medical facilities and providing health care services to these patients. Implementing apnea-hypopnea diagnosis algorithm based on user-friendly interface design in the form of mobile-base, web-base, and software can be a good substitution for PSG due to its low cost and accessibility in every location.


The authors declare no conflicts of interest regarding the publication of this study.


No financial interests related to the material of this manuscript have been declared.


1. Roehrs T, Kapke A, Roth T, Breslau N. Sex differences in the polysomnographic sleep of young adults: A community-based study. Sleep Med 2006;7(1):49–53.
2. Chung K. Use of the Epworth Sleepiness Scale in Chinese patients with obstructive sleep apnea and normal hospital employees. J Psychosom Res 2000;49(5):367–72.
3. Quan S, Gillin JC, Littner M, Shepard J. Sleep-related breathing disorders in adults: Recommendations for syndrome definition and measurement techniques in clinical research. Sleep 1999;22(5):667–89.
4. Coren, S. Sleep thieves: An eye-opening exploration into the science and mysteries of sleep. Simon & Schuster; 1996.
5. Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S. The occurrence of sleep-disordered breathing among middle-aged adults. N Engl J Med 1993;328(17):1230–5.
6. Young T, Evans L, Finn L, Palta M. Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle-aged men and women. Sleep 1997;20(9):705–6.
7. Tufik S, Santos-Silva R, Taddei JA, Bittencourt LRA. Obstructive sleep apnea syndrome in the Sao Paulo epidemiologic sleep study. Sleep Med 2010;11(5):441–6.
8. Aguiar M, Valenca J, Felizardo M, Caeiro F, Moreira S, Staats R, et al. Obstructive sleep apnoea syndrome as a cause of road traffic accidents. Rev Port Pneumol 2009;15(3):419–31.
9. Peppard PE, Young T, Palta M, Skatrud J. Prospective study of the association between sleep-disordered breathing and hypertension. N Engl J Med 2000;342(19):1378–84.
10. Yaggi HK, Concato J, Kernan WN, Lichtman JH, Brass LM, Mohsenin V. Obstructive sleep apnea as a risk factor for stroke and death. N Engl J Med 2005;353(19):2034–41.
11. Penzel T, Kemp B, Klosch G, Schlogl A, Hasan J, Varri A, et al. Acquisition of biomedical signals databases. IEEE Eng Med Biol Mag 2001;20(3):25–32.
12. Patil SP, Schneider H, Schwartz AR, Smith PL. Adult obstructive sleep apnea: Pathophysiology and diagnosis. Chest 2007;132(1):325–37.
13. Bennett J, Kinnear W. Sleep on the cheap: The role of overnight oximetry in the diagnosis of sleep apnoea hypopnoea syndrome. Thorax 1999;54(11):958–9.
14. Flemons WW, Littner MR, Rowley JA, Gay P, Anderson WM, Hudgel DW, et al. Home diagnosis of sleep apnea: A systematic review of the literature An evidence review cosponsored by the American Academy of Sleep Medicine, the American College of Chest Physicians, and the American Thoracic Society. Chest 2003;124(4):1543–79.
15. Várady P, Micsik T, Benedek S, Benyó Z. A novel method for the detection of apnea and hypopnea events in respiration signals. IEEE Trans Biomed Eng 2002;49(9):936–42.
16. Nguyen HD, Wilkins BA, Cheng Q, Benjamin BA. An online sleep apnea detection method based on recurrence quantification analysis. IEEE J Biomed Health Inform 2014;18(4):1285–93.
17. Hassan AR, Haque MA. Computer-aided obstructive sleep apnea screening from single-lead electrocardiogram using statistical and spectral features and bootstrap aggregating. Biocybernetics and Biomedical Engineering 2016;36(1):256–66.
18. Gutiérrez-Tobal GC, Álvarez D, Crespo A, Del Campo F, Hornero R. Evaluation of machine-learning approaches to estimate sleep apnea severity from at-home oximetry recordings. IEEE J Biomed Health Inform 2019;23(2):882–92.
19. Taran S, Bajaj V. Sleep apnea detection using artificial bee colony optimize hermite basis functions for EEG signals. IEEE Transactions on Instrumentation and Measurement 2019;69(2):608–16.
20. Stevenson M, Porter JE. Fuzzy time series forecasting using percentage change as the universe of discourse. International Journal of Mathematical, Computational, Physical, Electrical and Computer Engineering 2009;3:464–7.
21. Steinwart, I.; Christmann, A. Support vector machines. Springer Science & Business Media; 2008.
22. Vapnik V, Chervonenkis A. The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recognition and Image Analysis 1991;1(3):283–305.
23. Goh K-S, Chang E, Cheng K-T. SVM binary classifier ensembles for image classification. International conference on Information and knowledge management ACM 2001;
24. Ghassemian H, Landgrebe DA. Object-oriented feature extraction method for image data compaction. IEEE Control Systems Magazine 1988;8(3):42–8.
25. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism 2017;69:S36–S40.
26. Morillo DS, Rojas JL, Crespo LF, León A, Gross N. Poincaré analysis of an overnight arterial oxygen saturation signal applied to the diagnosis of sleep apnea hypopnea syndrome. Physiol Meas 2009;30(4):405–20.
27. Ciołek M, Niedźwiecki M, Sieklicki S, Drozdowski J, Siebert J. Automated detection of sleep apnea and hypopnea events based on robust airflow envelope tracking in the presence of breathing artifacts. IEEE J Biomed Health Inform 2015;19(2):418–29.
28. Lee H, Park J, Kim H, Lee K-J. New rule-based algorithm for real-time detecting sleep apnea and hypopnea events using a nasal pressure signal. J Med Syst 2016;40(12):282.
29. Gutiérrez-Tobal GC, Álvarez D, del Campo F, Hornero R. Utility of AdaBoost to detect sleep apnea-hypopnea syndrome from single-channel airflow. IEEE Trans Biomed Eng 2016;63(3):636–46.

This display is generated from Gostaresh Afzar Hamara JATS XML.


  • There are currently no refbacks.