Screening for abnormal heart sounds and murmurs
by implementing Neural Networks
by
Claude Visagie
Thesis presented at the University of Stellenbosch
in partial fulfilment of the requirements for the
degree of
Master of Science in Mechanical Engineering
Department of Mechanical Engineering
University of Stellenbosch
Private Bag X1, 7602 Matieland, South Africa
Study leader: Prof. C. Scheffer
April 2007
Copyright © 2007 University of Stellenbosch
All rights reserved.
Stellenbosch University http://scholar.sun.ac.za
Declaration
I, the undersigned, hereby declare that the work contained in this thesis is my own original
work and that I have not previously in its entirety or in part submitted it at any university
for a degree.
Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C. Visagie
Date: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
Stellenbosch University http://scholar.sun.ac.za
Abstract
Screening for abnormal heart sounds and murmurs
by implementing Neural Networks
C. Visagie
Department of Mechanical Engineering
University of Stellenbosch
Private Bag X1, 7602 Matieland, South Africa
Thesis: MScEng (Mech)
April 2007
This thesis is concerned with the testing of an “auscultation jacket” as a means of recording
heart sounds and electrocardiography (ECG) data from patients. A classification system
based on Neural Networks, that is able to discriminate between normal and abnormal heart
sounds and murmurs, has also been developed . The classification system uses the recorded
data as training and testing data. This classification system is proposed to serve as an aid to
physicians in diagnosing patients with cardiac abnormalities. Seventeen normal participants
and 14 participants that suffer from valve-related heart disease have been recorded with the
jacket. The “auscultation jacket” shows great promise as a wearable health monitoring
aid for application in rural areas and in the telemedicine industry. The Neural Network
classification system is able to differentiate between normal and abnormal heart sounds
with a sensitivity of 85.7% and a specificity of 94.1%.
iii
Stellenbosch University http://scholar.sun.ac.za
Uittreksel
Sifting vir abnormale hartklanke en geruise
deur die implementering van Neurale Netwerke
(“Screening for abnormal heart sounds and murmurs
by implementing Neural Networks”)
C. Visagie
Departement Meganiese Ingenieurswese
Universiteit van Stellenbosch
Privaatsak X1, 7602 Matieland, Suid Afrika
Tesis: MScIng (Meg)
April 2007
Hierdie tesis het te make met die toets van ’n “stetoskoop baadjie” as ’n manier om hart-
klanke en elektrokardiografie (EKG) data van pasiënte te bekom. ’n Klassifikasiesisteem wat
gebasseer is op Neurale Netwerke, wat die data wat met die baadjie opgeneem is gebruik
as leer- en toetsdata, is ook ontwikkel . Sewentien normale deelnemers en 14 deelnemers
wat lei aan klep-verwante hartsiektes is met die baadjie opgeneem. Die “stetoskoop baadjie”
toon baie potensiaal as ’n drabare gesondheidsmonitering sisteem, spesifiek vir gebruik in
verafgeleë gebiede en in die telemedisyne industrie. Die klassifikasiesisteem is bevoeg om
te diskrimineer tussen normale en abnormale hartklanke en geruis met ’n sensitiwiteit van
85.7% en ’n spesifisiteit van 94.1% en is beoog as ’n hulpmiddel vir dokters om hartabnor-
maliteite te diagnoseer.
iv
Stellenbosch University http://scholar.sun.ac.za
Acknowledgements
I would like to express my sincere gratitude to the following people and organisations who
have contributed to making this work possible:
• To my parents, Wally and Juanita, thank you for granting me this opportunity.
• To my promoter, Prof. Cornie Scheffer. Thank you for your great leadership and
giving me the freedom to do the research in my own way and allowing me to perform
to the best of my abilities.
• I would like to thank Dirk Koekemoer and Hugo Pienaar from GeoAXon. Thanks to
Dirk for thinking of this great concept, thereby giving me an opportunity to perform
this research. Thanks to Hugo for all the technical help. Without your help this would
have been a long, hard struggle.
• Sebastian Stärz, for designing and building a great prototype jacket. Without it this
research would not have been possible.
• Thanks to Dr. Wayne Lubbe for sourcing the patients and performing the auscultation
and ECG examinations. Miranne, thank you for performing the echo-cardiograms so
diligently and thank you to Prof. Doubell for his willingness to work with us on this
research project.
• To Gert-Jan, thanks for the initial help with LATEXand thanks to Adriana for all the
help with the construction of the jacket and putting us in touch with the right people.
It is much appreciated. Thank you also to Carine’s mother, Lucinda, for all the moral
support.
• Thanks to Dr Renier Verbeek for helping us finalise the positions of the stethoscopes
on the jacket.
v
Stellenbosch University http://scholar.sun.ac.za
Dedications
To Carine
vi
Stellenbosch University http://scholar.sun.ac.za
Contents
Declaration ii
Abstract iii
Uittreksel iv
Acknowledgements v
Dedications vi
Contents vii
List of Figures ix
List of Tables xiii
Nomenclature xiv
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Literature review 4
2.1 The cardiovascular system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The cardiac cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Heart sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Auscultation and phonocardiography . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Previous research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Hardware and Data Acquisition 20
3.1 Stethoscopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Auscultation jacket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
vii
Stellenbosch University http://scholar.sun.ac.za
CONTENTS viii
3.3 Data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4 Methodology 32
4.1 Denoising of recorded data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Feature selection and classification 59
5.1 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Artificial Neural Network classification . . . . . . . . . . . . . . . . . . . . . 61
5.3 Construction and training of the neural network . . . . . . . . . . . . . . . . 67
6 Conclusions and Recommendations 70
6.1 Data analysis and classification system . . . . . . . . . . . . . . . . . . . . . 70
6.2 Recommendations concerning the auscultation jacket . . . . . . . . . . . . . 74
6.3 Application to telemedicine? . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.4 Other applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Appendices 79
A Relevant technologies 80
A.1 Electrocardiogram (ECG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.2 Echo-cardiography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
A.3 Impedance cardiography (ICG) . . . . . . . . . . . . . . . . . . . . . . . . . 89
B Data sheets and data tables 93
C Gradient descent algorithm 96
List of References 98
Stellenbosch University http://scholar.sun.ac.za
List of Figures
2.1 Cardiovascular circulatory system [10] . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Frontal-section of the human heart showing the internal anatomy [10] . . . . . . 6
2.3 Single cardiac cycle showing S1 and S2 . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Actual heart valve positions together with auscultation positions [15] . . . . . . 9
2.5 FFT of recorded heart sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 STFT of recorded heart sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 Wigner distribution of recorded heart sound . . . . . . . . . . . . . . . . . . . . 13
2.8 Choi-Williams Distribution of recorded heart sound . . . . . . . . . . . . . . . . 14
2.9 Continuous Wavelet Transform (CWT) of recorded heart sound . . . . . . . . . 15
2.10 Discrete Wavelet Transform (DWT) of recorded heart sound . . . . . . . . . . . 16
2.11 Graphical presentation of FWT implementation . . . . . . . . . . . . . . . . . . 17
3.1 Conventional analogue stethoscope [45] . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Standard condenser microphone [46] . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Back electret condenser microphone [47] . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Digital stethoscope used in study . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5 Inside view of digital stethoscope . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Stethographics Inc. multi-channel stethograph [48] . . . . . . . . . . . . . . . . . 23
3.7 Tapuz Medical Technology Ltd. ECG belt [49] . . . . . . . . . . . . . . . . . . . 23
3.8 Medes - VTMAN project [50] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.9 Initial positions for stethoscopes and electrodes in jacket . . . . . . . . . . . . . 24
3.10 Final positions for stethoscopes and electrodes in jacket . . . . . . . . . . . . . . 25
3.11 Front piece of jacket (inside) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.12 Front piece of jacket (outside) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.13 Back piece of jacket (inside) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.14 Back piece of jacket (outside) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.15 Dual stethoscopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.16 Sample 12-lead ECG recorded with auscultation jacket . . . . . . . . . . . . . . 27
3.17 S2 at expiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
ix
Stellenbosch University http://scholar.sun.ac.za
LIST OF FIGURES x
3.18 S2 at inspiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.19 28-port USB hub used to connect stethoscopes to computer . . . . . . . . . . . 31
4.1 Implemented methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Denoising methods implemented . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Original recorded signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Bandpass filtered signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 Flowdiagram of wavelet threshold denoising technique . . . . . . . . . . . . . . . 34
4.6 Wavelet threshold denoised signal . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.7 Cycle extraction flowdiagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.8 Original recorded ECG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.9 ECG signal after first-derivative operator applied . . . . . . . . . . . . . . . . . 37
4.10 ECG signal after first-derivative operator and smoothing MA filter . . . . . . . 37
4.11 Algorithm to set all values below 15% of maximum value to zero . . . . . . . . . 38
4.12 Algorithm to detect QRS starting-points and end-points . . . . . . . . . . . . . 38
4.13 Feature extraction process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.14 Procedure followed to extract S1 and S2 . . . . . . . . . . . . . . . . . . . . . . 41
4.15 Identified S1’s for a patient with a normal heart . . . . . . . . . . . . . . . . . . 42
4.16 Identified S1’s for a patient with an abnormal heart . . . . . . . . . . . . . . . . 42
4.17 Extracted S1 for a patient with a normal heart . . . . . . . . . . . . . . . . . . . 43
4.18 Extracted S1 for a patient with an abnormal heart . . . . . . . . . . . . . . . . 43
4.19 Start of S2 identified for a patient with an abnormal heart . . . . . . . . . . . . 44
4.20 Extracted portion of signal for S2 extraction . . . . . . . . . . . . . . . . . . . . 45
4.21 Shannon energy envelope of a patient with an abnormal heart . . . . . . . . . . 45
4.22 Identified peaks in the Shannon energy envelope of a patient with an abnormal
heart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.23 Extracted second heart sound for an abnormal patient . . . . . . . . . . . . . . 46
4.24 FFT of S1 for a normal and an abnormal patient . . . . . . . . . . . . . . . . . 48
4.25 FFT of S2 for a normal and an abnormal patient . . . . . . . . . . . . . . . . . 48
4.26 Three heart cycles of abnormal patient to illustrate S1 beat-to-beat variation . . 49
4.27 ECG of patient that suffers from atrial fibrillation (see lead V1) . . . . . . . . . 49
4.28 CWT coefficients with peaks indicated that correspond to A2 and P2 of the
second heart sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.29 Ejection systolic murmur of a patient suffering from aortic stenosis, showing the
crescendo-decrescendo nature of the murmur . . . . . . . . . . . . . . . . . . . . 52
4.30 Pansystolic murmur of a patient suffering from mitral regurgitation . . . . . . . 53
4.31 Systole extracted from the cardiac cycle showing three sections for which rms-
value are calculated to determine shape of murmur . . . . . . . . . . . . . . . . 54
Stellenbosch University http://scholar.sun.ac.za
LIST OF FIGURES xi
4.32 Systole extracted from normal patient with subsections indicated . . . . . . . . 55
4.33 FFT of each subsection in systolic region of cardiac cycle for a normal patient . 55
4.34 Diastole extracted from abnormal patient with subsections indicated . . . . . . 55
4.35 FFT of each subsection in diastolic region of cardiac cycle for an abnormal patient 55
4.36 Splitting of heart cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.37 Cardiac cycle shown with extra sounds search areas . . . . . . . . . . . . . . . . 58
5.1 Feature reduction and ANN training and testing methodology . . . . . . . . . . 59
5.2 Values of feature that exhibited the greatest degree of separation between the
normal and abnormal groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Values of feature that exhibited the smallest degree of separation between the
normal and abnormal groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4 ANN with an input layer and an output layer . . . . . . . . . . . . . . . . . . . 62
5.5 ANN with one hidden layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.6 f(x) = 21+e−ax − 1 for a = 1, 2.5, 5 . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.7 ROC curve for classification scheme used . . . . . . . . . . . . . . . . . . . . . . 69
6.1 Recorded GeoAxon ECG showing artifacts that prohibited cycle extraction . . . 71
6.2 QRS-peaks with artifacts that resulted in wrongly extracted cycles . . . . . . . 71
6.3 Recording of normal patient at 2nd right intercostal space showing noise gener-
ated by insufficient contact between stethoscope and skin . . . . . . . . . . . . . 72
6.4 Recording of normal patient at 4th right intercostal space showing that less noise
is generated with sufficient contact between stethoscope and skin . . . . . . . . 72
6.5 Denoised recording showing that no information could be extracted due to poor
original recording . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.6 Denoised recording showing sufficient information to be extracted . . . . . . . . 72
6.7 CWT of S2 showing multiple peaks . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1 Schematic of the electrical system of the human heart [75] . . . . . . . . . . . . 81
A.2 Normal ECG wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.3 V1-V6 positions [77] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.4 Configuration for ECG electrodes on wrists and feet [77] . . . . . . . . . . . . . 83
A.5 Configuration for ECG electrodes on shoulders and hips [77] . . . . . . . . . . . 83
A.6 Heart axes as viewed by different leads . . . . . . . . . . . . . . . . . . . . . . . 83
A.7 Potential difference between two sides of ventricular muscle mass is zero when
there is no depolarisation wave, and positive when depolarisation moves towards
the positive electrode [78] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.8 Spread of atrial depolarisation and repolarisation waves and resulting deflections
in ECG tracing [78] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Stellenbosch University http://scholar.sun.ac.za
LIST OF FIGURES xii
A.9 Spread of ventricular depolarisation wave showing resulting deflections in ECG
tracing [78] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A.10 Standard bipolar limb leads for a 12-lead ECG configuration [78] . . . . . . . . . 86
A.11 Einthoven’s Triangle and the Axial Reference System [78] . . . . . . . . . . . . . 87
A.12 Unipolar augmented limb leads position [78] . . . . . . . . . . . . . . . . . . . . 88
A.13 Precordial unipolar chest leads positions [78] . . . . . . . . . . . . . . . . . . . . 88
A.14 Sweeping of echo-cardiography transducer beam and how resulting image is
formed [79] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
A.15 Echo-cardiogram of normal heart showing different chambers [80] . . . . . . . . 89
A.16 Electrode positions for ICG measurements . . . . . . . . . . . . . . . . . . . . . 90
C.1 Contour plot of function, showing how gradient descent algorithm steps towards
the minimum function value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Stellenbosch University http://scholar.sun.ac.za
List of Tables
4.1 Energy values of different components in extracted second heart sound signal . . 46
4.2 Calculate P-R intervals for normal and abnormal patients . . . . . . . . . . . . 50
4.3 RMS-values of different sections of systole of an abnormal patient . . . . . . . . 54
4.4 Average power of different sections to search for extra heart sounds (Abnormal) 57
4.5 Average power of different sections to search for extra heart sounds (Normal) . . 57
5.1 Network outputs for network with 15 hidden neurons and 2, 3, 4 and 5 input
features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Selected features and their respective SOF . . . . . . . . . . . . . . . . . . . . . 61
5.3 Neural network training algorithm parameters . . . . . . . . . . . . . . . . . . . 67
B.1 Extracted features and their respective SOF . . . . . . . . . . . . . . . . . . . . 94
B.2 Extracted features and their respective SOF (continued) . . . . . . . . . . . . . 95
xiii
Stellenbosch University http://scholar.sun.ac.za
Nomenclature
Constants
exp = 2.718 281 828
j =
√
−1
Variables
AN approximation at level N
A2 aortic component of the second heart sound
α regularisation parameter or momentum term (indicated in text)
b bias in ANN
C Fourier coefficient magnitude
CW (t, ω) Choi-Williams distribution of a time-domain signal
CWT (b, a) Continuous Wavelet Transform of a signal
DN detail at level N
δwrj correction term in estimation of unknown weights
E energy of a signal
ε error function of ANN
εp error function of weights in ANN
Fratio ratio of frequency bands
f frequency
fanalysis frequency intervals
fs sampling frequency
g(n) ECG signal after first-order derivative operator and MA filter have been
applied
g1(n) ECG signal after first-order derivative operator have been applied
h∗
( t−b
a
)
complex conjugate of wavelet function
i index operator
xiv
Stellenbosch University http://scholar.sun.ac.za
NOMENCLATURE xv
J cost function of ANN
k counter in algorithms
kr number of neurons in layer r of ANN
Ld length of diastole
Ls length of systole
M size of moving-average filter
M1 first group of identified murmur
M2 second group of identified murmur
µ learning rate of ANN
N length of signal or size of FFT (indicated in text)
n index operator
n(t) noise signal
o(t) original uncorrupted signal
P power of a signal
P2 pulmonary component of the second heart sound
Q vector containing indices of start-points of QRS wave of ECG
QRSstart start of QRS-complex in ECG wave
RMS(f) root-mean-square value of a function f
r layer in ANN
S indices at fs = 2000 Hz
S vector containing indices of end-points of QRS wave of ECG
S1, S2, S3, S4 first, second, third and fourth heart sounds
S1duration duration of the first heart sound
s indices at fs = 100 Hz
s(t) time-domain signal
sign(x) signum operator on a signal x
σ factor to reduce interference terms or standard deviation
T threshold value
Tn noise threshold value
TP-wave duration of atrial depolarisation wave in ECG
TP-Q interval time between start of P-wave and start of QRS-complex in ECG
TP-Q segment time between end of P-wave and start of QRS-complex in ECG
TQRS duration of QRS-wave in ECG
Stellenbosch University http://scholar.sun.ac.za
NOMENCLATURE xvi
TQ- T interval time between start of the QRS-wave and end of the T-wave in ECG
TQ-Tc corrected Q-T interval duration according to Bazett’s formula
TRR R-R interval duration in ECG wave
Ts signal threshold value
TS-T segment time between end of QRS-wave and start of T-wave in ECG
TT-wave duration of ventricular repolarisation wave in ECG
τ time step at which window function is centred
VN ECG chest electrode position for N = 1, 2, . . . , 6
vrj input to activation function of node j in layer r in ANN
W (t, ω) Wigner distribution of a time-domain signal
wrjk weight in ANN from node k in layer r − 1 to node j in layer r
wrj (new) new estimate of unknown weight
wrj (old) current estimate of unknown weight
w∗(t− τ) complex conjugate of window function
X(f) Fourier coefficient
x signal value
x vector of indices of values of g greater than zero or distribution mean
x(i) individual sample i of recorded signal
xmax maximum value of recorded signal
x(t) time-domain signal
yrj output of node j in layer r in ANN
Abbreviations
ANN artificial neural network
AR auto-regressive
ARMSCOR Armaments Corporation of South Africa
AV atrio-ventricular
CHR Committee for Human Research
CI cardiac index - normalised
CVD cardiovascular disease
CWD Choi-Williams distribution
CWT continuous wavelet transform
Stellenbosch University http://scholar.sun.ac.za
NOMENCLATURE xvii
DFT discrete Fourier transform
DWT discrete wavelet transform
dbn Daubechies wavelet of order n
ECG electrocardiography
EF ejection fraction estimate
EPCI ejection phase contractility index
ES ejection sound
FFT fast Fourier transform
FNF false-negative fraction
FPF false-positive fraction
FT Fourier transform
FWT fast wavelet transform
HMM hidden Markov model
HR heart rate
ICA independent component analysis
ICG impedance cardiography
IDWT inverse discrete wavelet transform
ISDN integrated services digital network
ISI inotropic state index
kg kilogram
MA moving-average
MC midsystolic click
MRC Medical Research Council
msec millisecond
OS opening snap
OTRI Online Telemedicine Research Institute
PC personal computer
PCA principal component analysis
PEP pre-ejection period
RMSS RSA Military Steering Committee
RR respiratory rate
SARS severe acute respiratory syndrome
SI stroke index - normalised
Stellenbosch University http://scholar.sun.ac.za
NOMENCLATURE xviii
SOF statistical overlap factor
STFT short-time Fourier transform
SURE Stein’s unbiased estimate of risk
TFC thoracic fluid conductivity
TNF true-negative fraction
TPF true-positive fraction
USB universal serial bus
VET ventricular ejection time
VSAT very small aperture terminal
WD Wigner distribution
Stellenbosch University http://scholar.sun.ac.za
Chapter 1
Introduction
1.1 Motivation
Every 12 minutes someone in South Africa suffers a heart attack; every 12 minutes someone
suffers a stroke. One in three men and one in four women will have a heart condition
before the age of 60 [1]. According to the World Health Organisation estimates of 2003,
cardiovascular disease accounts for approximately 16.7 million deaths globally, which equals
over 29% of all deaths globally [2]. The mortality rate in South Africa due to cardiovascular
disease (CVD) is 199 per 100000 people and the total mortality rate is 481 per 100000
people 1 [3]. Thus the mortality rate due to CVD accounts for 41% of the total deaths in
South Africa. In the United States of America 5 million people are diagnosed with valvular
heart disease each year.
These facts alone show that CVD is a major global threat and any development to aid
the prevention of these diseases is of great importance. Along with the increase in CVD,
the ability of physicians to diagnose heart disease by auscultation is also decreasing [4].
Proficiency in auscultation is a difficult skill to master, since heart and lung sounds are
short-duration sounds and several sounds occur in a short time interval [5]. Also the human
ear is poorly suited for cardiac auscultation and does not enable the physician to obtain
both qualitative and quantitive information about heart sounds [6]. For this reason, any
means that will aid physicians in making better diagnosis will prove extremely beneficial.
Tavel [7] evaluated the use of electronic stethoscopes and visual displays of heart sounds
and came to the conclusion that it can aid physicians in diagnosing and can also be used in
educational circumstances. For example, the acquired signals can be stored, played back at
a later stage and transmitted to distant sites. According to Tavel, the application of signal
analysis also shows promise for clinical application in cases such as the assessment of the
severity of aortic stenosis and in the differentiation between innocent and organic murmurs
1These statistics are pre-2002
1
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 1. INTRODUCTION 2
[7].
Many pathological conditions of the cardiovascular system cause murmurs and aberra-
tions in heart sounds much before they are reflected as other symptoms such as changes
in the electrocardiogram (ECG) signal [8]. Early detection of these sounds is therefore
critical to the diagnosis and sufficient treatment of patients that suffer from these types of
cardiovascular diseases.
Auscultation with a stethoscope is a well-known occurrence to each of us that has visited
a physician. It all began in the early 1800’s when the French physician René Laînnec had to
examine a female patient that showed symptoms of heart disease. According to Dr. Laînnec
“the patient’s age and sex did not permit direct application of the ear to the chest”, as was
the norm in examining heart and lung sounds in those days [9]. Determined to do his utmost
for the patient, Laînnec rolled up a sheet of paper to form a tube and pressed this against
the patient’s chest and held his ear to the other side. He later said that he “was surprised
and gratified at being able to hear the beating of the heart with much greater clearness and
distinctness than ever before”[9].
The first “electronic stethoscope” was developed in 1910 by S.G. Brown in London. He
was actually trying to overcome a problem in long distance telephony where the telephone
signals could not be transmitted further than 20 miles [9]. He developed a repeater, amplifier
and receiver that would allow transmission over 50 miles and further. As an experiment,
heart sounds were transmitted to physicians in various parts of London and all of them
reported that the received sounds were just as clear as when they were physically examining
the patient. Mr. Brown concluded that “this trial proved that it is now possible for a
specialist to examine a patient in the country and to arrive at a correct diagnosis”[9].
In many rural areas very few or no health care facilities exist. According to the Medical
Research Council of South Africa, the South African government is “committed to providing
basic health care to all South African citizens” and “to achieve this goal, the government
has identified telemedicine as a strategic tool for facilitating the delivery of equitable health
care and educational services”. Just as Mr. Brown did in the early 1900’s, it is our aim to
deliver recorded data from patients in rural areas to physicians in faraway locations to aid
in the diagnosing and treatment of these people.
1.2 Objectives
The primary objective of this project is to:
• Develop a classification scheme based on Neural Networks to screen for abnormal heart
sounds. The system has to use the data recorded by an “auscultation jacket”. The
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 1. INTRODUCTION 3
data has to be denoised and useful information (features) has to be extracted from
the recordings. The system must be tested with unknown data.
The secondary objective of this project is to:
• Test and validate the concept of the auscultation jacket as a means to record all
the necessary information from a patient and determine the validity of its use as an
application in the telemedicine industry.
1.3 Thesis outline
Chapter 2 presents a literature review covering the basic principles of the cardiovascular
system, how the different heart sounds are produced and gives an introduction to ausculta-
tion and phonocardiography. The current methods used to analyse heart sounds such as the
Fourier Transform and the Wavelet Transform are discussed, as well as other techniques.
Neural Networks and its application to heart sound classification is briefly discussed and
other classification techniques used for heart sounds are also mentioned.
In Chapter 3 the hardware used in the recording procedure is discussed. The develop-
ment of the auscultation jacket is also discussed in some detail.
Chapter 4 discusses the methodology used in the analysis of the recorded heart sounds.
This includes the denoising of the data, the extraction of individual cycles from the data by
using the ECG signal and the extraction of the features used in the classification system.
Some of these features include the extraction of the individual first and second heart sounds,
the extraction of the time difference between the different components of the second heart
sound and the extraction of extra heart sounds.
The theory behind Neural Networks and their application to this study is presented
in Chapter 5. The construction and training of the Neural Network as well as Statistical
Overlap Factor (SOF) as a feature reduction technique are discussed.
The report concludes in Chapter 6 with an evaluation of the techniques used in the
feature extraction process and an evaluation of the classification system. Application to the
telemedicine industry is discussed.
Stellenbosch University http://scholar.sun.ac.za
Chapter 2
Literature review
This chapter presents an overview of the human circulatory system, how the heart works,
how heart sounds are produced and where to listen for them. Signal processing techniques
used in the analysis of heart sounds are discussed, as well as methods implemented in
different classification schemes to differentiate between normal and abnormal heart sounds
as well as individual pathologies.
2.1 The cardiovascular system
Two circuits exist through which blood flows in the human body, namely the systemic circuit
and the pulmonary circuit. Both of these circuits begin and end at the heart. The pulmonary
circuit carries blood to and from the lungs while the systemic circuit carries blood to and
from the rest of the body. Figure 2.1 shows a schematic view of the circulatory system.
These two circuits are interconnected, so the blood that passes through one circuit, has to
pass through the other as well.
There are three types of vessels that transport blood. Arteries (efferent vessels) carry
blood away from the heart while veins(afferent vessels) carry blood to the heart. Capillar-
ies are small, thin-walled vessels between the smallest arteries and veins that permit the
exchange of nutrients and gases between the blood and the surrounding tissues [10].
The human heart is situated in the middle of the chest with the apex (bottom) shifted
slightly to the left. The heart consists of four chambers: the left and right atria and the
left and right ventricles. Each atrium and its corresponding ventricle is separated by an
atrioventricular (AV) valve. The right atrium and right ventricle are separated by the
tricuspid valve and the left atrium and left ventricle are separated by the mitral (bicuspid)
valve. The two ventricles and the arteries that carry blood from the are also separated by
valves. The right ventricle and the pulmonary artery are separated by the pulmonary valve,
while the left ventricle and the aorta are separated by the aortic valve. A frontal-section of
4
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 5
Figure 2.1: Cardiovascular circulatory system [10]
the heart is shown in Figure 2.2.
The right atrium receives deoxygenated blood from the body via the superior and inferior
vena cavae. From the right atrium the blood is pumped through the tricuspid valve to the
right ventricle, from where it goes through the pulmonary valve into the pulmonary artery,
which takes the blood to the lungs where it receives oxygen. The oxygenated blood is
transported to the left atrium via the pulmonary vein. The oxygenated blood is pumped
through the mitral valve to the left ventricle. When the left ventricle contracts, the blood
is pumped through the aortic valve into the aorta, from where it is distributed to the rest
of the body.
2.2 The cardiac cycle
The cardiac cycle can be divided into two phases for any chamber of the heart. These
two phases are known as systole (contraction) and diastole (dilation). During systole, the
chamber pushes blood into an adjacent chamber or arterial trunk. During diastole, the
chamber relaxes and is filled with blood.
A cardiac cycle begins with atrial systole which lasts for approximately 100 msec. At
this time, the ventricles are partially filled with blood and the atrial contraction fills the
ventricles. After the 100 msec of atrial systole, ventricular systole and atrial diastole begins.
Ventricular systole lasts for 275 msec and atrial diastole for 700 msec. During ventricular
systole, the pressure in the ventricles increases and forces the mitral and tricuspid valves
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 6
Figure 2.2: Frontal-section of the human heart showing the internal anatomy [10]
shut. The high pressures also force open the pulmonary valve and the aortic valve and the
blood flows into the pulmonary artery and aorta. At this point, ventricular diastole begins
and the ventricles as well as the atria are in diastole. The pressures in the ventricles decline
and fall below the pressures in the pulmonary artery and aorta and the pulmonary valve and
aortic valve close as a result of this. As ventricular pressure continues to fall, the pressure
drops below the pressure in the atria and the mitral and tricuspid valve open, allowing
blood to flow from the major veins through the atria to the relaxed ventricles. When atrial
systole begins another cardiac cycle, the total time that has passed from the start of the
previous atrial systole is approximately 800 msec. The ventricles are roughly 70% filled at
this time [10]. Atrial systole contributes a relatively small amount to ventricular volume
and this explains why individuals that have severely damaged atria can continue to lead
normal lives, while damage to one or both ventricles can leave the heart unable to maintain
adequate cardiac ouput [10].
2.3 Heart sounds
There are four different heart sounds known as S1, S2, S3 and S4. S1 and S2 are the
normal sounds one associates with a heartbeat. In the “lubb-dupp” sound one associates
with a heart sound, “lubb” corresponds to S1 and “dupp” corresponds to S2. Contradictory
explanations exist as to the origin of these sounds. It was historically believed that S1
and S2 were produced solely by the closure of the mitral and tricuspid and the aortic and
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 7
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
S1
S2
Figure 2.3: Single cardiac cycle showing S1 and S2
pulmonary valves, respectively. Recently it has been accepted that the externally recorded
heart sounds are produced by vibrations of the whole cardiovascular system triggered by
pressure gradients [11].
According to Rangayyan [11], S1 can be split into four parts. The first component is
due to the initial contraction of the ventricles as they move blood towards the atria thereby
sealing the AV valves (mitral and tricuspid valves). The second component of S1 can be
attributed to the closure of these valves and the resulting deceleration of the blood that is
moved to the atria by the contraction of the ventricles. The aortic and pulmonary valves
then open as a result of the increased pressure in the ventricles and the third component
of S1 may be attributed to the oscillation of blood between the root of the aorta and the
ventricles. The fourth component of S1 may be due to turbulence of blood flowing through
the aorta.
S2 is caused by the closure of the aortic and pulmonary valves. The primary vibrations
of S2 occur in the arteries due to the deceleration of the blood as the aortic and pulmonary
valves close, but the ventricles and atria also vibrate due to transmission of vibrations
through the blood, valves, etc. Figure 2.3 shows a single normal cardiac cycle where S1 and
S2 have been indicated.
The third heart sound (S3) can sometimes be heard and is due to the sudden termination
of the ventricular rapid-filling phase. S3 is usually low-pitched and best heard at the apex of
the heart [12]. If a third heart sound is heard in healthy young adults, it is usually diagnosed
as “physiological”. This is especially prevalent in athletes that have a slow pulse and a large
stroke volume 1 . In older patients, the presence of a third heart sound usually indicates
1The stroke volume is the amount of blood ejected by a ventricle during a single heartbeat [10].
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 8
impaired ventricular function if there are other signs of cardiac failure [12]. A third heart
sound can originate from either ventricle and the one responsible is usually deduced from
the circumstances, rather than the quality of the sound [12].
The fourth heart sound (S4) occurs at the same time as, and is due to, atrial systole.
It can be heard only in the presence of a sinus rhythm 2 . Phonocardiography can detect
a quiet S4 in many normal subjects but it tends to become particularly prominent when
a hypertrophied 3 left atrium pumps blood through an unobstructed mitral valve into a
stiff left ventricle. These conditions are most often fulfilled in ischaemic heart disease or
systemic hypertension [12]. S4 is usually a low-pitched sound and best heard at the apex
of the heart. In patients with a sufficiently slow heart rate it is sometimes possible to make
out fourth, first, second and third heart sounds, but as the heart rate increases, the third
and fourth sounds tend to merge [12].
2.4 Auscultation and phonocardiography
Heart auscultation (listening to heart and lung sounds of a patient through a stethoscope)
is the primary method by which physicians diagnose a patient as having an underlying
pathology associated with heart diseases. When auscultating a patient, one listens at specific
locations on the thorax and back. Only the thorax positions will be discussed here, since we
are dealing with heart sounds. Lung sounds are heard when auscultating the back. Figure
2.4 shows the positions of the actual valves as well as the auscultation positions. The aortic
valve is situated in the middle of the chest between the aorta and left ventricle but is best
heard in the second intercostal space (between the 2nd and 3rd ribs) to the right of the
sternum (the bone in the middle of the chest). The pulmonary valve is situated between
the pulmonary artery and right ventricle and is best heard in the second intercostal space
to the left of the sternum. The tricuspid valve, situated between the right atrium and right
ventricle is best heard at the fifth intercostal space (between the 5th and 6th ribs) just to
the left of the sternum, while the mitral valve that separates the left atrium and ventricle
is also best heard in the fifth intercostal space but further to the left of the sternum.
There is, however, a widespread belief that the skill of auscultation is of secondary
importance since the same information can be obtained through newer technological means
[16]. The reason auscultation remains a primary method by which to diagnose patients,
is due to the higher costs and limited availability of other screening procedures such as
an electrocardiogram (ECG) and an echo-cardiogram. Together with the overall bedside
2Sinus rhythm is a term used in medicine to describe the normal beating of the heart, as measured by
an electrocardiogram (ECG)[13].
3Enlargement or overgrowth of an organ or part of the body due to the increased size of the constituent
cells. Hypertrophy occurs in the biceps and heart due to increased work [14].
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 9
Figure 2.4: Actual heart valve positions together with auscultation positions [15]
examination, the use of the stethoscope is not only cost-effective, but is also not totally
replacable by alternative technological methods [7]. However, auscultation is a very difficult
skill to acquire and the necessary skills to make a proper diagnosis take years of practice
to develop [17]. The human ear is also poorly suited for cardiac auscultation [6]. The
conventional stethoscope also cannot store, play sounds back, offer a visual display, process
the acoustic signal and transmit the sounds simultaneously to multiple listeners [7].
Phonocardiography is the graphical recording of the vibrations caused by the beating
human heart. A microphone or piezo-electric sensor is placed on the thorax of a patient
and the vibrations caused by the beating heart are recorded and displayed as a sound wave.
Having digital recordings of patients’ heart sounds will prove beneficial in a multitude of
ways. First of all, it can be played back simultaneously to multiple listeners, which is ideal
for the training of auscultation skills. The teaching of cardiac auscultation skills seems
to be a difficult process as noted in [7], where it is stated that (referring to the lack of
after-recording playback):
“This lack of a common “audio platform” is the most serious obstacle to effective
teaching of cardiac auscultation, a deficiency that has reached serious proportions
throughout our educational institutions."
The unnecessary referrals of patients with innocent murmurs,4 etc. to cardiac specialists
by general practitioners poses a big problem, since this constitutes extra money and time
that have to be spent by both parties concerned. According to de Vos [18], any unnecessary
referrals should be minimised because:
4Innocent heart murmurs are murmurs found in people with normal hearts and are harmless [18].
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 10
1. Specialists are a very scarce and expensive resource that should be used only when
required.
2. The distribution of specialists and medical practitioners are not in ratio with the
regional demographic composition. The distribution of specialists are economically
driven with poorer regions having a much larger people-to-specialist ratio.
3. The anxiety of the patient and family can be minimised if unnecessary referrals are
eliminated.
2.5 Previous research
2.5.1 Signal processing techniques
A multitude of different techniques have been implemented to analyse and characterise heart
sounds. These include Fourier analysis, Short-time Fourier analysis, Wigner distributions,
Choi-Williams distributions and Wavelet analysis.
The Fourier Transform (FT) is used to determine which frequencies are contained in a
given time-domain signal. Fourier coefficients are indicative of the frequency content of a
signal and are calculated by:
X(f) =
∫ ∞
−∞
x(t)e−2jpiftdt (2.5.1)
At a more practical level, the Discrete Fourier Transform (DFT) is implemented to
calculate the Fourier coefficients for discrete signals. Discrete signals comprise most of the
signals one works with, since they are recorded by a computer. The formula by which the
Fourier coefficients are calculated for discrete signals is:
X(m) =
N−1∑
k=0
x(k)e−2jpimkN (2.5.2)
where m = 0, 1, ..., N2 , N is the size of the FT one wishes to calculate. The value
N effectively determines the resolution of the FT. For example, if you have a signal that
has been sampled at fs = 2000 Hz, the frequencies at which the FT will be calculated is
determined by [19]:
fanalysis (m) =
mfs
N (2.5.3)
For example, if you perform an 8-point FT on your data, the first frequency term will be
calculated at a frequency of 1×20008 = 250 Hz, the second frequency term will be calculated
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 11
at a frequency of 2×20008 = 500 Hz etc. If you decide to perform a 512-point FT on your
signal instead, the first frequency term will be calculated at a frequency of 1×2000512 = 3.91
Hz, the second frequency term will be calculated at a frequency of 2×2000512 = 7.81 Hz, etc.
Thus the larger N , the better the resolution of the FT that is calculated. However, the size
of the FT that you wish to calculate is bounded by the length of the signal that is being
analysed.
The Fourier coefficients are calculated for a set of pre-defined frequencies, as determined
by equation 2.5.3. At each frequency, the time-domain signal is multiplied by a complex
exponential function and integrated over all time to yield the corresponding Fourier coeffi-
cient. If the Fourier coefficient is relatively large, the time-domain signal contains a major
component of the frequency that is currently under consideration. Should the Fourier coef-
ficient be relatively small, the contribution of the frequency under consideration is small. If
the signal does not contain a component of a specific frequency, the Fourier coefficient will
be zero. The complex exponential function e−2jpift is defined as:
e−2jpift = cos(2pift)− j sin(2pift) (2.5.4)
This definition implies that any time-domain signal can be represented as a sum of sine
and cosine functions at specific frequencies. The FT of a signal is computed in a fast and
efficient manner by the Fast Fourier Transform (FFT), which is an algorithm developed by
J.W. Cooley and J.W. Tukey in 1965. The details of the algorithm will not be discussed
here and can be found in [20].
The frequency information is very important, since different actions of the heart (e.g.
the opening or closing of valves) will produce sounds at different frequencies. It is thus
critical to have the frequency information contained in a heart sound at your disposal in
order to identify certain pathologies. Bhatikar et al. [21] used the FT coefficients as input
to a classification scheme differentiating between innocent and pathological murmurs and
obtained a correct classification rate of 83% sensitivity and 90% specificity. Refer to Section
2.5.2 for a definition of sensitivity and specificity.
The major drawback of Fourier analysis is the fact that all temporal information in the
signal is lost [22]. The FT can only be applied to a signal if it is assumed that the signal is
stationary [6]. A stationary signal is defined as a signal whose statistical properties do not
change with time [23]. Heart sounds exhibit extremely non-stationary characteristics and
Fourier analysis is thus not suited for the analysis of these signals [24]. Figure 2.5 shows
the FFT of the denoised normal heart recording in Figure 2.3. The recording was done at
the 4th right intercostal space.
In an effort to correct the disadvantage (that temporal information is lost) of the FT,
the Short-Time Fourier transform (STFT) was developed. The STFT is implemented by
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 12
0 100 200 300 400 500
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Frequency [Hz]
M
ag
ni
tu
de
Figure 2.5: FFT of recorded heart sound
performing the FT on only a small part of the signal. The signal under analysis is subdivided
into a number of small records where it is assumed that each sub-record is stationary. The
signal is multiplied by a short-duration time window that is centered on the time instant of
interest. This is called windowing. The window is subsequently slid along the time axis to
cover the entire duration of the signal and to obtain an estimate of the spectral content of
the signal at every time instant. The formula by which the STFT is computed is:
X(τ, ω) =
∫ ∞
−∞
[x(t)w∗(t− τ)]e−j2piftdt (2.5.5)
The STFT cannot track very sensitive changes in the time direction [25] and hence is not
suitable for the analysis of the non-stationary and rapidly changing heart signals. However,
Turkoglu et al. [26] used the STFT to calculate the features that were used as input into
their classification algorithm for heart sounds. The authors used a back propagation neural
network as their classification scheme and obtained a correct classification rate of 94% for
normal heart sounds and 95.9% for abnormal heart sounds. The STFT of the recorded
signal in Figure 2.3 is shown in Figure 2.6. The window that was used is a Hanning window
with a duration of 64 ms and an overlap between windows of 32 ms.
The Wigner Distribution (WD) is another technique that provides a two-dimensional
view of the frequency and the temporal information of the signal under analysis. It provides
better resolution than the STFT, but is limited by the appearance of cross-terms. These
cross-terms are due to the non-linear behaviour of the WD and bear no physical meaning
[6]. The WD has also been evaluated by Bentley et al. [27] as a time-frequency technique
to extract information from recorded native and bioprosthetic heart sounds. The WD is
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 13
Time [sec]
Fr
eq
ue
nc
y 
[H
z]
0 0.2 0.4 0.6 0.8
0
50
100
150
200
250
300
350
400
450
500
−300
−250
−200
−150
−100
−50
0
Figure 2.6: STFT of recorded heart sound
calculated by:
W (t, ω) =
∫
x∗
(
t− 12τ
)
x
(
t + 12τ
)
e−jωtdτ (2.5.6)
Figure 2.7 shows the WD of the signal in Figure 2.3. It can be seen that at 0.4 sec
there is a component present that is not physically present in the recorded sound. This is
the cross-terms mentioned previously. Thus the WD is unsuitable for analysis since these
cross-terms may alter the information extracted from the signal.
The Choi-Williams Distribution (CWD) is another technique capable of displaying time-
Time [sec]
Fr
eq
ue
nc
y 
[H
z]
0 0.2 0.4 0.6 0.8 1
0
50
100
150
200
250
300
350
400
450
500
−50
−45
−40
−35
−30
−25
−20
−15
−10
−5
0
Figure 2.7: Wigner distribution of recorded heart sound
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 14
Time [sec]
Fr
eq
ue
nc
y 
[H
z]
0 0.2 0.4 0.6 0.8 1
0
50
100
150
200
250
300
350
400
450
500
−35
−30
−25
−20
−15
−10
−5
0
Figure 2.8: Choi-Williams Distribution of recorded heart sound
frequency information of heart sound signals. The CWD is calculated by [28]:
CW (t, ω) =
√
2
pi
∫ ∞∫
−∞
σ
|τ |e
−2σ2(s−t)2/τ2x
(
s + τ2
)
x∗
(
s− τ2
)
e−j2piωτds dτ (2.5.7)
The difference between the CWD and the WD is the use of the kernel function√
2
pi
σ
|τ |e−2σ
2(s−t)2/τ2−j2piωτ . In the WD the kernel function is e−jωt. The use of σ in the
CWD kernel function reduces the interference problems without reducing the resolution
[27]. Figure 2.8 shows the CWD of the signal in Figure 2.3. The value for σ used in these
calculations was σ = 6.061. It can be seen that the interference at 0.4 sec is significantly
reduced in the CWD, while the resolution still remains significantly better than the STFT.
Wavelet analysis provides a time-scale representation instead of a time-frequency repre-
sentation of the signal under analysis. Scale can be thought of as the inverse of frequency,
where the low scales constitute the high-frequency components and the high scales the
low-frequency components. When switching between frequency and scale, the scale cannot
simply be inverted to yield the frequency. Instead, one has to think in terms of pseudo-
frequencies to determine which frequencies a specific scale represents. To calculate the
pseudo-frequency associated with a specific scale, equation 2.5.8 can be used:
Fa =
Fc
∆a (2.5.8)
where Fc is the wavelet centre frequency, a is the specified scale and Fa is the pseudo-
frequency corresponding to scale a. It is attempted to associate with each wavelet a purely
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 15
Absolute Values of Ca,b Coefficients for a =  15 16 17 18 19 ...
time (or space) b
sc
a
le
s 
a
200 400 600 800 1000 1200 1400 1600 1800 2000
 15
 20
 25
 30
 35
 40
 45
 50
 55
 60
 65
 70
 75
 80
 85
 90
 95
100
Figure 2.9: Continuous Wavelet Transform (CWT) of recorded heart sound
periodic signal that captures the main oscillations of the wavelet. This is done to simplify
the subsequent analysis of the frequency content of the wavelet, since this signal contains
the main frequency component of the wavelet. The frequency of this signal is the wavelet
centre frequency, Fc, and this frequency maximises the FFT of the wavelet modulus [29].
Calculating the wavelet transform consists of breaking up a signal into shifted and scaled
versions of an original (mother) wavelet, similar to Fourier analysis which breaks up the
original signal into sinusoids of different frequencies. The continuous wavelet transform
(CWT) is calculated by:
CWT (b, a) = 1√a
∫
h∗
( t− b
a
)
s (t) dt (2.5.9)
An original mother wavelet is chosen from a pre-defined set of wavelets, or a custom
wavelet can also be constructed. The wavelet is then stepped through the signal, multiplied
with the signal at every time instant of interest and integrated to yield a wavelet coefficient.
The scale of the wavelet is then changed to compress or dilate it. The new wavelet is then
stepped through the signal again, multiplied by the signal and integrated to yield wavelet
coefficients. This process is repeated for the set of scales that one has decided upon. If
the coefficient that has been calculated is relatively large, the signal contains a component
that is similar to the wavelet at that specific scale. The CWT of the signal in Figure 2.3,
computed for scales 15 to 100, is shown in Figure 2.9.
The Discrete Wavelet Transform (DWT) computes the wavelet coefficients for a dyadic
scale sequence. This means that the wavelet coefficients are only calculated for scales based
on the power of 2 e.g. 21, 22, 23, etc. This implies that wavelet coefficients are only calculated
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 16
Discrete Transform, absolute coefficients.
Le
ve
l
Samples
200 400 600 800 1000 1200 1400 1600 1800 2000
7
6
5
4
3
2
1
Figure 2.10: Discrete Wavelet Transform (DWT) of recorded heart sound
for scales = 2, 4, 8, 16, etc. The resolution of the DWT is not as good as the resolution of
the CWT, but the computation time is far shorter since the coefficients are not calculated
for every scale. Nevertheless, the analysis is equally accurate as the CWT [29]. Figure 2.10
shows the DWT of the signal in Figure 2.3. The wavelet used was the Daubechies wavelet
of order 7, and the breakdown level was also level 7. This means that the coefficients were
calculated for scales of 21, 22, ..., 27.
Mallat developed an efficient way to implement the DWT by using the subband co-
ding scheme [30]. This is known as the Fast Wavelet Transform (FWT). The signal under
analysis is broken down into low-frequency (approximations) and high-frequency (details)
components by passing the signal through a low- and high-pass filter respectively. At each
breakdown level, the signal bandwidth is split in half. For example, if you have a signal
sampled at 2000 Hz, the maximum frequency present in the signal is 1000 Hz according
to the Nyquist criterion. This means that after the first set of filters in the DWT, the
approximations will contain the components between 0-500 Hz and the details will contain
the components between 500-1000 Hz. For the following breakdown level, the approximation
of the previous level is broken down further, yielding another set of approximations and
details. The approximation of this level contains the frequency components between 0-250
Hz and the details the frequency components between 250-500 Hz. This process continues
until the remaining samples are equal to one. The signal has to be downsampled at each
level to ensure that the number of samples at the breakdown level is half the amount of
samples contained in the signal that is passed through the filters. If this is not done, one
ends up with twice the amount of data that one started with, since convolving a signal with
a filter yields the same number of samples of the original signal. Every second sample is
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 17
thus kept to ensure the correct sizes at each level. This process is explained graphically in
Figure 2.5.1.
HP filter
LP filter
2 ?
2 ?
HP filter
LP filter
2 ?
2 ?
r
r
r
r
f(n)
D1
[500-1000 Hz] D2[250-500 Hz]
A2
[0-250 Hz]
Figure 2.11: Graphical presentation of FWT implementation
In the literature reviewed, wavelets have been used extensively to denoise phonocardio-
gram signals or highlight certain features in the signals. Debbal and Bereksi-Reguig [25]
showed that the number of major components present in each sound of the second heart
sound (A2 and P2) can be identified. The frequency range and the localisation of these
sounds can also be determined by use of the continuous wavelet transform. Doppler heart
sounds were decomposed using wavelet analysis and certain components were implemented
in a neural-network based classification scheme [26]. These authors obtained a correct classi-
fication rate of 94% using these components. Messer et al. [24] studied the effects of different
wavelets on denoising recorded heart sounds. They found that certain wavelets from the
Coiflet, Daubechies and Symlet families provide the best results. The best denoising results
were obtained by implementing wavelet analysis together with averaging5 .
Other techniques that have been implemented to extract information from heart sounds,
include the Hilbert Transform [24] and homomorphic filtering [31].
2.5.2 Classification techniques
Artificial Neural Networks (ANNs) are the primary tool implemented in the classification of
heart sounds [32; 33; 34; 35; 36] although other techniques, such as Hidden Markov Models
(HMMs), have been implemented as well [37]. ANNs are adaptive systems that can model
complex non-linear systems [38]. Refer to Chapter 5 for a detailed discussion of ANNs. As
an example, Cathers [32] used the heart sound amplitude envelope as input to the ANN.
5This is when a number of points in a signal is replaced by the average of all those points concerned.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 18
The output of the system was a 3 x 1 column vector. The sequence


0
0
1


denoted a normal heart sound whereas the sequence


1
0
1


denoted a systolic murmur. The ANN is thus trained to give these outputs for the correct
inputs, so that when it is presented with similar input data, it will give the same output,
classifying the heart sound as either normal, or as a systolic murmur, for instance.
In a study performed by Bhatikar et al. [21] to distinguish between innocent and patho-
logical murmurs, the input to the ANN was the frequency spectrum of the heart sound that
consisted of the 252 bins in the discrete energy spectrum with a range of 0-252 Hz and a
bin-size of 1 Hz. In this case, the output was a single number, either 0 or 1, where 0 indi-
cated an innocent murmur and 1 indicated a pathological murmur. The network consisted
of 252 input neurons, 15 hidden layer neurons and 1 output neuron. The authors obtained a
correct classification rate of 83% sensitivity and 90% specificity. Sensitivity and specificity
are defined as:
Sensitivity = # of true positives# of true positives + # of false negatives (2.5.10)
Specificity = # of true negatives# of true negatives + # of false positives (2.5.11)
Sensitivity specifies the percentage of unhealthy patients that are recognised as un-
healthy and specificity determines the number of healthy patients that are recognized as
healthy [34].
In other studies, Leung et al. [36] obtained a sensitivity of 97.3% and a specificity
of 94.4% in classifying innocent and pathological systolic murmurs. The authors used a
probability neural network in classifying their data.
Akay et al. [35] achieved a sensitivity of 85.5% and a specificity of 88.9% in detecting
coronary artery disease. The authors used a Fuzzy Min-Max Neural Network in their study.
The fuzzy min-max classification neural netork is an on-line supervised learning classifier
that is based on hyperbox fuzzy sets. Tripathy [39] used a feed-forward neural network
trained with the backpropagation algorithm to differentiate between normal heart sounds
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 2. LITERATURE REVIEW 19
and certain pathologies. A correct classification rate of 81.86% was obtained.
HMMs have mainly been used in the field of speech recognition. When working with
HMMs, one has a sequence of observable events, or observed vectors, that have been gen-
erated by a Markov model. The Markov model consists of a set of states and these states
produce a certain observation vector/s. In the Markov model, state is changed every time
unit and each time the state is changed, an observation vector is generated that depends on
the probability of that observation vector being produced. The transition from one state to
another is also determined by the probability that such a transition will occur. The HMM
then calculates the best sequence of states that maximises the probability of generating the
specific observation sequence. El-Hanjouri et al. [37] achieved a correct recognition rate of
99.1% in classifying pathological heart sounds by implementing HMMs. HMMs were also
used in [40] to segment heart sounds into their constituent parts.
Other classification techniques implemented in phonocardiogram analysis includes decision-
tree classifiers. Pavlopoulos et al. [41] achieved a correct classification rate of 90% in
discriminating between aortic stenosis and mitral regurgitation.
Voss et al. [42] achieved a correct classification rate of 100% for patients suffering
from moderate or severe aortic stenosis and a correct classification rate of 75% for patients
suffering from mild aortic stenosis. Their desicion-making scheme was based on a linear
discriminant function being applied to the feature vectors extracted from the heart sounds.
Bentley et al. [27] used Bayes’ decision rule in classifying their data as either normal or
abnormal. They obtained a correct classification rate of between 61% and 100%, depending
on which feature extraction method was followed.
Stellenbosch University http://scholar.sun.ac.za
Chapter 3
Hardware and Data Acquisition
This chapter describes in broad terms the procedure followed in the design of the ausculta-
tion jacket. The hardware implemented as well as the procedure followed in recording the
patient data is discussed.
3.1 Stethoscopes
It was desired to record the heart sounds from patients in order to develop an automated
screening procedure capable of differentiating between normal and abnormal heart sounds.
There are different methods of obtaining the heart sounds from a patient. It could be done
via the use of a digital stethoscope or an accelerometer. Accelerometers are not as widely
used as digital stethoscopes, but have been implemented in studies to record heart and lung
sounds [43].
Conventional analogue stethoscopes are mainly used for auscultating patients in hos-
pitals and clinics. A conventional analogue stethoscope simply converts sound waves into
pressure waves that can be heard and processed by the human ear and is shown in Figure
3.1. Digital stethoscopes can work on two different principles;(a) implementing a micro-
phone to convert the acoustic waves generated by the beating heart to electrical signals;(b)
using a piezo-electric crystal in converting the sound waves to electric signals. Most phono-
cardiograph transducers implement the crystal piezo-electric or dynamic piezo-electric mi-
crophones [44]. For this project, the digital stethoscopes implemented were designed and
supplied by GeoAxon. The stethoscopes made use of a condenser microphone to convert
the pressure waves to electrical signals and the microphone used in the stethoscope was the
Panasonic WM-61 B back electret condenser microphone. These microphones have a range
of 20 − 20000 Hz and a flat frequency response up to 5000 Hz. The data sheet for the
microphone is given in Appendix B.
A normal condensor microphone uses a capacitor to produce a change in voltage. One of
20
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 21
Figure 3.1: Conventional analogue
stethoscope [45]
Figure 3.2: Standard condenser mi-
crophone [46]
the materials used in the capacitor is the diaphragm. As sound waves reach the diaphragm, it
moves back and forth, thereby changing the distance between the two plates of the capacitor.
As the distance decreases, the capacitance increases, producing a charge current;when the
distance increases, capacitance decreases and a discharge current is produced. The change
in voltage across a resistor is measured and converted to an audible sound (refer Figure 3.2).
In order to produce the charge or discharge current a voltage is required and is normally
supplied by a battery in the microphone or by phantom power [46].
In the back electret microphone a dielectric material is placed behind the diaphragm
on the backplate of the microphone housing (refer Figure 3.3). This dielectric material
serves as the capacitor. The only difference between a normal condenser microphone and
an electret condensor microphone is that the latter does not require an external voltage
source to produce the charge and discharge currents, since the voltage is manufactured into
the dielectric material [46].
The stethoscopes used in this study are USB-enabled stethoscopes. The stethoscopes
connect to the PC via the USB connection and each stethoscope is registered by the com-
puter as a separate recording device. The analogue-to-digital conversion of the signal takes
Figure 3.3: Back electret condenser microphone [47]
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 22
Figure 3.4: Digital stethoscope used
in study
Figure 3.5: Inside view of digital
stethoscope
place in the stethoscope itself. These digital signals were then recorded by a computer,
using recording software. The digital stethoscopes used in the study are shown in Figure
3.4. Figure 3.5 shows the electronic chip inside the digital stethoscope. The recorded signals
were sampled at 16-bit and 2000 Hz.
3.2 Auscultation jacket
When auscultating a patient or recording the heart sounds of a patient, only one position
is normally listened to or recorded at a specific time. This is not necessarily a deficiency,
but during the research it was decided to obtain a “snapshot” of the heart (or lungs) of
a patient by simultaneously recording the heart and lung sounds at the positions where a
physician would normally auscultate a patient. To achieve this, 21 digital stethoscopes were
embedded into a jacket to record the heart and lung sounds of a patient.
The result is the “auscultation jacket” that is capable of recording the heart and lung
sounds at all the necessary positions simultaneously, as well as a 12-lead ECG (refer to
Figure 2.4). An Impedance Cardiogram (ICG) was also built into the jacket but due to
unforeseen hardware problems the ICG could not be recorded with the jacket but had to
be recorded separately. Please refer to Appendix A for a detailed explanation of ECG and
ICG technology.
3.2.1 Previous approaches
Similar approaches to the auscultation jacket have been developed previously. These include
designs from companies such as Stethographics Inc., Tapuz Medical Technology Ltd. and
Medes.
Stethographics Inc. developed a multi-channel stethograph that consists of 14 stetho-
scopes embedded within sponge. The sponge is placed behind the patient’s back and two
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 23
seperate stethoscopes are placed on the patient’s chest. All recordings are done simul-
taneously, which will make true comparisons possible and provides the basis for three-
dimensional analysis and display. Figure 3.6 shows the multi-channel stethograph from
Stethographics Inc..
Tapuz Medical Technology Ltd. developed a universal ECG electrode belt, which has
six ECG electrodes moulded into the structure. The electrode positions correspond to the
positions where the chest electrodes would be placed for a normal 12-lead ECG. Fitting
sockets for the leads onto the electrodes are provided for. Figure 3.7 shows this ECG belt.
Medes is a French organisation that has been working on the VTMAN project. The
objective is to enhance the autonomy of patients by integrating medical equipment in the
patients’ clothes. This achievement should significantly reduce the medical follow-up of pa-
tients who are medically dependent and should contribute to optimising medical procedures
[50]. Figure 3.8 shows some of the different elements of the VTMAN project.
3.2.2 Design procedure
In the design of the jacket, it was difficult to decide where the positions for the stethoscopes
and electrodes in the jacket should be. The positions of the stethoscopes should coincide
with the auscultation locations as explained in section 2.4, as well as some of the positions
Figure 3.6: Stethographics Inc. multi-channel stethograph [48]
Figure 3.7: T apuz Medical Technology Ltd. ECG belt [49]
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 24
Figure 3.8: Medes - VTMAN project [50]
where the ECG electrodes should be placed. These positions proved to be extremely difficult
to pin down, since these positions differ from person to person.
The final positions were decided upon with the help of a medical doctor, Dr Renier
Verbeek. A tight-fitting shirt was worn by a volunteer of average build and the normal
auscultation positions were indicated on the shirt. These markings resulted in 18 possible
auscultation/ECG/ICG positions on the torso and 14 auscultation/ECG positions on the
back. The positions marked were converted to a sketch and can be seen in Figure 3.9.
It was decided that all of these positions were not necessary and the final positions
decided upon are shown in Figure 3.10. These positions are the most likely auscultation
positions, as well as the positions of the ECG and ICG electrodes. These positions were
thus deemed sufficient to obtain all data necessary to make a proper diagnosis.
The physical size of the jacket was based on anthropometric data obtained from the RSA
Military Steering Committee (RMSS), which forms part of the Armaments Corporation of
South Africa (ARMSCOR). The data used is contained in the RSA-Mil-Std. 127 Volume
Figure 3.9: Initial positions for stethoscopes and electrodes in jacket
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 25
Figure 3.10: Final positions for stethoscopes and electrodes in jacket
1 and measurements based on the 50th percentile was used. It was thus ensured that the
jacket will fit the greater part of the South African population.
3.2.3 The auscultation jacket
The jacket consists of a neck piece, front piece, back piece and two side pieces. The neck
piece contains two stethoscopes with electrodes embedded and two smaller electrodes for
ICG purposes. The left side piece contains two stethoscopes with electrodes embedded.
The top stethoscope serves a dual purpose: it serves as the electrode at the V6 position
for the ECG, as well as the electrode that measures the impedance during the ICG. The
bottom stethoscope serves as one of the electrodes that generate the small current that is sent
through the thorax during the ICG procedure. The right side piece contains one stethoscope
with an electrode embedded and one stethoscope casing with only an electrode. The front
piece contains seven stethoscopes. Five of these stethoscopes have electrodes embedded,
since they serve the purpose of the V1-V5 electrodes needed for the ECG purposes. The
back piece contains 12 stethoscopes that are situated in a symmetrical pattern for the
recording of the lung sounds. Figures 3.11 and 3.12 show the inside and outside of the front
piece of the jacket. The inside and outside of the back piece is shown in Figures 3.13 and
3.14 respectively.
Some of the stethoscopes are stereo stethoscopes, i.e. two stethoscopes connect to the
PC via one USB cable. One stethoscope uses the left channel and the other the right channel
in the recording procedure. Figure 3.15 shows a pair of these stethoscopes. This was done
so that the total number of USB cables running to the PC remain at a minimum. Please
refer to [51] for a detailed explanation of the procedure followed during the design of the
jacket.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 26
Figure 3.11: Front piece of jacket (in-
side)
Figure 3.12: Front piece of jacket
(outside)
The electrodes necessary for the ECG and ICG purposes were built into the jacket, except
for two of the limb leads necessary for the ECG. If the auscultation positions coincided with
the position for an electrode, the electrode was simply embedded into the stethoscope. This
proved to be a satisfactory approach, since the ECG recorded with the jacket was of a good
standard. The system used for the ECG recording was supplied by IQteq and the system
Figure 3.13: Back piece of jacket (in-
side)
Figure 3.14: Back piece of jacket (out-
side)
Figure 3.15: Dual stethoscopes
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 27
for the ICG recording by Hemo Sapiens Inc. A sample ECG recording of a normal patient
is shown in Figure 3.16.
3.3 Data acquisition
The aim was to record the heart and lung sounds of 60 male volunteers, 30 of them patients
with no heart problems and 30 of them patients with valve-related heart disease or some form
of auscultatory abnormality. The design of the jacket did not accommodate the recording
of female test subjects. A study protocol, explaining the goals as well as the procedures
followed during the course of the study, was submitted to the Committe for Human Research
(CHR) of the University of Stellenbosch. The study protocol was approved by the CHR
and the project number is N06/02/030.
To establish whether a participant belonged to the normal or abnormal study group,
each participant had to be individually examined by a physician for any auscultatory ab-
normalities and undergo a 12-lead ECG and an echo-cardiogram. All patients had to sign a
consent form before the recording and examination procedure took place. Each patient was
also fitted with a 3-lead ECG, that could simultaneously be recorded with the stethoscope
data, built by GeoAxon. This served as a trigger to determine when S1 was produced.
The further use of this information is explained in Chapter 4.
The inclusion criteria for the participants were:
• 60 kg < mass of participant < 110 kg
• Participants above 18 years of age
Figure 3.16: Sample 12-lead ECG recorded with auscultation jacket
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 28
• Participants who had given written informed consent
• Patients determined fit to participate in this study by a physician
• Patients with a normal echo-cardiogram
• Patients with have an abnormal echo-cardiogram
• Any person willing to participate in this study, i.e. any person from the Mechanical
Engineering Department, Tygerberg Hospital, or any person who was aware of the
study and wished to take part
The exclusion criteria for the participants were:
• 60 kg > Mass of participant > 110 kg
• Participants without written informed consent
• Patients found physically unfit for the study
• Subordinates to any of the researchers
Unfortunately, due to time limits, not all sixty volunteers could be recorded. Thirty-four
normal volunteers were recorded, but the data of only 17 of these patients could be used in
the analysis procedure. This was due to the positions in which the patients were recorded
being changed after it was realised that the number of positions in which the recordings
were being done was not necessary. Ideally all the volunteers had to be recorded in the
same positions so as to not bias the results in any way. Another reason for omitting some
of the recorded data is that some of the data was still too noisy to extract any meaningful
information despite the denoising procedure. Of the 21 abnormal patients recorded, only
14 patients’ data could be used due to either the data that have been recorded being too
noisy despite the denoising procedure, or that the ECG recorded with GeoAxon’s ECG was
too noisy and artifacts were produced that corrupted the data.
Each participant was fitted with the jacket and was asked to lie on his back on an
examination bed. The patients were recorded in 3 supine positions:
1. Patient breathing normally
2. Patient holding breath at end of expiration
3. Patient holding breath at end of inspiration
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 29
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pli
tu
de
A2
P2
Figure 3.17: S2 at expiration
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pli
tu
de
A2
P2
Figure 3.18: S2 at inspiration
Positions 2 and 3 were performed so as to determine whether the splitting of S2 widens
during inspiration or vice versa. As was explained in section 2.3, the second heart sound
is produced by the closure of the aortic and pulmonary valves (refer Figure 2.2), in that
order. In some cases it is possible to hear the split between the closure of these two valves.
During expiration the split is virtually inseparable, but during inspiration the pulmonary
component, P2, tends to be delayed [12]. During inspiration, the intrathoracic pressure
decreases, below atmospheric pressure, to allow air into the lungs. This drop in intrathoracic
pressure leads to the expansion of the lungs, the cardiac chambers and superior and inferior
vena cavae [52]. Due to this expansion of the chambers, the pressure inside the right atrium
and the veins leading to the right atrium are decreased. The venous return can be defined
as:
V R = PV − PRARV
(3.3.1)
where V R is the venous return1 , PV is the pressure inside the vena cava, PRA is the pressure
inside the right atrium and RV is the venous resistance [52]. It can easily be seen from
equation 3.3.1 that a decrease in PRA will lead to an increase in V R. It has to be noted
that the decrease in PV during inspiration has to be smaller than the decrease in PRA to
facilitate an increase in venous return.
Due to this increase in venous return, the right atrium and ventricle fill slightly more
with blood and consequently it takes slightly longer to eject this blood during systole.
Because of the slightly longer ejection period, the pulmonary valve stays open a bit longer
and this leads to a split in the second heart sound that is audible in normal healthy people.
Splitting of the second heart sound occurs in patients with heart disease as well and may
be a good indicator of whether heart disease is present. Figures 3.17 and 3.18 show the
extracted second heart sound of a patient at expiration and inspiration respectively. The
aortic and pulmonary components are shown (A2 and P2 respectively) and it can be seen
that the split increased noticably with inspiration.
1Venous return is defined as the amount of blood reaching the right atrium during a single beat in the
cardiac cycle [52].
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 30
Splitting of S2 occurs when there is obstruction to emptying of the right ventricle, as
in pulmonary stenosis (when the pulmonary valve does not open properly). It may also
occur if there is delayed electrical activation of the right ventricle as in right bundle branch
block [12]. These diseases tend to increase the split of S2, but the split still increases with
inspiration and decreases with expiration.
Fixed splitting of S2 also occurs e.g. in patients who suffer from atrial septal defect.
In patients who suffer from atrial septal defect there is a hole in the muscle wall (septum)
that seperates the atria from each other. This causes blood to flow from one atrium to the
other. In cases such as these, the right ventricle has to pump harder to compensate for the
loss of blood to the left atrium. This causes the pulmonary valve to stay open longer, but
the split is fixed during inspiration and expiration since the blood moves between the atria
permanently.
Reversed splitting of S2 can also occur. That is when the pulmonary valve closes before
the aortic valve and the split increases during expiration and decreases during inspiration.
This commonly occurs in patients who suffer from hypertrophic obstructive cardiomyopathy.
This is when a portion of the myocardium (the heart muscle) is enlarged. It may also occur
in left bundle branch block and in aortic stenosis, but can be difficult to detect since the
aortic component is very soft due to the rigidity of the stenosed valve [12].
The amplitude values of the recorded heart sounds were normalised by:
x(i) = x(i)xm ax
(3.3.2)
where x(i) is the current value under consideration and xmax is the maximum value of
the specific recording. The recording software recorded the values in decibels, but after the
normalisation procedure all recorded values were between -1 and 1. All the stethoscopes were
first connected to a custom-built 28-port USB hub from which only four cables connected to
the PC. These four cables then transmitted all the recorded information to the computer.
The hub is shown in Figure 3.19.
After the recording procedure, the ICG recording was performed on each patient. ICG
technology is based upon a drop in the thoracic resistance during each beat of the heart. The
resistance drops because, during ejection of the blood, the red blood cells align themselves in
a parallel fashion thereby making the blood more conductive [53]. Please refer to Appendix
A for a detailed explanation of ICG technology.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 3. HARDWARE AND DATA ACQUISITION 31
Figure 3.19: 28-port USB hub used to connect stethoscopes to computer
Stellenbosch University http://scholar.sun.ac.za
Chapter 4
Methodology
The analysis procedure of the recorded heart sounds will be discussed in this chapter. This
includes denoising of the recorded data, preprocessing of the data and the feature extraction.
The whole process is outlined in Figure 4.1 and each section will be discussed separately.
Figure 4.1: Implemented methodology
4.1 Denoising of recorded data
Recording of the data took place at Tygerberg Hospital in the Western Cape. During the
recording procedure the recorded signals were contaminated by noise and this had to be
removed. It was attempted to keep the environment as quiet as possible but some noise was
still recorded. The noise was due to voices, people in the halls of the hospital as well as the
physical examination taking place right next to the recording area. The denoising methods
used are shown schematically in Figure 4.2.
The frequencies of interest in recorded heart sounds are in the range of 50 - 650 Hz.
Due to this, the recorded signals were bandpass filtered. The lower cut-off frequency was
25 Hz and the upper cut-off frequency was 700 Hz. Figure 4.3 shows the original recorded
32
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 33
Figure 4.2: Denoising methods implemented
signal and Figure 4.4 shows the signal after it was bandpass filtered. The signal was sent
through the filter, reversed and sent back through the filter so that zero phase distortion
would be present at the end of the filtering process. The signal shown is a signal recorded
with the auscultation jacket at the 4th left intercostal space. The recorded ECG signals
(IQteq and GeoAxon) were also filtered. These signals were low-pass filtered with a 4th
order Butterworth filter with a cut-off frequency of 40 Hz so as to remove any electrical
noise that might have occurred.
0 0.2 0.4 0.6 0.8 1
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pl
itu
de
Figure 4.3: Original recorded signal
0 0.2 0.4 0.6 0.8 1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pl
itu
de
Figure 4.4: Bandpass filtered signal
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 34
Original signal
?
DWT
A D
? ?
Threshold Threshold
IDWT
?
Denoised signal
Figure 4.5: Flowdiagram of wavelet threshold denoising technique
4.1.1 Wavelet threshold denoising
After the signals were bandpass filtered, the signal were denoised with the wavelet threshold
denoising method. The wavelet threshold denoising technique is a relatively simple and
effective technique for denoising data. It has been used by de Vos [18] and Messer et al. [24]
in denoising phonocardiograms. When using the wavelet threshold technique, the wavelet
is first broken down to a specified level by the discrete wavelet transform. The threshold
is then applied to the approximation and detail coefficients. The signal is reconstructed by
the Inverse Discrete Wavelet Transform (IDWT) and the denoised signal is produced. The
process is described figuratively in Figure 4.5.
Two types of thresholding techniques exist, hard thresholding and soft thresholding.
Hard thresholding is defined as (x being the signal value, T being the threshold value) [24]:
x =
{
x if |x| > T
0 if |x| 6 T
(4.1.1)
and soft thresholding is defined as (x being the signal value, T being the threshold value)
[24]:
x =
{
sign (x) (|x| − T ) if |x| > T
0 if |x| 6 T
(4.1.2)
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 35
0 0.2 0.4 0.6 0.8 1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Time [sec]
Am
pl
itu
de
Figure 4.6: Wavelet threshold denoised signal
Coefficients of the DWT that lie below the threshold value are removed and the resulting
coefficients are then reconstructed to constitute the denoised signal. This is a very powerful
concept, because signals with energy concentrated in a small number of wavelet dimensions
will have coefficients that are relatively large compared to any other signal present that
has its energy concentrated over a larger number of wavelet dimensions [22]. Applying the
thresholding operation to the DWT will, therefore, effectively remove any unwanted signal
or noise, even if the instantaneous frequency spectra of the two signals overlap.
When using the threshold denoising technique, the following assumptions are made [54]:
1. The recorded signal is modelled as: x (t) = o (t) + n (t) where x (t) is the recorded
signal, o (t)is the original uncorrupted signal and n (t) is the noise signal.
2. The energy of the original signal is effectively captured in a transform whose values
lie above a specified threshold Ts > 0.
3. The transform values of the noise signal have magnitudes that lie below a specified
threshold Tn that satisfy the following condition Tn < Ts.
The result of denoising by the wavelet threshold method for the signal shown in Figure
4.4 is shown in Figure 4.6. For the denoising of the recorded data, Stein’s Unbiased Estimate
of Risk (SURE) was used to calculate the threshold value. The calculated threshold value
was multiplied by 0.3 to reduce the value, since the larger value resulted in some signals
losing the first or second heart sound. In some instances this value was still a bit large
and removed some murmur information as well, but setting this value lower resulted in
more signals that were not denoised properly. It was thus decided to adhere to the selected
threshold and apply that to every signal processed.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 36
Figure 4.7: Cycle extraction flowdiagram
A second technique was used to calculate a threshold value. The standard deviation,
σ, of the first-level detail coefficients was calculated. It is assumed that most of the noise
is captured at this level, since it contains the high-frequency components. The threshold
is then set to T = 4σ as suggested in [54]. The larger of the the two calculated threshold
values was chosen as the value to use in the denoising procedure.
4.1.2 Cycle extraction
After the recorded signals had been denoised, four cycles of heart sounds were extracted.
This was done by identifying the QRS-peaks in the ECG recordings to identify the onset
of a heartbeat cycle. The process is outlined in Figure 4.7. The detection algorithm used
is based upon a weighted and squared first-derivative operator and a moving-average (MA)
filter and is described by Rangayyan [11]. The first-derivative operator accentuates the areas
of greatest change (the QRS-peaks) and attenuates the slow-varying components. The MA
filter smooths the signal further to attenuate any small artifacts or noise that may still be
present. The signal after applying the filtered-derivative operator is:
g1(n) =
N∑
i=1
|x(n− i + 1)− x(n− i)|2 (N − i + 1) (4.1.3)
where x(n) is the ECG signal and N is the width of a window within which first-order
differences are computed, squared and weighted by the factor (N−i+1). Further smoothing
of the result was performed by an MA filter over M points. The signal after application of
the MA filter:
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 37
g(n) = 1M
M−1∑
j=0
g1(n− j) (4.1.4)
The recorded ECG signals were resampled to fs = 100 Hz, since this makes the calcula-
tion considerably faster. The window widths were set to N = 1 and M = 5. In Rangayyan
[11] the window widths were set to N = M = 8. This, however, resulted in the QRS-peaks
being smoothed to such a degree that the start of S1 was calculated incorrectly by roughly
100 msec. The window widths were thus changed to avoid this. Figures 4.8, 4.9 and 4.10
show the original recorded ECG signal recorded by GeoAxon’s ECG, the signal after the
first-derivative operator and the signal after smoothing with the MA filter respectively.
After this procedure, the starting-points and end-points of the peaks were identified.
All values below 15% of the maximum value in the signal were first set to zero in order to
0 1 2 3 4 5
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
N
or
m
al
is
ed
 a
m
pl
itu
de
Figure 4.8: Original recorded ECG
0 1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Time [sec]
Am
pl
itu
de
Figure 4.9: ECG signal after first-
derivative operator applied
0 1 2 3 4 5
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Time [sec]
Am
pl
itu
de
Figure 4.10: ECG signal after first-derivative operator and smoothing MA filter
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 38
x(i) = 0
?
@@
@@
@@
  
  
  
  
  
  
@@
@@
@@
x(i) ≥ 0.15xmaxx(i) - -x(i) = x(i)YES
NO
Figure 4.11: Algorithm to set all values below 15% of maximum value to zero
remove any artifacts in the signal. This algorithm is shown in Figure 4.11 where x(i) is the
function value and xmax is the maximum function value.
The algorithm to detect the starting-points and end-points of the QRS-peaks is explained
in Figure 4.12. Here the vector x contains the indices of all the values of g (after the bottom
k = 1
x(i) = i|g(i) > 0 - Q(1) = x(1)
?
for j = 1 to N − 1 - 
  
  
 
@@@@@@
@@@@@@
  
  
  
x(j + 1)− x(j) > 0 -YES
k = k + 1
S(k) = x(j)
Q(k + 1) = x(j + 1)
6
Figure 4.12: Algorithm to detect QRS starting-points and end-points
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 39
15% have been removed, refer equation 4.1.4) that are greater than zero. The vector Q
contains the indices of the starting-points of the QRS-complex and the vector S contains
the indices of the end-points of the QRS-complex. N is the length of the vector x.
The indices of the starting-points and end-points of the QRS-peaks were identified. To
extract four cycles of the recorded heart sounds, the indices had to be reworked to the
sampling frequency fs = 2000 Hz. This was done by equation 4.1.5.
S = 2000× s100 (4.1.5)
where s is the sample indices at fs = 100 Hz and S is the sample indices at fs = 2000
Hz. In cases where a patient suffers from atrial fibrillation or any other disease that results
in an unstable heartbeat, the extracted heart cycle might not contain all the necessary
information. In order to extract four cycles that contain information that is comparable to
one another, the interval duration of a cycle was calculated and compared to the interval
prior to that and the interval next to it. If these ratios fell within the range, 0.85 ≤
Interval ≤ 1.15, the centre cycle was extracted. If none of the heart cycles fell within that
ratio, 4 cycles were simply extracted from the first QRS-peak to the fourth QRS-peak.
4.2 Feature extraction
It was decided to follow a more physiological process during the feature extraction process.
This implies that the features extracted are features that a physician would listen to when
auscultating the heart. It was then attempted to describe these features in a mathematical
manner so that they might be used in the classification process. The feature extraction
process is described schematically in Figure 4.13.
4.2.1 Ratio of power between S1 and S2
The power ratio of S1 to S2 as a feature was decided upon, since this indicates where S1 or
S2 is the loudest. The second heart sound should be loudest at the 2nd intercostal space
near the base of the heart and the first heart sound at the fifth intercostal space near the
apex of the heart. This is a result of the sound of the closing valves radiating through the
thorax. Refer to Figure 2.4 and Section 2.4 for an explanation. If S1 increased in intensity
(louder than S2 at the base), it might indicate the following [55]:
• Anaemia, pregnancy, anxiety, fever and hyperthyroidism, which in turn result in in-
creased contractility and myocardial tension development.
• Mitral stenosis: S1 may be louder due to mitral valve leaflet thickening and scarring.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 40
Figure 4.13: Feature extraction process
• The opening of mitral valve at the onset of ventricular contraction. This occurs when
there is a short P-R interval in the ECG (between 0.11-0.13 seconds).
If S1 is decreased in intensity (S2 louder than S1 at the apex) it may indicate the
following:
• Impaired ventricular contractility, which means a decrease in myocardial tension de-
velopment, that could indicate congestive heart failure.
• Severe mitral stenosis, which results in complete mitral valve immobility.
• The mitral valve is nearly closed at the onset of ventricular contraction. This results
in a prolonged P-R interval (> 0.2 seconds).
To calculate the power ratio, the first and second heart sounds had to be extracted first.
The extraction of S1 and S2 was based on the timing relationships of the heart sounds to
the ECG. The extraction process is shown schematically in Figure 4.14.
The start of the first heart sound can be taken as the start of the QRS-complex in the
ECG [11]. Refer to Appendix A for an explanation of the constituent parts of the ECG. The
QRS-complexes have been identified as explained in Section 4.1.2. It was decided to use
the power ratios of three consecutive cycles and calculate the average power ratio between
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 41
Figure 4.14: Procedure followed to extract S1 and S2
S1 and S2 at the apex and the base of the heart. This was done to ensure that the ratio
calculated is representative of the actual beating of the heart. The start of S1 can be related
to the QRS-complex by keeping in mind that ventricular contraction forces blood upwards,
which, in turn, closes the mitral and tricuspid valves (resulting in the main component
of the S1 sound). The end of S1 cannot be attributed to any specific event in the ECG.
We know that the start of the T-wave corresponds to ventricular repolarisation [10], which
means that the ventricles relax and cause the aortic and pulmonary valves to shut, thereby
causing S2. The end of S1 thus has to occur prior to that. It was decided to calculate the
end of S1 as:
S1 end = QRSstart + 1.5× TS-T segment (4.2.1)
This ensures that enough of the information contained in S1 is extracted without con-
taining a portion of S2.
Burke and Nasor [56] developed second-order equations to calculate the time relation-
ships of the different components of the human electrocardiogram. The equations developed
for the different sections of the ECG (refer Appendix A) are (all times are in seconds):
TP-wave = 0.57T 1/2R-R − 0.33TR-R − 0.14 (4.2.2)
TP-Q segment = 0.56T 1/2R-R − 0.33TR-R − 0.17 (4.2.3)
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 42
0 0.5 1 1.5 2 2.5 3 3.5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
S SSS
EEEE
Figure 4.15: Identified S1’s for a pa-
tient with a normal heart
0 0.5 1 1.5 2 2.5 3 3.5
−1.5
−1
−0.5
0
0.5
1
Time [sec]
Am
pl
itu
de
S S S S
E E E E
Figure 4.16: Identified S1’s for a pa-
tient with an abnormal heart
TP- Q interval = 1.12T 1/2R-R − 0.65TR-R − 0.31 (4.2.4)
TQRS = −0.02T 1/2R-R + 0.02TR-R + 0.08 (4.2.5)
TQ-T interval = 1.65T 1/2R-R − 0.84TR-R − 0.46 (4.2.6)
TT-wave = 1.29T 1/2R-R − 0.66TR-R − 0.42 (4.2.7)
TS-T segment = 0.34T 1/2R-R − 0.17TR-R − 0.10 (4.2.8)
This approach proved to be very uncomplicated and efficient in extracting the first heart
sound. The start and end of S1 are shown in Figure 4.15 (for a normal heart sound) and
in Figure 4.16 for an abnormal heart sound, where S denotes the start of S1 and E denotes
the end of S1. The corresponding extracted first heart sounds are shown in Figures 4.17
and 4.18 respectively.
For the extraction of the second heart sound, the procedure was not as straightforward.
The start of S2 was taken as the end of the T-wave in the ECG. The end of the T-wave can
be taken as the start of S2 since this is when the ventricles start to relax and the pressure
in the ventricles drop [11]. This action causes the aortic and pulmonary valves to shut,
since the pressure in the aorta and the pulmonary artery is higher than the pressure in the
ventricles. In calculating the end of the T-wave, the start of the QRS-complex was identified
and the duration of the Q-T interval was added. The corrected Q-T interval, Q-Tc, was
used in the calculation. This is equal to the Q-T interval divided by the square root of the
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 43
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
Figure 4.17: Extracted S1 for a pa-
tient with a normal heart
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
Time [sec]
Am
pl
itu
de
Figure 4.18: Extracted S1 for a pa-
tient with an abnormal heart
R-R interval, according to Bazett’s formula [57]:
TQ - T c =
TQ - T interval√
TR - R
(4.2.9)
After the start of S2 had been identified, 300 ms of the recorded heart signal were
extracted from 40 ms prior to the calculated start of S2. This was deemed sufficient to
capture the second heart sound as well as a bit extra of diastole, since the average duration
of S2 is about 150-200 ms. The reason for starting 40 ms prior to the identified start
was to compensate for the error in the calculation of the end of the T-wave, as well as to
accommodate fluctuations in the heartbeat of patients. The starting-points of S2 for an
abnormal patient are indicated in Figure 4.19 and Figure 4.20 shows the 300 ms of signal
extracted from 40 ms prior to the identified start of S2. As can be seen in Figure 4.20, S2
as well as a diastolic murmur, is present in the extracted portion of the signal. It must be
attempted not to identify murmurs as S2 falsely.
The end of the second heart sound cannot be attributed to any specific event in the
ECG. Thus a different approach was needed. To do this, it was decided to calculate an
energy envelope of the extracted signal to determine where the majority of the energy is
situated, since this would most likely correspond with the second heart sound. The Shannon
energy was used to calculate the envelope, since it intensifies the medium intensity signals
and attenuates the effect of low intensity signals much more than that of high intensity
signals [58]. This aspect makes it much easier to extract medium intensity signals (such as
heart sounds embedded in noise) from the recordings. Liang et al. [58] showed that the
Shannon energy performed the best in comparison to other techniques such as the Shannon
entropy, absolute value and normal energy (taking the square of the function values) in
obtaining a decent envelope of the recorded heart sound. The Shannon energy envelopes of
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 44
the extracted signals for an abnormal patient are shown in Figure 4.21. The bottom 25%
of the envelope values were discarded to eliminate small noise that might interfere in the
extraction process.
Due to the possibility of noise, it was decided to identify the peaks in the Shannon
energy envelope and group the peaks together. If two peaks were less than 40 ms apart, it
was assumed that they formed part of the same group. If not, they were put into separate
groups. The energy of the different groups were then calculated and the group with the
highest energy was extracted as S2. This assumption proved to be sufficient in extracting the
correct group as S2. Figure 4.22 shows the identified peaks for a patient with an abnormal
heart and the corresponding identified groups. It can be seen that each component identified
in Figure 4.20 forms its own separate group (with the murmur forming two groups).
The energy of the different groups were calculated by:
Egroup =
N∑
i=1
x(i)2 (4.2.10)
where x(i) is the amplitude of a specific peak i, and N is the number of peaks in a specific
group. The energy values for S2, the murmur (M1 and M2) and S1 are shown in Table 4.1.
Figure 4.23 shows the extracted second heart sound.
Now that both S1 and S2 had been extracted for three heart cycles at each location and
in each recording position, the power ratio between S1 and S2 could be calculated. The
power of each of the extracted heart sounds was calculated by calculating the energy and
0 0.5 1 1.5 2 2.5 3
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
S S S S
Figure 4.19: Start of S2 identified for a patient with an abnormal heart
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 45
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pl
itu
de
S2
Murmur
Figure 4.20: Extracted portion of signal for S2 extraction
dividing it by the duration of the extracted sound as in equation 4.2.11:
Pheart sound =
∑N
t=0 x(t)2
ttotal
(4.2.11)
where x(t) is the heart sound amplitude at a specific time instant t, ttotal is the duration of
the heart sound and N is the length of the signal. This calculation was performed for each
extracted sound at a specific location. The average power of the three extracted sounds was
calculated and used as the feature. The heart sounds recorded at the 2nd left and right
intercostal spaces and 5th and 6th left intercostal spaces were used in this calculation, since
S2 should be the loudest at the base of the heart (2nd left and right intercostal spaces) and
S1 should be the loudest at the apex of the heart (5th & 6th left intercostal spaces). The
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Time [sec]
Am
pl
itu
de
S2
M1
M2
Figure 4.21: Shannon energy envelope of a patient with an abnormal heart
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 46
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Time [sec]
Am
pl
itu
de
S2
M1
M2
Figure 4.22: Identified peaks in the Shannon energy envelope of a patient with an abnormal heart
Energy value
S2 0.76
M1 0.02
M2 0.12
Table 4.1: Energy values of different components in extracted second heart sound signal
0.46 0.48 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.64
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pl
itu
de
Figure 4.23: Extracted second heart sound for an abnormal patient
power ratio thus contributed to a total of four features used in the classification process.
Other studies that have also successfully identified the different heart sounds in the
phonocardiogram have been published . Liang et al. [58] obtained 93% correct identifcation
ratio in splitting the heart cycle into S1, systole, S2 and diastole. The algorithm was based
on the normalised average Shannon energy of the phonocardiogram signal. The correct S1’s
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 47
and S2’s were identified by a cardiologist and then compared to the heart sounds extracted
by the algorithm. Haghighi-Mood et al. [59] segmented the heart sound by using an auto-
regressive (AR) model to estimate the power spectral density of the signal and calculate
the energy in specific frames of the signal. The authors did not obtain statistical results
in quantifying the validity of their algorithm. Huiying et al. [8] developed an algorithm
for detecting S1, systole, S2 and diastole from the phonocardiogram signal. The discrete
wavelet transform was used to calculate intensity enevelopes of the signals and identify
components within the envelope. A correct identification ratio of 93 % was obtained. The
correct locations of S1 and S2 were identified by a cardiologist and the result was then
compared to the S1’s and S2’s extracted by the algorithm. The algorithm developed in
this study have not yet been assessed by a cardiologist, but preliminary results show that
the algorithm extracts S1 and S2 correctly in approximately 90% of the cases. This was
determined by visual inspection.
4.2.2 Frequency band ratio of S1 and S2
It was deemed necessary to extract the frequency information from S1 and S2 as well, since
S2 is normally higher pitched (higher frequency) than S1 [55] and a noticeable deviation
from this could indicate pathology. Johnson et al. [60] extracted frequency bands in their
study of the systolic murmur of aortic stenosis. It was decided to follow the same approach
in this study, as this would identify heart sounds that have higher frequency content than
normal.
The FFT of each extracted S1 and S2 was calculated. The magnitudes of the Fourier
coefficients in the frequency range between 0-100 Hz as well as between 100-800 Hz were
summed. These two values were then divided to yield a ratio that describes the frequency
content of the extracted signal as in equation 4.2.12.
Fratio =
∑100
f=0 C∑800
f=100 C
(4.2.12)
where C indicates the Fourier coefficient magnitude at a specific frequency f . The higher
this value, the more normal the extracted heart sound should be, since normal heart sounds
should not contain frequencies higher than about 100 Hz. Should this ratio be low, it should
indicate that higher than normal frequencies are present and could be an indication of a
murmur. Figure 4.24 shows the FFT of S1 for a normal and an abnormal patient and Figure
4.25 shows the FFT of S2 for a normal and an abnormal patient.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 48
0 100 200 300 400 500 600 700 800
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Frequency [Hz]
FF
T 
m
ag
ni
tu
de
Normal
Abnormal
Figure 4.24: FFT of S1 for a normal
and an abnormal patient
0 100 200 300 400 500 600 700 800
0
0.05
0.1
0.15
0.2
0.25
Frequency [Hz]
FF
T 
m
ag
ni
tu
de
Normal
Abnormal
Figure 4.25: FFT of S2 for a normal
and an abnormal patient
4.2.3 Power comparison between first heart sounds of different
cycles
Variation in the intensity of S1 from beat-to-beat is also an indication of abnormality [55].
S1 varies from beat-to-beat when the position of the mitral and tricuspid valves is variable
at the onset of ventricular contraction. This occurs in patients with atrial fibrillation1 , third
degree heart block 2 and ventricular pacemakers [55]. No fixed relationship exists between
atrial excitation and ventricular contraction. The position of the mitral and tricuspid valves
at the beginning of ventricular systole varies, sometimes being partially shut and at other
times being completely open, resulting in a variation in the intensity of S1.
The power of S1 was calculated for the phonocardiograms, as explained in Section 4.2.1.
The ratios between the first heart sounds of the different cycles were calculated and the
average of the three ratios was taken to yield the feature used in the classification process.
The recordings at the 5th and 6th intercostal spaces were used in the calculation, since this
is where S1 should be heard most clearly (refer Section 2.4).
Figure 4.26 shows 4 cycles of a patient with the first heart sounds indicated. It can be
seen that the amplitude of the different S1’s differ from beat-to-beat. Figure 4.27 shows the
recorded ECG of the patient. Lead V1 shows the characteristic “ripples” that are present
during atrial fibrillation. This patient suffers from severe mitral stenosis, as was confirmed
by an echocardiogram, which causes atrial fibrillation [12]. The ratios were calculated as
Cycle2
Cycle1 ,
Cycle3
Cycle2 and
Cycle3
Cycle1 . The average of these three values was calculated for the recordings
1Atrial fibrillation occurs when the atria are not depolarised in a rhythmic manner. Multiple electrical
impulses spread across the atria causing the atria to contract at random rates and in effect “flutter” [61].
2Third degree heart block, also known as complete heartblock, is when the electrical impulse that
activates atrial and ventricular contraction does not pass through the AV node. This leads to the ventricles
not necessarily contracting after the atria but at their own rhythm. The QRS-complex of the ECG thus
does not necessarily follow the P-wave [62].
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 49
0 0.5 1 1.5 2 2.5 3 3.5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
S1 S1
S1
S1
Figure 4.26: Three heart cycles of abnormal patient to illustrate S1 beat-to-beat variation
Figure 4.27: ECG of patient that suffers from atrial fibrillation (see lead V1)
at the 5th and 6th intercostal spaces as features.
4.2.4 Duration of P-R interval of electrocardiogram
Another way of determining whether S1 is increased or decreased in intensity, is by calculat-
ing the P-R interval of the electrocardiogram. The P-R interval is defined as the period from
the start of the P-wave to the start of the QRS-complex in the human electrocardiogram.
That is, from the onset of the atrial depolarization to the onset of ventricular depolarization
[63]. The impulse originates from the SA node, spreads across the atria, reaches the AV
node, moves down the interventricular septum into the Purkinje fibres, resulting in con-
traction of the ventricles. Refer to Appendix A for a detailed explanation of the electrical
conduction system of the human heart and the human electrocardiogram.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 50
Abnormal patient P-R interval [sec] Normal patient P-R interval [sec]
1 0.14 1 0.16
2 0.15 2 0.15
3 0.15 3 0.15
4 0.15 4 0.17
5 0.14 5 0.16
6 0.15 6 0.17
7 0.16 7 0.16
8 0.14 8 0.16
9 0.15 9 0.17
10 0.15 10 0.15
Table 4.2: Calculate P-R intervals for normal and abnormal patients
The normal duration for the P-R interval is between 0.12 and 0.20 seconds. According
to Werener et al. [55] a short P-R interval (0.11-0.13 seconds) is indicative of an increase
in intensity of S1. A loud S1 is produced when the mitral valve is wide open at the onset
of ventricular contraction resulting in a “louder” sound when the increase in ventricular
pressure closes the mitral valve. This is a logical consequence of the mitral valve having less
time to close due to the quicker onset of ventricular contraction. An increase in the P-R
interval ( > 0.2 seconds) results in a decrease in the intensity of S1, as this implies that the
mitral valve is almost closed at the onset of ventricular contraction, resulting in a “softer”
sound [55]. The mitral valve has more time to close due to the delayed onset of ventricular
contraction.
The P-R interval (in seconds) was calculated by the equation presented in [56]:
TP-R interval = 0.30T 1/2R-R − 0.12TR-R − 0.02 (4.2.13)
The values were calculated for recording position 1 (patient supine breathing normally)
and the values for the normal and abnormal patients are shown in Table 4.2. The values
are only shown for 10 patients of the normal and abnormal group respectively.
4.2.5 Duration of S1 and S2
The next feature that was extracted was the duration of S1 and S2. S1 is normally of longer
duration than S2 [55] and any deviation to this could indicate pathology.
4.2.6 Duration of S2 split
It was decided to extract the duration of the split of S2 as a feature to be used in the
classification process. The duration of the split of S2 is indicative of some pathologies, as
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 51
explained in Section 3.3. The problem was how to identify the two components of S2 and
measure the time difference between them. Debbal et al. [25] investigated the differences
between the components of the second heart sound of a normal heart, a patient that suffers
from aortic-coarctation and a patient that suffers from mitral stenosis. The aortic and
pulmonary components were identified by using the continuous wavelet transform (CWT),
but no automatic identification algorithm was presented in the study.
It was decided to identify the aortic and pulmonary components by also using the CWT,
but to add a degree of automaticity to the algorithm. When calculating the CWT of a signal,
a set of coefficients that indicate the degree of comparison of the signal to the analysing
wavelet at a specific scale and at a specific instant in time are generated. By taking the
absolute values of these coefficients, a “comparison envelope” is obtained. This indicates
where in frequency and time the major components of the analysed signal is situated. The
higher in value the coefficient, the higher the degree of comparison at that specific scale and
time instant.
The CWT of the extracted second heart sounds (refer Section 4.2.1) was calculated
and the absolute values taken to obtain the envelope. The Daubechies wavelet of order
7 (db7) was used and the scales at which the CWT was calculated, were from 5 to 100.
This corresponded to pseudo-frequencies between 14 and 277 Hz. It was assumed that the
two highest peaks corresponded to A2 and P2, as was done in [25]. To obtain the time
difference between the two components, the highest peak was first identified. It was then
stepped through the entire data set to identify the second highest peak. The maximum
points were identified and subsequently set to zero until two maxima differed by 10 msec or
more. This maxima was then identified as the second peak. The absolute value of the time
difference between the two components was then taken as the time difference between A2
and P2. According to Werner et al. [55], A2 and P2 differ by 10-20 msec during expiration
and 40-50 msec during inspiration during normal conditions. The 10 msec time difference
ensured that the two peaks were identified during inspiration as well as expiration. Figure
4.28 shows the “comparison envelope” as viewed from the top (a contour plot), with the
two peaks identified (the X’s) for an abnormal patient. The timing difference between these
peaks was calculated as 12.5 msec.
4.2.7 Determining the shape of systolic and diastolic murmurs
Murmurs can either be classified as systolic murmurs or diastolic murmurs, depending on
where in the cardiac cycle the murmur occurs. Depending on which pathology causes the
murmur, the shape of the murmur may vary. For instance, the intensity of the murmur
might increase from its origin (crescendo murmur) or decrease from its origin (decrescendo
murmur) or it may be a combination of the two. The intensity of the murmur may also stay
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 52
Samples
Sc
al
es
50 100 150 200 250 300
10
20
30
40
50
60
70
80
90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
X
X
Figure 4.28: CWT coefficients with peaks indicated that correspond to A2 and P2 of the second
heart sound
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
Time [sec]
Am
pl
itu
de
S1
Figure 4.29: Ejection systolic murmur of a patient suffering from aortic stenosis, showing the
crescendo-decrescendo nature of the murmur
constant for the duration of the murmur.
Systolic murmurs can be divided into ejection systolic murmurs, pansystolic murmurs
and late systolic murmurs. Ejection systolic murmurs occur due to turbulent bloodflow
through stenotic aortic and pulmonary valves. The murmur increases to a crescendo at more
or less the middle of systole and then decreases (decrescendo) and ends just before the start
of S2. Pathologies that exhibit such murmurs are aortic stenosis, pulmonary stenosis and
atrial septal defect. Figure 4.29 shows the recording of a patient that suffers from moderate
aortic stenosis. The crescendo-decrescendo shape of the murmur can clearly be seen as
indicated by the dashed lines.
Pansystolic murmurs are murmurs that extend throughout systole. The murmurs
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 53
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
Time [sec]
Am
pl
itu
de
S2
S1
Murmur
Figure 4.30: Pansystolic murmur of a patient suffering from mitral regurgitation
have a slight accentuation in mid-systole, but the ejection systolic murmur is more pro-
nounced. This type of murmur is caused by blood leaking through a valve that is not closed
properly during ventricular contraction, such as in mitral regurgitation or tricuspid regur-
gitation. It may sometimes occur in patients that suffer from ventricular septal defect in
which the hole between the left and right ventricles is relatively small. The murmur starts
simultaneously with S1 and extends throughout systole and may even obscure a bit of S2
[12]. Figure 4.30 shows the recording of a patient that suffers from mitral regurgitation. The
pansystolic nature of the murmur can clearly be seen. It obscures S1, extends throughout
systole and ends just before S2.
To calculate the shape of the murmur it was decided to extract the systole and diastole
portions from the heart cycle, break them up into three sections and calculate the root-
mean-square value (rms-value) of each. The rms-value of a function is defined as [64]:
RMS(f) =
√∫ b
a f 2(x)dx
b− a (4.2.14)
By calculating three values, one can draw a line which shows whether the murmur is
either increasing (crescendo), decreasing (decrescendo), a combination of the two (crescendo-
decrescendo), or if it stays constant throughout systole or diastole. Figure 4.31 shows the
extracted systole component from the cardiac cycle of a patient. Three sections are indicated
and it can easily be seen that the murmur is of a decrescendo nature. The calculated rms-
values are presented in Table 4.3. From the values in Table 4.3 it can be seen that the
murmur is of a decrescendo nature, since the line that is plotted through the three values
has a negative gradient.
This process was repeated for each of the three extracted cardiac cycles of both systole
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 54
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
−1.5
−1
−0.5
0
0.5
1
Time [sec]
Am
pl
itu
de
Am
pl
itu
de
Am
pl
itu
de
Am
pl
itu
de
Section 1 Section 2 Section 3
Figure 4.31: Systole extracted from the cardiac cycle showing three sections for which rms-value
are calculated to determine shape of murmur
Section rms-value
Section 1 0.46
Section 2 0.33
Section 3 0.07
Table 4.3: RMS-values of different sections of systole of an abnormal patient
and diastole. The recordings used in the calculations were the recordings at the 2nd left
and right intercostal spaces and the recordings at the 5th and 6th left intercostal spaces. It
was thought necessary to use the recordings, since some murmurs are heard at one location,
but not at another. Since three values for each section of either systole or diastole of a
recording were obtained (due to the tree cycles that were extracted), it was decided to take
the average of the three calculated values to use as a feature. This was done in order to
reduce the number of features used, as well as to obtain an overview of the nature of the
murmur under consideration. The average was simply calculated by taking the sum of the
three values and dividing the total by three. This process resulted in a total of 24 features
being generated; 12 of them from the systole part of the cardiac cycle and 12 of the diastole
part of the cardiac cycle.
4.2.8 Calculating maximum frequency in different sections of
systole and diastole
The maximum frequency in each section of systole and diastole was calculated. The FFT
of each extracted section was calculated and the frequency with the maximum amplitude
was identified and extracted. This was done for all three extracted cycles of the recorded
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 55
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
Am
pl
itu
de
Section 1 Section 2 Section 3
Figure 4.32: Systole extracted from
normal patient with subsections indi-
cated
0 100 200 300 400 500
0
0.05
M
ag
ni
tu
de
FFT of section 1
0 100 200 300 400 500
0
0.5
M
ag
ni
tu
de
FFT of section 2
0 100 200 300 400 500
0
5
x 10−4
M
ag
ni
tu
de
Frequency [Hz]
FFT of section3
Figure 4.33: FFT of each subsection
in systolic region of cardiac cycle for a
normal patient
phonocardiogram.
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time [sec]
Am
pl
itu
de
Section 1 Section 2 Section 3
Figure 4.34: Diastole extracted from
abnormal patient with subsections indi-
cated
0 100 200 300 400 500
0
0.2
M
ag
ni
tu
de
FFT of section 1
0 100 200 300 400 500
0
0.1
M
ag
ni
tu
de
FFT of section 2
0 100 200 300 400 500
0
0.05
0.1
M
ag
ni
tu
de
FFT of section 3
Frequency [Hz]
Figure 4.35: FFT of each subsection
in diastolic region of cardiac cycle for
an abnormal patient
The recordings used for the calculations were the recordings at the 2nd left and right
intercostal spaces, the 4th left and right intercostal spaces and the 5th left intercostal space.
Figure 4.32 shows the extracted systolic portion of the cardiac cycle of a patient with no
abnormalities, with the different sections into which it was subdivided, indicated. Figure
4.33 shows the FFT of each subsection. Figure 4.34 shows the diastolic portion of the
cardiac cycle of a patient that suffers from aortic regurgitation, with the different sections
indicated. Figure 4.35 shows the FFT of each different subsection.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 56
4.2.9 Identifying extra sounds: ejection sound, midsystolic click
and opening snap
When auscultating the heart, extra sounds that should not be present during normal func-
tioning of the human heart may occur. These sounds include the ejection sound, midsystolic
click and the opening snap.
Extra heart sounds can be attributed to specific events in the cardiac cycle and therefore
occur at specific times during the cardiac cycle. Ejection sounds are high-pitched sounds
that follow S1 by 0.04-0.06 seconds [55] and can be attributed to abnormalities of the aortic
and pulmonary valves [12]. Ejection sounds associated with the aortic valve are primarily
due to congenitally bicuspid aortic valves or congenital aortic stenosis, and are principally
due to the opening of the abnormal valves [12]. Ejection sounds due to abnormal pulmonary
valves are most commonly due to pulmonary stenosis, but can also be heard in patients with
idiopathic dilatation 3 of the pulmonary artery, or with pulmonary artery dilatation caused
by pulmonary hypertension [12].
Midsystolic clicks are extra heart sounds that occur in midsystole. The most com-
mon cause of a midsystolic click is mitral valve prolapse, where one of the mitral valve
leaflets moves into the left atrium during systole causing an extra sound. This is caused by
elongation or rupture of the chordae tendinae (muscles that hold the valve in place) [12].
Opening snaps are high-pithced sounds which occur in patients who suffer from mitral
stenosis. The hardened (stenosed) valve moves forward towards the left ventricle at the
beginning of diastole as the pressure decreases, resulting in an extra sound prior to S1 [12].
To check whether any of these extra sounds were present, it was argued that within the
interval in which the sounds would occur, the power of the sections would be higher if these
sounds were present than if they were not present. The systolic and diastolic regions of each
cardiac cycle were identified and broken up into different sections, as shown in Figure 4.2.9,
to search for these extra sounds. ES refers to the ejection sound, MC to the midsystolic
click and OS to the opening snap. S1s refers to the start of S1 and S1e to the end of S1.
The same numbering is applicable to S2.
For the ejection sound, 0.06 seconds was extracted from the end of S1. The power was
calculated, as was done in Section 4.2.1, by using equation 4.2.11. For the midsystolic click,
the portion of systole from Ls4 to 3Ls4 , with Ls being the length of systole, was extracted, and
the power calculated by equation 4.2.11. The opening snap follows A2 by 0.03-0.15 seconds
[12]. To search for any opening snap sounds, 0.15 seconds were extracted from the end
of S2 and the power of the extracted section was calculated by equation 4.2.11. This was
done for each of the three extracted cardiac cycles and the average of the three cycles was
3Idiopathic dilatation of the pulmonary artery is an uncommon cause of a large main pulmonary artery.
The reason for the enlargement of the artery is unknown. [65]
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 57
ES
MC
S1s S1e S2s S2e
OS
ff -
-ff
-ff
Figure 4.36: Splitting of heart cycle
Abnormal
Patient ES power [1/sec] MC power [1/sec] OS power [1/sec]
1 8.96 67.26 21.59
2 60.41 38.30 0.30
3 215.16 32.96 0.16
4 18.25 1.78 0.11
5 48.88 263.01 2.94
Table 4.4: Average power of different sections to search for extra heart sounds (Abnormal)
calculated. For the ejection sound calculations, the recording at the 2nd right intercostal
space was used, for the midsystolic click, the recording at the 5th left intercostal space was
used and for the opening snap, the recording at the 4th right intercostal space was used.
Tables 4.4 and 4.5 show the calculated power of the different sections for abnormal and
normal patients respectively. The values are shown only for five patients of each group.
Figure 4.37 shows an extracted heart cycle with the sections indicated, where MC refers to
the midsystolic click search area, ES refers to the ejection sound search area and OS refers
to the opening snap search area.
Normal
Patient ES power [1/sec] MC power [1/sec] OS power [1/sec]
1 23.37 184.88 0.11
2 21.10 77.37 0.01
3 75.68 202.18 23.61
4 62.45 175.56 0.01
5 42.26 257.70 0.22
Table 4.5: Average power of different sections to search for extra heart sounds (Normal)
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 4. METHODOLOGY 58
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1.5
−1
−0.5
0
0.5
1
1.5
Systole Diastole
MC
ES
OS
Figure 4.37: Cardiac cycle shown with extra sounds search areas
Stellenbosch University http://scholar.sun.ac.za
Chapter 5
Feature selection and classification
This chapter describes the method used in reducing the dimension of the input vector to the
classification scheme. Neural Networks are introduced as the classification scheme used, the
theory is discussed and some preliminary results are given. Statistical Overlap Factor (SOF)
is discussed as the feature reduction technique. The process is described schematically in
Figure 5.1.
Figure 5.1: Feature reduction and ANN training and testing methodology
5.1 Feature selection
It was decided to reduce the extracted features as input to the classification scheme to a
smaller set, since it was deemed unnecessary and computationally intensive to have a large
set of features as input. A variety of feature reduction methods exists: Principal Component
Analysis (PCA), Independent Component Analysis (ICA) and the Statistical Overlap Factor
59
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 60
(SOF). SOF was used to reduce the features in this project, since it was easy and efficient
to implement and proved to have satisfactory results.
5.1.1 Statistical overlap factor
The Statistical Overlap Factor (SOF) is used to determine the variability or degree of
separation between two distributions, and is defined as [66]:
SOF =
∣∣∣∣
x¯1 − x¯2
(σ1 + σ2) /2
∣∣∣∣ (5.1.1)
where x¯1 and x¯2 are the means of distributions x1 and x2, and σ1 and σ2 are the respective
standard deviations. The higher the SOF, the better the degree of separation between the
two distributions [66].
As an example, the SOF of two of the features extracted in Section 4.2 will be shown.
This will illustrate the degree of separation between the features from the different groups
(normal and abnormal). The reason for implementing the SOF to reduce the features is to
extract those features that differ most from one another in the respective groups. This will
ensure a better performance in classification.
The number of features extracted in Section 4.2 amounted to a total of 70 features.
The number of features used in the eventual classification scheme was reduced to 3. This
was determined experimentally and proved to produce the best results. To determine the
amount of features, the network was trained and tested with a different number of hidden
nodes and features to establish which combination provided the best results. The network
was trained and tested with 2, 3, 4 and 5 input features and 10, 15 and 20 hidden neurons.
0 2 4 6 8 10 12 14 16 18
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Patient
Fe
at
ur
e 
va
lu
e
Abnormal
Normal
Figure 5.2: Values of feature that ex-
hibited the greatest degree of separa-
tion between the normal and abnormal
groups
0 2 4 6 8 10 12 14 16 18
25
30
35
40
45
50
55
60
65
70
75
Patient
Fe
at
ur
e 
va
lu
e
Abnormal
Normal
Figure 5.3: Values of feature that ex-
hibited the smallest degree of separa-
tion between the normal and abnormal
groups
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 61
Number of features Run 1 Run 2 Run 3 Run 4 Run 5 Desired value
2 0.1369 0.1525 0.1430 0.1506 0.1764 0
2 0.3437 0.2698 0.3137 0.2857 0.2347 1
3 0.2644 0.2568 0.2987 0.2445 0.2584 0
3 0.8097 1.0350 0.8700 0.8711 0.9298 1
4 1.0121 1.1424 0.9547 0.8631 0.9969 0
4 0.5409 0.6969 0.7510 1.0214 1.0392 1
5 1.0853 1.0280 0.9760 0.9397 0.9478 0
5 0.5270 0.2885 0.1480 0.2345 0.2271 1
Table 5.1: Network outputs for network with 15 hidden neurons and 2, 3, 4 and 5 input features
Feature SOF
RMS-value of 3rd section of diastole - 2nd IC right 1.6153
RMS-value of 3rd section of diastole - 2nd IC left 1.6153
Max frequency of 1st section of diastole - 4th IC right 1.3869
Table 5.2: Selected features and their respective SOF
Each combination was tested five times to determine if repeatable results were achieved.
The network was trained to give an output of 0 for a normal heart sound and an output
of 1 for an abnormal heart sound. The number of features to be used were selected on the
basis of the number that gave the most correct and repeatable results, i.e. if the network
gave results that varied between 0 and 1 and the results for each training and testing run
were more or less in the same range, that number of features were selected. Table 5.1 shows
the results for the testing run for a network with 15 hidden nodes and 2, 3, 4 and 5 input
features respectively. It can clearly be seen that 3 features as input provided the best results.
The three features that were used in the final classification scheme are shown in Table
5.2. Figure 5.2 shows the different values for the RMS-value of the third section of diastole
of the recording at the 2nd right intercostal space. Figure 5.3 shows the different values for
the maximum frequency of the first section of diastole of the recording at the 4th intercostal
space. The SOF calculated for the first feature was 1.6153 and for the last feature was
1.3869. Tables B.1 and B.2 in Appendix B show the full set of extracted features and their
respective SOF.
5.2 Artificial Neural Network classification
Artificial Neural Networks (ANNs) are mathematical models inspired by biological nervous
sytems. ANNs attempt to simulate the learning process of biological systems, and can
learn to recognise certain inputs to produce particular outputs [67]. Therefore, ANNs are
commonly used for pattern detection and classification of signal features.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 62
ANNs consist of multiple interconnected processing units, known as neurons, arranged
in two or more layers. The simplest network consists of an input layer and an output
layer, as shown in Figure 5.4. The circle in the output layer represents a single neuron.
The input layer has one input and the output layer has one output. The output neuron
normally contains a transfer function that changes the input to a certain output. The
transfer function is denoted by the symbol f . Normally ANNs contain one or more hidden
layers. A hidden layer is another layer of neurons inserted between the input and output
layers.
Feed-forward networks (FFNs) are the simplest type of multiple-layer ANNs. A FFN
consists of an input layer, one or more hidden layers and an output layer. As the name
implies, a specific layer is only connected to the layer in front of it, no feedback or layer-
skipping is present. A simple FFN with one hidden layer is shown in Figure 5.5. One can
see that a specific layer is only connected to the layer in front of it.
Associated with each neuron is a specific transfer function. In Figure 5.5 the transfer
function is denoted by f . A variety of transfer functions can be used in the neurons, each
with its own benefits. The selected transfer functions have to be differentiable, though. The
most common transfer functions used is the family of sigmoid transfer functions [68]. A
typical representation is,
f(x) = 21 + e−ax − 1 (5.2.1)
which belongs to the family of hyperbolic tangent functions. These functions are known as
squashing functions, since their output is limited in a finite range of values [68]. The output
of the function in equation 5.2.1 varies between −1 and 1. Figure 5.6 shows the function
for different values of a.
The learning process of the ANN can either be supervised or unsupervised. During
supervised learning, a target output value is set for each input given to the ANN. The
ANN then tries to minimise the error between the output it calculates and the desired
response by minimising a certain cost function. The cost function can be any function that
is dependent on the calculated output of the network and the target values. This is done
by iteratively adjusting the weights and biases of each neuron until a specified tolerance has
Input layer

Output layer
"!
# 
- -f
Figure 5.4: ANN with an input layer and an output layer
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 63
x(i) -


7
SSSSSSw
ff

ff

ff
?
?
?
"!
# 
"!
# 
"!
# 
-
-
-

ff

ff

ff
-
  
  
  
@@@@@@R
AAU

AAU

AAU

-
-
-"!
# 
"!
# 
"!
# 
?
?
?
AAAU


AAAU


AAAU


-
-
-∑
∑
∑
∑
∑
∑
f
f
f
f
f
f
r-1 r
b4
b5
b6
b1
b2
b3
vr−1k
vrj
wrjk
yr−1k
yrj
Figure 5.5: ANN with one hidden layer
−5 0 5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x−values
f(x
)
a = 1
a = 2.5
a = 5
Figure 5.6: f(x) = 21+e−ax − 1 for a = 1, 2.5, 5
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 64
been met. During unsupervised learning the weights and biases are adapted only according
to the inputs, since no target values are available for training purposes. The biases for each
neuron are indicated as b1, b2 etc.
The most common way in which the weights are calculated and adapted, is probably the
backpropagation algorithm [68]. Standard backpropagation is a gradient descent algorithm1
in which the network weights are adjusted in accordance with the negative of the gradient of
the cost function. The term “backpropagation” refers to the manner in which the gradient is
computed for nonlinear multilayer networks. The gradient is first calculated for the last layer
of the network and subsequently moved to the first layer, hence the term backpropagation.
During training, when an input vector x(i) is applied to the input, the output of the
network will be ŷ(i), which is different from the desired value y(i). The weights of the
connections are computed such that an appropriate cost function,J, which is dependent on
the values y(i) and ŷ(i), i = 1, 2, . . . , N , is minimised [68]. In Figure 5.5 the weight vector
of the jth neuron in layer r can be denoted by wrj , which includes the thresholds. The
weight between neuron k in layer r-1 and neuron j in layer r is denoted by wrjk, which states
that the weight is applicable from neuron k to neuron j in layer r. The weight vector for
neuron j in layer r is defined as wrj =
[
wrj0, wrj1, . . . , wrjkr−1
]T
, where kr−1 are the number of
neurons in layer r-1.
At each iteration the weight vector is updated by
wrj (new) = wrj (old) + ∆wrj (5.2.2)
where wrj (old) is the current estimate of the unknown weights and ∆wrj is the correction
term to obtain the new estimate of the weights wrj (new).
5.2.1 The backpropagation algorithm
The backpropagation algorithm works as follows (refer to Fifure 5.5):
• Initialisation: Initialise all the weights with small random values.
• Forward computations : For each of the training feature vectors x(i), i = 1, 2, . . . , N ,
compute all the vrj (i), yrj (i) = f(vrj (i)), j = 1, 2, . . . , kr, r = 1, 2, . . . , L, from
vrj =
kr−1∑
k=1
wrjkyr−1k (i) + wrj0 ≡
kr−1∑
k=0
wrjkyr−1k (i) (5.2.3)
where kr is the number of neurons in layer r and L is the number of layers. By
definition yr0(i)≡ 1, ∀ r, i; so as to include the thresholds in the weights [68]. For the
1Please refer to Appendix C for an explanation of the gradient descent algorithm.
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 65
output layer, r = L and yrk(i) = yˆk(i), k = 1, 2, . . . , kL, i.e. the outputs of the neural
network, and for r = 1, yr−1k (i) = xk(i), k = 1, 2, . . . , k0, i.e. the network inputs. k0 is
the number of nodes in the input layer [or the length of x(i)].
Compute the cost function for the current estimate of weights from
J =
N∑
i=1
ε(i) (5.2.4)
where
ε(i) = 1N
kL∑
m=1
e2m(i) =
1
N
kL∑
m=1
(f(vLm(i))− ym(i))2 (5.2.5)
and ym(i) is the target value of output neuron m, and kL is the number of neurons
in the last layer of the network, layer L. In this case, the function ε(i) is the mean-
squared error function. Equation 5.2.4 thus states that the cost function is defined
as the sum of the N values that the function ε takes on for each training pair, (x(i),
y(i)). Other functions can be used for ε(i) as well, such as the sum of squared errors
defined as:
ε(i) = 12
kL∑
m=1
(f(vLm(i))− ym(i))2 (5.2.6)
or the cross-entropy function defined by:
ε(i) =
kL∑
m=1
(ym(i) ln yˆm(i) + (1− ym(i)) ln(1− yˆm(i))) (5.2.7)
• Backward computations: For each i = 1, 2, . . . , N and j = 1, 2, . . . , kL compute δLj (i)
from:
δLj (i) = ej(i)f ′(vLj (i)) (5.2.8)
where f ′ is the first derivative of transfer function f .
and subsequently compute δr−1j (i) for r = L,L − 1, . . . , 2, (r = 1 is the input layer)
and j = 1, 2, . . . , kr from:
δr−1j (i) = er−1j (i)f ′(vr−1j (i)) (5.2.9)
where
er−1j (i) =
kr∑
k=1
δrk(i)wrkj (5.2.10)
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 66
where wrkj = wrjk. The subscript kj simply states that the calculation is moving from
neuron j to neuron k (i.e. backwards, as would be expected).
• Update the weights: For r = 1, 2, . . . , L and j = 1, 2, . . . , kr calculate the new estimate
of the weights from equation 5.2.2 where
∆wrj = −µ
N∑
i=1
δrj (i)yr−1(i) (5.2.11)
where µ is defined as the learning rate of the training scheme.
5.2.2 Backpropagation variations
The convergence speed of the backpropagation scheme can sometimes be very slow [68] and
therefore variations to this scheme have been developed. Some of these variations include
the use of a momentum term, the use of an adaptive learning rate, the delta-delta rule and
the delta-bar-delta rule. Only the momentum term and adaptive learning rate strategies
will be discussed here.
5.2.2.1 Backpropagation with a momentum term
When the convergence of the cost function is slow with the backpropagation algorithm, it
is usually due to the fact that the change of the cost function gradient is highly oscillatory
between successive iterations [68]. To overcome this, a momentum term can be added to
the algorithm. This updates the weights and smooths the oscillatory behaviour and speeds
up convergence. The weights are then updated by:
∆wrj = α∆wrj (old)− µ
N∑
i
δrj (i)yr−1(i) (5.2.12)
The constant α is the momentum factor and usually takes on a value between 0.1 and
0.8 [68]. This approach was attempted with the training of the network in this study, but
did not give satisfactory results.
5.2.2.2 Backpropagation with an adaptive learning rate
The adaptive learning rate was also attempted with the training of the network in this study
and had very good results and was thus adopted as the training algorithm for the network.
This variation works on the principle that the learning rate, µ, is adapted, depending on
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 67
Parameters
µ ri rd c
4× 10−7 1.15 5× 10−6 1
Table 5.3: Neural network training algorithm parameters
the value of the cost function at successive iteration steps. The process can be described as
J (t)
J (t− 1) < 1, µ (t) = riµ (t− 1)
J (t)
J (t− 1) > c, µ (t) = rdµ (t− 1)
1 6 J (t)J (t− 1) 6 c, µ (t) = µ (t− 1)
where J(t) is the cost function at iteration t, ri is the factor by which the learning rate, µ,
is increased, rd is the factor by which the learning rate is decreased and c is just a limiting
factor by which the current and previous cost function ratio is allowed to differ. Typical
values are ri = 1.05, rd = 0.7 and c = 1.04 [68]. The values used in this study are presented
in Table 5.3 and were determined experimentally.
5.3 Construction and training of the neural network
For this study, a FFN with two hidden layers was implemented. The input layer consisted of
3 nodes (the 3 input features), the hidden layers consisted of 10 and 5 neurons respectively,
with the logarithmic tangent function as activation function. The formula of this function
is:
f(x) = 11 + exp(−x) (5.3.1)
The output layer consisted of one neuron with a linear function as activation function
(this function simply gives the same value as output and input). As stated previously, the
training function used was the backpropagation algorithm with the adaptive learning rate.
In certain circumstances, networks might be overtrained. This implies that the network
memorises the training data and produces the correct outputs, but does not have adequate
generalisation capabilities, i.e. it does not produce the correct results for new data. Several
methods to improve the generalisation capabilities of networks do exist, and one of these
methods is called regularisation. Regularisation tries to shrink the size of the weights, since
large weights lead to irregular error surfaces when sigmoid functions are used as activation
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 68
functions [69]. During regularisation the cost function is calculated by:
J = α
N∑
i=1
ε(i) + (1− α)εp(w) (5.3.2)
where εp(w) is the error function, dependent on the weights of the network and is equal to:
εp(w) =
1
K
K∑
k=1
w2k (5.3.3)
and α is the regularisation parameter. This method improved the results and hence was
implemented in the final network structure. In the final network structure α was set to 0.5,
giving equal importance to the weights and the network errors.
The number of features to be used in the network were determined, as explained in
Section 5.1.1. The number of hidden layers and nodes in the hidden layers were determined
by trial-and-error. The configuration that provided the best results were two hidden layers
with 10 and 5 nodes respectively. The network was trained by implementing a variation of
the leave-one-out algorithm, as was done in [70] and [39]. Thirteen abnormal data sets and
16 normal data sets were randomly selected as training data and one normal data set and
one abnormal data set were selected as test data. This process was repeated 50 times in
order to test a wide variety of combinations of normal and abnormal datasets.
It had to be decided which threshold would be used to differentiate between normal and
abnormal data, e.g. for a threshold of value t, all values below t would be set to 0 and
all values equal and above t would be set to 1, where t is a value between 0 and 1. One
way of determining the optimal threshold value is by constructing a Receiver Operating
Characteristic (ROC) curve. An ROC curve is a measure of how well a specific decision-
making system classifies between different classes. It is based on detecting the optimal
threshold to distinguish between two probability density functions. The aim is to maximise
the true-positive fraction (TPF) while at the same time minimising the false-positive fraction
(FPF). The TPF is the number of patients who are classified correctly as having the disease,
i.e. the sensitivity. The FPF is the number of people who have the disease, but are classified
as not having the disease. The FPF is related to the true-negative fraction (TNF) by the
following relation:
FPF + TNF = 1 (5.3.4)
The optimal threshold was calculated as 0.35, producing a TPF of 0.8571 and a FPF
of 0.1765. However, it was noted that the FPF was 0 and TPF was 0.7857 at a threshold
of 0.4 and, therefore, it was decided to examine the threshold in this range more closely to
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 5. FEATURE SELECTION AND CLASSIFICATION 69
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
FPF
TP
F
ROC
Ideal ROC
Figure 5.7: ROC curve for classification scheme used
ascertain if better results could be achieved. The thresholds in the range from 0.4 to 0.25
were varied in steps of 0.01 and, indeed, better results were achieved. Thresholds equal to
0.39 and 0.38 resulted in a FPF of 0.0588 and a TPF of 0.8571 and were thus deemed as
the optimal.
A measure of the effectiveness of a specific test is given by the area under the ROC curve
[11]. This value can only be between 0 and 1. The closer this value is to 1, the better the
test. The area under the curve in Figure 5.7 was calculated as 0.9076, which indicates that
the test is of a very good standard. The ideal curve is also shown in comparison with the
ROC curve of this test.
The optimal threshold value was calculated to be 0.38, resulting in a TPF of 0.8571 and
FPF of 0.0588. This in turn resulted in a sensitivity of:
Sensitivity = TPF = 85.7%
and a specificity of:
Specificity = (1− FPF )× 100% = (1− 0.0588)× 100% = 94.1%
This means that 85.7% of the abnormal patients were classified correctly as abnormal and
94.1% of the normal patients were classified correctly as normal.
Stellenbosch University http://scholar.sun.ac.za
Chapter 6
Conclusions and Recommendations
Problems regarding the cycle and feature extraction processes, as well as the ANN, are
discussed and possible solutions are presented. An overview of the positive and negative
aspects of the auscultation jacket and its application to telemedicine is also given.
6.1 Data analysis and classification system
6.1.1 Cycle extraction
In some instances cycle extraction was not possible, since the ECG recorded simultaneously
with the heart sounds produced artifacts as shown in Figure 6.1. When this signal was passed
through the first-derivative operator and the MA filter (refer Section 4.1.2) it produced
artifacts that were incorrectly labelled as QRS-peaks and this resulted in the extraction of
incorrect cycles. Recordings in which the ECG did not record properly, had to be discarded
and could not be used in the training of the Neural Network. Figure 6.1 shows the originally
recorded ECG signal after it had been low-pass filtered (refer Section 4.1). Figure 6.2 shows
the signal after it had been passed through the first derivative operator and MA filter (refer
Section 4.1.2). It can clearly be seen that peaks are presented that cannot be attributed to
QRS-peaks.
One might ask why the threshold was not simply increased. In some instances the
amplitude of the consecutive QRS-peaks differed from the maximum peak amplitude to
such an extent that when the threshold was increased, the peaks were also removed and the
cycle extraction process missed some cycles. Because of the fact that the successive ECG
cycle intervals were compared with one another, these intervals fell outside the allowed
range and resulted in no extracted cycles. All such erroneous cycles as mentioned had to
be discarded.
Two ECG recordings were taken, because it was only realised after the ECG had been
built into the jacket that the ECG recording would be needed to identify the start of S1.
70
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 71
0 1 2 3 4 5 6 7 8 9
−0.5
0
0.5
1
Time [sec]
N
or
m
al
is
ed
 a
m
pl
itu
de
Figure 6.1: Recorded GeoAxon ECG
showing artifacts that prohibited cycle
extraction
0 1 2 3 4 5 6 7 8 9
0
1
2
3
4
5
6
7
Time [sec]
N
or
m
al
is
ed
 a
m
pl
itu
de
Figure 6.2: QRS-peaks with artifacts
that resulted in wrongly extracted cy-
cles
Because this was only a prototype of the auscultation jacket, it was the easiest and quickest
solution to the problem to record an extra ECG together with the stethoscope data instead
of attempting to access the recording process of the built-in ECG and trying to synchronise
that with the recording of the stethoscopes. This would be the most sophisticated solution
and would nullify the effects of artifacts. This procedure should definitely be attempted in
the following version of the auscultation jacket.
6.1.2 Denoising
The denoising procedure of the recorded heart sounds presented a problem in that if the
wavelet threshold was set too high, some of the information was discarded. The use of high-
or low-pass filters did not solve this problem either, since the noise frequencies were in the
same range as the frequencies of interest. Averaging was not used and might provide better
results as indicated in [24].
Some of the stethoscopes (especially the stethoscopes at the 2nd left and right intercostal
spaces) did not make sufficient contact with the patient’s skin. This resulted in “noisy
scrathes” in the data, as the patient’s chest moved up and down during breathing. This
situation was exarcebated if the patient had a significant amount of chest hair. Figure
6.3 shows the recording of a normal patient at the 2nd right intercostal space, where the
stethoscope did not make sufficient contact with the skin. Figure 6.4 shows the recording
at the 4th right intercostal space for the same patient, where the stethoscopes did make
sufficient contact with the patient’s skin. Figures 6.5 and 6.6 show the denoised signals. It
can be seen that no information could be extracted from the recording at the 2nd intercostal
space in comparison to the recording at the 4th intercostal space. The recording at the 2nd
intercostal space (and thus the whole feature set) could thus not be used to train the Neural
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 72
0 0.5 1 1.5 2 2.5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time [sec]
Am
pl
itu
de
Figure 6.3: Recording of normal pa-
tient at 2nd right intercostal space
showing noise generated by insufficient
contact between stethoscope and skin
0 0.5 1 1.5 2 2.5
−2
−1.5
−1
−0.5
0
0.5
1
Time [sec]
Am
pl
itu
de
Figure 6.4: Recording of normal
patient at 4th right intercostal space
showing that less noise is generated
with sufficient contact between stetho-
scope and skin
Network.
0 0.5 1 1.5 2 2.5
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
Time [sec]
Am
pl
itu
de
Figure 6.5: Denoised recording show-
ing that no information could be ex-
tracted due to poor original recording
0 0.5 1 1.5 2 2.5
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Time [sec]
Am
pl
itu
de
Figure 6.6: Denoised recording show-
ing sufficient information to be ex-
tracted
6.1.3 Feature extraction
6.1.3.1 Duration of S2 split
The calculation of the S2 split proved to be more difficult than initially anticipated. The
procedure implemented in section 4.2.6 gave reasonable results but in some instances the
correct peaks in the CWT were not identified sufficiently. For example, if a value that
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 73
Figure 6.7: CWT of S2 showing multiple peaks
differed from the maximum value by 10 msec was situated in the range of the first peak and
was greater in value than the second peak, this value would have been incorrectly labelled
as the second component of the S2 split.
No easy solution to this problem seems to exist, since no automated algorithm that is
capable of detecting two peaks in a graph of the kind shown in Figure 6.7 currently exists.
To identify the peaks in 95% of these situations correctly and automatically will need further
research and possibly the construction and training of another neural network to determine
the relationship between these two peaks.
Other techniques that have been implemented by other researchers to identify A2 and
P2 include the use of the carotid pulse [11]. The carotid pulse is a pulse signal recorded over
the carotid artery. The algorithm proposed by Rangayyan [11] uses the dicrotic notch in
the carotid pulse as an indicator of where S2 should start. Synchronised averaging is then
used to enhance the appearance in time of A2, since it will most likely occur at the same
time relative to the start of S2. During inspiration the relative timing of P2 changes, and
because of this, P2 should be minimised by the averaging process.
6.1.4 Classification system
The biggest problem with the classification system is the lack of training data. Unfortu-
nately, time limits prohibited the collection of more data and the structure of the neural
network can only be truly evaluated once there is enough training data (probably 100 sets or
more). The fact that the system is capable of distinguishing between normal and abnormal
heart sounds despite the small amount of training data, holds big promise that the approach
followed in the construction and training of the classification system is the correct one.
Other classification techniques should also be investigated, as they may lead to better
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 74
results than ANNs. Other techniques that have been implemented in other studies, include
decision tree classifiers [41], linear discriminant functions [42] and Hidden Markov Models
[37]. Unsupervised pattern classification techniques such as the k-means algorithm and
the maximin-distance algorithm could also be researched. Feature reduction techniques
such as principal component analysis and independent component analysis should also be
considered.
The features that were eventually used as inputs to the ANN only evaluated information
from the diastolic region of the heart cycle. It was decided to train and test the ANN with
systolic information added to the inputs, to determine whether this would increase the
sensitivity and specificity of the ANN. Different combinations of systolic data, together
with the diastolic information already present, was presented to the ANN as input data.
Satisfactory results were not achieved. This can be attributed to the fact that there is not
sufficient discrimination between the systolic data of the normal and abnormal heart sounds
that enables the ANN to properly discriminate between the two classes. Better recording
procedures and denoising procedures could improve this result.
6.2 Recommendations concerning the auscultation
jacket
Since this is only the first attempt at a prototype of the auscultation jacket, numerous
improvements could be made. The greatest weakness of the jacket is that the stethoscopes,
and thus the electrodes embedded into the stethoscopes, move relative to the body surface
as the patient breathes. This in turn, causes the ECG recorded with the jacket to be
unreliable. The afore-mentioned is the reason why not more of the ECG information was
used as features in the classification scheme. Examples of these are the height of the P-wave,
where the P-wave occurs with respect to the QRS-complex, whether the QRS-complex is
inverted or not, etc. All of these features could be used to extend the screening capabilities of
the jacket beyond simple auscultation abnormalities. The screening of other cardiovascular
diseases such as myocardial infarction (heart attack) which may be diagnosed by a significant
decrease in the R-wave amplitude due to the loss of tissue would also be possible.
The side pieces of the jacket also present a problem. They move around and do not
necessarily correspond to the correct position for V6. To correct this problem it is proposed
that the stethoscopes should be fitted with double-sided tape that is capable of fixing the
position of the stethoscope (and thus the electrode) to the skin surface of the patient. This
material should be of such a nature that it could be easily removed from the stethoscope
between recordings.
The two stethoscopes, which correspond to the auscultation positions at the 2nd left
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 75
and right intercostal spaces (pulmonary and aortic areas respectively), present a two-fold
problem. Firstly, these two stethoscopes are the main reason why the jacket cannot be used
on women. The current size of the stethoscopes prohibit the stethoscopes from making
sufficient contact with the body between the breasts. The second problem is that these
two stethoscopes do not make sufficient contact with some male patients when they are in
the supine positions. This leads to noisy data that cannot be used. It is proposed that
the contour surface of the jacket be changed in such a way that it follows the contour of
the body better. This could be done by installing an inflatable bladder, that can press
the stethoscopes down until sufficient contact is made, in the jacket . Each stethoscope
could also have its own inflatable bladder and could be inflated individually. However, this
concept brings about a number of problems, such as the number of pipes needed to inflate
the jacket, whether it would be possible to seal off the bladder to avoid leakage, what size
pump would be needed to inflate the jacket, etc. All these factors will have to be taken into
account and assessed individually before making use of this idea. Another idea is to embed
the stethoscopes in sponge that is thicker at the positions that correspond to the 2nd left
and right intercostal spaces. This would also ensure that better contact is made with the
skin. Another strap could also be inserted on the side of the jacket at the height of the
2nd intercostal space. This strap could then be tightened around the body to ensure that
sufficient contact is made with the skin at these locations.
The number of cables is also a definite problem. Keeping in mind that this was a first
attempt at a prototype, the number of cables was not a major problem, but it would be
more practical to reduce the number of cables. Implementing wireless technology could pose
a solution. The hub to which the stethoscopes and ECG are connected, could be placed on
the jacket instead of hanging loose, as it is in the current prototype.
As the jacket was fitted on healthy as well as unhealthy patients, it was noted that if
the patient is significantly weakened, as in the case of someone with severe valvular heart
disease, it gets difficult to fit the person with the jacket. Since it is proposed to develop
this jacket to be used on patients who do suffer from valvular or other heart disease, this
problem is something that has to be considered and addressed.
Other aspects to think about are the stethoscopes themselves. During the recording
process the diaphragms of some of the stethoscopes became dislodged because of multiple
usage. The ECG electrode gel reacted with the glue that held the diaphragms in place and
this also aggravated the problem. To counter this, the diaphragm could be built into the
stethoscope by creating a small slot on the inside of the stethoscope casing and inserting the
flexible diaphragm. The diaphragm could also be discarded, since it only acts as a low-pass
filter and any noise that is recorded can be filtered out after the recording.
The stethoscopes are also very bulky and add a lot of weight to the jacket although the
casings are manufactured of aluminium. It might be beneficial to consider using accelerom-
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 76
eters instead of microphones to record the heart sounds, as they are much smaller, but this
still has to be researched. Accelerometers have been used in a previous study to record
lung sounds as described in Pourazad et al. [43]. The authors used two Siemens EMT 25
C piezoelectric contact accelerometers to record the lung sounds.
6.3 Application to telemedicine?
Telemedicine is defined as “the use of telecommunication technology (involving audio, video,
and graphic data) to deliver healthcare services, health education, and administrative ser-
vices to sites that are physically distant from the host or educator” [71]. According to the
Medical Research Council of South Africa, the South African government is “committed
to providing basic health care to all South African citizens” and “to achieve this goal, the
government has identified Telemedicine as a strategic tool for facilitating the delivery of
equitable healthcare and educational services”.
Telemedicine has proved useful and necessary in developing countries. In India, for
example, the Online Telemedicine Research Institute (OTRI) has made a great impact on
the lives of people. In January of 2001, an earthquake hit the city of Bhuj in Western India
and left thousands dead and homeless. Within a day, the OTRI in Ahmedabad established
satellite telephone links and set up all equipment necessary to provide emergency medical
care through telemedicine. Ahmedabad is 300 km from Bhuj. The satellite phones were soon
replaced by VSAT with phone lines and ISDN. A fully-fledged telemedicine system was used
for teleconsultation in pathology, radiology, and cardiology over ISDN lines, and between
district hospitals near Bhuj and other in Ahmedabad. Seven-hundred and fifty sessions
consisting primarily of X-rays and ECGs of patients, were transmitted in one month to
specialists in Ahmedabad [71].
Although telemedicine has proved helpful in some cases, it also has its limitations. In
some areas, the infrastructure is extremely poor and it is very difficult to implement a
telemedicine centre in an area that has little or no infrastructure. The problem is that
physicians tend to leave the rural areas for bigger cities and improperly trained technicians
are left in charge of health facilities. These technicians rarely have any experience working on
computers [72]. In the Alto Amazonas province of Peru, 64.9% of the healthcare personnel
have never used a computer and 89% of the healthcare personnel have never used e-mail. It
may be argued that since Peru is also a developing country, more or less the same conditions
exist in South Africa. This poses a big problem in setting up a telemedicine centre in the
rural areas of South Africa.
No roads, administrative problems (information being sent has to be paid for by the
health staff themselves) and no feedback to the rural health centres are just some of the
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 77
challenges in such areas. A delay of 13 months on the arrival of information is common
practice in these areas. Many healthcare personnel have to pay their travel costs themselves.
On average this accounts to 17% of his or her salary [72]. The use of electronic mail could
reduce the amount of time needed to travel by coordinating certain travels with others.
Training of healthcare personnel could also be improved by supplying the information better
and faster to rural areas. 59% of the healthcare staff in Alto Amazonas said they do not
attend courses because the information arrives too late or not at all.
On the practical side, many of these areas do not have electricity, no public telecommu-
nication infrastructure, have limiting purchasing power, maintenance costs are high due to
the poor infrastructure and few well-trained people in managerial positions are available.
To combat this, any telemedicine equipment installed has to fulfil the following conditions
[72]:
• Be highly robust
• Any technological platform must demand low infrastructure, maintenance and opera-
tion costs
• Low energy consumption
• Technical personnel will have to be trained in system management, maintenance, and
repair
However, in sub-Saharan Africa telemedicine has been implemented in several countries
to address the extremely poor medical infrastructure. Sub-Saharan Africa is home to 33 of
the 48 least developed countries in the world and telemedicine would thus have far-reaching
effects in these countries by making proper healthcare accessible to everyone [71].
The final aim of the auscultation jacket is to distribute the jacket to rural areas that do
not have sufficient healthcare facilities. Patients will then be recorded with the jacket and the
data will be sent via a communication link to physicians who are able to interpret the data
and provide feedback on each patient. Many factors have to be taken into consideration
before such a system can be implemented, but the possibility is real and will have far-
reaching positive effects if the process is managed and implemented correctly.
6.4 Other applications
The application of the auscultation jacket can be expanded to include educational appli-
cations as well. For instance, if patient contact is prohibited or limited due to a specific
disease as in [73] where patient contact was prohibited due to severe acute respiratory syn-
drome (SARS), the monitoring of patients could be done with the help of the auscultation
Stellenbosch University http://scholar.sun.ac.za
CHAPTER 6. CONCLUSIONS AND RECOMMENDATIONS 78
jacket. In the mentioned study, cardiac sounds were recorded with Littmann model 4000
stethoscopes and played back to students to assist in teaching auscultation skills. The aus-
cultation jacket could have been implemented to record all the heart and lung sounds, ECG
and ICG for educating purposes at a later stage.
The auscultation jacket, together with the classification system, can be used as a train-
ing tool for students to determine whether a patient has cardiovascular (or pulmonary)
pathology or not. According to Tuchinda et al. teaching cardiac auscultation skills has
been difficult “due to time constraints and the impracticability of examining large numbers
of patients with cardiac pathology”[74]. A database consisting of recordings made with the
auscultation jacket can be made and students can thus access recordings made at different
locations on the body in their own time. This would enable them to make their own diag-
nosis and check it against the results of the classification system. This eliminates the need
for examination of a large number of patients. The doctor-patient relationship is a very
necessary and important one and it is not proposed to do away with this relationship.
According to March et al. “the ability of many of today’s health care professionals
to correctly identify normal and abnormal heart sounds continues to diminish” [4]. The
auscultation jacket and classification system can help overcome this obstacle by providing
diagnostic information to physicians who are uncertain of a specific pathology.
Stellenbosch University http://scholar.sun.ac.za
Appendices
79
Stellenbosch University http://scholar.sun.ac.za
Appendix A
Relevant technologies
Many techniques that are used to diagnose heart disease exist. These techniques in-
clude electrocardiography (ECG), echo-cardiography, impedance cardiography (ICG), nu-
clear stress testing, coronary angiogram, computed tomography (CT) scan, PET (positron
emission tomography)/CT scan and magnetic resonance imaging (MRI). Only the ECG,
echo-cardiography and ICG will be discussed here.
A.1 Electrocardiogram (ECG)
The Electrocardiogram (ECG) measures the electrical activity of the human heart. The
electrical impulses that make the heart contract spread across the heart in a specific manner
and any deviation from this could indicate pathology. To explain and understand the ECG
waveform, the electrical system of the heart first has to be explained. Refer to Figure A.1
for a diagram of the heart and its electrical system. A normal ECG graph is shown in Figure
A.2. This ECG was recorded with the auscultation jacket. The heart’s pacemeaker is the
Sinoatrial (SA) node. Action potentials 1 are generated here and travel from here through
the rest of the electrical system. The action potential first travels down the anterior, middle
and posterior internodal tract as well as Bachmann’s bundle. In doing so, the muscle cells of
the atria depolarise (the action potential is raised from -90 mV towards 0 mV) and the atria
begin to contract. This corresponds to the P-wave in the ECG waveform and is shown as
section C in Figure A.2. The P-wave has a duration of 60−80 ms. When the action potential
arrives at the AV node, there is a delay of approximately 60− 80 ms [11]. This is known as
the P-Q segment and is shown as section D in Figure A.2. The action potential now travels
down the left and right bundle branches in the interventricular septum, into the conduction
pathways (also known as Purkinje fibres). As the action potential moves upward through
1A change in the membrane potential of cells (from the normal -90 mV), initiated by a change in the
membrane permeability to sodium ions [10].
80
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 81
Figure A.1: Schematic of the electrical system of the human heart [75]
the heart from the apex, the ventricles contract, resulting in the QRS-complex in the ECG
waveform (section E in Figure A.2) lasting about 80 ms. The QRS-complex is relatively
large in comparison to the other waveforms in the ECG, since the mass of the ventricles is
much larger than the mass of the atria. The action potential duration of ventricular muscle
cells is relatively long (300 − 350 ms) [11]. This results in a section of little activity after
the QRS-complex known as the S-T segment (section F ) and lasts for about 100 − 120
ms. Repolarisation (membrane potential of cardiac muscle cells return to -90 mV) of the
ventricles lasts for about 120− 160 ms and can be seen in the ECG as the T-wave (section
G). Section A is known as the P-Q interval and shows the electrical potential of the atria.
Section B is known as the Q-T interval and shows the electrical potential of the ventricles
during a single cardiac cycle.
Sometimes a U-wave is present after the T-wave (not present in Figure A.2), but the
origin of the U-wave is still a topic of debate [76]. Three hypotheses exist on the genesis of
the U-wave: late repolarisation of Purkinje fibres, late repolarisation of other portions of the
left ventricle and alteration in the normal action potential shape by after-potentials, which
are most likely generated by mechano-electric feedback [76]. The U-wave has the same
polarity as the T-wave in normal subjects; when the polarity of the U-wave is reversed, it
is, therefore, of great clinical importance [76].
The electrical potentials are measured by electrodes placed on the surface of the skin.
The electrodes are placed at different positions on the body, depending on which ECG
configuration is being used, e.g. 12-lead ECG, 6-lead ECG, 3-lead ECG or 1-lead ECG. For
a 12-lead ECG 6 electrodes are placed on the thorax in the V1-V6 positions (refer to Figure
A.3). For the remaining four electrodes there are two possible combinations: one electrode
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 82
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.5
1
1.5
2
2.5
Time [sec]
Am
pl
itu
de
P
Q
R
S T
A B
C D E F G
Figure A.2: Normal ECG wave
can be placed on each wrist and foot, or one electrode can be placed on each shoulder and
hip (refer to Figure A.4 and Figure A.5).
The graphs that are displayed on the standard ECG recording correspond to how the
action potentials spread through different axes of the heart. In effect, one looks at how
the heart contracts from different angles around the heart. Specific patterns are associated
with each view and any deviation from this could indicate pathology. The different axes are
shown in Figure A.6.
To understand how the deflections for specific axes are formed, the volume conductor
principle has to be explained. Consider a mass of ventricular muscle placed in a bath of salt
Figure A.3: V1-V6 positions [77]
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 83
Figure A.4: Configuration for ECG
electrodes on wrists and feet [77]
Figure A.5: Configuration for ECG
electrodes on shoulders and hips [77]
water. In the resting state of the ventricular muscle, the outside of the cells is positively
charged with respect to the inside of the cell. For heart muscle cells, the resting membrane
potential (RMP) is approximately -90 mV [10]. If two electrodes are placed on either side
of the ventricular mass, no potential difference will be measured between the electrodes, as
in Figure A.7. When an action potential spreads across the heart (from negative to positive
in this case), however, some of the muscle cells on the left side of the ventricular mass is
charged negatively with respect to the inside (due to the in- and outflow of sodium and
potassium ions) and a positive deflection is measured on the positive electrode.
A.1.1 Atrial depolarisation
Atrial depolarisation starts with the activation of the SA node. As the wave of depolarisation
spreads across the atria, some of the muscle cells are negatively charged with respect to the
inside (depolarised) and some of the muscle cells are still at their RMP(polarised) and,
therefore, positive with respect to the inside. A positive deflection is thus seen on the ECG
Figure A.6: Heart axes as viewed by different leads
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 84
Figure A.7: Potential difference between two sides of ventricular muscle mass is zero when there
is no depolarisation wave, and positive when depolarisation moves towards the positive electrode
[78]
tracing. Once all of the atrial muscle cells are depolarised, the net potential difference is
again zero and no deflection is seen. Figure A.8 explains this, as well as the spread of the
repolarisation wave schematically. When repolaristaion occurs, the opposite happens. The
muscle cells that were first depolarised are first to repolarise and their net charge with respect
to the inside of the cell is once again positive. Because the cells that are still negative are
now closer to the positive electrode, and a net negative potential difference exists between
the two electrodes, a negative deflection is seen in the ECG tracing. Once all the cells are
positive with respect to the inside (repolarisation has stopped), the net potential difference
is zero and no deflection is seen on the ECG tracing.
Figure A.8: Spread of atrial depolarisation and repolarisation waves and resulting deflections in
ECG tracing [78]
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 85
Figure A.9: Spread of ventricular depolarisation wave showing resulting deflections in ECG tra-
cing [78]
A.1.2 Ventricular depolarisation
The spread of the repolarisation wave across the ventricles is as shown in Figure A.9. This
explanation is based on lead II of the ECG axis. The ventricular depolarisation waves
originate at the AV node and first spread down the interventricular septum through the
left and right bundle branches (refer Figure A.1). The septum thus depolarises from left to
right, as shown in sketch A of Figure A.9. When viewing Figure A.9, it should be kept in
mind that the left side of the heart is to the right of the sketch. The plus and minus in each
sketch shows the position of the positive and negative electrodes for the lead II orientation.
The mean electrical vector is orientated in such a way that it moves away from the positive
electrode and the result is a negative deflection in the ECG tracing. This corresponds to
the Q-wave in the QRS-complex.
The repolarisation wave now moves further down the septum and reaches the apex of
the heart and begins to move through the Purkinje fibres (sketch B in Figure A.9). The
mean electrical vector is almost parallel to the orientation of lead II and moves towards the
positive electrode, thus resulting in a considerable positive deflection in the ECG tracing.
This corresponds to the R-wave in the QRS-complex of the ECG tracing.
Next the wave moves up the ventricles, almost totally depolarising the right ventricle
and partially depolarising the left ventricle. This is because the left ventricle is much larger
than the right ventricle. The mean electrical vector is orientated as shown in sketch C of
Figure A.9 and results in a small positive deflection in the ECG tracing. The last regions to
depolarise are the topmost areas of the ventricles and the resulting mean electrical vector
points upwards (towards the negative electrode) and to the left, resulting in a minor negative
deflection in the QRS-complex (the S-wave).
During ventricular repolarisation, the cells that were depolarised last are the first ones
to repolarise, therefore, the repolarisation waves move in the opposite direction to the
depolarisation waves, thus resulting in a positive deflection in the ECG tracing, whereas
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 86
Figure A.10: Standard bipolar limb leads for a 12-lead ECG configuration [78]
atrial repolarisation results in a negative deflection in the ECG tracing.
As the heart contracts (depolarises), a multitude of depolarisation waves spread across
the heart. The mean electrical vector is the sum of all these vectors (waves) of depolarisation
at a specific instant in time [78]. Physicians often refer to the mean electrical axis of a
patient. This refers to the average of all the mean electrical vectors and is normally in
the range of 0 ◦ to +90 ◦. If the mean electrical axis is less than 0 ◦ it is termed left axis
deviation and may be indicative of diseases such as inferior myocardial infarction or left
anterior hemiblock [57]. If the mean electrical axis is greater than +90 ◦ it is termed right
axis deviation. In order to determine the mean electrical axis, one should first find the
electrical axis that is biphasic (equal positive and negative deflections). Next, the electrical
axis that is perpendicular to the afore-mentioned axis, that has a net positive deflection,
should be established. The latter axis is then the mean electrical axis.
A.1.3 The lead system
The volume conductor principle also applies when viewing the deflections of a specific ECG
lead (or axis). For the different ECG axes (I, II, V1, aVF etc.), the positive and negative
electrodes are placed at different locations on the body and, thus, different deflections will
be seen by each one. For example, the standard bipolar limb leads are known as Leads I, II
and III and their electrodes are placed on the body as shown in Figure A.10.
As can be seen in Figure A.10, for lead I the positive electrode is situated on the left arm
(LA) and the negative electrode is situated on the right arm (RA). For leads II and III the
positive electrode is situated on the left leg (LL) and the negative electrodes are placed on
the right arm and left arm respectively. Together these three leads form what is known as
Einthoven’s Triangle and examine the depolarisation of the heart from different angles and
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 87
Figure A.11: Einthoven’s Triangle and the Axial Reference System [78]
together they form the Axial Reference System, as shown in Figure A.11. According to the
volume conductor principle, a wave of depolarisation that is moving towards the positive
electrode of lead I, will produce a positive deflection in lead I. The same applies to all the
other leads. If the wave of depolarisation moves towards a positive electrode, a positive
deflection will be seen and if the wave of depolarisation moves away from the positive
electrode, a negative deflection will be seen. Just the same, if a wave of repolarisation
moves towards the positive electrode, a negative deflection will be seen, whereas if the
repolarisation wave moves towards the negative electrode, a positive deflection will be seen.
These rules are universally accepted and apply to all ECG measurements [78].
The leads aVR , aVL and aVF are known as the unipolar augmented limb leads. They
are termed “unipolar” because there a single positive electrode is referenced against a com-
bination of all the other limb electrodes [78]. The positive electrodes are situated on the
left arm (aVL ), right arm (aVR ) and left leg (aVF ). The position of these electrodes and
their positions on the Axial Reference System are shown in Figure A.12.
The three limb leads and the three augmented limb leads view the electrical activity of
the heart form the frontal plane. In addition to this, there are 6 precordial unipolar chest
leads (V1 - V6) that view the electrical activity of the heart in a plane perpendicular to the
frontal plane. Their position on the chest and in the perpendicular plane is shown in Figure
A.13.
The deflections in these tracings are produced in the same way as for all the other leads.
Leads V1 and V2 view the anterior septal region of the heart, leads V3 and V4 view the
anterior apical (apex) region of the heart and leads V5 and V6 view the anterior lateral
region of the heart.
A.2 Echo-cardiography
The echo-cardiogram uses ultrasound waves to examine the heart of a patient. An echo-
cardiogram is also used in evaluating the fetus of a pregnant female. Ultrasonic gel is
applied to the thorax of the individual on the area of interest to aid in the transmitting of
the ultrasonic waves. A transducer then sends ultrasonic waves through the body and these
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 88
Figure A.12: Unipolar augmented
limb leads position [78]
Figure A.13: Precordial unipolar
chest leads positions [78]
are reflected back by the heart, collected by the transducer and transformed to an image of
the heart.
The single dimension (1-D) echo-cardiogram is known as M-mode echo-cardiography. In
M-mode echo-cardiography a narrow beam is directed towards the region of interest and the
output is the movement of the structures through which the beam passes as a function of
time. Characteristic patterns are associated with certain pathologies such as mitral stenosis
and pericardial effusions, which are easily recognised [12].
A more advanced type of echo-cardiography, 2-D echo, gives a two-dimensional sectional
view of the heart. This is probably the most well-known type of echo-cardiogram. In
this type of echo-cardiogram the transducer beam moves across the chest wall in a sweeping
manner, updating the picture with each sweep. The transducer can either be of the sweeping
or rotating type. In the rotating transducer the head has a multitude of transducers inside
a liquid-filled dome. One transducer emits a beam and receives the echo back and the next
transducer in line takes over and does the same and so on and so forth. In this manner
the picture is continuously updated so that the movement of the structures are displayed
in real-time [79]. Figure A.14 shows how the transducer beam sweeps across the heart and
the resulting image formed. Figure A.15 shows an echo-cardiogram of the four chambers of
the heart.
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 89
Figure A.14: Sweeping of echo-cardiography transducer beam and how resulting image is formed
[79]
Figure A.15: Echo-cardiogram of normal heart showing different chambers [80]
A.3 Impedance cardiography (ICG)
Impedance cardiography is a technique by means of which the resistance of the thorax to
a small current is measured. From this measurement, parameters such as cardiac output,
stroke volume, etc. are calculated. Eight electrodes are placed on the thorax and neck for
use in the measurements (see Figure A.16). The inner electrodes (white & red) are the
sensing electrodes, while the outer electrodes (black & green) are the source and sink of the
measurement current. The white electrodes must be placed along the line of the root of
the neck and the red electrodes are placed on either side (midaxillary line) of the patient
at the xiphoid process level (diaphragm level). The black electrodes are placed above the
white electrodes at 5 cm distance and the green electrodes are similarly placed below the
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 90
Figure A.16: Electrode positions for ICG measurements
red electrodes.
A high-frequency current (65 kHz, 7 µA RMS) is introduced between the black and
the green electrode pairs. The current flows through the thorax, parallel to the spine,
primarily through the aorta and superior and inferior vena cavae, since this is the path of
least resistance. Due to this high-frequency current, a high-frequency voltage is developed
across the thorax and sensed by the white and red electrode pairs. This voltage is directly
proportional to the impedance of the thorax [81]. As the impedance, known as Thoracic
Electrical Bioimpedance (TEB), in the thorax changes, the changes are measured and certain
parameters are calculated.
Before one starts an ICG test, the height(H) in cm and weight(W) in kg of the patient
have to be entered into the program. The ideal weight of a male of the specified height is
calculated by:
Wideal male = 0.524×H − 16.58 (A.3.1)
The volume of electrically participating tissue (VEPT) is then calculated as:
V EPTmale =
(0.17×H)3
4.25 × (1 + 0.65× (
W
Wideal male
− 1)) (A.3.2)
The body surface area (BSA) is calculated from the DuBois & DuBois [81] formula as:
BSA = W 0.425 ×H0.725 × 0.007184 (A.3.3)
The stroke index (SI ) 2 and cardiac index (CI ) 3 can then be calculated by:
2The amount of blood pumped by the left ventricle in one heart beat interval, indexed by the BSA
[ml/m2].
3The amount of blood delivered by the heart to the body in one minute, indexed by the BSA [l/min/m2].
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 91
SIa ctual =
2
BSA ×
V EPTmale
6800 × SITEBCO (A.3.4)
and
CIactual =
2
BSA ×
V EPTmale
6800 × CITEBCO (A.3.5)
where SITEBCO and CITEBCO are standardised values for a male of height (H) = 180 cm and
weight (W) = 80 kg, transmitted by TEBCO and are equal to 49 ml/m2 and 4.9 l/min/m2
respectively.
The parameters that were measured are (definitions were obtained from [53; 82; 81]):
• Heart rate: The heart rate is the number of times the heart beats in one minute and
is measured in beats per minute.
• Ventricular ejection time (VET): VET is the time from aortic valve opening to
closure during the systolic portion of the cardiac cycle. The duration of VET is
shortened by heart failure.
• Pre-ejection period (PEP): PEP represents the time from onset of electrical ac-
tivity of the heart (measured by the start of the QRS-complex) to the opening of the
aortic valve (the onset of left ventricular ejection). PEP is shortened by hyperadre-
nergic4 states and prolonged by heart failure.
• Thoracic fluid conductivity (TFC): This is the total conductivity of the thorax,
measured at 65 kHz between the root of the neck and the diaphragm. TFC represents
the total contribution of all the conductive fluids in the thorax.
• Ejection phase contractility index (EPCI): This represents a combination signal
and takes into account the maximum rate of volumetric change of blood within the
aorta and the maximum rate of alignment of red blood cells. The measurement is
normalised by TFC to produce a per second rate.
• Inotropic State Index (ISI): ISI represents a normalised image of maximum acce-
leration of aortic blood flow and is measured in 1/sec2.
• Ejection Fraction (EF): This is the percentage of blood held within the ventricle
at the end of diastole, which is ejected into the vasculature.
4Adrenergic refers to a synaptic terminal that releases norepinephrine upon stimulation [10]. Hypera-
drenergic will then refer to such a state when abnormally large amounts of norepinephrine is released upon
stimulation.
Stellenbosch University http://scholar.sun.ac.za
APPENDIX A. RELEVANT TECHNOLOGIES 92
• Stroke Index (SI): The SI is the volume of blood pumped by the left ventricle, over
one heart beat interval indexed by the body surface area (BSA). It is measured in
ml/m2.
• Cardiac Index (CI): This is the amount of blood pumped by the left ventricle in
one minute, indexed by the body surface area (BSA) and is measured in l/min/m2.
• Respiratory Rate (RR): This is the number of breaths per minute.
Stellenbosch University http://scholar.sun.ac.za
Appendix B
Data sheets and data tables
The data sheet for the condenser microphones used and tables with the extracted features
and their respective SOF are given. The selected features are indicated.
93
Stellenbosch University http://scholar.sun.ac.za
APPENDIX B. DATA SHEETS AND DATA TABLES 94
Feature SOF Selected(Y/N)
RMS-value of 3rd section of diastole - 2nd IC right 1.6153 Y
RMS-value of 3rd section of diastole - 2nd IC left 1.6153 Y
Max frequency of 1st section of diastole - 4th IC right 1.3869 Y
S1 duration 1.3213 N
P-R interval duration 1.2296 N
RMS-value of 2n d section of diastole - 2nd IC right 1.1271 N
RMS-value of 2n d section of diastole - 2nd IC left 1.1271 N
Frequency band ratio of S2 0.9828 N
Max frequency of 2n d section of diastole - 2nd IC right 0.9289 N
Max frequency of 2n d section of diastole - 2nd IC left 0.9289 N
RMS-value of 2n d section of systole - 5th IC left 0.9142 N
Max frequency of 3rd section of systole - 2nd IC right 0.9033 N
Max frequency of 3rd section of systole - 2nd IC left 0.9033 N
Average power of S1 to S2 ratio - 2nd IC right 0.8986 N
Average power of S1 to S2 ratio - 2nd IC left 0.8986 N
RMS-value of 2n d section of diastole - 6th IC left 0.7827 N
Max frequency of 2n d section of systole - 5th IC left 0.7801 N
Max frequency of 3rd section of diastole - 2nd IC right 0.7434 N
Max frequency of 3rd section of diastole - 2nd IC left 0.7434 N
Average power of S1 to S2 ratio - 5th IC left 0.7245 N
RMS-value of 3rd section of diastole - 6th IC left 0.6951 N
Average power of S1 to S2 ratio - 6th IC left 0.6941 N
Max frequency of 1st section of systole - 4th IC right 0.6859 N
Max frequency of 2n d section of systole - 4th IC left 0.6754 N
Ejection sound power 0.6511 N
Max frequency of 3rd section of diastole - 4th IC left 0.6496 N
RMS-value of 2n d section of systole - 6th IC left 0.6415 N
RMS-value of 2n d section of systole - 2nd IC right 0.6389 N
RMS-value of 2n d section of systole - 2nd IC left 0.6389 N
Max frequency of 1st section of systole - 4th IC left 0.6207 N
Max frequency of 3rd section of diastole - 4th IC right 0.5942 N
RMS-value of 3rd section of systole - 2nd IC right 0.5619 N
RMS-value of 3rd section of systole - 2nd IC left 0.5619 N
S1 beat-to-beat power comparison - 5th IC left 0.5345 N
Max frequency of 1st section of systole - 2nd IC right 0.5314 N
Table B.1: Extracted features and their respective SOF
Stellenbosch University http://scholar.sun.ac.za
APPENDIX B. DATA SHEETS AND DATA TABLES 95
Feature SOF Selected(Y/N)
Max frequency of 1st section of systole - 2nd IC left 0.5314 N
Max frequency of 3rd section of systole - 4th IC right 0.5255 N
Midsystolic click power 0.4822 N
Max frequency of 2n d section of diastole - 4th IC left 0.4655 N
A2- P2 split timing - 2nd IC right 0.4416 N
Max frequency of 1st section of diastole - 4th IC left 0.4345 N
RMS-value of 1st section of systole - 5th IC left 0.4244 N
Max frequency of 2n d section of diastole - 4th IC left 0.4147 N
RMS-value of 3rd section of diastole - 5th IC left 0.4126 N
Max frequency of 3rd section of systole - 4th IC left 0.4121 N
Frequency band ratio of S1 0.4115 N
RMS-value of 1st section of diastole - 2nd IC right 0.4101 N
RMS-value of 1st section of diastole - 2nd IC left 0.4101 N
Max frequency of 3rd section of diastole - 5th IC left 0.3972 N
Max frequency of 2n d section of systole - 2th IC right 0.3932 N
Max frequency of 2n d section of systole - 2th IC left 0.3932 N
Openingsnap power 0.3931 N
RMS-value of 1st section of systole - 6th IC left 0.3800 N
RMS-value of 1st section of diastole - 5th IC left 0.3534 N
Max frequency of 2n d section of systole - 4th IC right 0.3253 N
Max frequency of 1st section of systole - 5th IC left 0.2810 N
S1 beat-to-beat power comparison - 6th IC left 0.2790 N
Max frequency of 1st section of diastole - 5th IC left 0.2719 N
RMS-value of 2n d section of diastole - 5th IC left 0.2324 N
Max frequency of 3rd section of systole - 5th IC left 0.2311 N
RMS-value of 3rd section of systole - 6th IC left 0.2183 N
S2 duration 0.1979 N
Max frequency of 2n d section of diastole - 5th IC left 0.1820 N
Max frequency of 1st section of diastole - 2nd IC right 0.1360 N
Max frequency of 1st section of diastole - 2nd IC left 0.1360 N
A2- P2 split timing - 2nd IC left 0.1104 N
RMS-value of 3rd section of systole - 4th IC right 0.1014 N
RMS-value of 1st section of diastole - 6th IC left 0.0595 N
RMS-value of 1st section of systole - 2nd IC right 0.0218 N
RMS-value of 1st section of systole - 2nd IC left 0.0218 N
Table B.2: Extracted features and their respective SOF (continued)
Stellenbosch University http://scholar.sun.ac.za
Appendix C
Gradient descent algorithm
The gradient descent algorithm is an optimisation technique by which the minimum of a
function is found. If the maximum of the function is sought, the method is known as the
gradient ascent algorithm.
The gradient descent algorithm starts by having an initial estimate of the minimum
point of the function, say J(ζ1, ζ2) = J(ζ). The new ζ is calculated by
ζnew = ζold + ∆ζ (C.1)
where
∆ζ = −µ∂J (ζ)∂ζ (C.2)
where µ is positive [68]. Figure C.1 shows the contour plot of a function with the minimum
of the function indicated by X. If the initial point is chosen at x1, the gradient descent
algorithm searches in the direction of the negative of the gradient. The gradient at the
point x1 is shown as a straight line and the negative of the gradient at that point is in a
direction perpendicular to the gradient at that point. The algorithm then calculates the
new values of ζ and moves to the following point, say x2. The process is then repeated until
the minimum value of the function is reached, say point X.
The amount by which the function steps towards the minimum at each iteration is
dependent on the learning rate µ, as shown in equation C.2. If the learning rate is too large,
the algorithm might miss the minimum by overstepping, whereas convergence may take a
long time if the learning rate is too small [68]. If the learning rate is chosen correctly, the
algorithm converges to a point where the gradient is zero. This might not necessarily be
the global minimum of the function, but it might be a local minimum or a saddle point.
96
Stellenbosch University http://scholar.sun.ac.za
APPENDIX C. GRADIENT DESCENT ALGORITHM 97
ζ1
ζ 2
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
60
70
80
90
100
−4
−2
0
2
4
6
x1
x2X
Figure C.1: Contour plot of function, showing how gradient descent algorithm steps towards the
minimum function value
Stellenbosch University http://scholar.sun.ac.za
List of References
[1] Kearney, M.: Medical research council statistics of 2003. Heart Foundation South Africa,
April 25 2006. E-mail.
[2] American Heart Association: International Cardiovascular Disease Statistics. April 2006.
Available at: http://www.americanheart.org/downloadable/heart/1140811583642-
InternationalCVD.pdf
[3] Leeder, S., Raymond, S. and Greenberg, H.: A race against time: The
challenge of cardiovascular disease in developing economies. Available at:
http://www.ahpi.health.usyd.edu.au/pdfs/colloquia2004/
leederracepaper.pdf [2006, April 25], 2004.
[4] March, S., Bedynek, J. and Chizner, M.: Teaching cardiac auscultation:effectiveness of a
patient-centered teaching conference on improving cardiac auscultatory skills. Mayo Clinical
Proceedings, vol. 80, pp. 1443–1448, 2005.
[5] Saha, G. and Kumar, P.: An efficient heart sound segmentation algorithm for cardiac diseases.
In: Proceedings of the IEEE India Annual Conference. 2004.
[6] Obaidat, M.: Phonocardiogram signal analysis: techniques and performance comparison.
Journal of Medical Engineering & Technology, vol. 17, pp. 221–227, 1993.
[7] Tavel, M.E.: Cardiac auscultation: A glorious past - and it does have a future! Circulation,
vol. 113, pp. 1255–1259, 2006.
[8] Huiying, L., Sakari, L. and Iiro, H.: A heart sound segmentation algorithm using wavelet
decomposition and reconstruction. In: Computers in Cardiology, pp. 1630–1633. 1997.
[9] Abbruscato, C.: The history of auscultation. October 2006.
Available at: http://www.telemedtoday.com/articles/telesteth.html
[10] Martini, F.H. and Bartholomew, E.F.: Essentials of Anatomy and Physiology. 3rd edn. Pren-
tice Hall, New Jersey, 2003.
[11] Rangayyan, R.M.: Biomedical Signal Analysis: A Case-Study Approach. Wiley-Interscience,
New York, 2002.
98
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 99
[12] Munro, J. and Edwards, C.: Macleod’s Clinical Examination. 8th edn. Churchill Livingstone,
New York, 1993.
[13] University of Chicago: Sinus Rhythm. April 2006.
Available at: http://pediatriccardiology.uchicago.edu/PP/abnl%20rhythm%20for%20pa-
rents%20body.htm
[14] Medterms: Definition of Hypertrophy. May 2006.
Available at: http://www.medterms.com/script/main/art.asparticlekey=25464
[15] Human: October 2006.
Available at: http://www.humans.be/images/sites.jpg
[16] Tavel, M.E.: Cardiac auscultation: A glorious past - but does it have a future? Circulation,
vol. 93, pp. 1250–1253, 1996.
[17] Reed, T.R., Reed, N.E. and Fritzon, P.: Heart sound analysis for symptom detection and
computer-aided diagnosis. Simulation Modelling Practice and Theory, vol. 12, pp. 129–146,
2004.
[18] de Vos, J.P.: Automated Pediatric Cardiac Auscultation. Master’s thesis, Electric & Electronic
Engineering, University of Stellenbosch, Stellenbosch, South Africa, 2005.
[19] Lyons, R.G.: Understanding Digital Signal Processing. 1st edn. Prentice Hall PTR, New
Jersey, 2001.
[20] Smith, S.W.: The Scientist and Engineer’s Guide to Digital Signal Processing. 2nd edn.
California Technical Publishing, San Diego, 1999.
[21] Bhatikar, S.R., DeGroff, C. and Mahajan, R.L.: A classifier based on the artificial neural
network approach for cardiologic auscultation in pediatrics. Artificial Intelligence in Medicine,
vol. 33, pp. 251–260, 2005.
[22] Hall, L.T., Maple, J.L., J., A. and Abbott, D.: Sensor system for heart sound biomonitor.
Microelectronics Journal, vol. 31, pp. 583–592, 2000.
[23] Kutz, M.: Standard Handbook of Biomedical Engineering & Design. 1st edn. McGraw-Hill,
New York, 2003.
[24] Messer, S.R., Agzarian, J. and Abbott, D.: Optimal wavelet denoising for phonocardiograms.
Microelectronics Journal, vol. 32, pp. 931–941, 2001.
[25] Debbal, S.M. and Bereksi-Reguig, F.: Heartbeat sound analysis with the wavelet transform.
Journal of Mechanics in Medicine and Biology, vol. 4, pp. 133–141, 2004.
[26] Turkoglu, I., Arslan, A. and Ilkay, E.: An expert system for diagnosis of the heart valve
diseases. Expert Systems with Applications, vol. 23, pp. 229–236, 2002.
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 100
[27] Bentley, P., Grant, P. and McDonnell, J.: Time-frequency and time-scale techniques for the
classification of native and bioprosthetic heart valve sounds. IEEE Transactions on Biomedical
Engineering, vol. 45, no. 1, pp. 125–128, 1998.
[28] Auger, F., Flandrin, P., Goncalves, P. and Lemoine, O.: Time-Frequency Toolbox for use with
Matlab. CNRS, 1996.
[29] Wavelet Toolbox User’s Guide, Chapter 1. Wavelets: A New Tool for Signal Analysis. Math-
works Inc., 1996.
[30] Gonzalez, R.C. and Woods, R.E.: Digital Image Processing. 2nd edn. Prentice Hall, New
Jersey, 2002.
[31] Gupta, C.N., Palaniappan, R., Swaminathan, S. and Krishnan, S.M.: Neural network clas-
sification of homomorphic segmented heart sounds. Applied Soft Computing, vol. Article in
Press, 2005.
[32] Cathers, I.: Neural network assisted cardiac auscultation. Artificial Intelligence in Medicine,
vol. 7, pp. 53–66, 1995.
[33] Ölmez, T. and Dokur, Z.: Classification of heart sounds using an artificial neural network.
Pattern Recognition Letters, vol. 24, pp. 617–629, 2003.
[34] Andrisevic, N., Ejaz, K., Rios-Gutierrez, F., Alba-Flores, R., Nordehn, G. and Burns, S.:
Detection of heart murmurs using wavelet analysis and artificial neural networks. Journal of
Biomechanical Engineering, vol. 127, pp. 899–904, 2005.
[35] Akay, Y.M., Akay, M., Welkowitz, W. and Kostis, J.: Noninvasive detection of coronary artery
disease. IEEE Engineering in Medicine and Biology, vol. November/December, pp. 761–764,
1994.
[36] Leung, T.S., White, P., Collis, W., Brown, E. and Salmon, A.P.: Classification of heart
sounds using time-frequency method and artificial neural networks. In: Proceedings of the
22nd Annual EMBS International Conference, July 23-28, pp. 988–991. Chicago IL, 2000.
ISBN 0-7803-6465-1.
[37] El-Hanjouri, M., Alkhaldi, W., Hamdy, N. and Alim, O.A.: Heart diseases diagnosis using
hmm. In: IEEE Melecon, May 7-9, pp. 489–492. Cairo, Egypt, 2002. ISBN 0-7803-7527-0.
[38] C., D.R. (ed.): The Electrical Engineering Handbook, chap. 20. CRC Press LLC, New York,
2000.
[39] Tripathy, S.S.: System for diagnosing valvular heart disease using heart sounds. Master’s thesis,
Department of Computer Science & Engineering, Indian Insitute of Technology, Kanpur, India,
2005.
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 101
[40] Ricke, A.D., Povinelli, R.J. and Johnson, M.T.: Automatic segmenta-
tion of heart sound signals using hidden markov models. Available at:
http://povinelli.eece.mu.edu/publications/papers/cinc2005b.pdf, [2006, May
08], 2005.
[41] Pavlopoulos, S.A., Stasis, A.C.H. and Loukis, E.N.: A decision tree-based method for the dif-
ferential diagnosis of aortic stenosis from mitral regurgitation using heart sounds. BioMedical
Engineering Online, vol. 3, no. 21, 2004.
[42] Voss, A., Mix, A. and Hübner, T.: Diagnosing aortic valve stenosis by parameter extraction
of heart sound signals. Annals of Biomedical Engineering, vol. 33, no. 9, pp. 1167–1174, 2005.
[43] Pourazad, M., Moussavi, Z., Farahmand, F. and Ward, R.: Heart sounds seperation from lung
sounds using independant component analysis. In: Proceedings of the 2005 IEEE Engineering
in Medicine and Biology 27th Annual Conference. 2005.
[44] Carr, J. and Brown, J.: Introduction to Biomedical Equipment Technology. 4th edn. Prentice
Hall, Upper Saddle River, New Jersey, 2001.
[45] Digitize Data.com Inc.: October 2006.
Available at: http://www.digitizedata.com/images/stethoscope.jpg
[46] How condenser microphones work.
[47] Kingstate Electronic Corp.: Electret condenser Microphone structure and theory introduction.
October 2006.
Available at: http://www.kingstate.com.tw/9-5.htm
[48] Stethographics: Multichannel STG system overview. October 2006.
Available at: http://www.stethographics.com/main/productsmultioverview.html
[49] Tapuz Medical Technology Ltd.: ECG electrodes belt. October 2006.
Available at: http://www.tapuz.com/ecg.htm
[50] Medes: VTAMN PROJECT (RNTS 2000) : "Medical Teleassistance Suit". October 2006.
Available at: http://www.medes.fr/home_en.html
[51] Stärz, S.: Development of an Auscultation Jacket for Medical Examination. Master’s thesis,
Mechanical Engineering, University of Stellenbosch, Stellenbosch, South Africa, 2005.
[52] Klabunde, R.: Cardiovascular Physiology Concepts: Factors Promoting Venous Return.
November 2006.
Available at: http://www.cvphysiology.com/Cardiac%20Function/CF018.htm
[53] Buell, J.: A guide to interpreting computerized impedance cardiographic data. Available
at:http://www.cardiobeat.com/Impedance.htm, [2006, September 13], 2002.
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 102
[54] Walker, J.S.: A Primer on Wavelets and their Scientific Applications. Chapman & Hall/CRC,
New York, 1999.
[55] Werner, L., Pitts, B. and Gilsdorf, D.: Heartsounds: An interactive auscultation program.
CD-ROM, 2004.
[56] Burke, M.J. and Nasor, M.: The time relationships of the constituent components of the
human electrocardiogram. Journal of Medical Engineering & Technology, vol. 26, no. 1, pp.
1–6, 2002.
[57] van Schalkwyk, J.: The whole ecg - a really basic ecg primer. October 2006.
Available at: http://www.anaesthetist.com/icu/organs/heart/ecg
[58] Liang, H., Lukkarinen, S. and Hartimo, I.: Heart sound segmentation algorithm based on
heart sound envelogram. In: Computers in Cardiology, vol. 24, pp. 105–108. 1997.
[59] Haghighi-Mood, A. and Torry, J.: A sub-band energy tracking algorithm for heart sound
segmentation. In: Computers in Cardiology, pp. 501–504. 1995.
[60] Johnson, G., Adolph, R. and Campbell, D.: Estimation of the severity of aortic valve stenosis
by frequency analysis of the murmur. Journal of the American College of Cardiology, vol. 1,
no. 5, pp. 1315–1323, 1983.
[61] Lazar, J.: Atrial fibrillation. October 2006.
Available at: http://www.emedicine.com/EMERG/topic46.htm
[62] Levine, M.: Third degree heart block. October 2006.
Available at: http://www.emedicine.com/emerg/topic235.htm
[63] Molson Medical Informatics Project: The EKG waveform. October 2006.
Available at: http://sprojects.mmi.mcgill.ca/cardiophysio/EKGPRinterval.htm
[64] Zill, D. and Cullen, M.: Advanced Engineering Mathematics. 2nd edn. Jones and Bartlett,
Sudbury, Massachusetts, 2000.
[65] Ring, N. and Marshall, A.: Idiopathic Dilatation of the Pulmonary Artery. October 2006.
Available at: http://www2.umdnj.edu/ shindler/padil.html
[66] Mdlazi, L., Marwala, T., Stander, C., Scheffer, C. and Heyns, P.: The principal compo-
nent analysis and automatic relevance determination for fault identification in structures.
In: Proceedings of the 21st International Modal Analysis Conference (IMAC 21), pp. 37–42.
Kissimmee, Florida, U.S.A, 2003.
[67] Enderle, J., Blanchard, S. and Bronzino, J.: Introduction to Biomedical Engineering. 2nd edn.
Elsevier Academic Press, San Diego, 2005.
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 103
[68] Theodoridis, S. and Koutroumbas: Pattern Recognition. 1st edn. Academic Press, San Diego,
1999.
[69] Ingrassia, S. and Morlini, I.: Neural network modeling for small datasets. Technometrics,
vol. 47, no. 3, pp. 297–311, 2005.
[70] Shen, L., Rangayyan, R. and Leo Desautels, J.: Detection and classification of mammographic
calcifications. International Journal of Pattern Recognition and Artificial Intelligence, vol. 7,
no. 6, pp. 1403–1416, 1993.
[71] Pal, A., Mbarika, W., Cobb-Payton, F. and McCoy, S.: Telemedicine diffusion in a developing
country: The case of india (march 2004). IEEE Transactions on Information Technology in
Biomedicine, vol. 9, pp. 59–65, 2005.
[72] Martinez, A., Villarroel, V., Seoane, J. and del Pozo, F.: Analysis of information and com-
munication needs in rural primary health care in developing countries. IEEE Transactions on
Information Technology in Biomedicine, vol. 9, pp. 66–72, 2005.
[73] Lam, C., Cheong, P., Ong, B. and Ho, K.: Teaching cardiac auscultation without patient
contact. Medical Education, vol. 38, pp. 1184–1185, 2004.
[74] Tuchinda, C. and Reid Thompson, W.: Cardiac auscultatory recording database: Delivering
heart sounds through the internet. In: Proceedings of the 2001 AMIA Annual Symposium.
2001.
[75] Ohio State University Medical Center: Arrhythmias. November 2006.
Available at: http://medicalcenter.osu.edu/images/greystone/ei_0018.jpg
[76] di Bernardo, D. and Murray, A.: Origin on the electrocardiogram of u-waves and abnormal
u-wave inversion. Cardiovascular Research, vol. 53, pp. 202–208, 2002.
[77] Numed: ECG electrode placement. October 2006.
Available at: http://www.numed.co.uk/electrodepl.html
[78] Klabunde, R.: Cardiovascular physiology concepts. October 2006.
Available at: http://www.cvphysiology.com/Arrhythmias/A013a.htm
[79] Kisslo, J., Adams, D. and Leech, G.: Essentials of echocardiography. October 2006.
Available at: http://www.echoincontext.com/beginner.pdf
[80] University of Kansas Medical Center: Ebstein’s anomaly. October 2006.
Available at: http://www.kumc.edu/kumcpeds/cardiology/pedcardioecho/normalheart-
4c.gif
[81] TEBCO OEM Module. Hemo Sapiens Inc., 1995.
Stellenbosch University http://scholar.sun.ac.za
LIST OF REFERENCES 104
[82] Cardiobeat: Glossary of Terms. October 2005.
Available at: http://www.cardiobeat.com/definitionsofterms.htm
Stellenbosch University http://scholar.sun.ac.za