Development of
Non-Invasive Technique for Heart Rate Detection Using Facial Videos
Kokila Bharti Jaiswal1
1Department of Electronics and
Communication, National Institute of Technology, Raipur
*Corresponding Author: kbjaiswal.phd2018.etc@nitrr.ac.in
Abstract
Mortality rate in
Chhattisgarh state due to ischemic heart disease is 43.6% and growing
exponentially every year. Early detection of cardiac health plays a major role
in decreasing this rate. Due to the insufficient hospitals and accessibility of
the dedicated equipment, remote health monitoring has become quite inevitable
after SARC-CoV-2 pandemic. Due to its excellent capability is it going to be
cardiac rate measurement method of future. However, the difficulty in HR
measurement is that, it gets affected with noise very easily because the
amplitude of physiological signal is very weak. remote Photoplethysmography
(rPPG) is a technique to measure the cardiac activity in a contact-less manner
using digital cameras. However, the HR estimation suffers from two major
artifacts, motion artifact and illumination artifact. Denoising of rPPG signal
is a fundamental problem and needs to be addressed very carefully. In this
article we have proposed a novel HR estimation network using a combination of
wavelet decomposition and Convolutional Neural Network (CNN). This approach
provides distinct features at different frequency levels, which facilitates the
removal of noisy signal. Performance evaluation of the proposed method is done
on self-collected dataset. Lower values of RMSE and MAE proves the efficacy of
the proposed method.
Keywords: remote
Photoplethysmography(rPPG), Heart rate, Telehealth monitoring, CNN
1. Introduction
Cardiovascular diseases (CVDs) are leading cause of
mortality worldwide, accounting for 17.3 million deaths per year, a number that
is expected to grow to more than 23.6 million by 2030 [1]. More specifically if
we talk about Chhattisgarh state the mortality rate due to ischemic heart
disease, diabetes and Chronic obstructive pulmonary disease (COPD) are the
major cause of morbidity. It contributes
to 53.82% of total Disability-Adjusted Life Years (DALYs) [2]. To provide
medical attention to this serious disease, mere number of i.e, 26 district
hospitals are available in the state. Total Population of the state is 2.55
crore amongst which 76.76% belongs to rural areas. Availability of public
healthcare providers in district healthcare system is 5 per 10,000. The
accessibility of private hospitals is too expensive to be borne by common
man. All these statistics clearly says
that there is an immediate requirement of some alternate measure. Early
detection of heart rate can be a boon to prevent cardiovascular disease.
Dependency on ECGs for cardiac health check-ups can be eliminated by the use of
various types of contact-based devices such as wrist watches and waist bands. These devices work on the principle of
photoplethysmography (PPG). But these contact-based method suffers from the
limitation that it cannot be used with a person suffering from skin disease and
in the case of neonatal. Contact based methods cause skin allergies and may
cause discomfort to the patient or person. Over the last few years
cardiovascular status from using camera has gained immense popularity. These
technique works on the principle of remote photoplethysmography (rPPG). It
works on the principle that when the person face is illuminated with light
source, some amount of light gets reflected from the surface of skin which is
called specular reflection whereas some amount of light penetrates deep inside
the skin and reflection occurs due to the presence of hemoglobin and melanin
which is called diffuse reflection as shown in Fig. 1. The reflection due to
volume of blood is in synchronization with the heartbeat. Therefore, by
capturing the reflected amount of light through camera and applying various
signal processing methods, we can measure the heart rate of a person based only
on the facial video.
Such measurement comes under a category of
telemedicine, where a person located in remote areas without any medical
representatives, without any hospitals facilized with expensive machines will
be able to track the cardiac functionalities. Such devices are very helpful for
early detection of functionality of heart and if any anomaly is seen the person
can rush to the hospital immediately for proper medical attention. The technique was first discovered by
Verkyuesse et al. in 2008 [3] since then many advancements have been made. Many
Blind source separation (BSS)methods [4,5] have been developed considering the
fact that the noise and the rPPG signal are linearly separable. Later on,
method such as CHROM [7] and POS [6] are developed which are color intensity-based
methods. EVM-CNN [8] exploits the power of CNN to make robust system for the
detection of heart rate. Here in this article, we have tried to overcome the
limitations of the previously proposed methods for motion scenarios.
Fig. 1 Working principle of rPPG [6]
2. Experimental Details:
Video is acquired using simple mobile phone camera
(Samsung Galaxy-SM-F127G/DS) with a frame rate of 30fps and resolution of 1080
x 1920 in an RGB format. Video is recorded in two different conditions. First
in closed Computer Vision Lab at NIT, Raipur, illuminated with fluorescent tube
producing visible light. Second recording is done outdoor in the campus itself,
where the subjected is illuminated with natural sunlight varying by the shades
of cloud. In both the recordings the person is subjected to three conditions:
Resting position with deep breath
Showing change in facial emotions (smile, laugh and
anger)
Elevated heart rate condition due to strenuous
physical activity
All the data are recorded as 20 sec videos. Distance
between the subject and camera is set to be 1metre. For ground truth collection we have used
USM001 fingertip pulse oximeter attached to the index finger of the subject.
Sample frames of the self-collected dataset is shown in Fig. 2.
Fig. 2 Sample frames of the self-collected dataset
3. Method
The basic flow diagram
of the rPPG signal extraction through video can be depicted from Fig. 3. Here
the frames are extracted from video of a subject. We are interested only on the
face region. The frames are cropped to get the face region which forms the ROI.
The ROI is tracked using Kanade-Lucas -Tomasi (KLT) algorithm. Here only the green channel is considered for
HR estimation owing to the fact that the absorption of light in green color
range i.e, wavelength 550nm-590nm is maximum. Once the ROI is detected a
wavelet transform is applied to each ROI to get the subbands. This
multiresolution decomposition of G channel enables the identification of noise
frequency, which can be removed using SureShrink threshold. The thresholded
bands can now be concatenated to form spatiotemporal map. The benefit of constructing feature map is
that it is now focusing only the information related to HR and discarding any
noises which occurs due to motion by natural movements of face. This
spatiotemporal map is now applied as an input to Convolutional Neural Network
(CNN) for HR estimation. Owing to the success of CNN in various tasks such as
age prediction, disease progression etc , compelled us to use CNN network like
ResNet-18 for HR estimation problem. The
problem of HR estimation is considered as the regression problem. The ResNet18
is trained using Adam optimizer and at the output we have used linear
activation function to get the single HR value. Learning rate is set as 0.005.
L1 loss function is used to minimize the error. The network is trained on
publicly available UBFC-rPPG [9] dataset and is tested on self-collected
dataset.
Fig. 3 Illustration of
the flow diagram of the proposed method.
4. Observation and Results
The performance of the proposed network is determined
based on the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) given
in Eqn. 1 and 2. It can be clearly seen from the Fig. 4, that the predicted
value closely follows the ground truth value of HR. Also, from Table -1 it can
be concluded that the lower values of RMSE and MAE indicates the better
accuracy of the proposed network for the detection of HR.
Where, HRe is the estimated heart rate and HRgt
is the ground truth measured from pulse oximeter.
Table-1 Performance evaluation of subjects from
self-collected dataset
Subject
|
RMSE(%)
|
MAE(%)
|
1
|
4.45
|
0.67
|
2
|
3.5
|
0.75
|
3
|
5.12
|
0.89
|
Fig. 4 Ground truth and Predicted HR value of subject
-1
5. Discussion
rPPG provides us the convenience to measure the heart
rate of a person using simple RGB facial video of a person. The
inception of rPPG may be a decades ago but its necessity surges in the Covid
era. Due to cost effectiveness and accuracy, it is continuing to be one of the
best technologies to be used for monitoring cardiac health. But the main challenge is removal of unwanted
signal or noisy signal due to movement of head or natural variations in facial
expression. In this article a network is proposed for the estimation of heart
rate under motion scenarios. In this work we have effectively denoised the rPPG
signal using wavelet transform. Also, the used of Convolution Neural Network
makes the network to work on large dataset and improves the accuracy of HR estimation.
In future the telehealth monitoring may become a part of daily routine of each
and every individual. Such technology
plays an important role in reducing the mortality rate due to cardiac diseases
to a large extent.
Acknowledgement
This research has been supported by National Institute
of Technology, Raipur. The research is performed at Computer Vision Lab of
Electronics and Telecommunication Department. Special thanks to my research
advisor Dr. T. Meenpal for his constant support and Computer Vision Lab
research scholars Nitish Kumar, Madhu Oruganti and Deeksha Sahu for giving
their consent to collect data samples.
References
[1] Bansal, A.,
& Joshi, R. (2018). Portable out‐of‐hospital electrocardiography: A review
of current technologies. Journal of arrhythmia, 34(2),
129-138.
[2] National Health
Systems Resource Centre, https://nhsrcindia.org/sites/default/files/practice_image/HealthDossier2021/Chhattisgarh.pdf
[3] Verkruysse, W.,
Svaasand, L. O., & Nelson, J. S. (2008). Remote plethysmographic imaging
using ambient light. Optics express, 16(26),
21434-21445.
[4] Poh, M. Z.,
McDuff, D. J., & Picard, R. W. (2010). Non-contact, automated cardiac pulse
measurements using video imaging and blind source separation. Optics
express, 18(10), 10762-10774.
[5] Kranjec, J.,
Beguš, S., Geršak, G., & Drnovšek, J. (2014). Non-contact heart rate and
heart rate variability measurements: A review. Biomedical signal
processing and control, 13, 102-112.
[6] Wang, W., Den
Brinker, A. C., Stuijk, S., & De Haan, G. (2016). Algorithmic principles of
remote PPG. IEEE Transactions on Biomedical Engineering, 64(7),
1479-1491.
[7] De Haan, G.,
& Jeanne, V. (2013). Robust pulse rate from chrominance-based rPPG. IEEE
Transactions on Biomedical Engineering, 60(10), 2878-2886.
[8] Qiu, Y., Liu,
Y., Arteaga-Falconi, J., Dong, H., & El Saddik, A. (2018). EVM-CNN:
Real-time contactless heart rate estimation from facial video. IEEE
transactions on multimedia, 21(7), 1778-1787.
[9] S. Bobbia, R. Macwan, Y. Benezeth, A. Mansouri,
J. Dubois, "Unsupervised skin tissue segmentation for remote
photoplethysmography", Pattern Recognition Letters, 2017.