An AI-Based Visual Aid With Integrated Reading Assistant For The Completely Blind
An AI-Based Visual Aid With Integrated Reading Assistant For The Completely Blind
An AI-Based Visual Aid With Integrated Reading Assistant For The Completely Blind
Abstract—Blindness prevents a person from gaining knowledge at all. Recent statistics from the World Health Organization esti-
of the surrounding environment and makes unassisted navigation, mate the number of visually impaired or blind people to be about
object recognition, obstacle avoidance, and reading tasks a major 2.2 billion [1]. A white cane is used traditionally by the blind
challenge. In this work, we propose a novel visual aid system
for the completely blind. Because of its low cost, compact size, people to help them navigate their surroundings, although use of
and ease-of-integration, Raspberry Pi 3 Model B+ has been used the white cane does not provide information for moving obsta-
to demonstrate the functionality of the proposed prototype. The cles that are approaching from a distance. Moreover, white canes
design incorporates a camera and sensors for obstacle avoidance are unable to detect raised obstacles that are above the knee level.
and advanced image processing algorithms for object detection. Trained guide dogs are another option that can assist the blind.
The distance between the user and the obstacle is measured by
the camera as well as ultrasonic sensors. The system includes However, trained dogs are expensive and not readily available.
an integrated reading assistant, in the form of the image-to-text Recent studies have proposed several types [2]–[9] of wearable
converter, followed by an auditory feedback. The entire setup or hand-held electronic travel aids (ETAs). Most of these devices
is lightweight and portable and can be mounted onto a regular integrate various sensors to map the surroundings and provide
pair of eyeglasses, without any additional cost and complexity. voice or sound alarms through headphones. The quality of the
Experiments are carried out with 60 completely blind individuals to
evaluate the performance of the proposed device with respect to the auditory signal, delivered in real-time, affects the reliability of
traditional white cane. The evaluations are performed in controlled these gadgets. Many ETAs, currently available in the market,
environments that mimic real-world scenarios encountered by a do not include a real-time reading assistant and suffer from a
blind person. Results show that the proposed device, as compared poor user interface, high cost, limited portability, and lack of
with the white cane, enables greater accessibility, comfort, and ease hands-free access. These devices are, therefore, not widely pop-
of navigation for the visually impaired.
ular among the blind and require further improvement in design,
Index Terms—Blind people, completely blind, electronic performance, and reliability for use in both indoor and outdoor
navigation aid, Raspberry Pi, visual aid, visually impaired people, settings.
wearable system.
In this article, we propose a novel visual aid system for
completely blind individuals. The unique features, which define
I. INTRODUCTION the novelty of the proposed design, include the following.
1) Hands free, wearable, low power, and compact design,
LINDNESS or loss of vision is one of the most common
B disabilities worldwide. Blindness, either caused by natural
means or some form of accidents, has grown over the past
mountable on a pair of eyeglasses, for the indoor and
outdoor navigation with an integrated reading assistant.
2) Complex algorithm processing with a low-end configura-
decades. Partially blind people experience a cloudy vision, see-
tion.
ing only shadows, and suffer from poor night vision or tunnel vi-
3) Real-time, camera-based, accurate distance measurement,
sion. A completely blind person, on the other hand, has no vision
which simplifies the design and lowers the cost by reduc-
ing the number of required sensors.
Manuscript received March 31, 2020; revised July 23, 2020; accepted Septem-
ber 6, 2020. Date of publication October 20, 2020; date of current version The proposed setup, in its current form, can detect both
November 12, 2020. This article was recommended by Associate Editor Z. Yu. stationary and moving objects in real time and provide auditory
(Corresponding author: Mainul Hossain.) feedback to the blind. In addition, the device comes with an
Muiz Ahmed Khan, Pias Paul, and Mahmudur Rashid are with the Department
of Electrical and Computer Engineering, North South University, Dhaka 1229, in-built reading assistant that is capable of reading text from
Bangladesh (e-mail: [email protected]; [email protected]; any document. This article discusses the design, construction,
[email protected]). and performance evaluation of the proposed visual aid sys-
Mainul Hossain is with the Department of Electrical and Electronic En-
gineering, University of Dhaka, Dhaka 1000, Bangladesh (e-mail: mainul. tem and is organized as follows. Section II summarizes the
[email protected]). existing literature on blind navigation aids, highlighting their
Md Atiqur Rahman Ahad is with the Department of Electrical and Electronic benefits and challenges. Section III presents the design and the
Engineering, University of Dhaka, Dhaka 1000, Bangladesh, and also with
the Department of Intelligent Media, Osaka University, Suita 565-0871, Japan working principle of the prototype, while Section IV discusses
(e-mail: [email protected]). the experimental setup for performance evaluation. Section V
Color versions of one or more of the figures in this article are available online summarizes the results using appropriate statistical analysis.
at https://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/THMS.2020.3027534 Finally, Section VI concludes the article.
2168-2291 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
508 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 50, NO. 6, DECEMBER 2020
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
KHAN et al.: AI-BASED VISUAL AID WITH INTEGRATED READING ASSISTANT FOR THE COMPLETELY BLIND 509
Fig. 1. Hardware configuration of the proposed system. The visual assistant Fig. 3. Basic hardware setup: Raspberry Pi 3 Model B+ and associated module
takes the image as inputs, processes it through the Raspberry Pi Processor, and with the camera and ultrasonic sensors.
gives the audio feedback through a headphone.
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
510 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 50, NO. 6, DECEMBER 2020
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
KHAN et al.: AI-BASED VISUAL AID WITH INTEGRATED READING ASSISTANT FOR THE COMPLETELY BLIND 511
Fig. 6. Workflow for the reading assistant. Raspberry Pi gets a single frame
from the camera module and runs through the Tesseract OCR engine. The test
output is then converted to the audio.
Fig. 5. SSD Lite-MobileNet architecture. objects inside the view of the user will be identified. Next, a
rectangle is drawn around the objects. With the SSDLite model
and the Raspberry Pi 3 Model B+, a frame rate higher than 1 fps
MobileNet [35]. With the SSDLite on top of the MobileNet, we can be achieved, which is fast enough for most real-time object
were able to get around 30 frames per second (fps), which is detection applications.
enough to evaluate the system in real-time test cases. In places
where online access is either limited or absent, the proposed E. Reading Assistant
device can operate offline as well. In SSDLite-MobileNet, the
“classifier head” of MobileNet, which made the predictions for The proposed system integrates an intelligent reader that will
the whole network, gets replaced with the SSD network. As allow the user to read text from any document. An open-source li-
shown in Fig. 5, the output of the base network is typically a brary, Tesseract version-4, which includes a highly accurate deep
7 × 7 pixel image, which is fed into the replaced SSD network learning-based model for text recognition, is used for the reader.
to do further feature extraction. Not only the replaced SSD Tesseract has unicode (UTF-8) support and can recognize many
network takes the output of the base network but it also takes the languages along with various output formats: plain-text, hocr
outputs of several previous layers. The MobileNet layers convert (HTML), pdf, tsv, and invisible-text-only pdf. The underlying
the pixels from the input image into features that describe the engine uses a long short-term memory (LSTM) network. LSTM
contents of the image and pass these along to the other layers. is part of a recurrent neural network, which is a combination of
A new family of object detectors, such as POLY-YOLO [38], some unfolded layers that use cell states in each time steps to
DETR [39], Yolact [40], and Yolact++ [41], introduced instance predict letters from an image. The captured image is divided into
segmentation along with object detection. Despite the efforts, horizontal boxes, and in each time step, the horizontal boxes are
many object detection methods still struggle with medium and being analyzed with the ground truth value to predict the output
large-sized objects. Researchers have, therefore, focused on letter. LSTM uses gate layers to update the cell state, at each
proposing better anchor boxes to scale up the performance of an time step, by using several activation functions. Therefore, the
object detector with regards to the perception, size, and shape time required to recognize texts can be optimized.
of the object. Recent detectors offer a smaller parameter size Fig. 6 shows the working principle of the reading assistant. An
while significantly improving mean average precision. However, image is captured from the live video feed without interrupting
large input frame sizes limit their use in the systems with low the object detection process. In the background, Tesseract API
processing power. will extract the texts from the image and save them in a temporary
For object detection, MobileNetv2 is used as the base network, text file. Then it reads out the text from the text file using the text-
along with SSD since it is desirable to know both high-level as to-speech engine eSpeak. The accuracy of the Tesseract OCR
well as low-level features by reading the previous layers. Since engine depends on ambient lighting and background and usually
object detection is more complicated than the classification, SSD works well in the white background and brightly illuminated
adds many additional convolution layers on the top of the base places.
network. To detect objects in live feeds, we used a Pi camera.
Basically, our script sets paths to the model and label maps, IV. SYSTEM EVALUATION AND EXPERIMENTS
loads the model into memory, initializes the Pi camera, and then
A. Evaluation of Object Detection
begins performing object detection on each video frame from
the Pi camera. Once the script initializes, which can take up to Our model (SSDLite) is pretrained on the Image-Net dataset
a maximum of 30 s, a live video stream will begin and common for the image classification. It draws a bounding box on an
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
512 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 50, NO. 6, DECEMBER 2020
TABLE I
PERFORMANCE OF SINGLE AND MULTIPLE OBJECT DETECTION
Fig. 7. Single Object Detection. The object detection algorithm can detect the
cell phone with 97% confidence.
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
KHAN et al.: AI-BASED VISUAL AID WITH INTEGRATED READING ASSISTANT FOR THE COMPLETELY BLIND 513
Fig. 11. Demonstration of the distance measurement using camera and ultra-
sonic sensor.
Fig. 9. Measuring the distance of a mouse from the prototype device using TABLE II
ultrasonic sensor. DISTANCE MEASUREMENT BETWEEN OBJECT AND USER
The actual distance between the object and the user is mea-
sured by a measuring tape and compared with that measured
by the camera and the ultrasonic sensor. Since the camera can
Fig. 10. Face detection and distance measurement from a single video frame. detect a person’s face, the object used in this case is a human
face, as shown in Fig. 10. Table II summarizes the results. The
distance measured by ultrasonic sensors is more accurate than
and determine how far the person is from the blind user. The that measured by the camera. Also, the ultrasonic sensor can
integration of the camera with the ultrasonic sensor, therefore, respond in real time so that it can be used to measure the distance
allows simultaneous object detection and distance measurement, between the blind user and a moving object. The camera, with
which adds novelty to our proposed design. We have used the a higher processing power and more fps, has a shorter response
Haar cascade algorithm [42] to detect face from a single video time. Although the camera takes slightly more time to process,
frame. It can also be modified and used for other objects. The both camera and ultrasonic sensors can generate feedback at the
bounding boxes, which appear while recognizing an object, same time.
consist of a rectangle. The width w, height h, and the coordinates
of the rectangular box (x0 , y0 ) can be adjusted as required.
C. Evaluation of Reading Assistant
Fig. 11 demonstrates how the distance between the object
and the blind user can be simultaneously measured by both The integrated reading assistant in our prototype is tested
the camera and the ultrasonic sensor. The dotted line (6 m) under different ambient lighting conditions for various combina-
represents the distance measured by the camera and the solid tions of text size, font, color, and background. The OCR engine
line (5.6 m) represents the distance calculated from the ultrasonic performs better in an environment with more light as it can easily
sensor. Width w and height h of the bounding box are defined in extract the text from the captured image. While comparing text
the .xml file with feature vectors, and they vary depending on the with different colored background, it has been shown that a
distance between the camera and the object. In addition to the well-illuminated background yields better performance for the
camera, the use of the ultrasonic sensor makes object detection reading assistant. As given in Table III, the performance of the
more reliable. The following equation, which can be derived by reading assistant is tested under three different illuminations:
considering the formation of image, as light passes through the bright, slightly dark, and dark, using the green and black-colored
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
514 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 50, NO. 6, DECEMBER 2020
TABLE III
PERFORMANCE OF THE READING ASSISTANT
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
KHAN et al.: AI-BASED VISUAL AID WITH INTEGRATED READING ASSISTANT FOR THE COMPLETELY BLIND 515
Fig. 15. User rating for the proposed device tested in the indoor setup of
Fig. 14. Velocity of blind participants walking from point A to B in Fig. 13. Fig. 13.
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
516 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 50, NO. 6, DECEMBER 2020
TABLE VI [6] C. Ye and X. Qian, “3-D object recognition of a robotic navigation aid for
COST OF PROPOSED DEVICE VERSUS EXISTING VISUAL AIDS the visually impaired,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26,
no. 2, pp. 441–450, Feb. 2018.
[7] Y. Liu, N. R. B. Stiles, and M. Meister, “Augmented reality powers a
cognitive assistant for the blind,” eLife, vol. 7, Nov. 2018, Art. no. e37841.
[8] A. Adebiyi et al., “Assessment of feedback modalities for wearable visual
aids in blind mobility,” PLoS One, vol. 12, no. 2, Feb. 2017, Art. no.
e0170531.
[9] J. Bai, S. Lian, Z. Liu, K. Wang, and D. Liu, “Smart guiding glasses for
visually impaired people in indoor environment,” IEEE Trans. Consum.
Electron., vol. 63, no. 3, pp. 258–266, Aug. 2017.
[10] D. Dakopoulos and N. G. Bourbakis, “Wearable obstacle avoidance elec-
assistant with some of the existing platforms. The total cost tronic travel aids for blind: A survey,” IEEE Trans. Syst. Man, Cybern.
Part C Appl. Rev., vol. 40, no. 1, pp. 25–35, Jan. 2010.
of making the proposed device is roughly US $68, whereas [11] E. E. Pissaloux, R. Velazquez, and F. Maingreaud, “A new framework for
some existing devices, with a similar performance, appear more cognitive mobility of visually impaired users in using tactile device,” IEEE
expensive. The service dogs, another viable alternative, can cost Trans. Human-Mach. Syst., vol. 47, no. 6, pp. 1040–1051, Dec. 2017.
[12] K. Patil, Q. Jawadwala, and F. C. Shu, “Design and construction of
up to US $4000 and require high maintenance. Although the electronic aid for visually impaired people,” IEEE Trans. Human-Mach.
white canes are cheaper, they are unable to detect moving objects Syst., vol. 48, no. 2, pp. 172–182, Apr. 2018.
and do not include a reading assistant. [13] R. K. Katzschmann, B. Araki, and D. Rus, “Safe local navigation for
visually impaired users with a time-of-flight and haptic feedback device,”
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 3, pp. 583–593,
VI. CONCLUSION Mar. 2018.
[14] J. Villanueva and R. Farcy, “Optical device indicating a safe free path to
This research article introduces a novel visual aid system, in blind people,” IEEE Trans. Instrum. Meas., vol. 61, no. 1, pp. 170–177,
the form of a pair of eyeglasses, for the completely blind. The Jan. 2012.
[15] X. Yang, S. Yuan, and Y. Tian, “Assistive clothing pattern recognition for
key features of the proposed device include the following. visually impaired people,” IEEE Trans. Human-Mach. Syst., vol. 44, no. 2,
1) The hands free, wearable, low power, low cost, and com- pp. 234–243, Apr. 2014.
pact design for indoor and outdoor navigation. [16] S. L. Joseph et al., “Being aware of the world: Toward using social media
to support the blind with navigation,” IEEE Trans. Human-Mach. Syst.,
2) The complex algorithm processing using the low-end pro- vol. 45, no. 3, pp. 399–405, Jun. 2015.
cessing power of Raspberry Pi 3 Model B+. [17] B. Jiang, J. Yang, Z. Lv, and H. Song, “Wearable vision assistance system
3) Dual capabilities for object detection and distance mea- based on binocular sensors for visually impaired users,” IEEE Internet
Things J., vol. 6, no. 2, pp. 1375–1383, Apr. 2019.
surement using a combination of camera and ultrasound [18] L. Tepelea, I. Buciu, C. Grava, I. Gavrilut, and A. Gacsadi, “A vision
sensors. module for visually impaired people by using raspberry PI platform,” in
4) Integrated reading assistant, offering image-to-text con- Proc.15th Int. Conf. Eng. Modern Electr. Syst. (EMES), Oradea, Romania,
2019, pp. 209–212.
version capabilities, enabling the blind to read texts from [19] L. Dunai, G. Peris-Fajarnés, E. Lluna, and B. Defez, “Sensory naviga-
any document. tion device for blind people,” J. Navig., vol. 66, no. 3, pp. 349–362,
A detailed discussion, on the software and hardware aspects May 2013.
[20] V.-N. Hoang, T.-H. Nguyen, T.-L. Le, T.-H. Tran, T.-P. Vuong, and N.
of the proposed blind assistant, has been given. A total of 60 Vuillerme, “Obstacle detection and warning system for visually impaired
completely blind users have rated the performance of the device people based on electrode matrix and mobile kinect,” Vietnam J. Comput.
in well-controlled indoor settings that represent real-world sit- Sci., vol. 4, no. 2, pp. 71–83, Jul. 2016.
[21] C. I. Patel, A. Patel, and D. Patel, “Optical character recognition by open
uations. Although the current setup lacks advanced functions, source OCR tool Tesseract: A case study,” Int. J. Comput. Appl., vol. 55,
such as wet-floor and staircases detection or the use of GPS no. 10, pp. 50–56, Oct. 2012.
and mobile communication module, the flexibility in the design [22] A. Chalamandaris, S. Karabetsos, P. Tsiakoulis, and S. Raptis, “A unit
selection text-to-speech synthesis system optimized for use with screen
leaves room for future improvements and enhancements. In addi- readers,” IEEE Trans. Consum. Electron., vol. 56, no. 3, pp. 1890–1897,
tion, with the advanced machine learning algorithms and a more Aug. 2010.
improved user interface, the system can further be developed [23] R. Keefer, Y. Liu, and N. Bourbakis, “The development and evaluation of
an eyes-free interaction model for mobile reading devices,” IEEE Trans.
and tested in a more complex outdoor environment. Human-Mach. Syst., vol. 43, no. 1, pp. 76–91, Jan. 2013.
[24] B. Andò, S. Baglio, V. Marletta, and A. Valastro, “A haptic solution to
REFERENCES assist visually impaired in mobility tasks,” IEEE Trans. Human-Mach.
Syst., vol. 45, no. 5, pp. 641–646, Oct. 2015.
[1] Blindness and vision impairment, World Health Organization, Geneva, [25] V. V. Meshram, K. Patil, V. A. Meshram, and F. C. Shu, “An astute
Switzerland, Oct. 2019. [Online]. Available: https://www.who.int/news- assistive device for mobility and object recognition for visually impaired
room/fact-sheets/detail/blindness-and-visual-impairment people,” IEEE Trans. Human-Mach. Syst., vol. 49, no. 5, pp. 449–460,
[2] J. Bai, S. Lian, Z. Liu, K. Wang, and D. Liu, “Virtual-blind-road following- Oct. 2019.
based wearable navigation device for blind people,” IEEE Trans. Consum. [26] F. Lan, G. Zhai, and W. Lin, “Lightweight smart glass system with audio
Electron., vol. 64, no. 1, pp. 136–143, Feb. 2018. aid for visually impaired people,” in Proc. IEEE Region 10 Conf., Macao,
[3] B. Li et al., “Vision-based mobile indoor assistive navigation aid for China, 2015, pp. 1–4.
blind people,” IEEE Trans. Mobile Comput., vol. 18, no. 3, pp. 702–714, [27] M. M. Islam, M. S. Sadi, K. Z. Zamli, and M. M. Ahmed, “Developing
Mar. 2019. walking assistants for visually impaired people: A review,” IEEE Sens. J.,
[4] J. Xiao, S. L. Joseph, X. Zhang, B. Li, X. Li, and J. Zhang, “An assistive vol. 19, no. 8, pp. 2814–2828, Apr. 2019.
navigation framework for the visually impaired,” IEEE Trans. Human- [28] T-Y Lin et al., “Microsoft COCO: Common objects in context,” Feb. 2015.
Mach. Syst., vol. 45, no. 5, pp. 635–640, Oct. 2015. [Online]. Available: https://arxiv.org/abs/1405.0312
[5] A. Karmel, A. Sharma, M. Pandya, and D. Garg, “IoT based assistive [29] J. Han et al., “Representing and retrieving video shots in human-centric
device for deaf, dumb and blind people,” Procedia Comput. Sci., vol. 165, brain imaging space,” IEEE Trans. Image Process., vol. 22, no. 7,
pp. 259–269, Nov. 2019. pp. 2723–2736, Jul. 2013.
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.
KHAN et al.: AI-BASED VISUAL AID WITH INTEGRATED READING ASSISTANT FOR THE COMPLETELY BLIND 517
[30] J. Han, K. N. Ngan, M. Li, and H. J. Zhang, “Unsupervised extraction of [39] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and
visual attention objects in color images,” IEEE Trans. Circuits Syst. Video S. Zagoruyko, “End-to-end object detection with transformers,” May 2020.
Technol., vol. 16, no. 1, pp. 141–145, Jan. 2006. [Online]. Available: http://arxiv.org/abs/2005.12872
[31] D. Zhang, D. Meng, and J. Han, “Co-saliency detection via a self-paced [40] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time instance
multiple-instance learning framework,” IEEE Trans. Pattern Anal. Mach. segmentation,” in Proc. IEEE/CVF Conf. Comput. Vision, Seoul, South
Intell., vol. 39, no. 5, pp. 865–878, May 2017. Korea, 2019, pp. 4510–4520.
[32] G. Cheng, P. Zhou, and J. Han, “Learning rotation-invariant convolutional [41] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT++: Better real-time
neural networks for object detection in VHR optical remote sensing im- instance segmentation,” Dec. 2019. [Online]. Available: https://arxiv.org/
ages,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 12, pp. 7405–7415, abs/1912.06218
Dec. 2016. [42] R. Padilla, C. C. Filho, and M. Costa, “Evaluation of Haar cascade
[33] Y. Yang, Q. Zhang, P. Wang, X. Hu, and N. Wu, “Moving object detection classifiers designed for face detection,” Int. J. Comput., Elect., Autom.,
for dynamic background scenes based on spatiotemporal model,” Adv. Control Inf. Eng., vol. 6, no. 4, pp. 466–469, Apr. 2012.
Multimedia, vol. 2017, Jun. 2017, Art. no. 5179013. [43] L. Xiaoming, Q. Tian, C. Wanchun, and Y. Xingliang, “Real-time distance
[34] Q. Xie, O. Remil, Y. Guo, M. Wang, M. Wei, and J. Wang, “Object detection measurement using a modified camera,” in Proc. IEEE Sensors Appl.
and tracking under occlusion for object-level RGB-D video segmentation,” Symp., Limerick, Ireland, 2010, pp. 54–58.
IEEE Trans. Multimedia, vol. 20, no. 3, pp. 580–592, Mar. 2018. [44] L. Doyal and R. G. Das-Bhaumik, “Sex, gender and blindness: A new
[35] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time framework for equity,” BMJ Open Ophthalmol., vol. 3, no. 1, Sep. 2018,
object detection with region proposal networks,” IEEE Trans. Pattern Anal. Art. no. e000135.
Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017. [45] M. Prasad, S. Malhotra, M. Kalaivani, P. Vashist, and S. K. Gupta, “Gender
[36] W. Liu et al., “SSD: Single shot multibox detector,” in Proc. Eur. Conf. differences in blindness, cataract blindness and cataract surgical coverage
Comput. Vision, vol. 9905, Sep. 2016, pp. 21–37. in India: A systematic review and meta-analysis,” Brit. J. Ophthalmol.,
[37] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo- vol. 104, no. 2, pp. 220–224, Jan. 2020.
bileNetV2: Inverted residuals and linear bottlenecks,” in Proc. IEEE/CVF [46] M. Rajesh et al., “Text recognition and face detection aid for visually
Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, 2018, impaired person using raspberry PI,” in Proc. Int. Conf. Circuit, Power
pp. 4510–4520. Comput. Technol., Kollam, India, 2017, pp. 1–5.
[38] P. Hurtik, V. Molek, J. Hula, M. Vajgl, P. Vlasanek, and T. Nejezchleba,
“Poly-YOLO: Higher speed, more precise detection and instance segmen-
tation for YOLOv3,” May 2020. [Online]. Available: http://arxiv.org/abs/
2005.13243
Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on October 28,2022 at 06:14:27 UTC from IEEE Xplore. Restrictions apply.