Hand Safety Using Convolution Neural Networks
Hand Safety Using Convolution Neural Networks
Hand Safety Using Convolution Neural Networks
https://doi.org/10.22214/ijraset.2022.47711
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
Abstract: Detecting the hand when it crosses the safety level and in return it also raises an alert in the form of alarm. So that the
threat can be identified and proper measures are taken to overcome that. The methodology of the project goes as follows, taking
input from camera , Image processing to detect hand, Projecting a line using computer vision, Raising alarm when hand crosses
this projected safety line. The real time data is taken from the camera as an input to the Image processing algorithm. Then this
input is processed to find the hand in image in it and checks whether the hand is crossing that safety line. If that hand is
crossing the safety line we can simply raise alarm. The applications of the project are to the Employees who are working at
industry are pushing the material into shredder machine. But somehow while pushing these material into shredder machine the
employees are pushing their hands itself in the flow of work and the hands of employees were cut in that cause. So from a
certain distance from shredder machine input we project a imaginary line using computer vision, So that if any hand crossing
that imaginary line which is for safety we will raise an alarm. In addition, we can also extend the applications, by just replacing
hand with the Bike, we can detect the bike, which is crossing the staggered stop line, and we can punish or fine them. As a part
of object detection we are using Single short multibox detector.
Keywords: SSD, Image Processing, Object detection.
I. INTRODUCTION
Detecting the hand when it crosses the safety level and in return, it raises an alert in the form of alarm. So that the threat can be
identified and proper measures are taken to overcome that, Let us say there was a shredder machine into which the employees who
are working at industry are pushing the material into it. But somehow while pushing these material into shredder machine the
employees are pushing their hands itself in the flow of work and the hands of employees were cut in that cause. So from a certain
distance from shredder machine input we project a imaginary line using computer vision, So that if any hand crossing that
imaginary line which is for safety we will raise an alarm.
But firstly we need to detect the hand, for sure the problem here is we cannot use any ultrasonic sensor to detect hand , because for
ultrasonic sensor the Hand and the material which entering into machine will be same , due to this for every object entering into the
machine it will raise an alarm .
To overcome this here we use object detection that only detects the hand, which is crossing the imaginary line, which is for safety
and raises alarm. It is also possible to maintain the record of how many hands entered into it so that people can aware when they see
the stats and they try to protect themselves to avoid such danger; we can also use this theme of the project to other applications. By
just replacing hand with the Bike, and we can detect the bike which is crossing the staggered stop line and we can punish them.
Therefore, the project involves computer vision techniques and Object detection .Theme of project is able to solve the industry
problems and real time problems.
The theme of the project is to detecting the hand if it is a hand. Therefore, we used convolution neural network model to extract
feature maps and passed these feature maps to the detector, which will return the location of hands. If the hands in the image cross
the safety line we raise an alert in the form of alarm[3]
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1739
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
A. Existing system
In the automation sector, object detection is a key activity. Ultrasonic sensors detect objects by using sound waves. Most ultrasonic
sensors to detect objects and estimate distance use the return echo of an emitted sound wave bouncing off a target or background
condition. But for this, object detection sensors every object is same. In our case object detection sensor treat both hand and the
trash as object. But we don’t need that, we only want to detect the hand and that to if that hand crosses the line we need to raise an
alarm. So, the existing system doesn’t work in our problem to solve it. Moreover, we moved to the deep learning.
B. Proposed system
Object detection is a computer technology linked to computer vision and image processing that detects instances of semantic items
of a certain class (such as individuals, buildings, or cars) in digital photos and videos. Formalized paraphrase Face detection and
pedestrian detection are two well-studied object detection areas. Object detection has a wide range of applications in computer
vision, including image retrieval and video surveillance. There are various object detection algorithms are there. But we are
choosing SSD. Because SSD works well for real time object detection with best possible accuracy and speed,it is what actually we
need .So , we opted for SSD algorithm.
A. Feature Extraction
Convolution is the deep learning technique that is used to extract the features.
The feature extraction in convolution layers involves many steps. Below is the block diagram of feature extraction. The Fig 1
describes the process of extracting the features from the given image. The steps are
1) Convolution layer
2) Pooling layer
3) ReLU Lyer
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1740
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
The convolution network that is used for feature extraction is VGG-16 network. The VGG-16 network acts as a base network for the
feature extraction. The output of this VGG-16 is a feature map. The Feature maps are now finally with us from VGG-16 and this is
not the end, But SSD wants to make detection at various scales. So SSD added more convolution layers to the VGG-16 base
network in Fig 2.Now these convolutional layers generate a stack of feature maps of variety of sizes and different channels. Now
after having our feature maps, we move forward to detection heads where these detection heads are neural networks that are used to
detect the class and boxes of an object[2].
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1741
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
The Fig 4 describes conditions for raising an alarm , the alarm is raised based on the distance between the hand and the safety line.
If the distance between them is less than zero i.e., hand crossed the safety line. So alarm will raise in this case. And for the rest cases
alarm will not raise.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1742
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
In Fig 5, Fig 6 ,Fig 7 ,Fig 8 whenever the hand is crossing the red line (safety border line) the alarm is raised whether it is single
hand or double hand and alarm will not raised when hands are at a certain distance from the red line.
The object detection model that is used SSD(single shot multibox detector) is giving better results in detecting the hand in the image.
More over SSD is best for real time applications and working for us good as well. The confidence at which the model is detecting
the hand is 95% and sometimes even it will detect at a confidence of 98% as well. So the confidence is good in predicting these
hands. More over as of our requirement (i.e., detect only hand so that we can able raise an alarm) our model working in the same
direction in detecting only hand rather than any other objects which is shown in Fig 9.Below we will let you show some of the
images that show the model detecting only hands itself.
In the Fig 9 though someone is holding the phone in hand it only detecting the hand not phone and you also able to observe that the
phone is crossing the redline but alert is not raised because it is not hand, our model detects only hand because we want the hand not
to enter into the machine.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1743
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
V. CONCLUSION
The detection of hand and raising an alert whenever someone crosses the safety line is able to solve the real time problems .There
are many object detection algorithms are out there but SSD works well for real time applications because as we seen that it is very
fast and it is trained with different scales of the image (so that model can detect the images that are far from the camera).Unlike
other detection algorithms like RCNN , fast RCNN , faster RCNN where it uses regional proposed networks for detecting a object in
image which takes large time. Whereas SSD stands for Single Short Multibox Detector, which predicts classes and bounding boxes
for the entire image in a single run of the algorithm, rather than picking interesting areas of an image. It's often employed for real-
time object detection because it gives up a little accuracy in exchange for a lot of speed. The SSD method uses a feed-forward
convolutional network to generate a fixed-size collection of bounding boxes and scores for the presence of object class instances in
those boxes, followed by a non-maximum suppression step to get the final detections (for bounding boxes with most overlap keep the
one with highest score).
REFERENCES
[1] Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. arXiv preprint.
[2] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural
information processing systems (pp. 91-99).
[3] Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005.
IEEE Computer Society Conference on (Vol. 1, pp. 886-893). IEEE.
[4] Lee, K., Choi, J., Jeong, J., & Kwak, N. (2017). Residual features and unified prediction network for single stage detection. arXiv preprint arXiv:1707.05031.
[5] Wang, R. J., Li, X., Ao, S., & Ling, C. X. (2018). Pelee: A Real-Time Object Detection System on Mobile Devices. arXiv preprint arXiv:1804.06882.
[6] Liu, W., Anguelov, D., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp.
21-37). Springer, Cham.
[7] https://arxiv.org/abs/1412.1441 (website).
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1744