Face Detection Word
Face Detection Word
Face Detection Word
1.1 INTRODUCTION
Face recognition system is a complex image processing problem in real world applications with
complex effects of illumination, occlusion, and imaging condition on the live images. It is a
combination of face detection and recognition technique sin image analyzes. Detection
application is used to find position of the faces in a given image. Recognition algorithm is used
to classify given images with known structured properties, which are used commonly in most of
the computer vision applications. These images have some known properties like; same
resolution, including same facial feature components, and similar eye alignment. These images
will be referred as “standard image” in the further sections. Recognition applications uses
standard images, and detection algorithms detect the faces and extract face images which include
eyes, eyebrows, nose, and mouth. That makes the algorithm more complicated than single
detection or recognition algorithm. The first step for face recognition system is to acquire an
image from a camera. Second step is facing detection from the acquired image. As a third step,
face recognition that takes the face images from output of detection part. Final step is person
identity as a result of recognition part. An illustration of the steps for the face recognition system
is given in Acquiring images to computer from camera and computational medium
(environment) via-frame grabber is the first step in face recognition system applications.
The input image, in the form of digital data, is sent to face detection algorithm part of a software
for extracting each face in the image. Many methods are available for detecting faces in the
images in the literature. The available methods could be classified into two main groups as;
knowledge-based and appearance-based methods. Briefly, knowledge-based methods are derived
from human knowledge for features that makes a face. Appearance-based methods are derived
from training and/or learning methods to find faces. The details about theme those’ll be
summarized in the next chapter.
1
1.2. INTRODUCTION OF PYTHON
Python is a widely used general-purpose, high level programming language. It was created by
Guido van Rossum in 1991 and further developed by the Python Software Foundation. It was
designed with an emphasis on code readability, and its syntax allows programmers to express
their concepts in fewer lines of code.
Python is a programming language that lets you work quickly and integrate systems more
efficiently.
flexible, and you lose the compile-time type checking of the source code.
1.2.1 BACKGROUND
The project is developed based on the Python concept where smart device is used to manage
systems. IoT is generally about numerous devices being interconnected uniquely in the existing
internet infrastructure where information is shared among them. It can be viewed as a nervous
system that links anything or everything together. It is usually achieved using sophisticated
sensors and chips which are embedded in the physical things for real-time information retrieval.
Data collected will then be analyzed where intelligent decision will be taken by machines
without human intervention to either solve the existing problem or to improve the current
situation. In short, the IoT technology enhances many existing systems to be more efficient and
smarter. The application area of this project is involved in the smart cities sub-field. Smart cities
is a development vision using Information & Communication technology (ICT) in urban
advancement where city’s assets will be managed by smart devices to improve efficiency and
also to reduce human resource consumption. By integrating these concepts, a Smart attendance
monitoring system will be developed.
2
Face recognition is crucial in daily life in order to identify family, friends or someone we are
familiar with. We might not perceive that several steps have actually taken in order to identify
human faces.
Human intelligence allows us to receive information and interpret the information in the
recognition process. We receive information through the image projected into our eyes, by
specifically retina in the form of light (2008) mentioned that after visual processing done by the
human visual system, we actually classify shape, size, contour and the texture of the object in
order to analyze the information. The analyzed information will be compared to other
representations of objects or face that exist in our memory to recognize. However, we need large
memory to recognize different faces, for example, in the Universities, there are a lot of students
with different race and gender, it is impossible to remember every face of the individual without
making mistakes. In order to overcome human limitations, computers with almost limitless
memory, high processing speed and power are used in face recognition systems.
Traditional student attendance marking technique is often facing a lot of trouble. The face
recognition student attendance system emphasizes its simplicity by eliminating classical student
attendance marking technique such as calling student names or checking respective identification
cards. There are not only disturbing the teaching process but also causes distraction for students
during exam sessions. Apart from calling names, attendance sheet is passed around the classroom
during the lecture sessions. The lecture class especially the class with a large number of students
might find it difficult to have the attendance sheet being passed around the class. Thus, face
recognition attendance system is proposed in order to replace the manual signing of the presence
of students which are burdensome and causes students get distracted in order to sign for their
attendance.
Furthermore, the face recognition based automated student attendance system able to overcome
the problem of fraudulent approach and lecturers does not have to count the number of students
several times to ensure the presence of the students. The paper proposed by Zhao, W et al. (2003)
has listed the difficulties of facial identification.
3
One of the difficulties of facial identification is the identification between known and unknown
images. In addition, paper proposed by Pooja G.R et al. (2010) found out that the training
process for face recognition student attendance system is slow and time-consuming. In addition,
the paper proposed by Priyanka Wagh et al. (2015) mentioned that different lighting and head
poses are often the problems that could degrade the performance of face recognition-based
student attendance system. Hence, there is a need to develop a real time operating student
attendance system which means the identification process must be done within defined time
constraints to prevent omission. The extracted features from facial images which represent the
identity of the students have to be consistent towards a change in background, illumination, pose
and expression. High accuracy and fast computation time will be the evaluation points of the
performance. According to the previous attendance management system, the accuracy of the data
collected is the biggest issue. This is because the attendance might not be recorded personally by
the original person, in another word, the attendance of a particular person can be taken by a third
party without the realization of the institution which violates the accuracy of the data.
For example, student A is lazy to attend a particular class, so student B helped him/her to sign for
the attendance which in fact student A didn’t attend the class, but the system overlooked this
matter due to no enforcement practiced. Supposing the institution establish an enforcement, it
might need to waste a lot of human resource and time which in turn will not be practical at all.
Thus, all the recorded attendance in the previous system is not reliable for analysis usage. The
second problem of the previous system is where it is too time consuming. Assuming the time
taken for a student to sign his/her attendance on a 3-4 paged name list is approximately 1 minute.
In 1 hour, only approximately 60 students can sign their attendance which is obviously
inefficient and time consuming. The third issue is with the accessibility of those information by
the legitimate concerned party. For an example, most of the parents are very concerned to track
their child’s actual whereabouts to ensure their kid really attend the classes in college/school.
However, in the previous system, there are no ways for the parents to access such information.
Therefore, evolution is needed to be done to the previous system to improve efficiency, data
accuracy and provides accessibility to the information for those legitimate party.
4
There are not only disturbing the teaching process but also causes distraction for students during
exam sessions. Apart from calling names, attendance sheet is passed around the classroom during
the lecture sessions. The lecture class especially the class with a large number of students might
find it difficult to have the attendance sheet being passed around the class. Thus, face recognition
attendance system is proposed in order to replace the manual signing of the presence of students
which are burdensome and causes students get distracted in order to sign for their attendance.
Furthermore, the face recognition based automated student attendance system able to overcome
the problem of fraudulent approach and lecturers does not have to count the number of students
several times to ensure the presence of the students.
Hence, there is a need to develop a real time operating student attendance system which means
the identification process must be done within defined time constraints to prevent omission. The
extracted features from facial images which represent the identity of the students have to be
consistent towards a change in background, illumination, pose and expression. High accuracy
and fast computation time will be 6 the evaluation points of the performance.
The objective of this project is to develop face recognition attendance system. Expected
achievements in order to fulfill the objectives are:
In order to solve the drawbacks of the previous system stated in 1.1, the existing system will
need to evolve. The proposed system will reduce the paper work where attendance will no longer
involve any manual recording.
5
The new system will also reduce the total time needed to do attendance recording. The new
system will acquire individual attendance by means of facial-recognition to secure data accuracy
of the attendance.
The main intention of this project is to solve the issues encountered in the old attendance system
while reproducing a brand new innovative smart system that can provide convenience to the
institution. In this project, a smart device will be developed which is capable of recognizing the
identity of each individual and eventually record down the data into a database system. Apart
from that, a website will be developed to provide visual access to the information. The
followings are the project scopes:
The targeted groups of the attendance monitoring system are the students and staff of an
educational institution.
The database of the attendance management system can hold up to 2000 individual’s
information. The facial recognition process can only be done for 1 person at a time.
6
There will be two types of webpage interface after the login procedure for the admins and
the non-admins respectively.
The project has to work under a Wi-Fi coverage area, as the system need to update the
database of the attendance system constantly.
The smart device is powered up by power bank to improve the portability of the device.
We are setting up to design a system comprising of two modules. The first module (face detector)
is a mobile component, which is basically a camera application that captures student faces and
stores them in a file using computer vision face detection algorithms and face extraction
techniques. The second module is a desktop application that does face recognition of the
captured images (faces) in the file, marks the students register and then stores the results in a
database for future analysis.
Using this system, we will able to accomplish the task of marking the attendance in the
classroom automatically and output is obtained in an excel sheet as desired in real-time
However, in order to develop a dedicated system which can be implemented in an
educational institution, a very efficient algorithm which is insensitive to the lighting
conditions of the classroom has to be developed.
Another important aspect where we can work towards is creating an online database of
the attendance and automatic updating of the attendance
1.5 METHODOLOGY
In order to obtain the attendance, positions and face images in lecture, we proposed the
attendance management system based on face detection in the classroom lecture. The system
estimates the attendance and positions of each student by continuous observation and recording.
Current work is based on the method to obtain the different weights of each focused seat. The
technology aims in imparting tremendous knowledge oriented technical innovations these days.
Deep Learning is one among the interesting domain that enables the machine to train itself by
providing some datasets as input and provides an appropriate output during testing by applying
different learning algorithms. Nowadays Attendance is considered as an important factor for both
the student as well as the teacher of an educational organization. With the advancement of the
7
deep learning technology the machine automatically detects the attendance performance of the
students and maintains a record of those collected.
A similar separation of pattern recognition algorithms into four groups is proposed by Jain and
colleges. We can group face recognition methods into three main groups. The following
approaches are proposed:
Template matching: - Patterns are represented by samples, models, pixels, curves, textures. The
recognition function is usually a correlation or distance measure.
Neural networks: - The representation may vary. There is a network function in some point.
Note that many algorithms, mostly current complex algorithms, may fall into more than one of
these categories. The most relevant face recognition algorithms will be discussed later under this
classification.
Manual Student Attendance Management system is a process where a teacher concerned with the
particular attendance manually. Manual attendance may be considered as a time-consuming
process or sometimes it happens for the teacher to miss someone or students may answer
multiple times on the absence of their friends. So, the problem arises when we think about the
traditional process of taking attendance in the classroom.
8
1.5.2 Automated Attendance System (AAS)
Automated Attendance System (AAS) is a process to student in the classroom by using face
recognition technology. It is also possible to recognize whether the student is sleeping or awake
during the lecture and it can also be implemented in the exam sessions to ensure the presence of
the student. Attendance of students will be taken by a real-time camera positioned at the door
which senses anyone entering or exiting the classroom. The camera is trained in such a way that
it differentiates shadows and photos.
1. Feature-Based approach
Feature-Based approach: The Feature-based approach also known as local face recognition
system, used in pointing the key features of the face.
Brightness Based approach: Brightness-based approach also termed as the global face
recognition system, used in recognizing all the parts of the image.
9
CHAPTER-2-SYSTEM DESIGN
The primary purpose of this paper review is to find the solutions provided by others author and
consider the imperfection of the system proposed by them, give the best solutions. In Kawaguchi
introduced a lecture attendance system with a new method called continuous monitoring, and the
student’s attendance marked automatically by the camera which captures the photo of a student
in the class. The architecture of the system is simple since two cameras equipped with the wall of
the class. The first one is a capturing camera used to capture the image student in the class and
the second camera is sensor camera is used to getting the seat of a student inside the class and the
camera capturing will snap the image of the student. The system compares the picture taking
from a camera capturing images and faces in the database done much time to perfect the
attendance. Other paper proposed by introduced a real-time computer vision algorithm in
automatic attendance management system.
The system installed the camera with non-intrusive, which can snap images in the classroom and
compared the extracted face from the image of the camera capturing with faces inside the
system. This system also used machine learning algorithm which are usually used in computer
vision. Also, HAAR CLASSIFIERS used to train the images from the camera capturing. The
face snap by the camera capturing will convert to grayscale and do subtraction on the images;
then the image is transferred to store on the server and processing later. In 2012 N. Kar
introduced an automated attendance management system using face recognition technique which
used the Principal Component Analysis To implementation the system, use two libraries such
OpenCV is a computer vision library and FLTK (Light Tool Kit. Both of these libraries helped
the development such as OpenCV support algorithm and FLTK used to design the interface. In
the system, there are Request Matching and Adding New fact to Database. In Request Matching,
the first step is open the camera and snap the photo after the extraction the frontal face. The next
step is recognizing the face with the training data and project the extracted face onto the
Principal Component Analysis. The final step displays the nearest face with the acquired images.
10
Apart from that, adding a new face into the database is snap the photo after that extract the
frontal face images and then perform the Haar cascade Method to find the perform the Principal
Component Analysis Algorithm. The final step is storing the information inside the face XML
file. The system is focused on the algorithm to improve the face detection from acquired images
or videos in the author also proposed a system which implements automatic attendance using
face recognition. The system which can extract the object in the face such nose, mouth by using
MATLAB with Principal Component Analysis (PCA). The system designed to resolve the issues
of attendance marking system such as time-consuming. As the result of the experiment show that
this paper, the system can recognize in case the dark background or difference view of the face in
the classroom. Jyotsana Kanti proposed a smart attendance marking system which combines two
differencing algorithms such Principal Component Analysis and Artificial Neural Network.
The purpose of the author is to solve the traditional attendance marking system and to resolve the
time-consuming. In the system implement with Principal Component Analysis, it does an
extraction and identify the similarities of the face database and acquire images. Artificial Neural
Network is used to solve the problem of the input data or learn from the input data, and the
expect value. In the system implemented by the author using back propagation algorithm and
combines with mathematical function to perform in that system. The camera needs to install in
the front which can capture an entire face of the student inside the class. The first phase after the
camera has been captured; the captured image was transferred into the system as an input. The
image capture from the camera sometimes come with the darkness or brightness which need to
do an enhancement on it such as convert to gray image. The next step, Histogram Normalization
is used in this system remove the contrast of the image. It is easy to recognize when has the
student sit in the back row. The Median filter is used to remove noise from the image in case the
camera is high-definition camera, but sometimes it still contains the noise as a result, written by
the author research, it shows that the system can use to recognize in a different environment.
Student Attendance System using Face Recognition: Samridhi Dev, Tushar Patnaik (2020) In
this paper the system was tested on three different algorithms out of which the KNN algorithm
proved to be better with the accuracy of 99.27 %. The system was tested on various conditions
which include illumination, head movements, expressions, the distance of students from the
camera.
11
2.1.1 FACE RECOGNITION-BASED ATTENDANCE SYSTEM
The system consists of two cameras; one for determining the seating positions (fixed at the
ceiling) and the other for capturing the students face (Fixed in front of the seats). To determine
the target seat Active Student Detection (ASD) method is used to estimate the existence of a
student on a seat. One seat is targeted and camera is directed to capture the image. The face
image capture is enhanced and recognized and are recoded into the database. Every seat has a
vector of values that represent relationship between the student and seat. Attendance is estimated
by interpreting the face recognition data obtained by continuous observation. The position and
attendance of the student are recorded into the database.
1. Detection
2. Recognition
3. Detection Recognition
Face detection involves identifying a person’s face in an image or video. This is done by
analyzing the visual input to determine whether a person’s facial features are present. Since
human faces are so diverse, face detection models typically need to be trained on large amounts
of input data for them to be accurate. The training dataset must contain a sufficient representation
of people who come from different backgrounds, genders, and cultures. These algorithms also
need to be fed many training samples comprising different lighting, angles, and orientations to
make correct predictions in real-world scenarios.
These nuances make face detection a non-trivial, time-consuming task that requires hours of
model training and millions of data samples. The OpenCV package comes with pre-trained
models for face detection, which means that we don’t have to train an algorithm from scratch.
More specifically, the library employs a machine learning approach called Haar cascade to
identify objects in visual data. Face detection can be regarded as a specific case of object-class
12
detection. In object-class detection, the task is to find the locations and sizes of all objects in an
image that belong to a given class.
Face-detection algorithms focus on the detection of frontal human faces. It is analogous to image
detection in which the image of a person is matched bit by bit. Image matches with the image
stores in database. Any facial feature changes in the database will invalidate the matching
process.
A reliable face-detection approach based on the genetic algorithm and the eigen-face technique:
Firstly, the possible human eye regions are detected by testing all the valley regions in the gray-
level image. Then the genetic algorithm is used to generate all the possible face regions which
include the eyebrows, the iris, the nostril and the mouth corners.
Each possible face candidate is normalized to reduce both the lighting effect, which is caused by
uneven illumination; and the shirring effect, which is due to head movement. The fitness value of
each candidate is measured based on its projection on the eigen-faces. After a number of
iterations, all the face candidates with a high fitness value are selected for further verification. At
this stage, the face symmetry is measured and the existence of the different facial features is
verified for each face candidate.
13
unlock devices, while face recognition can also be used to analyze and track faces. It is certain
that without face detection, it is impossible to use face recognition. This is why face detection
builds face recognition technology possible. On the other hand, face detection is used by all
facial recognition systems, but not all facial recognition systems can be used to find faces.
Some popular applications like Snapchat or Instagram allow their users to modify their faces
with fun filters in real-time. This is made feasible by facial recognition algorithms, which inform
the apps that there is a face that may be tracked and modified on the screen.
Face Detection technology, "facial motion capture" is used to create computer graphics (CG), 3D
animations, and real-time avatars for movies, video games, and other media channels. It makes
facial emotion tracking possible.
2.1.3 RECOGNITION
There are various processes involved in the facial recognition process. Face detection begins by
locating and extracting facial characteristics from an image or video frame. The placement of the
eyes, nose, mouth, and other recognizable face hallmarks are examples of these traits. The
program then transforms the detected facial traits into a mathematical representation called a face
template or face print. Then, this face template is compared to a database of previously known
face templates to see whether there is a match or likeness.
Face recognition systems may be used for various purposes, such as identity verification, access
control, surveillance, and customizing. They are employed in many industries, including
14
customer service, mobile technology, social media, and law enforcement. The ethical and
privacy issues raised by facial recognition technology must be noted. Some of the problems with
its use include the possibility of misusing personal data, mass spying, and the danger of false
positives and negatives.
Rules and procedures are being devised to address these worries and guarantee the appropriate
use of the technology. Face recognition is a method of identifying or verifying the identity of an
individual using their face. There are various algorithms that can do face recognition but their
accuracy might vary. Here I am going to describe how we do face recognition using deep
learning.
So now let us understand how we recognize faces using deep learning. We make use of face
embedding in which each face is converted into a vector and this technique is called deep metric
learning.
Feature Extraction: Now that we have cropped the face out of the image, we extract features
from it. Here we are going to use face embeddings to extract the features out of the face. A neural
network takes an image of the person’s face as input and outputs a vector which represents the
most important features of a face. In machine learning, this vector is called embedding and thus
we call this vector as face embedding. Now how does this help in recognizing faces of different
persons?
15
16
17
2.2 STUDENT ATTENDANCE SYSTEM
a) Face Detection
b) Face Recognition
Face detection is a computer technology being used in a variety of applications that identifies
human faces in digital images. Face detection also refers to the psychological process by which
humans locate and attend to faces in a visual scene. Face detection can be regarded as a specific
case of object-class detection. In object-class detection, the task is to find the locations and sizes
of all objects in an image that belong to a given class.
Examples include upper torsos, pedestrians, and cars. Face-detection algorithms focus on the
detection of frontal human faces. It is analogous to image detection in which the image of a
person is matched bit.
Any facial feature changes in the database will invalidate the matching process. A reliable face-
detection approach based on the genetic algorithm and the eigen-face technique: Firstly, the
possible human eye regions are detected by testing all the valley regions in the Gray-level image.
Then the genetic algorithm is used to generate all the possible face regions which include the
eyebrows, the iris, the nostril and the mouth corners.
Each possible face candidates are normalized to reduce lightning effect caused due to uneven
illumination and the shirring effect due to head movement. The fitness value of each candidate is
measured based on its projection on the eigen-faces. After several iterations, all the face
candidates with a high fitness value are selected for further verification. At this stage, the face
symmetry is measured, and the existence of the different facial features is verified for each face
Candidate.
18
2.2.2 FACE RECOGNITION:
For example, an algorithm misanalyses the relative position, size, and/or shape of the eyes, nose,
cheekbones, and jaw.
These features are then used to search for other images with matching features. Other algorithms
nor-maize a gallery of face images and then compress the face data, only saving the data in the
image that is useful for face recognition. A probe image is then compared with the face data. One
of the earliest successful systems is based on template matching techniques applied to asset of
salient facial features, providing a sort of compressed face representation.
19
Recognition algorithms can be divided into two main approaches, geometric, which looks at
distinguishing features, or photometric, which is a statistical approach that distils an image into
values and compares the values with templates to eliminate variances. Popular recognition
algorithms in-clued Principal Component Analysis using eigenfaces, Linear Discriminate
Analysis, Elastic Bunch Graph Matching using the Fisherface algorithm, the Hidden Markov
model, the Multi-linear Subspace Learning using tensor representation, and the neuronal
motivated dynamic link matching.
Digital Image Processing is the processing of images which are digital in nature by a digital
computer. Digital image processing techniques are motivated by three major applications mainly:
20
The Digital Image Processing System consists of six stages: image acquisition, pre-processing,
feature extraction, associative storage, knowledge base and recognition as shown in the first step
in the process is image acquisition or capturing of digital image. A digitized image is an image f
(x, y) in which both spatial coordinates and brightness is digitized. The elements of a digitized
array are called picture elements or pixels. The image acquisition stage is concerned with sensors
that capture images. The sensor can be a camera or a scanner. The nature of the sensor and the
image it produces are determined by the application. The pre-processing stage deals with
brightness perception as well as image restoration and reconstruction. Image restoration deals
with estimating an original image.
2.3.1 PRE-PROCESSING
Pre Processing can refer to manipulation or dropping of data before it is used in order to ensure
or enhance performance,[1] and is an important step in the data mining process. The
phrase "garbage in, garbage out" is particularly applicable to data mining and machine
learning projects. Data collection methods are often loosely controlled, resulting in out-of-range
values, impossible data combinations, and missing values, amongst other issues.
Analyzing data that has not been carefully screened for such problems can produce misleading
results. Thus, representation and quality of data is necessary before running any analysis. Often,
data preprocessing is the most important phase of a machine learning project, especially
in computational biology. If there is a high proportion of irrelevant and redundant information
present or noisy and unreliable data, then knowledge discovery during the training phase may be
more difficult. Data preparation and filtering steps can take a considerable amount of processing
time.
Examples of this methods used for in the data preprocessing of the data
include cleaning, instance selection, normalization, one-hot encoding, data
transformation, feature extraction and feature selection.
21
2.3.2 FEATURE EXTRACTION
Features Extraction aims to reduce the number of features in a dataset by creating new features
from the existing ones (and then discarding the original features). These new reduced set of
features should then be able to summarize most of the information contained in the original set of
features. In this way, a summarized version of the original features can be created from a
combination of the original set.
Another commonly used technique to reduce the number of features in a dataset is Feature
Selection. The difference between Feature Selection and Feature Extraction is that feature
selection aims instead to rank the importance of the existing features in the dataset and discard
less important ones (no new features are created).
DATABASE LAYER
The Database layer is a centralized database system which consists of student database and their
attendance. The student database is formed by initial feeding of the frames from which system
detects faces crops them and stores it to the database and these stored images are hence forth
used for the recognition part.
The results of the face recognition module are compared with the images from the student
database and after the successful comparison the attendance is updated to the database. The sheet
is generated and uploaded to the web app.1.1.4 Formulation of Problem With using Technology
It really is tedious to use those paper sheets to mark the attendance as there are many classes in
an institute and each class has many students.
These many sheets and entries are to be entered manually to keep the record. This is quite hectic
and time consuming. What if an automation is brought up in such systems and the paperwork is
eliminated? The most common way to identify any human being is through his/her face. So
why not mark the attendance of students or employees using face and recognition technologies!
The current most efficient algorithm available is Viola and Jones algorithm which uses LBP
functions and is available openly on OpenCV.
22
Any employee or teacher has a mobile phone or a laptop with a camera and can use into click the
picture and upload to the system to mark the attendance. Web based applications are widely used
in any company or institute due to their security and user-friendly options. Hence, the idea is to
build a web-based application which provides user authentication, a dashboard with various
charts, image upload and generates an attendance data sheet.
In the face detection and recognition system, the process flow is initiated by being able to detect
the facial features from a camera or a picture store in a memory.
The algorithm processes the image captured and identifies the number of faces in the image by
analyzing from the learned pattern and compare them to filter out the rest. This image processing
uses multiple algorithm that takes facial features and compare them with known database.
APPLICATION LAYER
Web applications and web servers are critical to our online presence and the attacks observed
against them constitute more than 70% of the total attacks attempted on the Internet. These
attacks attempt to convert trusted websites into malicious ones. Due to this reason, web server
and web application pen testing plays an important role.
SYSTEM LAYER
The operating system can be implemented with the help of various structures. The structure of
the OS depends mainly on how the various common components of the operating system are
interconnected and melded into the kernel. Depending on this, we have to follow the structures
of the operating system. The layered structure approach breaks up the operating system into
different layers and retains much more control on the system. The bottom layer (layer 0) is the
hardware, and the topmost layer (layer N) is the user interface. These layers are so designed that
each layer uses the functions of the lower-level layers only. It simplifies the debugging process
as if lower-level layers are debugged, and an error occurs during debugging. The error must be
on that layer only as the lower-level layers have already been debugged.
23
24
DATABASE LAYER
The database layer of the data model consists of the classes and routines needed by a particular
Access Module to transport data using a native database API. The details, which vary widely
from one Access Module to another, are completely invisible to applications, but there are
common themes.
One common theme is the use of early binding of data in each of the Access Modules. In
designing the database layer of the data model, we examined two strategies: late binding and
early binding. With late binding, data is retained in its database-dependent form until it is
requested by a higher layer. With early binding, data is converted to a normalized form at the
earliest possible moment.
Late binding offers potential efficiency advantages. In some scenarios it might be possible to
avoid some type conversions; in a few cases, we might not need to touch the data at all. On the
other hand, early binding allows us to localize all decisions about type conversion to a narrow
layer, and to provide a simpler, more reusable interface to the rest of the library. By normalizing
immediately, we are able to guarantee that data, once retrieved, is always available wherever it is
required. In particular, no additional logic is required to allow data from one database to be
combined with data from another. This was the deciding factor in our decision to use early
binding in the database layer.
● Image Acquisition - An imaging sensor and the capability to digitize the signal produced by the
sensor.
● Description/feature Selection – extracts the description of image objects suitable for further
computer processing.
25
● Recognition and Interpretation – Assigning a label to the object based on the information
provided by its descriptor. Interpretation assigns meaning to a set of labelled objects.
● Knowledge Base – This helps for efficient processing as well as inter module cooperation.
Face recognition is a technology that identifies or verifies a person's identity by analyzing and
comparing patterns in facial features. It involves detecting and extracting facial landmarks,
encoding them into a feature vector, and matching those vectors against a database of known
faces. The advancements in deep learning algorithms, specifically convolutional neural networks
(CNNs), have significantly improved the accuracy and efficiency of face recognition systems.
26
2.4.1 DETECTION
The Media Pipe Face Detector task lets you detect faces in an image or video. You can use this
task to locate faces and facial features within a frame. This task uses a machine learning (ML)
model that works with single images or a continuous stream of images. The task outputs face
locations, along with the following facial key points: left eye, right eye, nose tip, mouth, left eye
tragion, and right eye tragion.
Face Detection Systems are one of the Artificial Intelligences most commonly uses. Meanwhile,
security and robotics implement it in an inconspicuous way, we use Face Detection every time
we take a photo or upload content to social media. It has become part of our lives and mostly of
the people don’t even notice what’s behind it. Face Detection can seem simple, but it’s not. Is a
technology capable to identify and verify people from images or video frames. Is similar
somehow to fingerprint or eye iris recognition systems.
The automatic analysis of facial expressions is motivated by the essential role that the face plays
in our emotional and social life. Facial expression is one of the most convincing and natural
means that human beings have to communicate our emotions, intentions, clarify and emphasize
what we say. Furthermore, unlike other non-verbal channels, facial expressions are cross-cultural
and universal, not depending on the age and gender of the individual. In the context of an
interview or interrogation, the analysis of facial expressions can provide invaluable support to
the observer.
for example, in what moments they occur in relation to the question posed: when listening to it,
while processing that information; when answering, after having given the answer. It is also
interesting for the detection of emotional incongruities, that is, situations in which the subject
verbally expresses an emotion by showing a very different one on the face. Likewise, the
direction of the gaze and the orientation of the head over time translate the degree of attention of
the interviewee, giving clues about their interest, abilities and certain personality traits.
27
IDENTIFICATION
The face identifier procedure simply requires any device that has digital photographic technology
to generate and obtain the images and data necessary to create and record the biometric facial
pattern of the person that needs to be identified.
Unlike other identification solutions such as passwords, verification by email, selfies or images,
or fingerprint identification, Biometric facial recognition uses unique mathematical and dynamic
patterns works as a face scanner that make this system one of the safest and most effective ones.
The objective of face recognition is, from the incoming image, to find a series of data of the
same face in a set of training images in a database. The great difficulty is ensuring that this
process is carried out in real-time, something that is not available to all biometric face
recognition software providers. The one in which, for the first time, a facial recognition
system addresses a face to register it and associate it with an identity, in such a way that it is
recorded in the system. This process is also known as digital onboarding with facial recognition.
Face Detection and Extraction: Face detection is important as the image taken through the
camera given to the system, face detection algorithm applies to identify the human faces in that
image, the number of image processing algorithms are introduced to detect faces in an image and
also the location of that detected faces. We have used HOG method to detect human faces in
given image.
Face Positioning: There are 68 specific points in a human face. In other words, we can say 68
face landmarks. The main function of this step is to detect landmarks of faces and to position the
image. A python script is used to automatically detect the face landmarks and to position the face
as much as possible without distorting the image.
This is the subject folder, subjects are to be filled according to time table once the time arrives
for the corresponding subject, the system starts capturing images, detects the faces, compares the
faces with existing database, mark attendance and generate excel sheet for the recognize
students.
28
Face Encoding: Once the faces are detected in the given image, the next step is to extract the
unique identifying facial feature for each image. Basically, whenever we get localization of face,
the 128 key facial point are extracted for each image given input which are highly accurate and
these 128-d facial points are stored in data file for face recognition.
Face matching: This is last step of face recognition process. We have used the one of the best
learning techniques that is deep metric learning which is highly accurate and capable of
outputting real value feature vector. Our system ratifies the faces, constructing the 128- d
embedding (ratification) for each. Internally compare face’s function is used to compute the
Euclidean distance between face in image and all faces in the dataset. If the current image is
matched with the 60% threshold with the existing dataset, it will move to attendance marking .
Image acquisition: Image is acquire using a high-definition camera which is placed in the
classroom. This image is given as an input to the system.
Dataset Creation: Dataset of students is created before the recognition process. Dataset was
created only to train this system. We have created a dataset of 5 students which involves their
name, roll number, department and images of student in different poses and variations. For better
accuracy minimum 15 images of each student should be captured. Whenever we register
student’s data and images in our system to create dataset, deep learning applies to each face to
compute 128-d facial features and store in student face data file to recall that face in recognition
process. This process is applying to each image taken during registration.
Attendance system proved to recognize images in different angle and light conditions. The faces
which are not in our training dataset are marked as unknown. The attendance of recognize
images of students is marked in real time. And import to excel sheet and saved by the system
automatically. The system calculates the attendance subject wise, that is the data of students and
subjects are added manually by administrator, and whenever time for corresponding subject
arrives the system automatically starts taking snaps and find whether human faces are appear in
the given image or not.
29
30
TRAINING THE ALGORITHM
First, we need to train the algorithm. To do so, we need to use a dataset with the facial images of
the people we want to recognize. We need to also set an ID (it may be a number or the name of
the person) for each image, so the algorithm will use this information to recognize an input
image and give you an output. Images of the same person must have the same ID With the
training set already constructed.
The development of the face database is an important phase before any facial recognizing
process can be carried out. It acts as a library to compare against with whenever the system
wanted to identify a person. In the image retrieval process, the system will first prompt for an
input from the user to enter their ID number.
The system will then validate the entered input and then check for duplication in the system. To
proceed, the entered input must contain only 12 digits of number. Apart from that, the ID
inputted must be a non-registered ID to ensure no duplication.
After that, a directory is created for each individual where their portraits will be stored inside of
it. It is a compulsory to store 10 - 30 portraits per person in the file.
After the acquisition of image is done, the images undergo a pre-processing before storing it into
the respective folder. The above flowchart is only the program flow for the image acquisition
process which describes the program flow for the script create_database.py. There are two more
python scripts that responsible for the remaining execution which will be explained in next sub-
section.
Face detection answers the question, where is the face? It identifies an object as a “face” and
locates it in the input image. Face Recognition on the other hand answers the question who is
this? Or whose face, is it? It decides if the detected face is someone. It can therefore be seen that
31
face detections output (the detected face) is the input to the face recognizer and the face
Recognition’s output is the final decision i.e., face known or face unknown.
CHAPTER 3
Python language is being used by almost all tech-giant companies like – Google, Amazon,
Facebook, Instagram, Dropbox, Uber… etc.
The biggest strength of Python is huge collection of standard libraries which can be used for the
following:
Machine Learning
GUI Applications (like Kivy, Tkinter, PyQt etc.)
Web frameworks like Django (used by YouTube, Instagram, Dropbox)
Image processing (like OpenCV, Pillow)
Web scraping (like Scrapy, Beautiful Soup, Selenium)
Test frameworks
Multimedia
Scientific computing
Text processing and many more...
32
Deep learning: Deep learning is a method in artificial intelligence (AI) that teaches computers to
process data in a way that is inspired by the human brain. Deep learning models can recognize
complex patterns in pictures, text, sounds, and other data to produce accurate insights and
predictions. You can use deep learning methods to automate tasks that typically require human
intelligence, such as describing images or transcribing a sound file into text.
Artificial intelligence (AI) attempts to train computers to think and learn as humans do. Deep
learning technology drives many AI applications used in everyday products, such as the
following:
Digital assistants
Voice-activated television remotes
Fraud detection
Automatic facial recognition
It is also a critical component of emerging technologies such as self-driving cars, virtual reality,
and more.
Deep learning models are computer files that data scientists have trained to perform tasks using
an algorithm or a predefined set of steps. Businesses use deep learning models to analyze data
and make predictions in various applications.
Neural network: A neural network is a method in artificial intelligence that teaches computers
to process data in a way that is inspired by the human brain. It is a type of machine learning
process, called deep learning, that uses interconnected nodes or neurons in a layered structure
that resembles the human brain. It creates an adaptive system that computers use to learn from
their mistakes and improve continuously. Thus, artificial neural networks attempt to solve
complicated problems, like summarizing documents or recognizing faces, with greater accuracy.
Computer vision: Computer vision is the ability of computers to extract information and
insights from images and videos. With neural networks, computers can distinguish and recognize
images similar to humans. Computer vision has several applications, such as the following:
33
Visual recognition in self-driving cars so they can recognize road signs and other road users
Content moderation to automatically remove unsafe or inappropriate content from image and
video archives
Facial recognition to identify faces and recognize attributes like open eyes, glasses, and facial
hair
Image labeling to identify brand logos, clothing, safety gear, and other image details.
34
3.2 SOFTWARE USED
OpenCV python software : OpenCV (Open Source Computer Vision Library) is a Python library
that allows you to perform image processing and computer vision tasks. It provides a wide range
of features, including object detection, face recognition, and tracking. So , it is very important to
install OpenCV. For installing OpenCV open your command prompt and type python. It will
show your python version . Then type “pip install opencv-python”.
VS code: Visual Studio Code, also commonly referred to as VS Code ,is a source-code editor
made by Microsoft for Windows , Linux and macOS. Features include support for debugging ,
syntax highlighting , intelligent code completion , snippets , code refactoring , and embedded Git
. Users can change the theme , keyboard shortcuts , preferences, and install extensions that add
functionality
Tkinter : Tkinter is the standard GUI library for Python. Python when combined with Tkinter
provides a fast and easy way to create GUI applications. Tkinter provides a powerful object-
oriented interface to the Tk GUI toolkit. Creating a GUI application using Tkinter is an easy task.
35
3.2 MODULE AND TECHNOLOGY
Face recognition is a technology that identifies or verifies a person's identity by analyzing and
comparing patterns in facial features. It involves detecting and extracting facial landmarks,
encoding them into a feature vector, and matching those vectors against a database of known
faces. The advancements in deep learning algorithms, specifically convolutional neural networks
(CNNs), have significantly improved the accuracy and efficiency of face recognition systems.
To build our face recognition attendance system, we will need the following Python libraries:
A well-known open-source computer vision and image processing library is called OpenCV
(Open-Source Computer Vision Library). For several computer vision applications, such as face
detection, it provides a complete collection of functions and methods. Python, C++, and Java are
just a few programming languages that OpenCV supports.
The notion of Haar cascades is the foundation for OpenCV's face detection method. Classifiers
called Haar cascades are learned to recognize certain patterns or characteristics in pictures. These
patterns or features correspond to facial features like the eyes, nose, and mouth in the context of
face detection-trained Haar cascades with a focus on face detection are available in OpenCV.
The learned patterns and parameters required for face detection are included in these cascades,
which are XML files. The Haar cascades are based on the basic rectangular patches known as
Haar-like characteristics, which stand out from their surroundings.
36
The process of face detection using OpenCV typically involves the following steps:
1. Load the Haar cascade classifier: OpenCV offers pre-trained Haar cascade XML files
for face detection; load these to use the classifier. You must load these files into your
program.
2. Read and preprocess the input picture: Read and preprocess the image in which faces
are to be detected. The image could be made grayscale, resized, and given any required
improvements during preprocessing.
3. Apply the face detection algorithm: Apply the face detection method to the
preprocessed picture using the loaded Haar cascade classifier. The classifier searches the
image at various sizes and locations for areas corresponding to the recognized face
patterns.
4. Find faces: A region of the picture is thought to include a face if it complies with the
patterns specified by the Haar cascade classifier. A single face might be the subject of
several possible detections.
5. Results post-processing and display: The prospective face detections are post-
processed to remove false positives and improve the final face detection results. To
increase accuracy, this can include using extra filters or methods. The final faces might be
highlighted or annotated in the picture output.
Installing the OpenCV library is required before we can begin face detection in Python.
Once the library is installed, we can start writing our code. The relevant modules must first be
imported and read in an image as the early phase:
1. import cv2
2. image = cv2.imread("image.jpg")
37
Next, we will use the Cascade Classifier class to detect faces in the image. This class takes in a
pre-trained cascading classifier that can be used to detect faces in an image.
The classifier can be trained using a dataset of images of faces, and it uses a combination of
features such as edges, shapes, and textures to detect faces.
1. face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
2. faces = face cascade. detect Multiscale (image, scale Factor=1.1, min Neighbors=5)
The detect Multiscale method takes in the image and a few parameters such as the scale factor
and the minimum number of neighbors.
The scale factor is used to control the size of the detection window, and the minimum number of
neighbors is used to control the number of false positives.
38
3.2.2 FACE RECOGNITION USING DLIB
Python's Dlib package for machine learning and computer vision is rather effective. There are a
few key ideas and information for accurate and trustworthy results regarding facial recognition
using Dlib. Here are several crucial components:
o Pre-trained face detection model: Dlib offers a face detection model that has already
been trained using a sizable dataset of faces that have been labeled. This model's purpose
is to find faces in video and picture streams. It combines Haar-like characteristics with
machine learning methods to recognize faces, notably a Support Vector Machine (SVM).
o Bounding box coordinates: The face detection model in Dlib gives the bounding box
coordinates for each face that is recognized. The rectangular area within which the
image's recognized face is known as the bounding box. The bounding box's width and
height and the (x, y) location of its top-left corner are normally included in the
coordinates.
o Facial Landmarks: Facial landmarks are distinct points or landmarks on the face, such
as the eyes, nose, mouth, and chin. Dlib also has functionality for recognizing these
features. These landmarks aid in correctly recognizing and aligning faces. The form
predictor approach, which employs a combination of regression and random forest
classifiers, is used to train the face landmark identification model in Dlib.
o Face encodings: Dlib uses face encodings or embeddings to recognize faces. Face
encoding is a concise numerical representation of the face that captures the distinctive
traits and qualities of a particular person's face. These encodings are produced by Dlib
using the ResNet-34 deep learning model. Then, the similarity between the encodings
may be assessed, or faces can be matched against a database of recognized faces.
Steps you would generally take to conduct facial recognition using Dlib are as follows:
1. The pre-trained facial landmark and face detection models should be loaded.
2. Find faces in an image or video frame using the face detection model, then get the
bounding box coordinates for each face that was found.
39
3. Use the facial landmark detection model to pinpoint certain facial landmarks inside each
recognized face, including the eyes, nose, and mouth.
4. Use the ResNet-34 model to create face encodings for the faces that were detected.
5. To see if there is a match or resemblance, compare the created face encodings with a
database of known face encodings.
40
Explanation of the Dlib Method: -
1. "Import Dlib" - This line imports the Dlib library, which provides functionality for
machine learning and computer vision in Python.
2. "detector = Dlib. Get_frontal_face_detector ()" - This line creates a face detector object
using the "get_frontal_face_detector ()" function provided by the Dlib library. This
function returns a pre-trained object detector specifically designed for detecting faces in
images.
3. "img = Dlib. load_rgb_image("image.jpg")" - This line loads an image named
"image.jpg" using the Dlib library's "load_rgb_image" function, which loads an image
from the specified file path and returns it as an array of RGB values.
4. "faces = detector(img)" - This line uses the face detector object to detect faces in the
image by calling the detector function on the image. The detector function returns a list of
"rect" objects, each representing a bounding box for a detected face in the image.
5. "Print ("Number of faces detected: ", Len(faces))" - This line prints the number of faces
detected in the image using the "Len(faces)" function, which returns the number of
elements in the list of "react" objects.
6. "For face in faces:" - This line starts a loop that iterates through all the "rect" objects in
the list of faces returned by the detector function.
7. "Print ("Left: ", face. Left ())" - This line, inside the loop, prints the left coordinate of the
bounding box of the current face using the "left ()" method of the "rect" object.
8. "Print ("Top: ", face. Top ())" - This line, inside the loop, prints the top coordinate of the
bounding box of the current face using the "top ()" method of the "rect" object.
9. "Print ("Right: ", face. Right ())" - This line, inside the loop, prints the right coordinate of
the bounding box of the current face using the "right ()" method of the "rect" object.
10. "Print ("Bottom: ", face. Bottom ())" - This line, inside the loop, prints the bottom
coordinate of the bounding box of the current face using the "bottom ()" method of the
"rect" object.
41
3.2.3 FACE RECOGNITION PANDAS
Pandas is a Python library used for working with data sets. It has functions for analyzing,
cleaning, exploring, and manipulating data. The name "Pandas" has a reference to both "Panel
Data", and "Python Data Analysis" and was created by Wes McKinney in 2008. Pandas is
an open-source library in Python that is made mainly for working with relational or labeled
data both easily and intuitively.
It provides various data structures and operations for manipulating numerical data and time
series. This library is built on top of the NumPy library of Python. Pandas is fast and it has
high performance & productivity for users.
Use Pandas
Fast and efficient for manipulating and analyzing data.
Data from different file objects can be easily loaded.
Flexible reshaping and pivoting of data sets
Provides time-series functionality.
42
3.2.4 FACE RECOGNITION USING PILLOW
In today’s digital world, we come across lots of digital images. In case, we are working with
Python programming language, it provides lot of image processing libraries to add image
processing capabilities to digital images. Some of the most common image processing libraries
are: OpenCV, Python Imaging Library (PIL), Scikit-image, Pillow. However, in this tutorial, we
are only focusing on Pillow module and will try to explore various capabilities of this module.
Pillow module gives more functionalities, runs on all major operating system and support for
python 3. It supports wide variety of images such as “jpeg”, “png”, “bmp”, “gif”, “ppm”, “tiff”.
Face detection: The face recognition library provides a face locations () method that locates all
faces in an image or video frame and gives their bounding box coordinates. These bounding box
coordinates specify the location and dimensions of each identified face.
Face Landmarks: Face landmarks (), a function in the library, finds and returns the positions of
various facial landmarks, including the eyes, nose, mouth, and chin. These markers might be
helpful for tasks like face alignment, emotion identification, and facial expression analysis.
43
Face encodings: A function named face encodings () is offered by the face recognition library,
and it computes a 128-dimensional numerical representation or encoding for each identified face.
These encodings, which may be used for face comparison and identification, capture the
distinctive traits of every face. The encodings can be kept in a database and contrasted with fresh
face encodings for recognition.
Face matching: The compare faces () method provided by the library compares two sets of face
encodings and provides a Boolean result indicating whether or not they match. This feature can
be used to compare a detected face to a database of recognized faces for identification or
verification purposes.
User database: A database of recognized faces and their related face encodings is necessary for
face recognition. This database is a resource for locating and validating people. By saving each
person's face encodings and distinct labels, the face recognition library enables you to build and
maintain such a database.
The face recognition library's face-recognition processes may be implemented using these
components. You can recognize and validate people in photos or video streams by recognizing
faces, extracting facial landmarks, computing face encodings, and comparing them to a known
database.
It's crucial to remember that the effectiveness and efficiency of face recognition systems depend
on the caliber of training data, the size of the database, and other elements like illumination,
position changes, and occlusions. Considering these factors, a facial recognition system should
be designed and implemented carefully. A face Detector has to tell whether an image of arbitrary
size contains a human face and if so, where it is. Face detection can be performed based on
several cues: skin color (for faces in color images and videos, motion (for faces in videos),
facial/head shape, facial appearance or a combination of these parameters. Most face detection
algorithms are appearance based without using other cues. An input image is scanned at all
possible locations and scales by a sub window. Face detection is posed as classifying the pattern
in the sub window either as a face or a non-face. The face/nonface classifier is learned from face
and non-face training examples using statistical learning methods. Most modern algorithms are
based on the Viola Jones object detection framework, which is based on Haar Cascades.
44
Explanation for code: -
1. The first line imports the face recognition library, which detects and recognizes faces in
images and videos.
2. The next line loads the image from the file "image.jpg" using the face recognition.
load_image_file () method.
3. The line after that detects faces in the image using the face recognition. face locations ()
method, which returns a list of face locations represented by four coordinates (top, right,
bottom, left) for each face.
4. The following line starts a for loop that loops through each face location in the list. Each
iteration of the loop assigns the four coordinates of a face location to the variables top,
right, bottom, and left.
5. The next line inside the for loop uses the cv2. To create a rectangle over the face, use the
rectangle () method. The picture, the top-left quadrant of the rectangle, the lower right
quadrant of the rectangle, the color of the rectangle, and the thickness of the rectangle are
all inputs that the function accepts. In this case, the rectangle is red (0, 0, 255), and the
thickness is 2 pixels.
45
6. The line after that uses the cv2.imshow() method to display the image with the rectangles
drawn around the faces.
7. The last line uses the cv2.waitKey() method to wait for the user to press a key before
closing the window. The argument against this method is the amount of time in
milliseconds before the window closes. In this case, it is set to 0, which means the
window will wait indefinitely for the user to press a key.
MTCNN: The face detection method MTCNN (Multi-task Cascaded Convolutional Networks)
is well known for its reliability and accuracy. It comprises several neural networks that cooperate
to find facial landmarks and faces. MTCNN can recognize faces of various shapes, orientations,
and resolutions and offers facial landmark locations and bounding box coordinates for each face
found. Real-time facial detection programs frequently employ it.
Before training the face recognition model, we need a dataset of labeled images representing
different individuals. Start by gathering images of each person to be recognized. It's
recommended to capture a variety of images under different lighting conditions and angles to
improve model robustness.
Once you have the images, create a directory structure with subdirectories named after each
person, and place their respective images in their respective directories.
To train the face recognition model, we will use a popular pre-trained model called
"dlib_face_recognition_resnet_model_v1." This model provides a 128-dimensional face
embedding for each face detected in an image.
Load the labeled face images, detect faces, and extract facial landmarks using the Dlib library.
Then, utilize the pre-trained model to compute the face embeddings for each image. Store the
46
computed embeddings along with the corresponding person's label in a dictionary or a data
frame.
Now that we have trained our face recognition model, we can proceed to implement the
attendance system. The system will follow these steps:
The main components used in the implementation approach are open-source computer vision
library (OpenCV). One of OpenCV’s goals is to provide a simple-to-use computer vision
infrastructure that helps people build fairly sophisticated vision applications quickly. OpenCV
library contains over 500 functions that span many areas in vision. The primary technology
behind Face recognition is OpenCV.
The user stands in front of the camera keeping a minimum distance of 50cm and his image is
taken as an input. The frontal face is extracted from the image then converted to gray scale and
stored. The principal component Analysis (PCA) algorithm is performed on the images and the
eigen values are stored in an xml file. When a user requests for recognition the frontal face is
extracted from the captured video frame through the camera. The eigen value is re-calculated for
the test face and it is matched with the stored data for the closest neighbor.
47
4.1 DESIGN OF A FACE RECOGNITION SYSTEM
Research papers on Face recognition systems are studied and state of the current technology are
reviewed and summarized in the previous chapter, results of which will guide us to design a face
recognition system for future humanoid and or guide guard robot. A throughout survey has
revealed that various methods and combination of the methods can be applied in development of
an we face recognition system.
Among the many possible approaches, we have decided to use a combination of know ledge-
based methods for face detection part and neural network approach for face recognition part. The
main reason in this selection is their smooth applicability and reliability issues.
This produces a set of pose landmarks to be used for training. We are not interested in the pose
detection itself, since we will be training our own model in the next step.
The k-NN algorithm we've chosen for custom pose classification requires a feature vector
representation for each sample and a metric to compute the distance between two vectors to find
the target nearest to the pose sample. This means we must convert the pose landmarks we just
obtained.
To convert pose landmarks to a feature vector, we use the pairwise distances between predefined
lists of pose joints, such as the distance between wrist and shoulder, ankle and hip, and left and
right wrists. Since the scale of images can vary, we normalized the poses to have the same torso
size and vertical torso orientation before converting the landmarks.
48
TRAIN THE MODEL AND COUNT REPETITIONS
We used the Media Pipe Colab to access the code for the classifier and train the model. To count
repetitions, we used another Colab algorithm to monitor the probability threshold of a target pose
position.
For example: When the probability of the "down" pose class passes a given threshold for the first
time, the algorithm marks that the "down" pose class is entered. When the probability drops
below the threshold, the algorithm marks that the "down" pose class has been exited and
increases the counter.
49
INTEGRATE WITH THE ML
The Colab above produces a CSV file that you can populate with all your pose samples. In this
section, you will learn how to integrate your CSV file with the ML Kit Android QuickStart app
to see custom pose classification in real time. Try pose classification with samples bundled in the
QuickStart app.
Get the ML Kit Android QuickStart app project from GitHub and make sure it builds and runs
well.Goto Live Preview of the so their face recognition Activity and enable Detection Run of
classification from the settings’ page. Now you should be able to classify pushes and squats.
50
THE SYSTEM FLOW OF THE CREATION OF FACE DATABASE
Save the Trained Data After the training process is done, the trained sets of data will be stored
into a (.yml) file which will be retrieved during the recognition process to ensure the training
process are only done for the minimum time.
There are two major system flows in the software development section. The creation of the face
database. The process of attendance taking Both processes mentioned above are essential
because they made up the backbone of the attendance management system. In this section, the
process of both flows will be briefly described. Meanwhile, their full functionality, specific
requirements and also the methods/approach to accomplish.
51
The face database is an important step to be done before any further process can be initiated.
This is because the face database acts as a comparison factor during the recognition process
which will be discussed in later section. In the process above, a csv file is created to aid the
process of image labelling because there will be more than one portrait stored for each student,
thus, in order to group their portraits under the name of the same person, labels are used to
distinguish them.
After that, those images will be inserted into a recognizer to do its training. Since the training
process is very time consuming as the face database grew larger, the training is only done right
after there is a batch of new addition of student’s portraits to ensure the training is done as
minimum as possible.
52
Other than the creation of face database, the rest of the remaining process can all be done
through a webserver. Thus, the attendance taking procedure will also be done through a web
server. This is to provide a friendly user-interface to the user (lecturer) while being able to
conduct an execution on the raspberry pi to do attendance taking without the need to control the
raspberry pi from a terminal which will be ambiguous for most user. Therefore, just with a click
of button on the webpage, a python script will be executed which it will launch a series of
initialization such as loading the trained data to the recognizer and etc. The attendance taking
process will then proceed in a loop to acquire, identify and mark the attendance for each of the
students that is obtained from the pi camera. In, every step in both of the process flow above will
be explained in detailed on its accomplishment.
Before training the face recognition model, we need a dataset of labelled images representing
different individuals. Start by gathering images of each person. It’s recommended to capture a
variety of images under different lighting conditions and angles to improve model robustness.
Once you have the images, create a directory structure with subdirectories named after each
person, and place their respective images in their respective directories.
To train the face recognition model, we will use a popular pre-trained model are they also as the
face is "dlib_face_recognition_resnet_.
This model provides a 128-dimensional face embedding for each face detected in an image.
Load the labelled face images, detect faces, and extract facial landmarks using the dib library.
Then, utilize the pre-trained model to compute the face embeddings for each image. Store the
computed embeddings along with the corresponding person's label in a dictionary or a data
frame.
53
IMPLEMENTING ATTENDANCE SYSTEM
Now that we have trained our face recognition model, we can proceed to implement the
attendance system. The system will follow these steps:
If a match is found, mark the attendance for that person. To maintain attendance records, create
an SQLite database and define a table structure to store relevant information such as the person's
name, date, and time of attendance.
Radius: The radius is used to build the circular local binary pattern and represents the radius
around the central pixel. It is usually set to 1.
Neighbors: The number of sample points to build the circular local binary pattern. Keep in mind:
the more sample points you include, the higher the computational cost. It is usually set to 8.
Grid X: The number of cells in the horizontal direction. The more cells, the finer the grid, the
higher the dimensionality of the resulting feature vector. It is usually set to 8.
Grid Y: The number of cells in the vertical direction. The more cells, the finer the grid, the
higher the dimensionality of the resulting feature vector. It is usually set to 8.
54
Based on the image above, let’s break it into several small steps so we can understand it easily:
It can also be represented as a 3x3 matrix containing the intensity of each pixel (0~255).
Then, we need to take the central value of the matrix to be used as the threshold.
This value will be used to define the new values from the 8 neighbors.
55
For each neighbor of the central value (threshold), we set a new binary value. We set 1 for values
equal or higher than the threshold and 0 for values lower than the threshold.
The value of integrating image in a specific location is the sum of pixels on the left and the top of
the respective location. To illustrate clearly, the value of the integral image at location 1 is the
sum of the pixels in rectangle A.
The values of integral image at the rest of the locations are cumulative. For instance, the value at
location 2 is summation of A and B, (A + B), at location 3 is summation of A and C, (A + C), and
at location 4 is summation of all the regions, (A + B + C + D). Therefore, the sum within the D
region can be computed with only addition and subtraction of diagonal at location 4 + 1 − (2 + 3)
to eliminate rectangles A, B and C.
56
TRAINING THE ALGORITHM
First, we need to train the algorithm. To do so, we need to use a dataset with the facial images of
the people we want to recognize. We need to also set an ID (it may be a number or the name of
the person) for each image, so the algorithm will use this information to recognize an input
image and give you an output. Images of the same person must have the same ID. With the
training set already constructed, let’s see the LBPH computational steps.
The first computational step of the LBPH is to create an intermediate image that describes the
original image in a better way, by highlighting the facial characteristics. To do so, the algorithm
uses a concept of a sliding window, based on the parameter’s radius and neighbors.
Now, the matrix will contain only binary values (ignoring the central value). We need to
concatenate each binary value from each position from the matrix line by line into a new binary
value (e.g., 10001101). Note: some authors use other approaches to concatenate the binary values
(e.g., clockwise direction), but the final result will be the same.
57
Then, we convert this binary value to a decimal value and set it to the central value of the
matrix, which is actually a pixel from the original image.
At the end of this procedure (LBP procedure), we have a new image which represents
better the characteristics of the original image.
58
EXTRACTING THE HISTOGRAMS
Now, using the image generated in the last step, we can use the Grid X and Grid Y parameters to
divide the image into multiple grids, as can be seen in the following image:
Based on the image above, we can extract the histogram of each region as follows:
● As we have an image in grayscale, each histogram (from each grid) will contain only 256
positions (0~255) representing the occurrences of each pixel intensity.
● Then, we need to concatenate each histogram to create a new and bigger histogram.
Supposing we have 8x8 grids, we will have 8x8x256=16.384 positions in the final histogram.
The final histogram represents the characteristics of the image original image.
59
4.5 PERFORMING THE FACE RECOGNITION
In this step, the algorithm is already trained. Each histogram created is used to represent each
image from the training dataset. So, given an input image, we perform the steps again for this
new image and creates a histogram which represents the image.
So, to find the image that matches the input image we just need to compare two histograms and return
the image with the closest histogram.
We can use various approaches to compare the histograms (calculate the distance
between two histograms), for example: Euclidean distance, chi-square, absolute value,
etc. In this example, we can use the Euclidean distance (which is quite known) based on
the following formula:
So, the algorithm output is the ID from the image with the closest histogram. The
algorithm should also return the calculated distance, which can be used as a ‘confidence’
measurement.
We can then use a threshold and the ‘confidence’ to automatically estimate if the
algorithm has correctly recognized the image. We can assume that the algorithm has
successfully recognized if the confidence is lower than the threshold defined.
60
61
CHAPTER 4-MODAL IMPLEMENTATION AND ANALYSIS
Face detection involves separating image windows into two classes; one containing faces
(turning the background (clutter). It is difficult because although commonalities exist between
faces, they can vary considerably in terms of age, skin color and facial expression. The problem
is further complicated by differing lighting conditions, image qualities and geometries, as well as
the possibility of partial occlusion and disguise. An ideal face detector would therefore be able to
detect the presence of any face under any set of lighting conditions, upon any background. The
face detection task can be broken down into two steps. The first step is a classification task that
takes some arbitrary image as input and outputs a binary value of yes or no, indicating whether
there are any faces present in the image. The second step is the face localization task that aims to
take an image as input and output the location of any face or faces within that image as some
bounding box with (x, y, width, height). After taking the picture the system will compare the
equality of the pictures in its database and give the most related result. We will use NVIDIA
Jetson Nano Developer kit, Logitech C270 HD Webcam, open CV platform and will do the
coding in python language.
62
Our system contains different databases one is storage databases and other is attendance
databases. Storage database contains trained set database and attendance database contains
record of attendance marked by our system. A camera will be fixed in classroom and it takes
63
image as input. After taking the image it will convert the image into grayscale image. With the
help of Haar Cascade classifier it will detect faces and extract the features from the images and it
stored in storage database. For face recognizing, the extract features will be compared with trained set
database. If it matches then attendance will be updated in attendance database and if it is not matches
then it will not update.
Digital camera: The camera is the only hardware component required to seize stay video feed of
class.
Vision Acquisition: This module permits picture to be captured with the aid of digicam into
LabVIEW for programming. It consists of IMAQ submodules which include IMAQ Create,
Imax Open, Imax grab. they all combine to provide non-stop Acquisition of video feed from
digicam module.
Picture to Grayscale: This process is carried out the usage of IMAQ Extract Single Color Plane
VI to convert a 32/16bit picture to 8bit picture. that is a requirement for our pattern matching
algorithm to work completely.
Sample Extraction: that is blanketed in imaginative and prescient Assistant VI which offers
with our face popularity set of rules. Sample Extraction is function in which the picture inputted
capabilities are compared the use of pattern Matching set of rules.
Characteristic Extraction: this feature is used to extract crucial features out of picture. It
compares them with templates, saves in database and presents a score of contrast.
Discover suit in database: Our database has preserved templates or images of students which
we aim to apprehend and mark attendance. This database can be updated or appended according
to Requirement. This database is used for evaluation with extracted feature of photograph to
affirm a successful hit.
64
Update Attendance Sheet.xlsx: If healthy is discovered our algorithm updates the attendance of
person like his/her call in excel file of layout .xlsx. If no longer, the system marks absent in the
front of his/her call within the same excel file.
VISION ASSISTANT
Vision Assistant helps us to perform Machine Vision Algorithm Pattern Matching on our image.
This allows us to detect faces of student in a group of class. First one must add student faces as
template in this program to create a database of images using reference images. The result is
outputted which includes the information of score 0-1000 to tell how successful a match was,
position of match occurred in image, angle of match occurred in image. This information
together with number of matches for each user will be used to mark attendance of user in future
progresses.
65
4.1.2 DATA SAMPLING
We do not take all retrieved frames from uploaded videos by the data collector application to
build a training dataset. Instead of doing so, on each video, we perform head pose detection
using Hope Net to detect exactly three face turning angles. In our experiment, the best value of
three face turning angles is −0.3 rad, 0 rad, and 0.3 rad. After that, we perform face detection to
extract three faces in each corner. We have nine faces evenly spread from the input video that is
the number of training samples for each class. The facial embedding methods used are all pre-
trained models with millions of face data, so images in normal conditions can be encoded under
feature vectors that are separate from faces in other classes. Feature vectors are then extracted
from the cropped faces to be stored in the facial database.
The system only considers faces that appear in the ROI. Ngo et al. reviewed the face detection
libraries. According to the results obtained from them, we use MTCNN as a face and landmark
detector. The faces are aligned based on five returned landmark points (left eye, right eye, nose,
left corner mouth, and right corner mouth). We already consider some of the face embedding
libraries mentioned in Wang and Deng’s survey including: Arc face, Sphere face, Face net, Cos
face. The main goal of these methods is to maximize face class separability by introducing a new
loss function that is highly discriminative to features for face recognition. According to the
survey. Although, Arc face showed the best results compared to other loss functions that are good
with face recognition like triplet loss, intra-loss, and inter-loss. However, through initial
empirical results, they all show adverse effects in the real environment (because of light
condition, motion blur…) without our post-processing. We tested some different FR libraries and
their output feature vectors with several machine learning algorithms including parametric and
non-parametric, generative, and discriminative.
66
At FPT polytechnic, if the student is late for the first 15 min, or leave early before the last 15
min, he/she is counted as absent. Attendance usually takes place at this time. To increase
flexibility in attendance as well as to avoid affecting the class.
Check-in and check-out for this process. The check-in time starts 15 min earlier than the lesson
and ends after the first 15 min. Likewise, for check-out at the end of class. Students present in
front of the attendance taking area at the right time are considered as present. The AT may not
strictly require runtime responding.
However, it should get a response as fast as possible. In this situation, we set 2 min as the size of
a job. We tried many different sizes; however, 2 min is the best-observed value because of the
aspects of waiting time, processing time, and memory consumption.
Like any AI/ML model, data collection is one of the most important steps in training a facial
recognition system since it determines the end performance of the system. The developer needs
67
to ensure that the right dataset is selected for the training process and that the model is
not over/underfitting and is unbiased.
Facial recognition systems usually require large datasets to be trained to avoid false positives.
The more data is fed into the algorithm, the more accurate it will become. Crowdsourcing can be
an effective method of collecting large and diverse datasets for facial recognition systems.
You can also check our data-driven list of data collection/harvesting services to find the option that
best suits your project.
After gathering the dataset, annotating is required so that the algorithm knows what to look for in
the image. For annotating data for a facial recognition system, different points and features of the
face are tagged/labeled with accuracy and consistency.
OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and direct
source information to the user. Efficient and intelligent output design improves the system’s
relationship to help user decision making.
Designing computer output should proceed in an organized, well thought out manner; the right
output must be developed while ensuring that each output element is designed so that people will
find the system can use easily and effectively. When analysis design computer output, they
should Identify the specific output that is needed to meet the requirements. A crucial step in face
recognition is feature extraction. By extracting discriminative facial features from preprocessed
images, accurate identification and verification can be achieved. Various techniques can be
employed, including traditional methods like eigenfaces and Fisher faces, as well as deep
learning-based approaches. Traditional techniques utilize linear algebra and statistical methods to
extract facial features, while deep learning methods leverage convolutional neural networks
(CNNs).
68
INPUT DESIGN
The input design is the link between the information system and the user. It comprises the
developing specification and procedures for data preparation and those steps are necessary to put
transaction data in to a usable form for processing can be achieved by inspecting the computer to
read data from a written or printed document or it can occur by having people keying the data
directly into the system. The design of input focuses on controlling the amount of input required,
controlling the errors, avoiding delay, avoiding extra steps and keeping the process simple.
The input is designed in such a way so that it provides security and ease of use with retaining the
privacy. Input Design considered the following things:
1. Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and
show the correct direction to the management for getting correct information from the
computerized system.
2. It is achieved by creating user-friendly screens for the data entry to handle large volume of
data. The goal of designing input is to make data entry easier and to be free from errors. The data
entry screen is designed in such a way that all the data manipulates can be performed. It also
provides record viewing facilities.
3.When the data is entered it will check for its validity. Data can be entered with the help of
screens. Appropriate messages are provided as when needed so that the user will not be in maize
of instant. Thus, the objective of input design is to create an input layout that is easy to follow.
69
4.1.4 TEST OBJECTIVES
70
TYPES OF TESTING
Unit testing: Unit testing involve the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All decision
branches and internal code flow should be validated. It is the testing of individual software units
of the application .it is done after the completion of an individual unit before integration. This is
a structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application, and/or
system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and expected
results.
Integration testing: Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is more concerned
with the basic outcome of screens or fields. Integration tests demonstrate that although the
components were individually satisfaction, as shown by successfully unit testing, the
combination of components is correct and consistent. Integration testing is specifically aimed at
exposing the problems that arise from the combination of components.
Functional test: Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system documentation, and
user manuals. Functional testing is centered on the following items: Valid Input-identified classes
of valid input must be accepted. Invalid Input identified classes of invalid input must be rejected.
Functions-identified functions must be exercised. Output-identified classes of application outputs
must be exercised. Systems/Procedures: interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key functions.
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
71
components, sub-assemblies, assemblies and/or a finished product It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of tests. Each
test type addresses a specific testing requirement.
72
4.2 VIOLA AND JONES ALGORITHM
Viola and Jones Algorithm In project we are using Viola and Jones algorithm for object
detection. The Viola–Jones object detection framework is the first object detection framework to
provide competitive object detection rates in real-time proposed in 2001 by Paul Viola and
Michael Jones. Although it can be trained to detect a variety of object classes, it was motivated
primarily by the problem of face detection. Viola and jones are currently one of the best
algorithms to detect the faces of human. This algorithm mainly has following functionality. A
Face Detection
Integral image or summed area table is a data structure and algorithm for quickly and
efficiently generating the sum of values in a rectangular subset of a grid. In the image
processing domain, it is also known as an integral image.
73
Haar-like features are digital image features used in object recognition. They owe their
name to their intuitive similarity with hear wavelets and were used in the first real-time
face detector.
Ad boost (adaptive boost) meta-algorithm formulated by Yoav Freund and Robert
Schapiro which is use to improve the performance of another algorithm. Viola and jones
extract the millions of features (pixels) for comparison so, we ad boost to enhance the
overall performance and calculation speed of the algorithm.
Cascade classifier is a particular case of ensemble learning based on the concatenation of
several classifiers, using all information collected from the output from a given classifier
as ad-additional information for the next classifier in the cascade.
Unlike voting or stacking ensembles, which are multi-expert systems, cascading is a
multistage one. Face Recognition
Initially the ROI is extracted from the source face image, ROI is the sub image and is
smaller than the original image.
Normalized Cross-Correlation is performed on ROI and target image to find the peak
coordinates.
The total offset or translation is carried out based on the position of the peak in the cross-
correlation matrix.
4.3 ANALYSIS
User Interfaces: The user interface for the software shall be compatible to any Android version
by which user can access to the system. The user interface shall be implemented using any tool
or software package like Android Studio, MYSQL etc.
Hardware Interfaces: To run our project, the hardware part is first completed to provide a
platform for the software to work. Before the software part we need to install some libraries for
effective working of the application. we required a hardware system which is feasible for our
project like Intel I3 processor, 4 GB minimum RAM, 5GB Hard disk. We also need standard
74
Camera Module with good mega pixels, keyboard. Since the application must run over the
internet, the hardware shall require connecting internet to the hardware which is android device
for the system.
Software Interfaces: This system is a Single-user, multi-tasking environment. It enables the user
to interact with the server and attain interact with the server to show the animal information also
leaves a record in the inbuilt database. It uses Java and android as the front-end programming
tool and MySQL as the backend application tool. To run the admin module system requirement is
Web browser where we can sign -in the admin panel.
The development of a face recognition based automatic student attendance system using
Convolutional Neural Networks which includes data entry, dataset training, face recognition and
attendance entry.
Communication Interfaces: The e-store system shall use the HTTP protocol for communication
over the internet and for the intranet communication will be through TCP/IP protocol suite.
Artificial Intelligence enabled Face detection primarily based totally software turning into
international famous. This changed into mentioned through Santana Fell.
75
PARAMETER COMPARISON
76
.
Now, the matrix will contain only binary values (ignoring the central value). We need to
concatenate each binary value from each position from the matrix line by line into a new
binary value (e.g., 10001101).
Note: some authors use other approaches to concatenate the binary values (e.g.,
clockwise direction), but the result will be the same.
Then, we convert this binary value to a decimal value and set it to the central value of the
matrix, which is a pixel from the original image.
At the end of this procedure (LBP procedure), we have a new image which represents
better the characteristics of the original image.
77
4.3.2 FLOW CHART
78
4.3.3 ENROLMENT
For enrolment we define smaller Reset neural network. Training was also done using this
network. A Persons’ images we are going to enroll are structured in following way: We will be
having subfolders; each subfolder has images of one person. We will store this mapping of
images and their corresponding labels to use it later in testing. Then we process enrolment
images one by one, convert each image from BGR to RGB format, because Dib uses RGB as
default format. Then convert OpenCV BGR image to Dib’s cv_image and then Dib’s cv_image
to Dib’s matrix format since Dib’s cv_image format is not recognized by neural network module.
Detect faces in the image. For each face we detect facial landmarks and get a normalized and
warped patch of detected face. Compute face descriptor using facial landmarks.
This is a 128- dimensional vector which represents a face. Based on these conditions, face
candidates are extracted from input image with modified bounding box from original bounding
box. The height of bounding box modified as 1.28 times bigger than width of bounding box
because chest and neck parts will be eliminated if candidate includes them This modification
value have been determined experimentally.
These face candidates will be sent to facial feature extraction part to validate the candidates.
Final verification of candidate and face image extraction, facial feature extraction process is
applied. Facial feature is one of the most significant features of face. Facial features are
eyebrows, eyes, mouth, nose, nose tip, cheek, etc. The property is used to extract the eyes and
mouth which, two eyes and mouth generate isosceles triangle, and distance between eye to eye
and midpoint of eyes distance to mouth is equal.
Laplacian of Gaussian (Log) filter and some other filtering operations are performed to extract
facial feature of face candidate. Our system splits into two parts, First the front-end side which
consist of GUI which is based on Electron JS that is JavaScript stack which is serving as a client
and the second is the backend side which consist of logic and based on Python which is serving
as a server. And we know that both the languages cannot communicate with each other directly
so we have used IPC (Inter Personal Communication) techniques with zero library as a bridge to
communicate these two languages. The Electron JS call the python functions and interchange
data via TCP with help of Zero PC Library.
79
.
Learning Any image can be vectorized by simply storing all the pixel values in a tall vector. This
vector represents a point in higher dimensional space. However, this space is not very good for
measuring.
In a face recognition application, the points representing two different images of the same person
may be very far away and the points representing images of two different people may be close
by. Deep Metric Learning is a class of techniques that uses Deep Learning to learn a lower
dimensional effective metric space where images are represented by points such that images of
the same class are clustered together, and images of different class are far apart.
80
Instead of directly reducing the dimension of the pixel space, the convolution layers first
calculate the meaningful features which are then implicitly used to create the metric space. Turns
out we can use the same CNN architecture we use for image classification for deep metric
learning.
Face detection locates human faces in visual media such as digital images or video. When a face
is detected, it has an associated position, size, and orientation; and it can be searched for
landmarks such as the eyes and nose.
Here are some of the terms that we use regarding the face detection feature of ML Kit.
Face tracking extends face detection to video sequences. Any face that appears in a video for
any length of time can be tracked from frame to frame. This means a face detected in consecutive
video frames can be identified as being the same.
81
Note that this isn't a form of face recognition; face tracking only makes inferences based on the
position and motion of the faces in a video sequence.
A landmark is a point of interest within a face. The left eye, right eye, and base of the nose are
all examples of landmarks. ML Kit provides the ability to find landmarks on a detected face.
A contour is a set of points that follow the shape of a facial feature. ML Kit provides the ability
to find the contours of a face.
Classification determines whether a certain facial characteristic is present. For example, a face
can be classified by whether its eyes are open or closed, or if the face is smiling or not.
82
4.4.1 LANDMARKS
A landmark is a point of interest within a face. The left eye, right eye, and nose base are all
examples of landmarks.
ML Kit detects faces without looking for landmarks. Landmark detection is an optional step that
is disabled by default.
The following table summarizes all the landmarks that can be detected given the Euler Y angle of
an associated face:
< -36 degrees left eye, left mouth, left ear, nose base, left cheek
-36 degrees to -12 left mouth, nose base, bottom mouth, right eye, left eye, left cheek, left ear tip
degrees
-12 degrees to 12 right eye, left eye, nose base, left cheek, right cheek, left mouth, right mouth, bottom
degrees mouth
12 degrees to 36 right mouth, nose base, bottom mouth, left eye, right eye, right cheek, right ear tip
degrees
> 36 degrees right eye, right mouth, right ear, nose base, right cheek
Dataset of students is created before the recognition process. Dataset was created only to train
this system. We have created a dataset of 5 students which involves their name, roll number,
department and images of student in different poses and variations. For better accuracy minimum
15 images of each student should be captured. Whenever we register student’s data and images
in our system to create dataset, deep learning applies to each face to compute 128-d facial
features and store in student face data file to recall that face in recognition process. This process
is applying to each image taken during registration.
83
4.5 CLASSIFICATION
Also note that the classifications "eyes open" and "smiling" only work for frontal faces, i.e.,
faces with a small Euler Y angle (between -18 and 18 degrees).
Minimum Face Size: The minimum face size is the desired face size, expressed as the ratio of
the width of the head to the width of the image. For example, the value of 0.1 means that the
smallest face to search for is roughly 10% of the width of the image being searched.
The minimum face size is a performance vs. accuracy trade-off: setting the minimum size
smaller lets the detector find smaller faces but detection will take longer; setting it larger might
exclude smaller faces but will run faster. The minimum face size is not a hard limit; the detector
may find faces slightly smaller than specified. A similar separation of pattern recognition
algorithms into four groups is proposed by Jain and colleges. We can group face recognition
methods into three main groups. The following approaches are proposed: ˆ
There is a network function in some point. Note that many algorithms, mostly current complex
algorithms, may fall into more than one of these categories. The most relevant face recognition
algorithms will be discussed later under this classification. Once the face is identified with the
image stored in JSON file, python generate roll numbers of present students and return that,
when data is returned, the system generates attendance table which includes the name, roll
number, date, day and time with corresponding subject id. And then passes the data to python to
store the table into an excel sheet automatically.
84
A similar separation of pattern recognition algorithms into four groups is proposed by Jain and
colleges. We can group face recognition methods into three main groups. The following
approaches are proposed:
Template matching: - Patterns are represented by samples, models, pixels, curves, textures. The
recognition function is usually a correlation or distance measure.
85
CHAPTER 5-CODE IMPLEMENTATION
CODING
AMS.PY
import tkinter as tk
from tkinter import *
import cv2
import csv
import os
import numpy as np
from PIL import Image, ImageTk
import pandas as pd
import datetime
import time
window.geometry('1280x720')
window.configure(background='grey80')
86
def manually_fill():
global sb
sb = tk.Tk()
# sb.iconbitmap('AMS.ico')
sb.title("Enter subject name...")
sb.geometry('580x320')
sb.configure(background='grey80')
def err_screen_for_subject():
def ec_delete():
ec.destroy()
global ec
ec = tk.Tk()
ec.geometry('300x100')
# ec.iconbitmap('AMS.ico')
ec.title('Warning!!')
ec.configure(background='snow')
Label(ec, text='Please enter your subject name!!!', fg='red',
bg='white', font=('times', 16, ' bold ')).pack()
Button(ec, text='OK', command=ec_delete, fg="black", bg="lawn green", width=9,
height=1, activebackground="Red",
font=('times', 15, ' bold ')).place(x=90, y=50)
def fill_attendance():
ts = time.time()
Date = datetime.datetime.fromtimestamp(ts).strftime('%Y_%m_%d')
timeStamp = datetime.datetime.fromtimestamp(ts).strftime('%H:%M:%S')
87
Time = datetime.datetime.fromtimestamp(ts).strftime('%H:%M:%S')
Hour, Minute, Second = timeStamp.split(":")
# Creatting csv of attendance
import pymysql.connections
88
PRIMARY KEY (ID)
);
"""
try:
cursor.execute(sql) # for create a table
except Exception as ex:
print(ex) #
if subb == '':
err_screen_for_subject()
else:
sb.destroy()
MFW = tk.Tk()
# MFW.iconbitmap('AMS.ico')
MFW.title("Manually attendance of " + str(subb))
MFW.geometry('880x470')
MFW.configure(background='grey80')
def del_errsc2():
errsc2.destroy()
def err_screen1():
global errsc2
errsc2 = tk.Tk()
errsc2.geometry('330x100')
# errsc2.iconbitmap('AMS.ico')
errsc2.title('Warning!!')
89
errsc2.configure(background='grey80')
Label(errsc2, text='Please enter Student & Enrollment!!!', fg='black', bg='white',
font=('times', 16, ' bold ')).pack()
Button(errsc2, text='OK', command=del_errsc2, fg="black", bg="lawn green",
width=9, height=1,
activebackground="Red", font=('times', 15, ' bold ')).place(x=90, y=50)
global ENR_ENTRY
ENR_ENTRY = tk.Entry(MFW, width=20, validate='key',
bg="white", fg="black", font=('times', 23))
ENR_ENTRY['validatecommand'] = (
ENR_ENTRY.register(testVal), '%P', '%d')
ENR_ENTRY.place(x=290, y=105)
90
def remove_enr():
ENR_ENTRY.delete(first=0, last=22)
STUDENT_ENTRY = tk.Entry(
MFW, width=20, bg="white", fg="black", font=('times', 23))
STUDENT_ENTRY.place(x=290, y=205)
def remove_student():
STUDENT_ENTRY.delete(first=0, last=22)
91
cursor.execute(Insert_data, VALUES)
except Exception as e:
print(e)
ENR_ENTRY.delete(first=0, last=22)
STUDENT_ENTRY.delete(first=0, last=22)
def create_csv():
import csv
cursor.execute("select * from " + DB_table_name + ";")
csv_name = 'C:/Users/Pragya
singh/PycharmProjects/Attendace_management_system/Attendance/Manually
Attendance/'+DB_table_name+'.csv'
with open(csv_name, "w") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(
[i[0] for i in cursor.description]) # write headers
csv_writer.writerows(cursor)
O = "CSV created Successfully"
Notifi.configure(text=O, bg="Green", fg="white",
width=33, font=('times', 19, 'bold'))
Notifi.place(x=180, y=380)
import csv
import tkinter
root = tkinter.Tk()
root.title("Attendance of " + subb)
root.configure(background='grey80')
with open(csv_name, newline="") as file:
reader = csv.reader(file)
r=0
92
for col in reader:
c=0
for row in col:
# i've added some styling
label = tkinter.Label(root, width=18, height=1, fg="black", font=('times',
13, ' bold '),
bg="white", text=row, relief=tkinter.RIDGE)
label.grid(row=r, column=c)
c += 1
r += 1
root.mainloop()
93
DATA_SUB = tk.Button(MFW, text="Enter Data", command=enter_data_DB,
fg="black", bg="SkyBlue1", width=20,
height=2,
activebackground="white", font=('times', 15, ' bold '))
DATA_SUB.place(x=170, y=300)
def attf():
import subprocess
subprocess.Popen(
r'explorer /select,"C:\Users\Pragya Singh\PycharmProjects\
Attendace_management_system\Attendance\Manually Attendance\-------Check
atttendance-------"')
MFW.mainloop()
94
global SUB_ENTRY
def clear():
txt.delete(first=0, last=22)
def clear1():
txt2.delete(first=0, last=22)
def del_sc1():
sc1.destroy()
def err_screen():
95
global sc1
sc1 = tk.Tk()
sc1.geometry('300x100')
# sc1.iconbitmap('AMS.ico')
sc1.title('Warning!!')
sc1.configure(background='grey80')
Label(sc1, text='Enrollment & Name required!!!', fg='black',
bg='white', font=('times', 16)).pack()
Button(sc1, text='OK', command=del_sc1, fg="black", bg="lawn green", width=9,
height=1, activebackground="Red", font=('times', 15, ' bold ')).place(x=90, y=50)
# Error screen2
def del_sc2():
sc2.destroy()
def err_screen1():
global sc2
sc2 = tk.Tk()
sc2.geometry('300x100')
# sc2.iconbitmap('AMS.ico')
sc2.title('Warning!!')
sc2.configure(background='grey80')
Label(sc2, text='Please enter your subject name!!!', fg='black',
bg='white', font=('times', 16)).pack()
Button(sc2, text='OK', command=del_sc2, fg="black", bg="lawn green", width=9,
96
height=1, activebackground="Red", font=('times', 15, ' bold ')).place(x=90, y=50)
def take_img():
l1 = txt.get()
l2 = txt2.get()
if l1 == '':
err_screen()
elif l2 == '':
err_screen()
else:
try:
cam = cv2.VideoCapture(0)
detector = cv2.CascadeClassifier(
'haarcascade_frontalface_default.xml')
Enrollment = txt.get()
Name = txt2.get()
sampleNum = 0
while (True):
ret, img = cam.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = detector.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
# incrementing sample number
sampleNum = sampleNum + 1
97
# saving the captured face in the dataset folder
cv2.imwrite("TrainingImage/ " + Name + "." + Enrollment + '.' +
str(sampleNum) + ".jpg",
gray)
print("Images Saved for Enrollment :")
cv2.imshow('Frame', img)
# wait for 100 miliseconds
if cv2.waitKey(1) & 0xFF == ord('q'):
break
#
# # break if the sample number is morethan 100
elif sampleNum > 70:
break
cam.release()
cv2.destroyAllWindows()
ts = time.time()
Date = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d')
Time = datetime.datetime.fromtimestamp(ts).strftime('%H:%M:%S')
row = [Enrollment, Name, Date, Time]
with open('StudentDetails\StudentDetails.csv', 'a+') as csvFile:
writer = csv.writer(csvFile, delimiter=',')
writer.writerow(row)
csvFile.close()
res = "Images Saved for Enrollment : " + Enrollment + " Name : " + Name
Notification.configure(
text=res, bg="SpringGreen3", width=50, font=('times', 18, 'bold'))
Notification.place(x=250, y=400)
98
except FileExistsError as F:
f = 'Student Data already exists'
Notification.configure(text=f, bg="Red", width=21)
Notification.place(x=450, y=400)
harcascadePath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(harcascadePath)
df = pd.read_csv("StudentDetails\StudentDetails.csv")
99
cam = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX
col_names = ['Enrollment', 'Name', 'Date', 'Time']
attendance = pd.DataFrame(columns=col_names)
while True:
ret, im = cam.read()
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(gray, 1.2, 5)
for (x, y, w, h) in faces:
global Id
100
attendance.loc[len(attendance)] = [
Id, aa, date, timeStamp]
cv2.rectangle(
im, (x, y), (x + w, y + h), (0, 260, 0), 7)
cv2.putText(im, str(tt), (x + h, y),
font, 1, (255, 255, 0,), 4)
else:
Id = 'Unknown'
tt = str(Id)
cv2.rectangle(
im, (x, y), (x + w, y + h), (0, 25, 255), 7)
cv2.putText(im, str(tt), (x + h, y),
font, 1, (0, 25, 255), 4)
if time.time() > future:
break
attendance = attendance.drop_duplicates(
['Enrollment'], keep='first')
cv2.imshow('Filling attedance..', im)
key = cv2.waitKey(30) & 0xff
if key == 27:
break
ts = time.time()
date = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d')
timeStamp = datetime.datetime.fromtimestamp(
ts).strftime('%H:%M:%S')
101
Hour, Minute, Second = timeStamp.split(":")
fileName = "Attendance/" + Subject + "_" + date + \
"_" + Hour + "-" + Minute + "-" + Second + ".csv"
attendance = attendance.drop_duplicates(
['Enrollment'], keep='first')
print(attendance)
attendance.to_csv(fileName, index=False)
cam.release()
cv2.destroyAllWindows()
import csv
import tkinter
root = tkinter.Tk()
root.title("Attendance of " + Subject)
103
root.configure(background='grey80')
cs = 'C:/Users/Pragya Singh/PycharmProjects/Attendace_management_system/' +
fileName
with open(cs, newline="") as file:
reader = csv.reader(file)
r=0
104
def Attf():
import subprocess
subprocess.Popen(
r'explorer /select,"C:\Users\Pragya Singh\PycharmProjects\
Attendace_management_system\Attendance\-------Check atttendance-------"')
def admin_panel():
win = tk.Tk()
# win.iconbitmap('AMS.ico')
105
win.title("LogIn")
win.geometry('880x420')
win.configure(background='grey80')
def log_in():
username = un_entr.get()
password = pw_entr.get()
if username == 'pragya':
if password == 'pragya123':
win.destroy()
import csv
import tkinter
root = tkinter.Tk()
root.title("Student Details")
root.configure(background='grey80')
cs = 'C:/Users/Pragya
Singh/PycharmProjects/Attendace_management_system/StudentDetails/StudentDetails.csv
'
with open(cs, newline="") as file:
reader = csv.reader(file)
r=0
106
label = tkinter.Label(root, width=10, height=1, fg="black", font=('times',
15, ' bold '),
bg="white", text=row, relief=tkinter.RIDGE)
label.grid(row=r, column=c)
c += 1
r += 1
root.mainloop()
else:
valid = 'Incorrect ID or Password'
Nt.configure(text=valid, bg="red", fg="white",
width=38, font=('times', 19, 'bold'))
Nt.place(x=120, y=350)
else:
valid = 'Incorrect ID or Password'
Nt.configure(text=valid, bg="red", fg="white",
width=38, font=('times', 19, 'bold'))
Nt.place(x=120, y=350)
107
pw = tk.Label(win, text="Enter password : ", width=15, height=2, fg="black",
bg="grey",
font=('times', 15, ' bold '))
pw.place(x=30, y=150)
def c00():
un_entr.delete(first=0, last=22)
def c11():
pw_entr.delete(first=0, last=22)
108
Login = tk.Button(win, text="LogIn", fg="black", bg="SkyBlue1", width=20,
height=2,
activebackground="Red", command=log_in, font=('times', 15, ' bold '))
Login.place(x=290, y=250)
win.mainloop()
recognizer.train(faces, np.array(Id))
try:
recognizer.save("TrainingImageLabel\Trainner.yml")
except Exception as e:
q = 'Please make "TrainingImageLabel" folder'
Notification.configure(text=q, bg="SpringGreen3",
width=50, font=('times', 18, 'bold'))
109
Notification.place(x=350, y=400)
def getImagesAndLabels(path):
imagePaths = [os.path.join(path, f) for f in os.listdir(path)]
# create empth face list
faceSamples = []
# create empty ID list
Ids = []
# now looping through all the image paths and loading the Ids and the images
for imagePath in imagePaths:
# loading the image and converting it to gray scale
pilImage = Image.open(imagePath).convert('L')
# Now we are converting the PIL image into numpy array
imageNp = np.array(pilImage, 'uint8')
# getting the Id from the image
Id = int(os.path.split(imagePath)[-1].split(".")[1])
# extract the face from the training image sample
faces = detector.detectMultiScale(imageNp)
# If a face is there then append that in the list as well as Id of it
for (x, y, w, h) in faces:
faceSamples.append(imageNp[y:y + h, x:x + w])
110
Ids.append(Id)
return faceSamples, Ids
window.grid_rowconfigure(0, weight=1)
window.grid_columnconfigure(0, weight=1)
# window.iconbitmap('AMS.ico')
def on_closing():
from tkinter import messagebox
if messagebox.askokcancel("Quit", "Do you want to quit?"):
window.destroy()
window.protocol("WM_DELETE_WINDOW", on_closing)
message.place(x=80, y=20)
111
lbl.place(x=200, y=200)
window.mainloop()
113
OUTPUT
114
115
CONCLUSION
Face recognition systems are part of facial image processing applications and their significance
as a research area are increasing recently. Implementations of system are crime prevention, video
surveillance, person verification, and similar security activities. The face recognition system
implementation can be part of Universities. Face Recognition Based Attendance System has been
envisioned for the purpose of reducing the errors that occur in the traditional (manual) attendance
taking system. The aim is to automate and make a system that is useful to the organization such
as an institute. The efficient and accurate method of attendance in the office environment that can
replace the old manual methods. This method is secure enough, reliable and available for use.
Proposed algorithm is capable of detect multiple faces, and performance of system has
acceptable good results. In order to reduce the faculty effort and to manage the time effectively
the authors proposed automated attendance system based on face recognition in schools/colleges.
The system takes attendance for particular amount of time and after the time expires the system
automatically closes the attendance. The result of the experiment shows improved performance
in the estimation of attendance compared to traditional pen and paper type attendance system.
The current work is mainly focused on face detection and extraction by PCA algorithm in video
frames or images. In further work authors are intended to improve face recognition by comparing
3D face images with 2D face images (Real time). Also, the authors are intended to improve on
multiple face recognition at the same time so that the effectiveness of time can still be managed
and try to improve on the portability of the system. After analyzing various methods, this paper
can achieve the goal of analyzing each method such as overall system capacity, throughput as
well as accuracy. According to analysis on those paper show that the PCA algorithm has been
proved to be incredibly effective in an extensive database. After analysis to each paper, PCA is
better in the System of Attendance Management Based on Face Recognition, is a technique used
for securing the attendance, replacing the manual attendance system which is time-consuming
and Convolutional Neural Network also contribute in attendance management system using face
recognition due to the strong classifier. In future work, the accuracy of the system should be
solved by incorporate principal component analysis with the convolutional neural network.
However, it still has issues with performance and the accuracy of the system to recognize a human face .
In the next research work will conduct fast PCA with back-pro Back-Propagation to resolved that
problem.
116
FUTURE SCOPE
A possible future application for facial recognition systems lies in retailing. A retail store (for
example, a grocery store) may have cash registers equipped with cameras; the cameras would be
aimed at the faces of customers, so pictures of customers could be obtained. The camera would
be the primary means of identifying the customer, and if visual identification failed, the customer
could complete the purchase by using a PIN (personal identification number). After the cash
register had calculated the total sale, the face recognition system would verify the identity of the
customer and the total amount of the sale would be deducted from the customer's bank account.
Hence, face-based retailing would provide convenience for retail customers, since they could go
shopping simply by showing their faces, and there would be no need to bring debit cards, or
other financial media. It’s able to revolutionize so many different industries – from security and
surveillance to AI tools and even providing a more personalized advertising experience – the
options are endless. The only limit? Your mind. The global facial recognition market is valued at
$4.4 billion and is expected to reach $10.9 billion, registering a CAGR of 17.6% during the
forecast period. Facial recognition has recently gained popularity, due to its advantages over
traditional surveillance methods, such as biometrics.
We’re getting ahead of ourselves here, so let’s take a step back and start at the beginning. The
system captures an image of the individual's face and then compares it to a database of pre-
registered faces. If a match is found, the system records the individual's attendance. In a world
where technology is advancing at an unprecedented rate, it is no surprise that traditional
attendance tracking methods are being revolutionized. Say goodbye to outdated time clocks and
cumbersome swipe cards - face recognition technology is here to reshape how we manage
attendance. With its ability to accurately identify individuals based on unique facial features, this
cutting-edge solution is more secure and convenient for employers and employees. In this article,
we will explore the future of attendance tracking through our Face Recognition Attendance
System and delve into how face recognition is transforming how we keep track of time. From
eliminating buddy punching to streamlining payroll processes, this innovative technology is
paving the way for a more efficient and accurate time-tracking system. Join us as we uncover the
exciting possibilities in the world of face recognition and its impact on the future of attendance.
117
REFERENCES
[1]. A brief history of Facial Recognition, NEC, New Zealand,26 May 2020. [Online]. Available:
https://www.nec.co.nz/market-leadership/publications-media/a-brief-history-of-facialrecognition/
[2]. Face detection, TechTarget Network, Corinne Bernstein, Feb, 2020. [Online]. Available:
https://searchenterpriseai.techtarget.com/definition/face-detection
[3]. Paul Viola and Michael Jones, Rapid Object Detection using a Boosted Cascade of Simple
Features. Accepted Conference on Computer Vision and Pattern Re cognition, 2001.
[4]. Face Detection with Haar Cascade, Towards Data Science-727f68dafd08, Girija Shankar
Behera, India, Dec 24, 2020. [Online]. Available: https://towardsdatascience.com/face-
detectionwith-haar-cascade-727f68dafd08
[5]. Face Recognition: Understanding LBPH Algorithm, Towards Data Science90ec258c3d6b,
Kelvin Salton do Prado, Nov 11, 2017. [Online]. Available: https://towardsdatascience.com/face-
recognition-how-lbph-works-90ec258c3d6b
[6]. What is Facial Recognition and how sinister is it, The guardian, Ian Sample, July, 2019.
[Online]. Available: https://www.theguardian.com/technology/2019/jul/29/what-is-
facialrecognition-and-how-sinister-is-it
[7]. Kush airy Kadir, Mohd Khairi Kamaruddin, Hirawati Nasir, Saiful I Safie, Zulkifli Abdul
Kadir Batia comparative study between LBP and Haar-like features for Face Detection using
OpenCV", 4th International Conference on Engineering Technology and Technopreneurs
(ICE2T), DOI:10.1109/ICE2T.2014.7006273, 12 January 2015.
[8]. Senthamizh Selvi.R, Sivakumar, Sadhya’s. S, Siva Sowmya’s, Ramya’s, Kanaga Suba
Raja.S,"Face Recognition Using Haar - Cascade Classifier for Criminal Identification",
International Journal of Recent Technology and Engineering (IJRTE), vol.7, issn:2277-3878, ,
issue-6S5, April 2019.
[9]. Robinson-Riegler, G., & Robinson-Riegler, B. (2008). Cognitive psychology: applying the
64 sciences of the mind. Boston, Pearson/Allyn and Bacon.
[10]. Margaret Rouse, what is facial recognition? - Definition from WhatIs.com, 2012. [online]
Available at: http://whatis.techtarget.com/definition/facial-recognition
[11]. Robert Silk, Biometrics: Facial recognition tech coming to an airport near you: Travel
Weekly, 2017. [online] Available at:
118
https://www.travelweekly.com/Travel-News/AirlineNews/Biometrics-Facial-recognition-tech-
coming-airport-near-you
[12]. Sidney Fussell, NEWS Facebook's New Face Recognition Features: What We Do (and
Don't) Know, 2018. [online] Available at: https://gizmodo.com/facebooks-new-face-recognition-
fea tures-what-we-do-an-1823359911
[13]. Reichert, C. Intel demos 5G facial-recognition payment technology | ZDNet, 2017. [online]
ZDNet. Available at: https://www.zdnet.com/article/intel-demos-5g-facial-recognition-
paymenttechnology/#:~:text=Such%20%22pay%20via%20face%20identification,and
%20artificial%20int elligence%20(AI). [Accessed 25 Mar. 2018].
[14]. Mayank Kumar Rusia, Dushyant Kumar Singh, Mohd. Aqib Ansari, “Human Face
Identification using LBP and Haar-like Features for Real Time Attendance Monitoring”, 2019
Fifth International Conference on Image Information Processing (ICIIP) ,Shimla, India, DOI:
10.1109/ICIIP47207.2019.8985867 10 February 2020.
119