ABSTRACT

METHODOLOGY
The usage of Deep Learning (DL) and Machine Learning (ML) approaches has
occurred as an innovative paradigm in facial assessment for age and gender
identification. This methodology includes using a smooth machine learning model
like SVM and a powerful deep learning model like CNN from scratch for accurate
predictions by decoding the facial patterns.
Figure 1 Methodology Framework
4.1 DATA ANALYSIS

The UTKFace dataset is a large set of RGB facial images that includes people
ranging in age from 116 years old to newborns. It has more than 20k photos with
accompanying information on age, gender, and ethnicity.
Figure 2 UTK Face Dataset
These photos display a range of lighting settings, some partial obstacles, various
angles, expressions on the faces, and resolutions. This dataset can be used for a
variety of applications, including face detection, age estimation, ageing process
prediction, landmark identification on the face, and more.
We have used this dataset for age and gender prediction using deep learning and
machine learning techniques. Originally the dataset contains 23, 706 face images
with annotations of age, gender, and ethnicity, but we needed only age and gender’
annotations to train the proposed model so that’s why Ethnicity column has been
removed from this dataset for better age and gender prediction. The age column of
dataset has been categorized into distinct categories and ranges so that age prediction
can be done better.
4.2 AGE IDENTIFICATION USING CNN

It is difficult to assume the exact age of the individuals by using regression for age
prediction because facial genetics of the individuals are unique from each other even
people from the same age. So, it will be more suitable to predict the range of the age.
Therefore, we also supposed that this problem was a classification task among age
categories that we have made by using UTK Face Dataset, the ages grouped as
following (0-3), (3-18), (18-35), (35-60), (60-116).
Figure 3 Visualization of age in UTK Face dataset
Figure 4 Age Categories
The CNN model is proposed that contains 15 layers out of which 4 are convolutional
layers and the activation function used in final layer is SoftMax for classification of
age groups. After training the model, the model will be tested on the unseen data to
check if the predictions being made are correct or not.
4.3 GENDER IDENTIFICATION USING CNN

The gender identification was supposed to be a classification task challenge which
was tackled by using the CNNs from scratch which consists of total 15 layers out of
which 4 are convolutional layers and the final output layer has the SoftMax
activation function with 2 neurons to represent male and female classes. By training
the model using the training dataset, the model will be able to predict the gender of
the test dataset of the model.
Figure 5 Visualization of gender in UTK Face dataset
4.4 GENDER IDENTIFICATION USING SVM

The proposed method focuses on determining the most productive machine learning
technique for the classification of gender using facial images from the UTKFace
dataset. The process includes many key stages, starting with pre-processing stages
such as normalizing the images against brightness variations, scaling them to same
sizes, and face detection using the Viola-Jones technique. The histogram-oriented
gradient (HOG) is a feature extraction technique which is then applied to extract
descriptive features from these preprocessed images. Principal Component Analysis
(PCA) is applied to the images to dimensionality reduce the features. These features
are again processed using Principal Component Analysis (PCA) to reduce
dimensionality, improving computational efficiency and classification performance.
Finally, a Support Vector Machine (SVM) classifier (80% training, 20% split test) is
trained and tested for classification of gender classes (male or female). This method
provides advantages such as advanced feature extraction and machine learning
techniques to accurately classify gender based on facial features.
Chapter 5
DETAILED DESIGN AND ARCHITECTURE
5.1 SYSTEM ARCHITECTURE

The system architecture design approach encompasses the models of deep
learning and machine learning to train the UTK Face dataset to predict age
and gender of individuals correctly using their facial attributes. These models
include Convolutional Neural Networks (CNNs) for age and gender
identification from scratch and Support Vector Machine (SVM) for gender
identification. The deep learning models for age and gender prediction are
then integrated in an application for real time age and gender estimation from
the facial photos of individuals that encloses a comprehensive and efficient
framework that is modified for efficiency, scalability, and flexibility.
At the center of our approach, there is an adaptative framework that is

thoughtfully designed to ensure optimal integration and interaction of
different components and modules of the software. The prediction procedure
runs so flawlessly and seamlessly from start to finish, because each
component has different functionalities.
The cornerstone of our software depends upon the utilization of CNNs for age
and gender prediction. These Neural Networks are developed from scratch,
allowing the customization of the system architecture that is optimal for the
extraction of distinctive features from facial images of individuals for age and
gender prediction. Utilizing CNNs ensures us to capture the intricate instances
and variations present in facial images, offering high accuracy even in real-
time scenarios.
Furthermore, the real time processing capabilities are achieved through robust
data processing pipelines, optimized model prediction strategies, and
integrated system component interface. The architecture design incorporates
the capabilities of CNNs from scratch within the flexible design of the
software that is tailored for the accurate prediction of real time facial images.
The aim is to deliver precise, reliable, and effective forecasts that should be
suited for a wide range of real-world applications by harnessing the benefits
of Neural Networks.
5.1.1 Architecture Design
Following is the system architecture design of the software application:
Figure 6 System Architecture
5.1.2 Subsystem Architecture
The subsystem architecture of our software application consists of ensuring

dataset preparation, model training and evaluation, prediction processing and
then integration of model with software interface. These distinct modules are
used for accurate and precise prediction of real-time facial images.
 Dataset Preparation Module
This module is the core of the subsystem architecture and contains the dataset
that is going to be used for model training and validation. It consists of data
acquisition, annotation, and splitting to ensure the quality and integrity of the
dataset. The UTKFace dataset is a large set of facial images that includes
people ranging in age from 116 years old to newborns and gender (female and
male). It has 23, 706 photos with accompanying information on age and
gender. This module works closely with data preprocessing as it involves the
resizing of the images before training the models. Our system ensures that the
training data is properly formatted and annotated for reliable age and gender
predictions. It streamlines the data processing pipeline ensuring the overall
effectiveness of the model.
 Model Training Module
This module is responsible for model training and validation using

Convolutional Neural Networks (CNNs). The software uses the customized
model of CNNs to generate facial attributes that are essential for estimating
age and gender. The training of this model includes iterative refining of the
model hyperparameters like convolutional layers, kernel filters, activation
functions, batch normalization, regularization, and dropout functions. The
model hyperparameters are then tuned for achieving maximum performance
and generalizability of the model.
 Prediction Module
After the models are trained, then these models are utilized to predict age and
gender from unseen facial images of testing dataset. This module works on
the backend server of the application software offering accurate results to the
real-time images that has been input by the users after face detection using
pretrained models of MediaPipe or YOLO algorithms to accurately locate the
facial regions of the individuals. After receiving the new image from the user
that is either captured or imported, this module uses the face detection
algorithm to detect the face and then uses the CNN model and its weights to
predict the age and gender of the individual. The predicted age and gender
labels are then displayed to the user and the information is saved into the
database at the backend for further analysis.
 Software Interface Module
It is an integration module that ensures flawless communication between the

prediction module and the user-interface, facilitating the users to
communicate with the software and get predictions in real time. It consists of
different pages of the application like sign up page, login page, welcome
page, home page, and results page etc.
When a user starts the application, and uploads an image after signing in, this
component captures the image data and passes to prediction module. After
predictions, the results are displayed on the user interface and saved on the
database. By integrating the CNN model with the software interface, the
subsystem architecture allows the users to easily interact with the prediction
module to get the results, and it also improves the application’s usability and
efficiency. The software interface is user friendly, easier to input facial
images and receive accurate predictions and save the findings. This improved
the entire user experience and facilitated intuitive usage.
Figure 7 Modules of Subsystem Architecture
5.2 DETAILED SYSTEM DESING

The functional requirements, non-functional requirements, system
architecture, user interface design, and technical requirements are all covered
in the detailed system design, which is based on the software specifications
that have been provided. Here's a thorough explanation:
5.1.3 Sequence diagram
Figure 8 Sequence diagram

5.1.4 Activity diagram
Figure 9 Activity diagram

5.1.5 Functional requirements
The following functional criteria apply to an app that uses facial analysis to
identify age and gender:
Use-Case 1: Sign Up
Table 1 Use-Case 1
Identifier UC1
Signup of users depending on Username and
Purpose
Password.
Priority High
Pre-
User must have email to sign up
conditions
Post- User must access his email and get 6 chars
conditions verification code to verify his email.
Typical Course of Action
S
Actor Action System Response
#
Open the webpage
Select the button of the
1 specialized for completing
registration.
sign up process.
Register user entered data
into database.
Fill in the registration form
2 Send message via user’s
and press sign up button.
registered email with email
verification code.
Log in to the system with Display Home page with
3 username and password after welcome message to the
verification. registered user.
Use-Case 2: Sign In
Table 2 Use-Case 2
Identifier UC2
Signing users depending on Username and
Purpose
Password.
Priority High
Pre- User must have been registered earlier on our
conditions system.
Post-
User had verified his email.
conditions
S
#
Sign In users using Username The system successfully
1 and Password by pressing on redirects the user to the
Log in Button. Home Page with a welcome
2 Press Sign In message.
Use-Case3: Forgot Password and reset password
Table 3 Use-Case 3
Identifier UC3
To configure old password and reset a new

Purpose
password
Priority High
Pre-
User registered in the application
conditions
Post-
fill the input requirements
conditions
S
#
Click on Forgot password Sends email verification and

1
button pin code
Navigate to reset password

2 Click on reset password
configuration
Use-Case4: Import Image
Table 4 Use-Case 4
Identifier UC4
Allow user to import image for age and gender

Purpose
prediction
Priority High
Pre-
User on the image upload face
conditions
Post-
Face detection of imported image
conditions
S
#
Select the option to import Opens gallery for image

1
image selection
Displays the chosen image

2 Select a single face image
for analysis
3 System processes the image Detects the face and

proceeds with age and
gender prediction
Use-Case5: Capture Image
Table 5 Use-Case 5
Identifier UC5
Allow user to capture image for age and gender

Purpose
prediction
Priority High
Pre-
User on the image captured face
conditions
Post-
Face detection of captured image
conditions
S
#
Select the option to capture Opens camera for image

1
image capture
2 Capture a single face image Displays the capture image

for analysis
Detects the face and

3 System processes the image proceeds with age and
gender prediction
Use-Case6: Face Detection
Table 6 Use-Case 6
Identifier UC6
Face detection of imported/captured images for

Purpose
age and gender prediction
Priority High
Pre-
Selection of imported/captured images
conditions
Post- Face detection of uploaded image for age and

conditions gender prediction
S
#
1 Image selected or captured Analyze face detection of

images
Proceed with age and gender

2 Face detected
prediction
Use-Case7: Age and gender prediction
Table 7 Use-Case 7
Identifier UC7
Age and gender prediction after face detection

Purpose
of images
Priority High
Pre-
Face detection of imported/captured images
conditions
Post-
Results saved in a database
conditions
S
#
Prediction of age and gender

1 Saves result in database
completed
Display confirmation
2 Results saved in database
message
Use-Case8: Share analysis results
Table 8 Use-Case 8
Identifier UC8
Purpose Allow user to share and export analysis results
Priority Medium
Analysis completed and results displayed and

Pre-conditions
saved
Post-
Analysis results shared or exported
conditions
S
#
Provides options for sharing

1 Click on share/export button (social media, email) or
exporting (PDF, CSV)
2 Select desired Executes the chosen action

sharing/exporting method
Indicates successful
Confirmation message or
3 sharing/exporting of analysis
shareable link generated
results
5.1.6 Use-case diagram
Figure 10 Use-case diagram
5.1.7 Constraints
1) The application will accept only one image at a time, users are
prohibited to upload multiple images at a time of different facial
images of individuals.
2) Batch processing of images at a single time to get predictions of age

and gender is restricted, only single image analysis is requested.
3) The granularity of age prediction is limited since the prediction

module predicts the age categories of individuals not the exact age.
4) The model may not be accurate for predictions of the individuals of
Asian countries like China and Korea due to similarities in their facial
features, since the diversity of the dataset used to train the model is not
vast.
5.1.8 Composition
The composition of the application comprises of different modules like:
 Software interface module
Users interact with the application through the user-interface of application

developed in react native. Users will sign-up to register their account and then
upload their image or capture their image for further processing.
 Prediction Module
After face detection of the uploaded/captured image, the image will be passed
to prediction module where predictions are made using the trained CNN
model and its weights.
 Model Training
The model that will be used at the backend is trained on the popular UTK
Face Dataset and evaluated using the testing dataset. After the evaluation, it is
used for the estimation of age and gender of uploaded images by the user.
 Dataset module
The dataset module contains the dataset that is used for the training, validation
and testing of the model and to improve the model’s performance.
5.1.9 Non-functional requirements
Non-functional requirements, go beyond the software's immediate

functionality and concentrate on features like performance, security, usability,
and maintainability. Here are some non-functional criteria for the application
that uses facial features to identify age and gender:
 Performance
When a picture is uploaded, the software will react very instantly,

guaranteeing a speedy analysis in a matter of seconds. It will be able to
manage several simultaneous requests with ease, accommodating higher user
loads without sacrificing performance. Memory and computational resources
will be optimized for seamless operation in a range of device combinations.
 Security
Put in place strong safeguards to protect user information and make sure that
data protection laws (GDPR, CCPA, etc.) are followed. To limit access to
important functions and data, authentication and authorization methods will
be used.
 Usability
Will Create an intuitive user interface that ensures a smooth experience for
users with different levels of technical expertise. During picture uploads and
result presentations, users can give concise, helpful feedback, along with error
messages for unsuccessful analyses.
 Maintainability
We will create the application with a modular framework that makes upgrades
and future improvements simpler. To keep track of modifications and oversee
the development of the program, we will use version control systems.
 Compliance
We will make sure that all legal and industry standards for image processing,
facial recognition, and data protection are followed. User consent and privacy
will be respected by incorporating ethical principles and considerations into
the app's design and usage.
5.1.10 User interface design

5.1.11 Technical requirements
 Software Requirements
1) Operating System: The system should be compatible with major

operating systems such as Windows, Linux, or macOS.
2) Programming Languages:
3) Python: Required for implementing machine learning and deep

learning algorithms.
4) React Native: Required for developing the frontend of application.
5) Django: Required for integrating the models with frontend of

application using APIs.
6) Libraries: TensorFlow, Keras, scikit-learn, skimage, OpenCV for

machine learning and computer vision tasks, MediaPipe for face
detection.
 App Development Framework

If a user interface is required, frameworks like React or next.js can be used.
1) React Native for front-end development
2) Django for backend development
3) Data Preprocessing Tools:
4) Pandas and NumPy for data cleaning, manipulation, and

transformation.
5) scikit-image for image processing tasks.
6) MediaPipe for face detection tasks.
 Machine Learning and Deep Learning Frameworks
1) TensorFlow and Keras: Used for building, training, and evaluating

CNN models for age and gender prediction.
2) scikit-learn: Utilized for implementing SVM classifiers and additional

machine learning functionalities.
 Image Processing Libraries
1) OpenCV: Essential for image preprocessing, facial detection, and feature

extraction.
2) scikit-image: Used for additional image processing tasks.

 Development tools
1) Popular IDEs such as Visual Studio Code, PyCharm, or Jupyter

Notebooks for code development.
5.1.12 Interface/Exports
1) The software interface has been developed from React Native Expo.
2) The rest APIs of Django are used to integrate the model with the frontend.
3) Necessary libraries like TensorFlow, keras, MediaPipe, react-native etc.

have been imported.
5.3 CLASS DIAGRAM
Figure 11 Class Diagram

Chapter 6
IMPLEMENTATION AND TESTING
6.1 DEEP LEARNING MODELS
6.1.1 GENDER IDENTIFICATION
The modified Convolutional Neural Network (CNN) is designed for gender

classification tasks using RGB facial images with a resolution of 100 x 100 pixels. It
starts with the convolutional 2d layers utilizing 16 kernel filters and the hyperbolic
tangent (tanh) activation function, integrated with L2 regularization to reduce
overfitting. After this, batch normalization is consequently employed to guarantee
balance and effective training, along with a max-pooling layer to downsize the spatial
dimensions, improving feature extraction. The subsequent convolutional 2d layers
further refine feature extraction: the second layer uses 32 filters with Rectified Linear
Unit (ReLU) activation, accompanied by batch normalization and max pooling, which
is repeated in a third convolutional layer. A fourth convolutional layer introduces 64
filters, followed by max pooling to reduce dimensionality. The subsequent layers
include a fully connected layer with 128 units and ReLU activation, followed by
dropout regularization at a rate of 20% to mitigate overfitting. The final layer utilizes
sigmoid activation for binary gender classification ('male' as 0, 'female' as 1).
During model training, the Adam optimizer is employed with binary cross-entropy as
the loss function and accuracy as the evaluation metric. This architecture has proven
effective for gender classification in grayscale facial images. Input images are present in
RGB format at 100 x 100 pixels.
In the training phase, different batch sizes (256, 128, 64) were explored to discover the
most optimal configuration. Large scale experimentation identified a batch size of 256
as providing optimal results, balancing convergence speed and reliable gradient
estimation. Using both Adam and stochastic gradient descent (SGD) optimizers across
various learning rates, the model's performance was assessed. Adam consistently
showed more advanced results, highlighting the importance of optimizer, and learning
rate selection.
Iterative hyperparameter adjustments played a vital role in optimizing the model.

Parameters such as learning rates, early stopping patience values, and minimum delta
were fine-tuned to increase accuracy over successive training iterations. An
independent test set was used to conduct accurate determination of the model's
generalization capabilities, with test loss and accuracy serving as key metrics to validate
performance on unseen data.
Finally, the model's practical use was shown by precisely predicting gender categories
('male' or 'female') using certain images from the test set, demonstrating its
effectiveness in real-world gender identification. This comprehensive method
emphasizes the model's robustness and reliability in handling grayscale facial image
classification for gender identification.
Figure 12 Block diagram of CNN for gender identification

A single neuron with a 'sigmoid' activation function acts as the output layer, modified
for binary classification tasks like gender prediction. This architecture is developed to
efficiently learn hierarchical features from input images and predict gender based on the
learned representations. Adam optimizer is used for training, and additional methods
such as early stopping and learning rate reduction are employed to increase model
performance and avoid overfitting.
Table 9 Parameters used in CNN for gender identification
Parameters Used Value

Batch size 256
Convolutional layers 4
Activation function Tanh, Relu, Sigmoid
Loss function Binary cross entropy
Optimizers Adam
Total number of parameters 160289
Number of trainable parameters 160129
Learning rate 0.001
Dropout 0.1
6.1.2 AGE IDENTIFICATION
The developed Convolutional Neural Network (CNN) framework for age prediction is
obtained by efficiently categorizing facial photos into distinguishing age instances. The
model contains 4 convolutional 2d layers gradually increasing kernel filters from 16 to
128. Batch normalization is applied along with each convolutional 2d Each
convolutional layer is followed by batch normalization to balance and intensify training,
exploiting activation functions such as rectified linear unit (ReLU) and hyperbolic
tangent (tanh) to generate complex patterns hierarchically from the input images. After
that Max-pooling has been applied with a pool size of (2,2) are strategically placed after
each convolutional layer to scale down feature maps and improve computational
performance.
The generated facial features are then flattened to be managed by fully connected layers
(dense layers) in the following convolutional 2d layers. The model encompasses 2
dense layers with 256 and 128 neurons respectively, combined with dropout rate of 0.1
after each FC layer. The dropout rate helps in avoiding overfitting by randomly
blocking a set of inactivated neurons during training. The output layer of this model is
composed of 5 neurons by using activation function of SoftMax, allowing for the
prediction of age instances probabilities.
The Adam optimizer is used along with categorical cross-entropy as the loss function to
obtain optimization during training procedure, which is compatible for multi-class
categorization strategies like age instances prediction. The performance metric of
accuracy is evaluated to examine the model’s performance, calculating the correctly
predicted age groups’ percentage.
Moreover, L2 regularization with a coefficient factor of 0.01 is employed to convo 2d

layers to determine values having increased weight and avoid overfitting.
This precisely shown CNN framework selects to train the model with the capability to
distinguish visual attributes and efficiently distinguish between different age categories
based on input of facial images. The careful combination of convolutional, batch
normalization, dropout, and dense layers, along with proper activation functions and
optimization techniques, underscores the model’s effectiveness and potential in age
prediction using facial analysis.
Figure 13 Block diagram of CNN for age identification
The output layer consists of neurons, each applying a ‘sigmoid’ activation function.
This layer is optimized and provides age predictions using the categorical cross-entropy
loss function. The model is trained using Adam optimizer efficiently, and it enhances
the learning rates for every parameter as it goes. To determine the model’s efficiency in
precisely categorizing age instances from facial images, performance metrics are also
used. This extensive framework finds it to smoothly predict age categories from input
facial images by implementing regularization techniques to prevent overfitting and
increase the model’s performance using the proper activation functions and optimizers.
Table 10 Parameters used in CNN for age identification
Parameters Used Value

Batch size 256
Convolutional layers 4
Activation function Tanh, Relu
Loss function Categorical cross entropy
Optimizers Adam
Total number of parameters 688485
Number of trainable parameters 688261
Learning rate 0.001
Dropout 0.1
6.1.3 TESTING, TRAINING AND VALIDATION
The dataset is divided into 70% for training the model for a certain number of epochs
and a given batch size. Many callbacks are applied, such as EarlyStopping to reduce
overfitting and ReduceLROnPlateau to dynamically change the learning rate. 20% of
the dataset is used for validation to ensure the fine-tuning of the trained model. Lastly
for trained model evaluation on the testing subset, which is 10% of the dataset, the
trained model has been employed to the unseen data to check its performance and
robustness.
A separate testing subset, constituting 10% of the dataset, is dedicated to evaluating the
model's performance on previously unseen data. The model is applied to this subset,
and predictions are compared with the actual gender labels. This testing process
provides insights into the generalization capabilities of the model and its effectiveness
in real-world scenarios.
6.2 MACHINE LEARNING MODEL

The goal is to discover the most effective and ideal technique for identifying a person's
gender from facial images using machine learning techniques. The standard procedure
for identifying a person's gender from photos is reading the images from the image
database and then extracting the facial image in accordance with the type of dataset.
6.2.1 PREPROCESSING
We have used UTKFace dataset, and each image is read from this dataset. Pre-
processing is necessary for all the photos in the image dataset, including normalization
against brightness variations, scaling, and noise removal. The facial area is identified by
subjecting each image to the Viola-Jones technique, and the recognized faces were then
scaled to a predetermined size of 48x48 pixels.
6.2.2 FEATURE EXTRACTION
Extraction of usable features from face photos is a crucial step in successful gender
classification. The characteristics of the Histogram Oriented Gradient (HOG) and
Gabor Filters and combined as input for gender classification.
 HISTOGRAM ORIENTED GRADIENT (HOG)
Histogram Oriented Gradient is based on computing the gradient orientation histograms
for each cell by dividing an image into smaller cells. Following that, the histograms are
combined into a single feature vector that describes the overall structure of the image.
The HOG features are extracted using “skimage”.
 GABOR FILTERS
Gabor filters, named after physicist Dennis Gabor, are linear filters commonly used in
image processing and computer vision. They are particularly effective for texture
analysis and edge detection. These features are extracted using “Opencv”.
The process of joining or merging different attributes of a data set into one feature
representation refers to combining features. We have combined the features by
concatenating. This is basically connecting several features into a distinct vector,
ensuing in a higher-dimensional feature representation.
Now, we have Split the dataset into training and testing sets to evaluate the model's
performance. 80% data is selected for training the model and 20% data is selected to
evaluate the model’s performance.
6.2.3 PRINCIPAL COMPONENT ANALYSIS (PCA)
Then we reduced the dimensions of the feature vector by using PCA. PCA is a
technique statistically used to decrease the dimensionality of data while maintaining a
lot of information. The primary goal of PCA is to identify a new collection of
orthogonal axes, known as principal components, that will effectively capture the most
significant information in the data.
6.2.4 SUPPORT VECTOR MACHINE (SVM)
Gender recognition of humans has been achieved in this study by using SVM
Classifiers. This step involves training the SVM classifier so that it will be able to
differentiate between the classes i.e., female and male.
Figure 14 Flowchart of gender identification using SVM
Chapter 7
RESULTS AND DISCUSSION
7.1 PREPROCESSING AND PARAMETERS SETTING
7.1.1 GENDER IDENTIFICATION
Images are scaled into grayscale (100 x 100 x 3). The output dense layer consists
of a ‘softmax’ activation function with one unit either depicting ‘male’ or ‘female’
class. As this is a binary classification problem, the loss function that is used here
is binary cross entropy. Optimization is done using Adam optimizer with a learning
rate of 0.0001. The final model consists of 688261 trainable parameters.
7.1.2 AGE IDENTIFICATION
Similarly, the images (100 x 100 x 3) are first scaled into rgb channel. The images
are passed then the model for classifying the age category. As classification model
is used, so loss function used is categorical cross entropy and optimizer and
learning rate used are same as for gender identification. This model also contains
688261 trainable parameters.
7.2 EXPERIMENTAL RESULTS
7.2.1 GENDER IDENTIFICATION USING CNN
The CNN model described here is specifically designed for gender identification
using RGB images of size 100 x 100 pixels. The decision to convert images to
RGB is strategic, focusing on essential image features for gender classification.
To optimize the model's performance, a comprehensive experimental approach was

employed. The model's performance was systematically evaluated across a range
of batch sizes (512, 256, 128, 64) to identify the optimal batch size that balances
training efficiency and computational resources. Through experimentation, it was
38
determined that a batch size of 256 achieved the best balance of performance and
resource efficiency for the CNN model.
Many combinations of hyperparameters were determined to enhance the model’s

evaluation process besides the batch size exploration. This method has involved
applying different optimizers like Adam and SGD with many values of learning
rates, as well as adapting patience values for early stopping and minimum delta
benchmark for stopping the training process. These hyperparameters play vital
roles in managing model convergence speed, stability, and generalization.
The adjustment of the values of parameters to increase the model’s performance

and convergence is done for repetitive hyperparameters refining process. This
method presented crucial insights into the model’s behavior and patterns under
different training set of condition on gender prediction.
Additionally, the generalizability of the trained model was continuously

determined by applying it to the data that was not included in the training and
validation dataset. To assess how great the model generalized to the independent
test set, the evaluation metrics like test loss and accuracy were determined,
presenting vital discernments into the real-world performance and usability.
An example was pointed out using a specific image from the test set to show the
model's effectiveness in gender prediction categories ('male' represented as 0 and
'female' represented as 1). To prepare the test image for model input, image
preprocessing techniques were applied, allowing precise gender prediction based
on the highest chances of output.
This thorough computational approach combined with the repetitive

hyperparameter tuning and wide determination, validates the effectiveness of the
CNN model in gender prediction. The study’s results present crucial information in
increasing the CNN architecture for comparable classification tasks.
Overall, this comprehensive experimental approach, combined with iterative
39
hyperparameter tuning and thorough evaluation, validates the CNN model's
efficacy in gender identification tasks. The study's results present key insights into
optimizing CNN architectures for similar image classification tasks and highlight
the model the model's effectiveness in real-world applications requiring precise
gender prediction from RGB images.
Table 11 Comparison of the effect of different learning rate on the training dataset
Learning rate Gender Accuracy
0.0001 89.8%
0.001 91.46%
0.01 91.8%
0.1 93.19%
1.0 92.72%
Table 12 Comparison of the effect of different batch sizes on the testing dataset
Batch size Gender Accuracy

512 88.16%
256 88.2%
128 87.19%
64 86.64%
40
Figure 15 Test Accuracy of gender identification CNN model
Figure 16 Test Loss of gender identification CNN model
7.2.2 AGE IDENTIFICATION USING CNN
A systematic approach in this study is used for age identification model tackle
comprehensive optimization of its hyperparameters. The model’s robustness was
determined across different batch sizes () while exploring many combinations of
values of learning rates, patience values for early stopping, and minimum delta
limit for stopping the training. On a designated training set, each configuration was
trained and validated on a separate validation set to scale its effects on performance
metrics.
Accuracy and loss metrics were meticulously observed to understand how changes
in hyperparameters affected the model's learning dynamics and convergence
throughout this iterative process. Researchers got valuable insights into the model's
behavior under different training circumstances by systematically changing these
parameters.
41
Additionally, the generalizability of the trained model was determined by using the
model onto the unseen dataset that was not included in training and validation
dataset. Test loss and accuracy were the metrics that were compared to assess the
model’s performance on this unseen dataset. The model’s predictive capabilities in
real world conditions were shown after the model generated the predictions on
photos of unseen data.
Image preprocessing techniques were applied to appropriately format the test data
to ensure compatibility with the model's input requirements. This preprocessing
step assisted precise age category predictions by the model, identifying the age
group with the highest probability among its output categories.
Collectively, a robust foundation for understanding the model's behavior across

different hyperparameter design were provided by these comprehensive
experiments, evaluating its generalization capability, and validating its
effectiveness in providing accurate age identification.
Table 13 Comparison of the effect of different learning rate on the training dataset
Learning rate Age Accuracy

0.00001 92.75%
0.0001 94.11%
0.001 94.51%
0.01 94.7%
0.1 95.13%
42
Figure 17 Accuracy vs Learning Rate
Table 14 Comparison of the effect of different batch sizes on the testing dataset
Batch size Age Accuracy

512 69.77%
256 81.81%
128 72.5%
64 73.57%
43
Figure 18 Test Accuracy of age identification CNN model
Figure 19 Test loss of age identification CNN model
7.2.3 PERFORMANCE METRICS
 ACCURACY
The accuracy for gender prediction using deep learning architecture called
Convolutional Neural Networks is 89.49% and for age prediction, it is 81.81%. It
defines the correctly classified gender labels among all the predictions that have
been made and the correct proportion of categorized instances of age in the test set.
The high accuracy level shows the efficient performance of model in
discriminating between male and female subjects and across different age
categories based on their facial features.
 F1 SCORE
44
F1 is computed as 0.886 which gives a balanced analysis of the model’s precision
and recall. It is a robust metric for binary classification tasks like gender prediction
considering it takes both false positives and false negatives. An increased F1 score
specifies that there is a good balance between minimizing false positives and false
negatives in the model, thus proficiently capturing the hidden features for gender
prediction in the dataset. On the other hand, the macro F1 score across all age
instances is approximately 0.7607. It gives a balanced analysis of model’s
precision and recall for all age categories.
 ROC
The Receiver Operating Characteristic (ROC) curve, along with its corresponding
Area Under the Curve (AUC), serves as a performance metric for the model’s
ability to distinguish between male and female subjects against different threshold
values. The value of ROC AUC is approximately 0.9658 which suggests that the
performance of model is great at distinguishing between genders, with a
comparatively high true positive rate and a low false positive rate. This intimates
that the model has the capability to effectively distinguish between male and
female faces.
 SPECIFICTY
Specificity is calculated as 0.907 and it measures the percentage of correctly

identified male subjects. This performance measure enhances specificity (true
negative rate) and gives insight into the model’s performance in identifying male
faces specifically. An increased specificity value suggests that model demonstrates
a low rate of false positives among male subjects, further pledges its capability to
45
accurately identify gender based on facial features. Specificity of age identifies the
true negative cases within each age category. It is calculated as 0.7392,
demonstrating fluctuations in the model’s performance across different age
categories. Classes with higher specificity like class 0 have better performance and
capability to recognize the age of individual outside their age category.
 SENSISTIVITY
Sensitivity is calculated as 0.8797 and it measures the percentage of correctly

identified female subjects. This performance measure enhances specificity (true
positive rate) and gives insight into the model’s performance in identifying female
faces specifically. An increased specificity value suggests that model demonstrates
a low rate of false positives among female subjects, further pledges its capability to
accurately identify gender based on facial features. Sensitivity of age identifies the
true positive cases within each age category. It is calculated as 0.7878,
demonstrating fluctuations in the model’s performance across different age
categories.
Table 15 Performance Metrics for age and gender prediction through CNN
Models Accuracy Specificity ROC/AUC Sensitivity F1 Score

Age CNN 81.69% 73.92% - 78.78% 76.07%
Gender 89.49% 90.77% 96.58% 87.97% 88.69%
CNN
46
CNN Model Performance
for age prediction
82.00%
80.00%
78.00%
76.00%
74.00%
72.00%
70.00%
68.00%
Accuracy F1 Score Sensitivity Specificity
Figure 20 Graph of performance metrics for age identification
CNN Model Performance

for gender prediction
98.00%
96.00%
94.00%
92.00%
90.00%
88.00%
86.00%
84.00%
82.00%
Accuracy F1 Score Sensitivity Specificity ROC
Figure 21 Graph of performance metrics for gender identification
7.2.4 GENDER IDENTIFICATION USING SVM
Table 16 Comparison of effects of different kernels on the testing dataset
Kernel Accuracy
Linear 82%
Poly 85%
47
RBF 86%
7.2.5 Performance measures for gender prediction using SVM
 Accuracy
The overall accuracy of the model stands at 86%, demonstrating that the model
accurately predicts the orientation 86% of the time. We accomplished this exactness
utilizing 'rbf' kernel of support vector machine.
 Precision
As far as precision, which estimates the exactness of positive expectations, the model
accomplishes precision of 85% for anticipating male and 86% for anticipating females.
This implies that when the model predicts male or female, it is right 85% and 86% of the
time, respectively.
 Recall
Recall, or sensitivity, demonstrates the model's capacity to accurately distinguish every
pertinent occurrence. For this model, the recall is 84% for males and 87% for females.
This demonstrates that the model effectively distinguishes 84% of every real male and
87% of all genuine females accurately.
 F1 score
The F1 score, which is a harmonic mean of accuracy and review, remains at 85% for
males and 86% for females. These F1 scores recommend a fair presentation among
48
accuracy and review, giving a more thorough perspective on the model's viability in
foreseeing the two sexes.
Class Accuracy Precision Recall F1 Score

Female (0) 90% 90% 91% 90%
Male (1) 90% 89% 89% 89%
49

ABSTRACT

Uploaded by

Copyright:

Available Formats

ABSTRACT

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ABSTRACT

Uploaded by

Copyright:

Available Formats

METHODOLOGY

Figure 1 Methodology Framework

4.1 DATA ANALYSIS

4.2 AGE IDENTIFICATION USING CNN

Figure 4 Age Categories

4.3 GENDER IDENTIFICATION USING CNN

4.4 GENDER IDENTIFICATION USING SVM

DETAILED DESIGN AND ARCHITECTURE

5.1 SYSTEM ARCHITECTURE

At the center of our approach, there is an adaptative framework that is

Figure 6 System Architecture

5.1.2 Subsystem Architecture

The subsystem architecture of our software application consists of ensuring

 Dataset Preparation Module

 Model Training Module

This module is responsible for model training and validation using

 Software Interface Module

It is an integration module that ensures flawless communication between the

Figure 7 Modules of Subsystem Architecture

5.2 DETAILED SYSTEM DESING

Figure 8 Sequence diagram

Figure 9 Activity diagram

To configure old password and reset a new

Typical Course of Action

Click on Forgot password Sends email verification and

Navigate to reset password

Allow user to import image for age and gender

Typical Course of Action

Select the option to import Opens gallery for image

Displays the chosen image

3 System processes the image Detects the face and

Use-Case5: Capture Image

Allow user to capture image for age and gender

Typical Course of Action

Select the option to capture Opens camera for image

2 Capture a single face image Displays the capture image

Detects the face and

Use-Case6: Face Detection

Face detection of imported/captured images for

Post- Face detection of uploaded image for age and

Typical Course of Action

1 Image selected or captured Analyze face detection of

Proceed with age and gender

Use-Case7: Age and gender prediction

Age and gender prediction after face detection

Typical Course of Action

Prediction of age and gender

Use-Case8: Share analysis results

Purpose Allow user to share and export analysis results

Analysis completed and results displayed and

Typical Course of Action

Provides options for sharing

2 Select desired Executes the chosen action

Figure 10 Use-case diagram

2) Batch processing of images at a single time to get predictions of age

3) The granularity of age prediction is limited since the prediction

 Software interface module

Users interact with the application through the user-interface of application

5.1.9 Non-functional requirements

Non-functional requirements, go beyond the software's immediate