Autism Detecting Model Using Image
Autism Detecting Model Using Image
Autism Detecting Model Using Image
net/publication/361209023
CITATION READS
1 1,127
3 authors, including:
All content following this page was uploaded by Plabon Banik on 10 June 2022.
1
Md Tahmid Zoayed, 2Sayma Haque Arshe, and 3Plabon Banik *
1Department of Computer Science and Engineering, East West University, Dhaka,
Bangladesh
2 Department of Computer Science and Engineering, East West University, Dhaka,
Bangladesh
3 Department of Computer Science and Engineering, East West University, Dhaka,
Bangladesh
[email protected], [email protected], 2019-1-60-
[email protected]
*Corresponding Author
Abstract. This paper brings a model which can be used in detecting autism-
affected patients using their images. The images of autism-affected patients
and non-autism patients are collected from the Kaggle website. The total
number of images is almost 2940. We have implemented CNN(Convolution
Neural Network) algorithm to build this model. We have used TensorFlow
and Keras library in this process and lots of image processing functions to
smooth the model. This model can be implemented in hospitals and clinics
as a GUI or software to detect autism disease without any kind of medical
operations.
Keywords: Autism detection, Autism Detection using an image, Autism
Detection using CNN, Autism detection using CNN, TensorFlow, and Keras,
Autism detection using deep learning, disease detection using CNN.
1 Introduction
Early identification of this neurological condition can help keep the subject's mental
and physical health in good shape. With the increased use of machine learning-
based models to forecast numerous human diseases, early diagnosis based on
multiple health and physiological parameters appears to be viable. Machine learning
2
is now being used to diagnose disorders such as depression, and ASD. The key goals
of using machine learning techniques are to enhance diagnostic accuracy and
minimize diagnosis time to enable faster access to health care services. Now, we
wish to detect this disease using an image of a person and check how much it can
contribute to the medical industry.
Recently, many efforts are done to identify ASD based on deep learning with fMRI
(Koyamada et al., 2015; Anirudh and Thiagarajan, 2017; Subbaraju et al., 2017).
In Koyamada et al. (2015), a deep neural network (DNN) model was implemented
to build a subject-transfer decoder. The authors used principal sensitivity analysis
(PSA) to construct a decoder for visualizing different features of all individuals in
the dataset.
To implement the model we will use CNN( Convolution Neural Network) which is
an algorithm of deep learning. Speech recognition, image classification, automotive
software engineering, and neuroscience are the areas where CNN has given very
great performances. This huge progress is mainly due to a combination of
algorithmic improvements, computation resource enhancements, and access to a big
amount of data. That’s why we have chosen CNN for this work. We’ll try to
maximize the accuracy by creating more hidden layers and using a strong dataset.
2 Related Work
Previously, a lot of persons have made such models to identify autism by pictures.
They all had taken different approaches.
As per Rad and Furnanello,” most of the investigations were chiefly centered around
the social and communicational issues of the youngsters with ASD, while the
stereotypical motor movement of the patients stood out enough to be noticed."
Because SMMs are such an important part of children with Autism's abnormal and
rehashed behaviors, it's critical to develop effective and precise techniques for
locating these changes. [14]. Another study uses AI to propose the sequence of the
repetitive examples of strolling, which is based on the motor and kinematic
characteristics of strolling. Linear analysis of Learning classification with its
positives and negatives was conducted to identify the problem. “existing ASD
screening tools and the consistency of such tools using the DSM-IV instead of the
DSM-5 manual”. Thomas et al. notice the indicative control of Autism by utilizing
the attractive tomography of the cerebrum [12]. Another Artificial Neural Network
was created using the data set from the ASD Tests program as a data source. They
used 10 questions to gather information, including the member's age and gender.
3
The images of these humans are collected without any racism from each ethnicity
across the whole world. The total collection of all images is almost 2940. We are
going to work on this dataset and implement our model with this dataset using
different folders.
The images of the dataset folders are of different sizes and got some noise in them.
To prevent the noises and make the dataset smoother we have implemented some
preprocessing techniques on the images. We have preprocessed the train and valid
folder using the ImageDataGenerator function. In the function, we have rescaled the
image by (1/255) as the pixel values are[0,255]. We have also used 20% shear and
20% zoom-in to get clear visibility of the images. The shear will visualize the
5
images from every human angle and detect the image better. We made the
horizontal true as we need all images trained in horizontal mode. At last, we have
set all the images in (200x200) shape for the smooth work of the algorithm.
Convolutional Neural Networks are one of the most widely used deep neural
networks. It was most typically used to examine visual imagery. The human brain
inspired a feed-forward neural network. Simpler patterns (lines, curves, etc.) are
detected initially, then many more complex patterns (faces, objects, etc.) are
detected. Typically, a CNN module includes three layers:
• Convolutional layer
• Pooling layer
• Fully connected layer
Input: An image
1. Begin
2. for j = 1:J {% Loop on output feature maps
3. for i = 1:I {% Loop on Input feature maps
4. for m = 1:M {% Loop on rows of output feature maps
5. for n = 1:N {% Loop on columns of output feature maps
6. for p = 1:P {% Loop on rows of filer kernel
7. for q = 1:Q {% Loop on columns of filter kernel
8. 𝑂𝐹 𝑗 𝑚, 𝑛 = 𝑂𝐹 𝑗 𝑚, 𝑛 + 𝐹 𝑖,𝑗 𝑝, 𝑞 × 𝐼𝐹 𝑖(𝑚 + 𝑝 − 1, 𝑛 + 𝑞 − 1); }}}}}} %
Loops
9. End
I and J denote the number of input and output feature maps, M and N the number
of rows and columns in output feature maps, and P and Q the number of rows and
columns in filter kernels, respectively. These values can be used to define additional
values, such as the number of filters, which can be derived from J.
3.3.2 Pooling:
7
1. Start
2. Input:
3. Input features 𝑥1 … 𝑥𝑁 , 𝑘𝑡𝑜𝑝 , Total Layer L
4. Output:
5. Update 𝑘1
6. Extract Features 𝑦𝑖 ←f(x;b)
𝑘
7. Convolve network 𝑦𝑗 =∑𝑖=0( 𝑘𝑖𝑗 *𝑥𝑖 )
8. Compute activation layer 𝑦𝑖 (l+1) = f ((𝑘𝑖𝑗 ) 𝑦𝑖 ) +𝑏𝑖
9. Calculate 𝑘𝑚𝑎𝑝
10. Calculate 𝑘𝑚𝑎𝑝 = 𝑘𝑚𝑎𝑝 *Gaussian R
11. Choose to pool 𝑘1 = max (𝑘𝑡𝑜𝑝 , 𝑘𝑚𝑎𝑝
12. End
As in a typical FCNN, this layer has a full connection with all preceding and
succeeding layers. The FC layer helps in the mapping of representations between
input and output. We must flatten the final pooling k1 and create an output size in
this layer. Finally, we have our result. A matrix multiplication followed by a bias
effect can be used to compute it.
3.4 Xception :
regular Inception modules. For the picture identification and classification job, the
Xception model was trained using the ImageNet dataset [2, 3]. Xception is a deep
CNN with a lot of features. new levels of inception The inception layers are made
from of Layers of depthwise convolution are followed by a layer of pointwise
convolution. There are two types of transfer learning: extraction and fine-tuning of
features The characteristic in this study is The pretraining model was employed in
the extraction process. taught to extract the feature from a standard dataset the new
dataset, and to remove the model's top layers -e custom top layers were added to the
model. The number of classes determines the categorization. Fine-tuning has been
used to customize generic characteristics for a specific class.
3.5 VGG16 :
VGG16 is one of the most excellent vision models used convolution neural net
(CNN) architecture. VGG16 focused on 3x3 filter convolution layers having stride
1 and always, and maxpool layer of 2x2 filter stride 2. Throughout the architecture,
it maintains this convolution and max pool layer layout. After that, there are two
completely connected layers and a softmax for output. The VGG16 stands for
"weighted 16 layers."
9
3.6 VGG19 :
The VGG19 net was built primarily to win the ILSVRC, but it has been utilized in
a variety of other ways as well. It is used simply as a suitable classification
architecture for many additional datasets, and because the authors made the models
public, they may be used as is or with modifications for other comparable jobs as
well. Transfer learning may also be applied to facial recognition applications.
Weights are easily available in other frameworks like as Keras and may be tinkered
with and utilized as desired. The VGG19 is mainly upgrade version of VGG16
which have 19 layers while VGG16 has 16 layers.
The proposed system has been implemented on a machine having Windows 10,
Core i3 2.4 GHz with 16GB RAM. The work is done in Jupyter Notebook using
Keras and TensorFlow packages. Python 3.6.7(version) is used for developing it.
4.2 Implementation
Images of the datasets are transformed and scaled by our preprocessing techniques.
Now it’s time to implement CNN on them and get a result. We have used 2
convolutions and 2 max pool layers in the model. Then, after flattening, we have
used 3 dense layers where 2 are hidden and one is output. In the output, we have
used the ‘sigmoid activation function. Rather than that, we have used the ‘ReLU’
activation function in all other layers. In the compilation, we have used
‘binary_entropy’ as ‘loss entropy’ and set the Learning rate at 0.00001. As we have
binary class classification in the dataset the loss function of ‘binary_entropy’ is.
Loss= abs(Y_pred – Y_actual).
11
After setting all the parameters, we moved to our fitting the model segment. We
have used 21 epochs in this. The valid folder is used as validation in the model. We
have also restored the best weights to gain the maximum result possible and
assigned the patience as 3. After that, we have used every model with 5 epochs to
get an idea of which algorithm works best.
We can see the best accuracy possible from 21 epochs is 77% in this model. We
have run 21 epochs and got this result. But with 5 epochs with other algorithms like
Xception, VGG19, VGG16, and our proposed model we get 66%, 73%,71% and
73% respectively. This gives a visual representation of which model works best.
12
We have implemented a plot to visualize the accuracy more effectively. This plot
will help you to understand better the model and its accuracy.
Now, we’ll use this model to check our test images from the test folder. If the value
we get is ‘0’ the person is affected with Autism else the person is non-autistic. This
is how we can check the accuracy of our model better.
17
The performance of each model is different from the others. Based on their
performances some works very well as a model. From the above discussion, we
18
have figured out that VGG19 and our proposed model works the best in this dataset,
while others does an average job. If we increase the size of epochs from 5 to 21, it
will give more accuracy. The list of their performance accuracy is shown in the table
with 5 epochs.
Xception 66%
VGG16 71%
VGG19 73%
5 Conclusion
References
18. Rajaram, M. Concerns with ‘Detect Autism’ Dataset. Kaggle. Available online:
www.kaggle.com/melissarajaram/concerns‐
with‐detect‐autism‐dataset (accessed on 6 August 2021).
19. Musser, M. Detecting Autism Spectrum Disorder in Children with Computer Vision.
Medium. 24 August 2020. Available online:
https://towardsdatascience.com/detecting‐autism‐spectrum‐disorder‐in‐children‐with‐
computer‐vision‐8abd7fc9b40a
(accessed on 1 August 2021).
20. Vo, T.; Nguyen, T.; Le, T. Race Recognition Using Deep Convolutional Neural
Networks. Symmetry 2018, 10, 564.
https://doi.org/10.3390/sym10110564.
21. Chaudhuri, A. Deep Learning Models for Face Recognition: A Comparative Analysis. In
Deep Biometrics; Springer: Singapore,
2020; pp. 99–140.
22. Gwyn, T.; Roy, K.; Atay, M. Face Recognition Using Popular Deep Net Architectures:
A Brief Comparative Study. Future Internet
2021, 13, 164. https://doi.org/10.3390/fi13070164.