ANNand Its Applications

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

Artificial Neural Network and Its Applications

Alexandros Vasileiadis, Eirini Alexandrou, Lydia Paschalidou, Maria Chrysanthou, Maria Hadjichristoforou


Abstract—This paper focuses on Artificial Neural Networks backpropagation (BP) training algorithm programmer. Despite
(ANNs) and their applications. Initially, it explores the core the numerous training techniques, establishing an optimal
concepts of a neural network (NN), including their inspiration, ANN for a particular application remains a notable challenge.
basic structure, and training process, along with an overview of This challenge persists from compelling evidence from both
the most commonly used models. Additionally, the paper delves biological and technical perspectives, suggesting that the
into the three fields that ANNs play an important role: (1) effectiveness of an ANN in manipulating knowledge is
Computer Science, (2) Security, and (3) Health Care. These fields impacted by its design. [1]
are marked as significant since they hold great impact on various
aspects of society. For each one field, the paper discusses ways
that NNs have been utilised to unravel problems, the
This research will focus on neural network applications in
architectures employed, notable applications of NN within the computer science, security, and healthcare. It will explore how
domain and challenges faced because of NNs implementation. ANNs can be used in these fields, delve into their impact and
Lastly, it discusses the future directions of ANNs, exploring challenges, and discuss their potential future.
potential advancements in architecture, models, and applications
across diverse domains. Neural Networks are used in computer science for problem-
solving across various disciplines. Through algorithms, they
Index Terms—Artificial Neural Networks, Neural Networks, can execute various tasks, including image recognition, natural
Core Concepts, Training, models, Applications, Computer language processing (NLP), machine translation, speech
Science, Security, Health Care, Challenges, Architecture, Future recognition, and help with developing language translation
Direction systems. Building their success in Computer Science, ANNs
extended their applications into the Security sector. Various
types of ANNs, such as Convolutional Neural Networks
I. INTRODUCTION (CNNs), Graph Neural Networks (GNNs), and Recurrent
Neural Networks (RNNs), play an important role in addressing
Artificial Neural network (ANNs) is a machine learning security matters, such as Fraud Detection, cybersecurity
model, designed to emulate human decision-making processes threads, and facial recognition. Moreover, ANNs architectures
by simulating how biological neurons work. They consist of have expanded their use into the realm of healthcare.
interconnected layers of units, where data flows through them Capitalising on their abilities, they can help analyse medical
in an orderly sequence. Specifically, it can be categorised into images such as MRIs, CT scans, X-rays and ultrasounds which
three neural layers: (1) an input layer, (2) a hidden layer, and helps us to make early clinical diagnosis. ANNs can also
(3) an output layer. Even though ANNs are the simplified predict epidemic outbreaks, organise patients' health records
version of how our brain works, they are adept at learning to and personalise their medicine.
solve difficult problems through training, using experiments
and observations. Meaning, they are proficient at
comprehending intricate patterns and connections. [1]

Particularly, ANNs function by manipulating inputs and


II. CORE CONCEPTS
adjusting connections between neurons. They execute multiple The concept of ANN’s starts its inspiration from the human
pattern recognitions and mapping tasks. They can rebuild brain, particularly its building blocks, the neurons. The human
stored patterns using partial or noisy inputs, associate a given brain is a powerful tool that can do tasks such as thinking,
pattern with another associated pattern in temporal sequence, recognizing, and solving hard and complex problems, and to
create new patterns for complex problems, and group similar do all of that, the neuron - an electrically excitable cell - plays
patterns into clusters by creating new pattern representatives a big part. Our brain consists of an estimated 90 to 100 billion
for them. [2] neurons, where each neuron is connected with 1 to 10
thousand others, which makes up to 1015 interconnections.
There are various types of neural networks, including Neurons communicate by sending electrical and chemical
Feedforward Neural Networks (FNN), Convolutional Neural signals to their neighbours. Signals are sent through the
Networks (CNN), and Recurrent Neural Networks (RNN). neuron's branch, the axon, which further extends into smaller
ANNs find applications across various domains, the greater segments called collaterals. At the end of these collaterals,
part of which engage feedforward architecture ANNs and the neuromuscular junctions known as synapses, form connections
with neighbour neurons, allowing the neuron to transfer a
signal. Meanwhile, on the receiving end, the neuron’s

dendrites receive those signals via the synapses and merge For non-linearly separable problems, the additional neurons
them within the soma, the neuron body. Neurons with stronger layers are placed after the input layer and before the output
synaptic connections have a greater impact on each other. This neuron, two of which are also located in the multilayer
massive web of billions of interconnected neurons working perceptron architecture (MLP). In the context of these
together allows the brain to achieve its amazing abilities. [3] intermediary layers (known as hidden layers), where hidden
nodes are located, inside which the information transformation
A. Structure of a neural network occurs without direct access to the external environment. By
In 1943, McCulloch and Pitts, the founding fathers of AI, the same pattern of neuron dynamics, the hidden neurons
developed the first mathematical model of a neuron. Through examine the information, which was transmitted by the input
an analogy between a nerve cell and an artificial neuron, nodes and send it to the output layer. However, MLP has a
where dendrites and dendrites symbolise input and output of a learning behaviour, which is a lot more complex than a single
neuron, synapses portray the weight of a neuron and the perceptron learning. Despite the increased complexity, the
activity in soma represents the threshold. learning process is built upon the basis of the simple
perceptron algorithm, and therefore MLP is able to handle the
That is based on the experiences of McCulloch and Pitts, the non-linearities more efficiently. [3]
concept of a perceptron was introduced by Rosenblatt in 1958.
The breakthrough that marked the early artificial neural
network structures which can monitor, learn, and operate
imitating the human-like learning through example. An
algorithm that gives neurons the ability to learn and at the
same time process information efficiently, that helps them to
learn independently.
The Perceptron is an algorithm which is made by the concept
of having numerical inputs together with the weights and bias.
This produces a weighted summation of the input multiplied
with the weight. It is achieved by the introduction of weighted
bias in the products. The activation function applies
programming to compute and deliver the final value.

Figure 2. MLP showing input, hidden and output layers and nodes
with feedforward links.

Output y of the perceptron clarifies whether a weighted sum of B. Training process of neural networks
inputs and the bias exceeds a certain value. if y = 1 is the
To achieve a network that produces accurate outputs, it must
output, the model will predict that an input belongs to class 1,
first go through a training process. Like a human, an ANN
while when y = 0 is the output, it is predicted that the input
learns from examples, so it is important to provide a large
belongs to class 0.
amount of data. There are three main approaches to training an
ANN - supervised, unsupervised, and reinforcement.
Even of the fact that the Perceptron represented real progress
in the development of artificial neural networks, it had its
Supervised training requires well-defined data with
drawbacks. Perceptrons were capable to learn just linearly
corresponding labels, and it is being used to make networks
separable data, where one class of objects is positioned on one
that are capable of making predictions, image classification,
side of the plane and the other class on the opposite side, as
market forecasting, and more. One of the algorithms is linear
shown in the figure 1 below, while data in the real world is
regression and k-nearest neighbours represent supervised
usually not linearly separable. As a consequence, perceptrons
learning. Linear Regression is a way of modelling a
have been fraught with the problems of solving the essential
relationship between various independent variables and a
issues that are important to the society.
dependent variable by fitting a straight line to observed data.
While a k-Nearest Neighbours (kNN) is a non-parametric
method which is useful for both classification and regression
tasks and is a way of predicting the output of an instance
based on the majority class of its k neighbours in the feature
space. [4] [5]

Unsupervised training involves unlabelled data that is self-


trained to find the class structure within the data. Such an
approach is especially handy as a tool for tasks, e.g. targeted
marketing, and anomaly detection. The most common
Figure 1. Linear vs. nonlinear separation.
algorithms for this type of learning are the Hierarchical
Clustering and t-distributed Stochastic Neighbour Embedding
(t-SNE). Hierarchical Clustering considers the level of internal module, hence they can deal with temporal
similarity between the data points and creates a nested dependencies and context. The variant of RNNs, such as
hierarchy of clusters based on a dendrogram. t-SNE, unlike LSTM (Long Short-Term Memory), have addressed
Principal Component Analysis (PCA), is a nonlinear technique challenges like improved speech recognition and language
of dimensionality reduction that shows aptness for translation. [10]
visualization of high dimensional data well while preserving
the neighbourhood structure of data in the lower dimensional 4. Radial Basis Function Networks (RBFN):
space. [5] [6] Radial Basis Function Networks are being used mostly in
numerical analysis. RBF Networks' hidden layer employs
Reinforcement learning is a process where the network learns radial basis functions as activation ones. These types of
from the interactions with the data set and modifies its connections are good at representing functional forms in
behaviour accordingly in order to get rewards or punishments. multidimensional spaces, and very often are used as the basis
This facilitates the network to adapt with the complicated in regression and interpolation tasks. RBFNs have applications
tasks it does that need it to guide robots, do game playing, and in different areas, and these include function approximation,
make real time decisions. The algorithms used in this area are time series predictions, and financial projections. [11]
Q-Learning and Policy Gradient Methods. Q-learning is a
model-free reinforcement learning algorithm without a
specific model that relies on an evaluation of the value of III. APPLICATIONS IN VARIOUS SECTORS
carrying out a given action in a set state. Meanwhile, the other
method, Policy Gradient, directly optimizes the policy A. In Computer Science
function, whose role is to determine how the agent will act
through reward gradients that are with regard to policy This section covers the usage of artificial neural networks
parameters. [7] (ANNs) in the field of computer science, highlighting their
roles in key areas such as image recognition, natural language
processing (NLP), machine translation, automatic speech
C. Basic types of neural networks recognition, and language translation systems. We'll also go
In this section, we provide an overview of four commonly through the algorithms that power them, including the
used types of neural network architectures: Feedforward Backpropagation (BP) algorithm, the Recurrent Neural
Neural Networks (FNN), Convolutional Neural Networks Network-Based Optimization Algorithm (RNN-OA), and
(CNN), Recurrent Neural Networks (RNN), and Radial Basis advanced algorithms designed for optimising computer
Function Networks (RBFN). network routing. These discussions highlight the remarkable
ability of ANNs extraordinary capacity to convert complicated
1. Feedforward Neural Networks (FNN): data into useful insights and choices, demonstrating their
Feedforward Neural Networks, or multilayer perceptrons, are significance on a variety of computational challenges.
the simplest type of neural network. The structure of FNN
consists of interconnected layers of neurons, where the A1. Image Recognition
information flows only in one direction, starting from the The development of ANNs has recently taken a crucial role in
input layer, through one or more hidden layers, and to the the computer science department, especially in the sense of
output layer. FNN shows a good performance in regression creating image recognition systems. Due to their extraordinary
and classification in various types of fields, due to its capacity capacities to capture and process high numbers of visual data
to understand complex relationships between input and output with flawless precision, visual inspection can now be applied
data. [8] to multiple fields. The implementation of ANNs for image
recognition within computer science is primarily based on the
2. Convolutional Neural Networks (CNN): ability of the algorithms to detect, classify, and interpret
Convolutional Neural Networks are a great type of neural images very quickly.
network for image, audio, and video processing. CNN uses
convolutional layers that apply filters to the input data to learn The creation of sophisticated picture recognition systems is
different patterns such as edges, textures, or more complex one of the main advances made by ANNs in computer science.
models. Due to its ability to recognize patterns, CNN These systems employ either two or three sided layered neural
revolutionized computer vision tasks, including image networks, such as Convolutional Neural Networks (CNN), that
classification, object detection, and image segmentation. CNN independently extract the image's features at various levels of
is widely used in healthcare, robotics, and recently in self- abstraction. For instance, conceiving in the first layers of the
driving cars. [9] network, the edges, and colours could be identified, and as the
network goes deeper it could start identifying more complex
3. Recurrent Neural Networks (RNN): shapes and objects within the image.
Recurrent Neural Networks, being a subset of FNN, are a
perfect type of neural network to work with sequential data, The ANNs’ strength is their capability to generalise through
where the order of the information is important. RNNs have examples. This really illustrates their advantage since they can
circular links that allow them to store a complete state in an quantify characteristics that are not ordinarily perceived by a
mere human eye in applications such as imaging, where they
can improve the quality of images being captured, identify The field of NLP has, in the past decade, tremendously
objects, and even detect patterns. For example, the speed and evolved with tremendous advancements in ANN to process
accuracy of robots is a feature that is key in manufacturing linguistic data.
development, in which automated visual inspection systems
require high speed operation and precision. The joint network of ANNs with NLP has transformed the
very concept of how machines understand and generate human
As to the ANNs, they are one of the most important reasons language, providing immense practical applicability across a
why the computer graphics and vision field has developed so whole range of domains. Integration of ANNs with NLP has
fast. One may point to this by mentioning that they are used to led to the designing of systems for advanced language
enhancing advanced video games and virtual reality understanding that can do anything you can ever think of with
environments. ANNs can comprehend and decode the natural language. They use highly advanced algorithms and
settings around at every instance, then craft the graphic parts architectures of neural networks to do most precisely and
accordingly. The users will experience the graphic as highly efficiently what is, for the very first time, process, scrutinise,
vivid, dynamic, and interactive with their actions. and synthesise human language.

Moreover, ANNs not only solve these technical issues, but A prominent example is the Boltzmann machine, created by
also provide needed optimization of the current computer Terrence Sejnowski and Geoffrey Hinton, which introduced
systems to improve their performance. For example, in this new learning mechanisms that can handle the complex
case, they can also expedite the effectiveness of search linguistic patterns of the domain.
algorithms armed with large databases of images by only
displaying the ones that satisfy certain conditions which are With symmetric connections inspired by physical phenomena
uncomplicated to deal with manually. such as spin-glass, the Boltzmann machine offered a
mechanism for unsupervised learning that could be applied to
In the area of software development, ANNs can be used to machines recognizing and replicating relationships between
develop intelligent user interfaces whose visual inputs are linguistic elements. Topping this development was the Back-
processed and turned into usable information. This is Propagation algorithm by David Rumelhart, Geoffrey Hinton,
specifically significant in cases when a framework of gesture and R. J. Williams, which made it possible for multi-layer
acknowledgment is required, in which the system interprets perceptrons to solve complex linguistic problems. These
physical performance into valuable commands. networks, through stimulus and response iteration, could also
account for the subtle linguistic differences that issues such as
As research progresses, existing ANNs in computer science exclusive OR and the T/C problem pose.
can be made to become stronger and more innovative by
taking advantage of the fact that artificial systems can be Practical impacts of these capabilities can be seen in several
designed to gather information and interact optimally with the real-world applications of NLP. For instance, Sejnowski and
visual world. Such a move will go far beyond bringing the Charles Rosenberg have trained a network to pronounce
computational power to the next level, but also will generate English words correctly and, thus, have shown some potential
new methods of making machines intelligent and responsive that neural networks have for speech recognition. That is what
to human-like situation perceptions. [12] [13] opens the promising realm of using ANN for developing
automatic speech recognition systems to be used in virtual
A2. Natural Language Processing assistants, home-based voice-controlled devices, amongst
The journey of NLP started with the early theoretical models others.
by pioneers such as Warren McCulloch, Walter Pitts, and
Frank Rosenblatt. These early researchers laid down the Moreover, Teuvo Kohonen offered another insight about
theoretical foundation for the recognition and classification of topographical networks for map reading, which provided other
patterns in text data by using computation. Rosenblatt's ways concerning analysing and interpreting linguistic data.
invention of the 'Perceptron' in 1958 was quite revolutionary Some of these improvements include the creation of intelligent
because the Perceptron could train neural networks to classify chatbots, language translation systems, sentiment analysis
text patterns. It became the indispensable one for a task like tools, and named entity recognition systems, amongst others.
text classification and sentiment analysis. Essentially, practical applications of NLP, artificial neural
In 1961, further development in the field was the proposition networks, have revolutionised how the interaction of machines
of the 'back-propagating error correction algorithm' by Frank with human language allows them to make sense of and
Rosenblatt. It underscored that in case the level of accuracy produce text data in ways that would have otherwise been
was to increase in the recognition of patterns, then the training considered impossible. ANNs continue to grow and evolve,
of the neural network must be sophisticatedly done. This early creating the potential to harness advances in the understanding
work opened the doors to the use of Artificial Neural of languages and interaction for just about any industry.
Networks for supporting complex tasks in NLP, such as
named entity recognition or sentiment analysis. ii. NN Architectures in NLP
The advancement of Natural Language Processing (NLP)
i. Applications of NLP systems, which allows machines to comprehend and interpret
human language more efficiently, is greatly dependent on the and innovation within the research of neural networking
development of neural network architectures. These signals is a significant advancement in the understanding of
architectures provide considerable category disparities in natural language and interaction that may revolutionise
terms of time, manner, and place, in an attempt to mimic the teaching, translation, and communication. [14]
unusual computational needs of neural networks.

Neural network architectures have to transition from A3. Potential of RNN-OA


conventional serial processing to parallel processing in the The introduction of recurrent neural networks in recent years
realm of time. Neural net computers are capable of processing has led to a transformative impact in computer science,
many pieces of information at once, but traditional computers ushering in a new era of computational efficiency and
can only process one piece at a time. This capability enables problem-solving within this dynamic field. RNNs are
interactions and corrections where "future affects the past." particularly effective in processing sequential data, providing
This modification is necessary to handle the simultaneous substantial improvements in areas such as optimization,
information changes that come with language processing recommendation systems, image processing, and natural
activities. language processing (NLP).

Furthermore, computation in the neural network architectures The success of RNNs in enhancing algorithmic efficiency and
proceeds in a way to have a differentiation between the digital model performance when dealing with sequential data requires
(or binary) and analogue processing. The most significant careful consideration of several factors. In particular, the
difference: While computational computers produce binary architecture of the model, the tuning of hyperparameters, and
answers (true or false), the neural net machines take in inputs the quality of training data are extremely important for
and give outputs along a range to allow for the full range of achieving optimal results.
variation in the linguistic stimuli. This is what makes the
subtleties and nuances of human language become captured in A significant advancement in RNN-based optimization is the
analogue processing, hence making learning natural and development of the Recurrent Neural Network-Based
flexible. Optimization Algorithm (RNN-OA). This method enhances
the dimensions of processing algorithms through the
In addition, the distributed connectivity seems to call for application of attention mechanisms, regularisation
information processing in the various classes of neural techniques, and improvements in interpretability. By
network architectures. Unlike traditional computers, where the processing input selectively and maintaining model stability,
information processing is isolated at unique addresses, neural RNN-OA significantly boosts the efficiency and adaptability
net machines happen to distribute information connectively of algorithms to various problem-solving scenarios.
between many addresses, both wholly and partly. This
distributed architecture allows the representation of complex This algorithmic approach also benefits the overall
patterns in the linguistic data, hence improving the learning computational process by incorporating fine-tuning, transfer
ability of the machine in understanding and interpreting learning, and other techniques that reduce the computational
language. load and expedite algorithm development. The efficiency,
scalability, and robustness of RNN-OA have been rigorously
Such architectural considerations follow three important tested, showing that it offers significant benefits and has
characteristics set by Hopfield for neural network computers: potential for further improvements.
large connectivity, analogue response, and reciprocal or re-
entrant connections. These characteristics give rise to In practical terms, RNN-OA is applicable to a broad range of
computations qualitatively different from those performed by computer science functions, including voice recognition,
Boolean logic. machine translation, and time series forecasting. The
evaluation of its efficiency and scalability involves the use of
In practical applications, several neural network architectures specially developed frameworks and mathematical models.
have impressively addressed a wide range of natural language These models take into account dynamic learning, model
problems, from low-level phonology to high-level syntax. For stability, adaptability to various data sources, and sensitivity to
example, Rumelhart did experiments in the prediction of input fluctuations.
English verb morphology, and Sejnowski and Rosenberg
developed a model of phonology. Similarly, companies such The integration of RNNs into computer science marks a
as Nestor Corporation have manufactured tablets that significant leap forward, and innovative approaches like RNN-
recognize handwritten input, while Neural Tech has OA have boundless potential to expand their application
introduced products that can recognize teaching and learning further. Continuous improvements in RNN-based methods are
input in more than one natural language. setting high expectations for computational advancements that
While these are certainly indicative of what may be promised promise extensive benefits for academia and industry. [15]
by neural network architectures within NLP, it has to be
understood that they remain experimental and years away A4. The Back-Propagation Algorithm in Computer
from realising wide usage. However, this ongoing experiment Science and Applications
Back-Propagation (BP) is an algorithm used to train artificial Output Layer Neurons: For each output neuron, the error
neural networks through a method of error correction based on term (δ) is calculated by:
the previously computed errors. From a computer science
perspective, BP modifies weights allocated to a multilayer
network according to the actual computation of error that Where y i is the actual output for neuron j , y iis the predicted
happened during the previous iteration. The role of BP in output for neuron j , and f ' ( z j ) is the derivative of the
computer science applications is mainly to minimise the error
rate when predicting the outputs and is largely used in solving activation function applied at the output of neuron j , z j .
more complex problems like image recognition, autonomous
vehicles, and natural language processing. Hidden Layer Neurons: For neurons in the hidden layers, the
error is propagated back from the output layer, and the error
Back-Propagation (BP) is an algorithm used to train artificial term is calculated using:
neural networks through a method of error correction based on
the previously computed errors. From a computer science
perspective, weight modification is executed with the basic
help of BP in the process of network iteration. The role of BP
in computer science applications is mainly to minimise the Where w jk are the weights connecting neuron j in a hidden
error rate when predicting the outputs and is largely used in
solving more complex problems like the large-scale process of layer to neuron k in the subsequent layer, δ k is the error term
image recognition, autonomous vehicles, and natural language for neuron k in the layer above, and f ' ( z j ) is the derivative of
processing. the activation function at the output of neuron j , z j .
i. How BP works
Weight Update Rule: The weights are updated by moving
The BP algorithm consists of two main phases: the forward
against the gradient of the error function, which is computed
pass and the backward pass. During the forward pass, the
input data passes through the network layer after another,
starting from the input layer to the network’s output layer with
some initialised weight in the matrices and vector form.
Further, the resultant values of each layer are then passed to
the next layer. Subsequently, some predicted output of the for each weight as follows:
actual output data is made from the output layer generated Where η is the learning rate, δ j is the error term for the
during the forward pass. The next step is the backward pass, neuron j , computed as shown above, and ο i is the output of
which is just a direct influence of the forward pass process. the previous layer's neuron i , which is connected to the neuron
The error is to be used for iteration; hence it is made against
the actual output so as the error is made using the back pass.
j by the weight w ij.
The weights are fine-tuned in that they minimise the error
value. Some differentiation is involved concerning the partial These formulas are essential because they direct the iterative
margin of change in the weights. This is generally calculated weight modifications that enable neural networks to learn
using calculus, or partial derivatives to be more precise. from their mistakes and gradually increase their accuracy.

iii. Applications of Back-Propagation Neural Networks


in Computer Science
a. Image and Speech Recognition:
The most promising application of BP networks in computer
science is image and speech recognition. Since BP networks
can handle large amounts of data and patterns, they can be
used for image recognition and interpretation technologies and
spoken word recognition. BP networks are often used in
Figure 3. Back-Propagation algorithm workflow for neural network classifying images into categories, recognizing people’s faces,
training. or interpreting scenes. In spoken language recognition, the
network helps develop service assistants or real-time
ii. Key Formulas in BP translators that hear and learn from a large number of
The main computation in BP involves adjusting the weights, phonemes and intonation with excellent accuracy.
which is done using the gradient descent optimization
algorithm. The error term for each neuron is calculated during b. Natural Language Processing
the backward pass, starting from the output layer, and moving BP neural networks are essential for processing natural
backward through the network. This calculation for each language. They assist computers in comprehending and
neuron depends on its role (output layer vs. hidden layer): interpreting human language so that when the computers
produce human language, it is meaningful and appropriate for
the context. Sentiment analysis, in which networks examine
text from social media or reviews to assess the sentiment Present-day computer networks are complex systems, the
expressed, is one application. Furthermore, BP networks packets of which pass through several nodes before reaching
facilitate machine translation by enabling translation without the final destination. In conditions of growing network traffic,
the need for rule-based programming, as they leverage quality routing becomes a necessary condition because
extensive datasets from pre-existing translations. otherwise, there is an opportunity for congestion and data loss.
At the same time, the difficulty of the routing situation, which
c. Game Development and Strategy Planning depends on factors such as network topology and load,
BP networks are used to build stronger, more adaptable AIs in suggests that a dynamic solution will be optimal.
the gaming industry. The network learns from huge databases
of people playing; it can easily predict human behaviour and ii. ANN-Based Optimization Approach
offer the challenge or counteraction in the game without the This section describes a new method using Artificial Neural
AI “cheating”. BP networks assist in real-time decisions Networks to optimise network routing. It uses the ANN
within the game, helping optimise the engine performance via learning algorithm, which, in turn, operates on the principle of
predictive modelling. artificial reproduction of the brain’s structural units. Such
networks allow simulating many neurons’ joint work and
d. Autonomous vehicles choosing between several options based on historical data.
The science behind autonomous vehicles is one of the most
critical areas where BP neural networks have led to significant iii. Methodology
technological advancement. BP neural networks take input The designed ANN model includes a layered network
from sensors and cameras fitted on a vehicle and make real- structure in which each node represents a decision location
time decisions about how to navigate and, at a higher level, during the routing of data packets along the node paths. A
recognize road signs and avoid obstacles such as other network is trained off samples of the appropriate network
vehicles, facilitating safe driving. Because BP networks can conditions and routing selections that enable it to discover the
learn from different circumstances and scenarios, without this most effective routing patterns. Figure 2 below portrays the
technique, the innovation and improvement of autonomous neural network’s architecture which is featured with input,
driving is impossible. hidden, and output layers. The neural network will implement
the decision-making process based on these layers.
e. Robotics
Another obvious application in which BP neural networks
have expanded the frontier of computational science is in
robotics, particularly for robots performing complex tasks in
which the environment changes continuously. Assemblers,
most missions, hazard sensing, space exploration and even
surgery are some of the required tasks because these robots
interact with their surroundings in real-time and gain
experience by a neural network model to execute tasks with
improved precision and efficiency.

The above scenarios are just a few examples that indicate how
BP algorithms can be used to model intricate patterns and
make informed patterns. Given their history of innovation and
the optimal efficiency of their models, BP models are
unquestionably going to be the cornerstone of modern
computational sciences. [16]

A5. Computer Network Routing Optimization Figure 4. Architecture of the Neural Network for Routing
Algorithm Optimization
The development of Internet technologies has not only
changed the way we live, but also led to the emergence of a The ANN employs a probabilistic model to dynamically form
particularly urgent problem – the need for high-quality network connections, as described by the following equation:
network infrastructure. Due to the steady increase in network
requirements, one of the most pressing concerns is the
optimization of the network routing process. Traditional
technologies rarely cope with network complexities, which
forces researchers to seek new trends such as Artificial Neural
Networks that optimise routing. where Π ( i ) is the probability that a new node i will connect
to an existing node, k i is the degree of node i , and k j
i. The Challenge of Network Routing represents the degree of node j . This formula helps the ANN
in predicting the most efficient pathways by optimising the vital in cybersecurity in situations where the risks develop at a
network topology based on the likelihood of node connections. great speed and display nonlinear characteristics. Through the
Additionally, the system delay model used to minimise latency use of networks simulating actual biological systems, security
infrastructures can naturally adapt to new menaces, grasping
the patterns and misconforms in real time to fortify the shield
and hedge risks that might occur. [18]
and optimise routing is given by:
B1. Fraud Detection
where T C i is the total system delay, T ti represents the Financial fraud has continued to be an enduring threat that is
transmission delay, and t b irepresents the delay experienced faced in the financial sector, and this assails on the
due to inadequate bandwidth availability, signifying the individuals, institutions, and economies greatly. The deep
waiting time for data transmission. Pci , j symbolises the delay neural networks that exhibit this capability are known for
induced by queuing at the Mobile Edge Computing (MEC) autonomous learning of complex patterns and representations
from raw data, therefore this technique could be very effective
infrastructure, and t i signifies the delay attributed to task in addressing this issue. The performance of neural networks
execution by the MEC server. These components are crucial in fraud detection is not just a representation of their aptitude
for evaluating the efficiency of different routing paths and are for detecting sophisticated patterns in large data sets but a
integral to the ANN’s decision-making process. profound illustration of their excellence. The neural networks,
namely neural networks that are built based on the
iv. Simulation Results transactional data, user behaviour, and historical patterns, are
A comparative assessment according to the traditional routing capable of spotting anomalous activities that could be
of the network demonstrates the much higher efficacy of the fraudulent behaviour. They are rather capable of adjusting to
new proposed model. According to the results, the ANN fresh cases of fraud and learning from these new instances.
model practically reduces packet loss and delay to zero and [18] As a result, the anti-fraud mechanism is constantly
does not require human intervention, which can significantly improving its detection algorithms, and therefore, the efficacy
increase its effectiveness in terms of routing. of fraud prevention technologies goes up. It thus facilitates
eliminating the requirement of labour-intensive manual feature
v. Applications and Future Work engineering, which may also be very time-consuming and
The ANN-based routing optimization model has extensive domain specific. In addition, deep learning approaches are
prospects in that it can be applied to both small corporate good at handling multidimensional data and finding hidden
networks and international Internet backbones. The relationships, especially the complex and hidden ones, which
programmed scalability allows the use of this algorithm for give the system a unique feature of identifying subtle and
modern dynamic telecommunications. Future work will covert signs that characterise the fraudulent behaviours.
involve the reduction of dependency on human data and the Different deep learning architectures, such as the
rapid response of the network to the situation. [17] convolutional neural networks (CNN), and the graph neural
networks (GNN) are used to detect financial fraud in recent
B. In Security times. These models are used in different types of financial
systems like detecting credit card fraud, insurance, and money
ANN adaptability and efficacy have rendered them laundering. Most notably, deep learning models have been
indispensable in safeguarding critical systems, combating consistently outperforming classic approaches, with a success
fraudulent activities, and enhancing security measures across rate of around 99%. [19] [20].
diverse domains.
This section delves into the multifaceted applications of neural
i. CNN in Fraud detection
networks in security, focusing on three pivotal areas: fraud
Convolutional Neural Networks (CNN) is a popular deep
detection, anomaly detection in cybersecurity, and facial
learning algorithm, which shows good results in finding
recognition for security purposes. The integration of neural
unobservable features of dubious transactions and helps to
networks in these realms not only augments traditional
avoid overfitting of the model. The CNN algorithm has three
security measures but also empowers organisations to
main layers which are: Convolution layer, pooling layer, and
proactively mitigate risks and fortify their defences against
fully connected layer constitute the neural network. Normally,
evolving threats.
the role of the convolution and pooling layers is to perform
feature extraction. The third layer which is known as the fully
Neural networks have especially much to offer in the security
connected layer performs the operation of mapping the
field by focusing on the logic of input-output relationship
extracted features into its final output, such as classification.
surface and the depth learning process inspired by the human
[19] [21]
brain, they acquire knowledge by learning and storing it
within connection strengths between the neurons, recognized
as synaptic weights. Different from traditional fit to purpose
linear models, neural networks demonstrate their flexibility to
non-linear and linear correlations while not using intermediate
variables to model the reality. This capability is proven to be
The graph neural networks achieve this by using message
passing procedures where it disseminates information across
the network edges thus processing information in a way that
encapsulates the graph topology and relationships of the
nodes. GNN gives fraud scores to the node or transaction as it
does graph embedding operations on the financial transaction
graph and learning its features. These suspicion scores are the
variables that are determined for the accounting of these
Fi systems in order to be exposed to fraud. The GNN was used to
gure 5. Overall Network Structure guess fraud scores and a threshold that separated ordinary and
suspicious transactions was applied. Fraud scores higher than
The design of network structure is intended to make it possible a certain threshold is the sign to put the transactions on stake,
for applying the analytical tool to network transaction data and and they are investigated deeper. The boundary value might be
for the identification of criminal financial activities in a short computed from the data distribution to avoid the occurrence of
time. In essence, we have an input feature sequencing layer, a either false positives or false negatives while optimising for
group of four convolutional layers interlaced with pooling the necessary intervals within the domain knowledge.
layers, and a fully connected layer (Fig. 1). The next task is Cooperation between automated detection from the GNN and
the feature sequencing layer; a layer operated through which the expertise of professional human analysts, will provide any
the input features are processed according to their orders. financial institution with the means to tackle financial fraud
Distinction of effects are accumulated on the model whenever efficiently and effectively by being more proactive.
different order feature input layers are convoluted. The
filtering function of the convolutional layer is to detect the B2. Anomaly Detection in Cybersecurity
local feature of the input data; in this context, developers The cybersecurity domain is nowadays being challenged by
would benefit from the new computed features based on the non-trivial attacks, whose skilling development is advanced.
input features. These new attribute items that are not defined This is why the research in defence mechanisms is now
physically but are certainly useful in the data modelling booming. Traditional detection systems that are designed to
domain, they are. Pooling helps to combine the features from work only with attack templates are not effective enough when
the adjacent areas into a single higher-level feature which is it comes to the development of new threats or changing attack
more efficient and makes use of less of the data. The final strategies, which has already resulted in search for better
layer, which is fully connected, is responsible for dynamic and smart solutions. The fact that machine learning
classification of stocks. The number of nodes in each layer of techniques including the neural networks are used as a good
a neural network varies from one input to another. The trained option to strengthen intrusion detection systems and those
networks model will get the optimised model parameters from systems have the ability of learning and reacting to new
the training data. The optimised model parameters also can be threats in (a) real-time has been a positive sign. Through the
directly applied to the detection of real trading data in a real application of the data science and analytics, cybersecurity
time. [22] experts can obtain more and more useful data from the vast
data set, which will make the defence mechanism more
ii. GNNs applied for financial fraud detection effective and the digital fortress also stronger because of the
Graph neural networks (GNN) is grasping a larger pool of continual cybersecurity threats evolution. Neural networks
users as they discover their utility in learning about graphs. provide an alternative solution by resorting to their ability to
The structure of the graph naturally supports strong problem- observe the smallest disparities with well-established norms.
solving and modelling of complex relationships between Shifting from reactive detection to proactive detection, neural
nodes through message passing and agglomeration. [23] networks automatically process historical information and
The graph applied in the case of financial fraud detection datasets containing malicious behaviour patterns, thus being
scenario is usually made-up of nodes that refer to accounts and more capable of identifying and mitigating cyber threats in
edges which represent transactions. Every node means a real-time.
financial account including the examples of bank account,
credit card account, or any financial institution implicated in a i. RNN in cybersecurity
transaction. Nodes can possess values, namely type of RNN, or recurrent neural network, which is a subset of neural
account, transaction history, current balance, account owner networks, features loops within its nodes, forming a directed
information, and other data applicable to fraud detection. All graph. This structure enhances its status as a network. This
ripples in between correspond to a financial exchange between subject allows us to demonstrate the recognition of the
two accounts. The edge label displays the transaction amount dynamic behaviour that is carried out in the sequence. The
transferred, in relation from account A to account B. Edges internal memory serves as a place where the sequence of
may be linked with weighted attributes representing the activations is processed, that way they can conduct both back
quantities’ transfers or transactions annotations (e.g. and forward transmission by forming feedback loops in the
transactions mechanism in certain occasions or the transferred network. Gradients are more complicated to deal with when
sums). training RNNs, however. Nevertheless, the progress attained
in architecture and training as-of-today yielded different
Algorithm 1: Training Neural Network the labelled data. The algorithm introduced begins with
----------------------------------------------------------- classifying the specified LSTM-RNN model as a classifier of
Input: Features X extracted from the training each channel. It then gets the R vector indicating results of
dataset with labelled information evaluation through the classifier by using the test dataset.
Continuing, it goes through all elements of R by applying the
Initialization: voting method to determine the value v as the element of
1. for channel = 1 to N do majority. It finishes by returning the element v as the result of
2. Train LSTM-RNN model the attack detection process. [26]
3. Save the LSTM-RNN model as a classifier c
4. end for B3. Facial Recognition for Security Purposes
Facial recognition is the most critical function of video
Return: c surveillance systems, which makes it possible to determine
whether the image is that of a person in a scene, and mostly
RNNs. The model is a little bit easier to train as it is. LSTM monitored through a network of cameras. Such application has
(long short-term memory), the improved one of RNN, was widespread use in border security, access control systems,
monitoring and enforcing the law. This helps in addressing
Algorithm 2: Attack Detection security related issues but at the same time making privacy
----------------------------------------------------------- and accuracy a top priority. The utilisation of people’s faces in
Input: Feature X extracted from test dataset with the photos to give rise to the increasing interest among the
labelled information scientists is a factor which is due to their application interests
as well as the challenge that this presents to artificial vision
Initialization: algorithms. The specialists have to be ready to deal with the
1. for channel = 1 to N do extremely high diversity of the features of faces, as well as of
2. Load LSTM-RNN model as a classifier the many different parameters of the image (angle, lighting,
3. Get the result vector R of the classifier hairstyle, facial expression, background, etc.). Currently, the
4. end for most widely recognized face recognition methods utilise
Convolutional Neural Networks. It describes the architecture
Vote to get the majority element v: of a Deep Learning model which allows the enhancement of
1. for r in R do the existing best programs in terms of accuracy and processing
2. Vote to get the majority element v time.
3. end for
i. CNN in Facial Recognition
Return: v The said network is composed of two convolutional layers,
then a fully connected layer and at last classification layer.
proposed in 1997 as they were put forward by Hohenreiter and Every layer of convolution is succeeded by an activation layer
Schmidhuber. LSTM is the first step of a new revolution on and a carpooling operation. Also, two regularisation
speech recognition and incredible success on some traditional techniques after each convolution layer are added: batch norm
models in niche applications. It serves to overcome the only and dropout. The fully connected layer is then applied
drawback of RNNs, in short-term memory. LSTMs, with followed by the dropout technique which is to reduce
several neurons connected to the previous time unit. The overfitting and to improve the performance of the proposed
memory accumulator is the term that defines the configuration neural network model. [27]
of units responsible for collecting the information and is called
a memory cell [24] [25]. In Deep Learning Based Multi- While for image processing or any sort of prediction, which is
Channel Intelligent Attack Detection for Data Security [26] associated with image, a convolutional neural network is first
the authors recommend the following algorithm as seen of all the choice. A standard convolutional neural network
below: would constitute of a number of simple layers, which may be
The detection algorithm is described by pseudocode, given as repeated n times in the network depending on the topic that is
Algorithm 2. to be predicted [28] [29]. The first layer consists of a
Algorithm 1 presents the process for training a network that convolutional layer populated with some filter that will be
will have a Long Short-Term Memory Recurrent Neural applied to the pixels of the image.
Network (LSTM-RNN) model. From the labelled training
dataset features X this requires are taken in as the input. The Usually, the image should be larger relative to the filter
algorithm gets started with setting up the LSTM-RNN model applied to it. From the beginning to the end of the image, the
for each channel in the dataset. It performs the process of filter goes in the horizontal and vertical directions, one step at
looping over all the channels, trains the LSTM-RNN network a time, the values of the convolutional layer are calculated
model, and saves the trained model in the classifier. After that, with a dot product method. The generated convolutional layer
it returns the classifier c that can make predictions. Algorithm results are then passed to the next layer called pooling layer.
2 explains the detection scheme with the classifier made using Through this process, the dimensions of values taken from the
LSTM-RNN which is learned from Algorithm 1. It reviews previous layer are actually the features we have extracted to
the test data set that comes in as a featured data X including
better describe the image. The same needs to be approached As we navigate an increasingly interconnected and digitised
using a pooling filter which smoothly scans the output of the world, the integration of neural networks in security systems
previous output. Conditioned on the topic to be predicted, a promises to fortify defences, thwart malicious activities, and
convolutional layer and successive pooling layers are safeguard critical assets. Through an exploration of their
repeatedly applied to produce the desired output. applications in fraud detection, anomaly detection in
Subsequently, the subset is exposed to the compression stage, cybersecurity, and facial recognition for security purposes, this
where after it is pooled, the final dimension is flattened out. section illuminates the transformative potential of neural
Such output from the first layer goes to the next layer which is networks in shaping the future of security paradigms.
fully connected, and the prediction is done; finally at the last
layer, the predicted output can be seen. In the present study, an C. In Health Care
exhaustive search of the data from the image is going to In recent years, the technological advancements in health
produce around 68 key points which is the main asset of the systems and especially the integration of neural networks in
study. It is evident that the overall CNN model can be healthcare have revolutionized the world of medicine.
extracted from the given Fig. 1 to understand the structure of In this section, we will focus on the influence neural networks
the CNN. The image will be pre-trained in the proposed CNN have had in healthcare, emphasizing on the various neural
architecture which hasn’t been done in the previous stage [30] network architectures that are commonly used in medicine,
[31]. The RGB-formatted input image that uses colour space discussing the diverse range of their applications across
from [0,255], will be converted to grayscale so that it changes various medical fields, as well as analysing the challenges of
to [0,1]. To maintain the consistency of the original applying deep learning in healthcare.
information- it has a resolution of 224*224 pixels -, this
grayscale data is resampled to the standard pixel size [32] [33] C1. Architectures
[34]. The task is to apply appropriate formatting steps. After This section describes the various neural network architectures
that, the convolution model accepts the image. Human figure adapted for healthcare applications. While Convolutional
key point extraction was achieved by the use of the given Neural Networks (CNN) and Recurrent Neural Networks
figure, which is the architecture of the CNN model in Fig. 6. (RNN) are extensively used in healthcare, this section will
focus on Autoencoders (AE), Restricted Boltzmann Machines
(RBM) and Long Short-Term Memory (LSTM).

i. Autoencoders (AE)
Autoencoders are one of the deep learning models that
illustrate the idea of unsupervised representation learning.
Initially, they were introduced as an early tool used to pre-
train supervised deep learning models, when labeled data was
Figure 6. CNN architecture for Facial Key point Prediction uncommon. Despite that, they kept usefulness for
unsupervised procedures such as the phenotype discovery
B4. Challenges [36]. Explicitly, autoencoders are divided into two main parts
Application of neural networks to security, on the other hand, the encoder and the decoder. The encoder consists of an input
is fraught with a lot of challenges even with the effectiveness layer, while the decoder comprises an output layer [37].
of it. There is one prominent drawback of neural network Moreover, they possess a similar number of nodes for both
models; it is in the paring of the network architecture. When input and output, and the number of units that are not visible is
carrying out some studies researchers have noticed that the less than that of the input or output layers, which achieves the
number of layers in the model can be affected in a negative whole purpose of AE. Autoencoders are designed to encode
way through a decrease in accuracy. [20] Here is a the input data into a lower dimensional space [38]. By training
manifestation highlighting the importance of the model (model an AE on a dataset, they are able to transform the input data
class) architecture by demonstrating how it affects the into a format focused only on storing the most important
accuracy; hence, an appropriate model class architecture and derived dimensions. In this way, they bear resemblance to
tuning are required. Ensuring that they keep up with the latest standard dimensionality reduction techniques, for instance, the
algorithms and solutions for neural networks for organisations singular value decomposition (SVD) and the principal
that are prone to financial abuse is also critical. [35] The component analysis (PCA). However, autoencoders have an
malicious changing nature of fraud schemes will continue to important advantage for complicated problems on account of
pose a challenge for financial institutions since the criminals nonlinear transformations by each hidden layer’s activation
are always devising new means to carry out their scams. In functions, but one hidden layer of an autoencoder could
other words, although neural networks leverage very attractive potentially be insufficient to represent all the data if the input
tools for fraud detection, anomaly detection etc, their is of high dimensionality.
incorporation necessitates in-depth comprehension of their
capabilities, defects, and latest developments to make them an Additionally, autoencoders when stacked on top of each other
excellent weapon against crimes. are able to construct a Deep Autoencoder (DAE) architecture.

B5. Conclusion
Numerous mutations of AE have been proposed to convert the probability to enhance the lower bound of the probability.
acquired representations into something more robust and Similarly to DBNs, DBMs utilize a greedy layer-wise training
consistent rather than tiny changes in the input pattern. One of method during pretraining. The primary challenge they face
those mutations is the Sparse Autoencoder (SAE), which lies within their inference time complexity, which is
specializes in learning sparse representations of the input data. significantly higher than that of DBN, making the argument
Sparse Autoencoders achieve sparsity by activating only a optimization impractical for large training sets [44].
small subset of neurons during encoding, making the classes
even more divisible. Vincent et al. [39] proposed another iii. Long Short-Term Memory (LSTM)
mutation known as denoising autoencoders. This method LSTM is a specialized recurrent neural network (RNN)
remakes the input by bringing in noise to the patterns, forcing architecture that was designed to model their long-range
the model to focus solely on capturing the formation of the dependencies and their temporal sequences, more accurately
input. A similar concept was introduced by Rifai et al. [40] in than conventional RNNs [41]. In the typical architecture of
their proposal of contractive autoencoders. However, instead LSTM networks, there is an input layer, a recurrent LSTM
of corrupting the training set with noise, this mutation adds an layer, and an output layer, with the input layer being directly
analytical contractive penalty to the error function. Lastly, in connected to the LSTM layer. The recurrent connections
Convolutional Autoencoders (CAE) [41] their weights are within the LSTM layer extend directly from the cell output
shared amidst all locations in the input to maintain spatial units to the cell input units, input gates, output gates, and
locality and accurately process two-dimensional (2-D) forget gates [42]. These gates regulate the flow of information
patterns. within the network. They control how much information is
stored or discarded from the memory cell each time step,
ii. Restricted Boltzmann Machine (RBM) enabling the model to learn long-term dependencies more
The Restricted Boltzmann machine is another unsupervised effectively. One of the main motivations behind LSTM’s
deep learning architecture for learning input data design is to address the vanishing gradient problem
representations. Their aim is similar to autoencoders, but encountered in traditional RNNs. By introducing the memory
RBMs put on a stochastic outlook by evaluating the cell and gating mechanism, LSTM can reduce the issue of
probability distribution of the input data. Because of this, they vanishing gradients, allowing it to carry forward errors over
are frequently considered as generative models, aiming to extended sequences without the gradients diminishing to zero.
model the underlying process, responsible for generating the
data. Training an RBM usually includes stochastic C2. Applications
optimization methods, such as Gibbs sampling, which This section explores the applications of neural networks in
gradually adjusts the weights to minimize the reconstruction healthcare, focusing on three important areas: Medical
error. In an RBM, the visible and hidden units are combined to Imaging, Medical Informatics, and Disease Diagnosis
form a bipartite graph allowing for the implementation of Prediction.
more effective and thorough training algorithms. The
Restricted Boltzmann Machines serve as learning models in i. Medical Imaging
two main deep learning configurations, that have been In modern medicine, automatic medical imaging analysis
proposed in literature. These are the Deep Belief Network holds significant importance, since diagnosis based on the
(DBN) and the Deep Boltzmann machine (DBM). interpretation of images can be extremely subjective.

a. Deep Belief Network (DBN) Computer-aided diagnosis (CAD) offers an objective


A DBN can be taken as a combination of RBMs. In this assessment of the underlying disease processes. Modelling
structure, each subnetwork’s hidden layer is connected to the disease progression is common in various neurological
visible layer of the succeeding RBM. In DBNs, the top two conditions like Alzheimer's and multiple sclerosis. It requires a
layers have undirected links, while the lower layers have detailed examination of brain scans based on multimodal data
directed links. Initially, a DBN goes through an efficient layer- and precise mapping of brain regions.
by-layer greedy learning approach. This strategy is later
altered based on anticipated outputs. [44] [51] Recently, CNNs have been rapidly gaining traction within the
medical imaging research community due to their outstanding
b. Deep Boltzmann Machines (DBM) performance in computer vision and their ability to be
A DBM is a variant of Deep Neural Network (DNN) within parallelized with Graphics Processing Units (GPUs). [52]
the Boltzmann class. The main distinction from Deep Belief
Networks (DBN) lies in the presence of undirected or One of the biggest challenges in Computer-Aided Diagnosis is
unguided links that are conditionally independent between all the inconsistency in the intensity and shape of tumors, as well
layers of the network. In the case of DBM, computing the as the differences in imaging protocols even within the same
posterior distribution for the given visible units is not imaging modality. In many cases, the intensity of pathological
achievable by directly augmenting the probability. This is tissue may overlap with that of medically healthy samples.
because it involves interaction among the hidden units. Additionally, non-isotropic resolution, Rician noise, and bias
field effects in magnetic resonance images (MRI) cannot be
Consequently, training a Deep Boltzmann Machine typically handled automatically using simpler machine learning
requires the use of an algorithm based on stochastic maximum
approaches. To tackle this complexity in the data, hand- vast array of data could provide valuable insights into disease
designed features are extracted, and conventional machine management [48].
learning methods are trained to classify them in an entirely
different step. [48] Deep learning methods have been tailored to handle properly
large and distributed datasets. The huge success of Deep
Deep learning provides the possibility to optimize and merge Neural Networks (DNNs) lies in their ability to learn features
the extraction of relevant features with the classification and understand data representation in both supervised and
procedure. CNNs can learn a hierarchy of continuously more unsupervised hierarchical modes. DNNs are also effective in
complex features, allowing them to directly operate on image processing multimodal information by simply integrating
patches centred on the abnormal tissue. Their versatility is several components of their architecture. Consequently, it is
displayed in various medical imaging applications, including not surprising that deep learning has rapidly been adopted in
the classification of interstitial lung diseases based on CT the area of medical informatics research.
images, tuberculosis manifestation from X-ray images, and the
identification of neural progenitor cells. These models can also Various applications demonstrate the adaptability of deep
be tailored for specific tasks, such as body-part recognition. learning in medical informatics. For example, authors
Additionally, CNNs have been proposed for the segmentation highlighted their system’s capability to predict the probability
of isointense brain tissues and brain extraction from of patients developing certain conditions such as
multimodality MR images. [48] schizophrenia, cancer, and diabetes. Additionally, Futoma et
al. [55] compared the performance of different models in
While CNNs have dominated medical image analysis, other forecasting clinic readmissions based on an extensive EHR
deep-learning techniques have also been implemented database. Despite the complexity involved in training DNN
successfully. In a recent study, researchers proposed a stacked models, they have consistently outperformed conventional
denoising autoencoder to diagnose malignant breast lesions in methods in terms of prediction precision.
ultrasound images and pulmonary nodules in CT scans [53]. To control time dependency in EHR data, especially with
This approach surpassed traditional CAD methods, largely due multivariate time series obtained from intensive care
to its automatic feature extraction and noise resilience. monitoring systems, Lipton et al. [56] implemented a Long
Short-Term Memory (LSTM) Recurrent Neural Networks.
Moreover, it eliminated the need for image segmentation to RNNs are preferred for their ability to capture sequential
acquire lesion boundaries. In another study, Shan et al. [54] events, thus improving the modelling of time delays between
introduced a stacked sparse autoencoder that detects the inception of emergency clinical events and symptom
microaneurysms in fundus images as part of a diabetic manifest.
retinopathy strategy. This method learns distinctive features
only from pixel intensities, demonstrating how flexible are Deep learning offers extraordinary power and efficiency in
autoencoder-based approaches in medical image analysis. gathering valuable insights from large-scale datasets, laying
In general, deep learning in medical imaging provides the the foundations for personalized healthcare. However,
automatic discovery of object features and the automatic appropriate initialization and tuning are important in
investigation of feature hierarchy. Along these lines, a simple preventing overfitting, especially because of the challenges
training process and systematic performance tuning can be caused by noisy and sparse datasets. Addressing these
applied, improving over the state-of-the-art deep learning challenges remains a priority in advancing deep learning
approaches. algorithms in medical informatics. [48]

iii. Disease Diagnosis Prediction


Despite the increased integrations of machine learning in
healthcare, the primary focus of research revolves around the
nervous system, cancer, and heart diseases, given their
significant impact on mortality and quality of life. However,
there’s a noteworthy increase in research concerning chronic
Figure 7. MRI scans, CT scans and X-rays and infectious diseases, such as type 2 diabetes and
inflammatory bowel diseases. Advancements in understanding
ii. Medical Informatics clinical data and diseases through data-driven models have
Medical Informatics focuses on analysing large-scale data enabled early diagnosis of several conditions, thereby
within the healthcare context, aiming to improve clinical transforming them into diagnostic systems [43].
decision support systems and simplify the assessment of
medical data. Both purposes are ensuring quality assurance • Chronic Kidney Disease (CKD) is one of the most
and improving access to healthcare services. Electronic health significant health challenges globally. Recent statistics
records (EHRs) are a rich source of patient information indicate that over 10% of individuals in the general population
including medical history, allergies, test results, laboratory and worldwide are afflicted with CKD [46]. The research to detect
diagnostic exams, images from radiology, medications, CKD with machine learning algorithms has enhanced the
treatment plans, and diagnoses. Thorough extraction of this procedure and consequence accuracy. A hybrid model has
demonstrated an impressive 99% accuracy in predicting CKD data often suffers from issues related to dataset quality and
[47]. these issues most of the time lead to paucity of data and
inconsistencies in disease condition assessments. [38] [44]
• In eye diseases, neural networks are invaluable in
diagnosing conditions like diabetic retinopathy, as seen in the iv. Privacy
IDx-DR and IDx Technologies systems. These models use One of the most crucial challenges in applying deep learning
medical imaging data, particularly retinal images for accurate in healthcare is to understand whether neural network models
diagnosis. Additionally, supervised algorithms such as the are vulnerable to privacy or security threats. Artificial
random forest algorithm are used for predicting myopia, by intelligence models and privacy-preserving data mining are
drawing insights from electronic health records. This subjects under extensive research. False positive
algorithm accurately predicted the development of adult classifications for patients could lead to unnecessary concern.
myopia in children up to eight years in advance, having an Moreover, if poisoning attacks are detected, dataset clients
accuracy rate ranging from 85% to 99% [43] [48]. may take appropriate actions, such as dismissing the results of
the machine learning algorithm or attempting to identify and
• For cardiac irregularities, cloud-based artificial eliminate any malicious data from the dataset. [43] [44]
neural network algorithms like Cardio DL, are used for
diagnosing such conditions. These algorithms use medical
image data, in this case, Magnetic Resonance Imaging (MRI) C4. Conclusion
scans of heart ventricles. They have displayed efficacy in Over the last decade, machine learning and pattern recognition
studying the functioning of heart ventricles and blood flow, have grown significantly. In this section, we explored the wide
contributing insights comparable to those of radiologists [43]. variety of neural network architectures that are commonly
utilized in healthcare and their applications. These
• In Fractures, a machine learning model known as architectures have demonstrated great efficacy, but despite all
OsteoDetect is used for detecting radius fractures located the advancements, they encounter many challenges such as
away from the joint. This model, uses wrist image data, managing large volumes of data and ensuring privacy, that
specifically X-rays, for detection purposes. This model has require ongoing search efforts to overcome. However, the
enhanced the efficiency of orthopedic clinicians in fracture potential of neural networks to transform healthcare continues
diagnosis and management [43]. to be very promising, due to the constant innovations of these
technologies.
C3. Challenges
Applying deep learning in healthcare shows promising results.
However, because of those applications, there are also many IV. FUTURE DIRECTIONS
challenges being faced. In the following subsection, there’s a
summary of these challenges. ANNs hold great promise for the future. They have the
potential to evolve in numerous fields, boosting our lives and
i. Volume of Data achieving remarkable deeds beyond our current imagination.
Deep learning models are often considered computationally
intensive models, due to the large population of parameters Pulsed or Spiked Neural Networks (SNNs) are considered to
they require. To train these models effectively access to a wide be the next generation of neural networks. SNNs research
clinical data is essential. However, due to confidentiality and started after data from neurobiological experiments made clear
ethical concerns, many researchers face challenges in that biological neural networks communicated through pulses,
obtaining medical records. Moreover, in the case of using their timing to send information and perform
underdeveloped countries, where there is a lack of healthcare calculations. SNNs model, the spiking behaviour of neurons,
records, and insufficient training of healthcare workers further and how their membrane changes electrically when influenced
complicates the understanding of the relationships between by external factors. They are a prime factor in the evolution of
diseases and symptoms. [38] [44] Computer Vision, and they are used for image classification,
object detection, object tracking, object segmentation, and
ii. Temporality optical-flow estimation. SNNs are also employed in Robotic
Infections are continuously evolving, in a non-deterministic Control. They are used as a ‘brain’ for robots, which allows
manner. However, many existing deep learning models them to observe their surroundings and mimic the actions
depend on static vector-based sources of information, which noted in their environment. For the robot to perform a task
cannot handle temporal aspects. Developing deep learning such as the movement system inspired by the biological
approaches capable of handling temporal healthcare data is an system, the network can be customised and adjusted by hand.
imperative aspect that will require the creation of innovative [57] [58]
solutions [38] [44].
Multi/Infinite Dimensional Neural Networks (MDNNs) are a
iii. Data Quality new model of ANNs, that are the generalised version of One-
Data quality in healthcare varies from structured datasets in Dimensional Neural Networks (RNNs, CNNs, etc.). Their
computing and information security. Electronic healthcare theory is still under development but is based on the
generalisation of the gates from the one-dimensional logic to
the multidimensional logic. MDNN architecture is portrayed V. CONCLUSION
by a Tensor State Space Representation, which is used to
compute the output of each neuron. MDNNs use the BP ANNs are one of the greatest inventions from the combination
algorithm, which is tailored to neural networks with complex of the Computer Science and the Neuroscience fields. Enabled
values, depending on complex signum and sigmoid functions. by their contributions, numerous fields including Computer
MDNNs have employed applications in the foundation of the Science, Security, and Health Care benefited, unravelling
three core concepts of cybernetics: (1) Development of the many challenges in the process. Their ability to learn from
unified theory of control, (2) Communication, and (3) Coding. data and adapt to new information makes them capable of
They also have applications in the field of binary filters and solving complex problems, which is beneficial for most fields.
the complex hypercube, which is a foundation of complex- Although they offer solutions to many problems, challenges
valued neural associative systems. [57] still exist due to the complexities inherent in their
implementation, which still await resolution. As technology
Forecasting methods in the context of NNs have both and science advance, we acquire a new understanding of the
limitations and future innovations. There is a growing interest human brain, which originally inspired ANNs. This leads to
in the research community in exploring the application of the creation of architectures and training methods for ANNs
probabilistic forecasting to reduce the uncertainty of NN that are more efficient, and potentially give solutions
predictions. Furthermore, multivariate forecasting will be encountered by previous models. [61]
necessary for complex scenarios since products are becoming
more diverse, emphasising the need to explore multiple REFERENCES
seasonality models, especially for high-frequency big data
[1] R. Qamar, B. A. Zardari, “Artificial Neural Networks: An Overview”,
contexts. Since RNNs were inefficient in modelling ResearchGate, Mesopotamian Journal of Computer Science, Aug 2023.
seasonality, researchers explored alternative approaches, such [2] P. J. Denning, “The Science of Computing: Neural Networks”, American
as combining CNN filters with customised attention Scientist, Sigma Xi, The Scientific Research Honor Society, Sep-Oct
1992
algorithms. Temporal convolution networks (TCNs), an
[3] I. A. Basheer, M. Hajmeer, “Artificial neural networks: fundamentals,
advanced type of CNN architecture, provide an efficient computing, design, and application”, sciencedirect.com, vol. 43 no. 1,
training process by combining convolutions with residual Dec. 2000
connections, resulting in improved efficiency for forecasting [4] Q. Liu, Y. Wu. “Supervised Learning”, researchgate.net, Jan. 2012
[5] Y. Tishan, “Understanding the Difference Between Supervised and
tasks. [59] Unsupervised Learning Techniques”, Sep. 2023
[6] B. M. Devassy, S. George, P. Nussbanm, “Unsupervised Clustering of
Future projects that employ ANNS aim to address challenges Hyperspectral Paper Data Using t-SNE”, researchgate.net, Journal of
and advance capabilities in programmable network devices, Imaging 6(5):29, May 2020
[7] K. Sivamayil, E. Rajaseker, B. Aljafari, S. Nikolovski, S.
such as hardware offloading, data plane virtualization, NN Vairavasundaram, I. Vairavasundaram, “A systematic study on
orchestration, incremental and online learning, as well as Reinforcement Learning Based Applications”, mdpi.com, Feb. 2023
distributed and federated learning. [60] [8] M. H. Sazli, “A brief review of feed-forward neural networks”,
researchgate.net, May 2015
[9] V. Srilakshmi, G. U. Kiran, M. Mounika, A. Sravanthi, N. V. K. Sravya,
To enhance the accuracy and efficiency of ANNs in the future, V. N. S. Akhil, M. Manasa, “Evolving Convolutional Neural Network
we can increase the number of hidden layers, and vary the with Meta-Heuristics for Transfer Learning in Computer Vision”,
training and learning rules applied within them. The ANN sciencedirect.com, 2023
technology will advance over time, with most applications [10] Z. C. Lipton, “A Critical Review of Recurrent Neural Network for
sequence Learning”, researchgate.net, Jun. 2015
utilising them becoming more advanced, while researchers [11] C. S. K. Dash, A. K. Behera, S. Dehuri, S-B. Cho, “Radial basis function
invent new training ways and network architectures. [57] neural networks: a topical state-of-the-art survey”, 2016
[12] C. Wang and L. Wang, “Artificial Neural Network and Its Application in
Image Recognition”, Journal of Engineering Research and Reports,
Volume 24, Issue 2, Feb. 2023
[13] X. Li and X. Lv, "Research on Image Recognition Method of
Convolutional Neural Network with Improved Computer Technology”,
Journal of Physics: Conference Series 1744, 2021
[14] F. L. Borchardt, “Neural Network Computing and Natural Language
Processing”, CALICO Journal, Jun. 1988
[15] R. G. Franklin, A. R. Doni, D. Poornima, S. I. S. Prabu, “The Use of
Recurrent Neural Networks in the Optimization of Computer Science
Algorithms”, IEEE International Conference on Emerging Research in
Computational Science, 2023
[16] Z. Yan, “Research and Application on BP Neural Network Algorithm”
IEEE International Industrial Informatics and Computer Engineering
Conference, 2015
[17] L. Liu, “Computer Network Routing Optimization Algorithm Based on
Neural Network Mode” IEEE Asia-Pacific Conference on Image
Processing, Electronics and Computers (IPEC), Apr. 2023
[18] A. K. Swain, S. K. Jayasingh, “Neural Network in Fraud Detection”,
Conference Paper, Aug. 2011
[19] M. L. Gambo, A. Zainal, M. N. Kassim, “A Convolutional Neural
Network Model for Credit Card Fraud Detection”, Inter. Confer. on Data
Science and Its Applications (ICoDSA), 2022
[20] B. F. Murorunkwere, O. Tuyishimire, D. Haughton, J. Nzabanita, “Fraud [43] A. Pandit, A. Garg, “Artificial Neural Network in Healthcare: A
Detection Using Neural Networks: A Case Study of Income Tax”, MDPI, Systematic Review”, IEEE International Conference on Cloud
May 2022 Computing, Data Science & Engineering, Mar. 2021
[21] S. Yuan, X. Wu, J. Li, A. Lu, “Spectrum-based deep neural networks for [44] S. K. Pandey, R. R. Janghel, “Recent Deep Learning Techniques,
fraud detection”, Jun. 2017 Challenges and Its Applications for Medical Healthcare System: A
[22] Z. Zhang, X. Zhou, X. Zhang, L. Wang, P. Wang, “A Model Based on Review”, Department of Information Technology, India, Jan. 2019
Convolutional Neural Network for Online Transaction Fraud Detection”, [45] C. P. Kovesdy, “Epidemiology of Chronic Kidney Disease: an update
Aug. 2018 2022”, University of Tennessee Health Science Center, Memphis, USA,
[23] M. Lu, Z. Han, Z. Zhang, Y. Zhao, Y. Shan, “Graph Neural Networks in Apr. 2022
Real-Time Fraud Detection with Lambda Architecture”, Oct. 2021 [46] H. Khalid, A. Khan, M. Z. Khan, G. Mehmood, M. S. Qureshi, “Machine
[24] P. Podder, S. Bharati, M. Rubaiyat Hossain Mondal, P. Kumar Paul, U. Learning Hybrid Model for the Prediction of Chronic Kidney Disease”,
Kose, “Artificial Neural Network for Cybersecurity: A Comprehensive National Library of Medicine, Mar. 2023
Review”, 2020 [47] S. M. Li, M. Y. Ren, J. Gan, S. G. Zhang, M. T. Kang, H. Li, D. A.
[25] T. A Tang, L. Mhamdi, D. McLernon, S. Ali Raza Zaidi, M. Ghogho, Atchison, J. Rozema, A. Grzybowski, N. Wang, “Machine Learning to
“Deep Recurrent Neural Network for Intrusion Detection in SDN-based Determine Risk Factors for Myopia Progression in Primary School
Networks”, IEEE International Conference on Network Softwarization Children: The Anyang Childhood Eye Study”, National Library of
(NetSoft 2018) - Technical Sessions, 2018 Medicine, Apr. 2022
[26] F. Jiang, Y. Fu, B. B. Gupta, Y. Liang, S. Rho, F. Lou, F. Meng, Z. Tian, [48] D. Ravi, C. Wong, F. Deligianni, M. Berthelot, J. A. Perez, B. Lo, G. Z.
“Deep Learning Based Multi-Channel Intelligent Attack Detection for Yang, “Deep Learning for Health Informatics”, IEEE Journal of
Data Security", IEEE Transactions On Sustainable Computing, April-Jun. Biomedical and Health Informatics, Dec. 2016
2020 [49] F. Li, L. Tran, K. H. Thung, S. Ji, D. Shen, and J. Li, “A robust deep
[27] Y. Said, M. Barr, H. Eddine Ahmed, “Design of a Face Recognition model for improved classification of ad/mci patients,” IEEE J. Biomed.
System based on Convolutional Neural Network (CNN)”, Engineering, Health Inform, Sep. 2015.
Technology & Applied Science Research, 2020 [50] D. Kuang and L. He, “Classification on ADHD with deep learning,”
[28] S. Kanithan, N.A. Vignesh, E. Karthikeyan, N. Kumareshan, “An International Conference on Cloud Computing and Big Data, Nov 2014.
intelligent energy efficient cooperative MIMO-AF multi-hop and relay [51] G. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for
based communications for Unmanned Aerial Vehicular networks”, deep belief nets,” Neural computation, Aug 2006.
Comput. Commun., 2020 [52] M. Havaei, N. Guizard, H. Larochelle, and P.-M. Jodoin,” Deep Learning
[29] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “On the effectiveness of Trends for Focal Brain Pathology Segmentation in MRI”, July 2016.
deep transfer learning for Bangladeshi meat based curry image [53] J. Z. Cheng et al., “Computer-Aided Diagnosis with Deep Learning
classification”, International Conference on Innovations in Science, Architecture: Applications to Breast Lesions in US Images and
Engineering and Technology (ICISET), IEEE, 2022 Pulmonary Nodules in CT Scans,” https://www.nature.com/srep/, Apr.
[30] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “Fine Tuned 2016
convolutional neural networks for Bangladeshi vehicle classification”, [54] J. Shan and L. Li, “A Deep Learning Method for microaneurysm
International Conference on Innovations in Science, Engineering and detection in fundus images,” IEEE International Conference on
Technology (ICISET), IEEE, 2022 Connected Health, Jun. 2016
[31] S.T. Suganthi, M. U. A. Ayoobkhan, N. B., K. Venkatachalam, H. ˇStˇep [55] J. Futoma, J. Morris, and J. Lucas, “A comparison of models for
´an, and T. Pavel. "Deep learning model for deep fake face recognition predicting early hospital readmissions,” Journal of Biomedical
and detection.", PeerJ Computer Science, 2022 Informatics, Aug. 2015.
[32] T. Guo, J. Dong, H. Li, Y. Gao, “Simple convolutional neural network on [56] Z. C. Lipton, D. C. Kale, C. Elkan, and R. C. Wetzel, “Learning to
image classification”, IEEE 2nd International Conference on Big Data diagnose with LSTM recurrent neural networks,” Cornell University,
Analysis (ICBDA), IEEE, Mar. 2017 Nov. 2015
[33] F. Sultana, A. Sufian, P. Dutta, “Advancements in image classification [57] S. Lakra, T. V. Prasad, G. Ramakrishna, “The Future of Neural
using convolutional neural network”, Fourth International Conference on Networks”, ResearchGate, 6th National Conference - Computing For
Research in Computational Intelligence and Communication Networks Nation Development, INDIA, Feb 2012.
(ICRCICN), IEEE, Nov. 2018 [58] K. Yamazaki, V. Vo-Ho, D. Bulsara, and N. Le, “Spiking Neural
[34] Y. Pei, Y. Huang, Q. Zou, X. Zhang, S. Wang, “Effects of image Networks and their Applications: A Review”, MDPI,brain sciences, Jun
degradation and degradation removal to CNN-based image 2022.
classification”, IEEE Trans. Pattern Anal. Mach. Intel., 2019 [59] H. Hewamalage, C. Bergmeir, and K. Bndara,“Recurrent Neural
[35] A. C. Navarrete, A. C. Gallegos, “Neural Network Algorithms for Fraud Networks for Time series Forecasting: Current status and future
Detection: A Comparison of the Complementary Techniques in the Last directions”, ScienceDirect, Faculty of Information Technology, Monash
Five Years”, 2021 University, Melbourne, Australia, Jan-Mar 2021 .
[36] M. Cabrera-Bean, V. J. Santos, A. R. Llorach, S F. Bertolin, J. Vidal, and [60] F. D. Rossi, M. C. Luizelli, A. Lorenzon, and M. Caicedo, “In-Network
C. Violan, “Autoencoders for health improvement by compressing the set Neural Networks: Challenges and Opportunities for Innovation”,
of patient features”, IEEE Engineering in Medicine & Biology Society, ResearchGate, IEEE Network, Nov-Dec 2021
Sep. 2018 [61] S, Agrawal, J. Agrawal, “Neural Network Techniques for Cancer
[37] B. Shickel, P. J. Tighe, A. Bihorac, and P. Rashidi, “Deep ehr: A survey Prediction: A Survey”, sciencedirect.com,19th International Conference
of recent advances in deep learning techniques for electronic health record on Knowledge Based and Intelligent Information and Engineering
(ehr) analysis,” IEEE Journal of Biomedical and Health Informatics, Sep. Systems, Dec 2015.
2018
[38] I. Zion, S. Ozuomba, P. Asuquo, “An Overview of Neural Network
Architectures for Healthcare”, IEEE International Conference in
Mathematics, Computer Engineering and Computer Science, Apr. 2020
[39] P. Vincent, H. Larochelle, Y. Bengio, P. A. Manzagol, “Extracting and
Composing Robust Features with Denoising Autoencoders”,
researchgate.net, University of Montreal, Jan. 2008
[40] S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio, “Contractive Auto-
Encoders: Explicit Invariance During Feature Extraction”,
scholar.google.com, ICML, Jan. 2008
[41] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Recurrent
Neural Network Architectures for Large Scale Acoustic Modelling”,
Cornell University, Sep. 2014
[42] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Based
Recurrent Neural Network Architectures for Large Vocabulary Speech
Recognition”, Cornell University, Sep. 2014

You might also like