AI Unit 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Introduction to Expert Systems, Architecture of Expert Systems; Expert System Shells; Knowledge

Acquisition; Case Studies: MYCIN, Learning, Rote Learning; Learning by Induction; Explanation based
learning.

INTRODUCTION TO EXPERT SYSTEMS


Expert systems are the computer applications developed to solve complex problems in a particular domain,
at the level of extra-ordinary human intelligence and expertise. Expert systems (ES) are one of the prominent
research domains of AI. It was introduced by researchers at Stanford University, Computer Science
Department.
CHARACTERISTICS OF EXPERT SYSTEMS
• High performance
• Understandable
• Reliable
• Highly responsive
EXPERT SYSTEMS ARCHITECTURE

COMPONENTS OF EXPERT SYSTEMS


The components of ES include −
• Knowledge Base
• Inference Engine
• User Interface

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
KNOWLEDGE BASE
It contains domain-specific and high-quality knowledge. Knowledge is required to exhibit intelligence. The s
uccess of any ES majorly depends upon the collection of highly accurate and precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task domain. Data,
information, and past experience combined together are termed as knowledge.
Components of Knowledge Base
The knowledge base of an ES is a store of both, factual and heuristic knowledge.
• Factual Knowledge − It is the information widely accepted by the Knowledge Engineers and s
cholars in the task domain.
• Heuristic Knowledge − It is about practice, accurate judgement, ability of evaluation, and
guessing.
KNOWLEDGE REPRESENTATION
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form of IF-
THEN-ELSE rules.
KNOWLEDGE ACQUISITION
The success of any expert system majorly depends on the quality, completeness, and accuracy of the informa
tion stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge Engineers. T
he knowledge engineer is a person with the qualities of empathy, quick learning, and case analysing skills.
He acquires information from subject expert by recording, interviewing, and observing him at work, etc. He
then categorizes and organizes the information in a meaningful way, in the form of IF-THEN-ELSE rules, to
be used by interference machine. The knowledge engineer also monitors the development of the ES.
INFERENCE ENGINE
Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct, flawless sol
ution. In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from th
e knowledge base to arrive at a particular solution. In case of rule-based ES, it −
• Applies rules repeatedly to the facts, which are obtained from earlier rule application.
• Adds new knowledge into the knowledge base if required.
• Resolves rules conflict when multiple rules are applicable to a particular case.
To recommend a solution, the Inference Engine uses the following strategies −
• Forward Chaining
• Backward Chaining

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the outcome.
It considers all the facts and rules, and sorts them before concluding to a solution.
This strategy is followed for working on conclusion, result, or effect. For example, prediction of share marke
t status as an effect of changes in interest rates
Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions could hav
e happened in the past for this result. This strategy is followed for finding out cause or reason. For example,
diagnosis of blood cancer in humans
USER INTERFACE
User interface provides interaction between user of the ES and the ES itself. It is generally Natural Language
Processing so as to be used by the user who is well-versed in the task domain. The user of the ES need not b
e necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may appear in the follow
ing forms −

• Natural language displayed on screen.


• Verbal narrations in natural language.
• Listing of rule numbers displayed on the screen.
• The user interface makes it easy to trace the credibility of the deductions.

Requirements of Efficient ES User Interface

• It should help users to accomplish their goals in shortest possible way.


• It should be designed to work for user’s existing or desired work practices.
• Its technology should be adaptable to user’s requirements; not the other way around.
• It should make efficient use of user input.

APPLICATIONS OF EXPERT SYSTEM


Application Description
Design Domain Camera lens design, automobile design.
Medical Domain Diagnosis Systems to deduce cause of disease from observed data, conduction medical

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
operations on humans.
Comparing data continuously with observed system or with prescribed behaviour such as
Monitoring Systems
leakage monitoring in long petroleum pipeline.
Process Control Syst
Controlling a physical process based on monitoring.
ems
Knowledge Domain Finding out faults in vehicles, computers.
Detection of possible fraud, suspicious transactions, stock market trading, Airline
Finance/Commerce
scheduling, cargo scheduling.

EXPERT SYSTEM TECHNOLOGY


There are several levels of ES technologies available. Expert systems technologies include −
1. Expert System Development Environment − The ES development environment includes har
dware and tools. They are −
• Workstations, minicomputers, mainframes.
• High level Symbolic Programming Languages such as LISt Programming (LISP) and P
ROgrammation en LOGique (PROLOG).
• Large databases.
2. Tools − They reduce the effort and cost involved in developing an expert system to large extent.
• Powerful editors and debugging tools with multi-windows.
• They provide rapid prototyping
• Have Inbuilt definitions of model, knowledge representation, and inference design.
3. Shells − A shell is nothing but an expert system without knowledge base. A shell provides the
developers with knowledge acquisition, inference engine, user interface, and explanation facility.
For example, few shells are given below −
• Java Expert System Shell (JESS) that provides fully developed Java API for creating an
expert system.
• Vidwan, a shell developed at the National Centre for Software Technology, Mumbai in
1993. It enables knowledge encoding in the form of IF-THEN rules.
DEVELOPMENT OF EXPERT SYSTEMS: GENERAL STEPS
The process of ES development is iterative. Steps in developing the ES include −
IDENTIFY PROBLEM DOMAIN
• The problem must be suitable for an expert system to solve it.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
• Find the experts in task domain for the ES project.
• Establish cost-effectiveness of the system.
DESIGN THE SYSTEM
• Identify the ES Technology
• Know and establish the degree of integration with the other systems and databases.
• Realize how the concepts can represent the domain knowledge best.
DEVELOP THE PROTOTYPE
From Knowledge Base: The knowledge engineer works to −
• Acquire domain knowledge from the expert.
• Represent it in the form of If-THEN-ELSE rules.
TEST AND REFINE THE PROTOTYPE
• The knowledge engineer uses sample cases to test the prototype for any deficiencies in perform
ance.
• End users test the prototypes of the ES.
DEVELOP AND COMPLETE THE ES
• Test and ensure the interaction of the ES with all elements of its environment, including end us
ers, databases, and other information systems.
• Document the ES project well.
• Train the user to use ES.
MAINTAIN THE ES
• Keep the knowledge base up-to-date by regular review and update.
• Cater for new interfaces with other information systems, as those systems evolve.
BENEFITS OF EXPERT SYSTEMS
• Availability − They are easily available due to mass production of software.
• Less Production Cost − Production cost is reasonable. This makes them affordable.
• Speed − They offer great speed. They reduce the amount of work an individual puts in.
• Less Error Rate − Error rate is low as compared to human errors.
• Reducing Risk − They can work in the environment dangerous to humans.
• Steady response − They work steadily without getting motional, tensed or fatigued.
EXAMPLES OF EXPERT SYSTEMS ARTIFICIAL INTELLIGENCE BASED ON CASE
STUDIES: MYCIN:
Historically, the MYCIN system played a major role in stimulating research interest in rule-
based expert systems.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
MYCIN is an expert system for diagnosing and recommending treatment of bacterial infections
of the blood (such as meningitis and bacteremia). It was developed at Stanford University in
California, USA, in the 1970s, and has become a template for many similar rule-based systems.
It is intended to support clinicians in the early diagnosis and treatment of meningitis, which can
be fatal if not treated in time.
However, the laboratory tests for these conditions take several days to be completed, so doctors
(and therefore MYCIN) have to make decisions with incomplete information associated with
medical knowledge. MYCIN incorporated a calculus of uncertainty called certainty factors
which seemed (at that time) to fit well with how doctors assessed the impact of evidence on the
diagnosis.
This system was able to perform as well as some experts and considerably better than junior
doctors. A consultation with MYCIN begins with requests for routine information such as age,
medical history and so on, programming to more specific questions as required.
MYCIN’S expertise knowledge lies in the domain of bacterial infections. MYCIN’S pool of
knowledge consists of approximately 500 antecedent-consequent rules, implemented in LISP
which give MYCIN the ability to recognize about 100 causes of bacterial infections. Physicians
usually begin antibiotic treatment for patients who have bacterial infections without knowing
exactly which organism is the culprit.
There is no time to wait for definitive laboratory culture evidence, which accumulates too slowly.
For the desperately sick, therapy must begin at once, not 2 days after.
• This requirement leaves two choices:
i. The physician can either prescribe a broad-spectrum drug which covers all possibilities, or
ii. He can prescribe a better, disease specific drug.
MYCIN helps the physician to prescribe disease-specific drugs. MYCIN informs itself about
particular cases by requesting information from the physician about a patient’s symptoms,
general condition, history, and laboratory-test results. At each point, the question MYCIN asks
is determined by MYCIN’S current hypothesis and the answer to all previous questions. Thus,
the questions start as though taken from a checklist, but the questions then vary as evidence
builds.
• Here is an excerpt from a sample session:
1. What is the patient’s name?
Arun Sehgal
2. Male or Female?

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
Male
3. Age?
55
4. Have you obtained positive cultures indicating general type?
Yes.
5. What type of infection is it?
Primary bacteria.
6. When did symptoms first appear?
May 5, 2006
7. Let’s call the most recent positive culture C1. From what site was C1 taken?
From the blood.
8. When?
May 9, 2006
9. Let’s call the first significant organism from this culture U1. Do you know the identity of U1?
No.
10. Is U1 a rod or a coccus or something else?
Rod.
11. What in the gram stain of U1?
Gram-negative.
12. Have there been positive cultures from sites which are not associated with the infections
about which you are seeking advice?
No.
13. Did U1 grow aerobically?
Yes.
When MYCIN is satisfied that it has a reasonably good grasp of the situation, MYCIN
announces its diagnosis:
My recommendation will be based on the opinion that the identity of U1 may be:
1. Pseudomonas-aeruginosa
2. Klebsiella-pneumonise.
3. E. coli
4. Bacteroides-fragilis.
5. Enterobacter.
6. Proteus-nonmirabilis.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
After asking a few more questions about the patient, MYCIN announces its
recommendation:
1. On a scale of 0 to 4, where the higher number indicates greater severity, how would you rate
the patient’s degree of sickness?
2. Does the patient have a clinically significant allergic reaction to any antimicrobial agent?
3. How much does the patient weigh?
4. My first recommendation is as follows: to cover for items 1, 2, 3, 5 and 6, give gentamycin
using a dose of 119 mg (1.7 mg/kg) q8h IV [or IM] for 10 days. Modify dose in renal failure.
Also, to cover for item 4, give clindamycin using a dose of 595 mg (8.5 mg/kg) q6th IV [or IM]
for 14 days. If diarrhea or other gastrointestinal symptoms develop, check for
pseudomembranous colitis.
ORGANIZATIONAL FEATURES:
i. Knowledge Representation:
In the form of production rules implemented in LISP. K.B. is of the form – If condition, and ….
and condition hold then draw conclusion, and …. and conclusion encoded in the data structure
of LISP programming.
ii. Reasoning:
Backward chaining, and forward reasoning uses certainty factors to reason with uncertain
information. MYCIN uses backward chaining to discover what organisms where present. Then
it uses forward chaining to reason from the organism to a treatment regime.
• iii. Heuristics:
When the general category of infection has been established, MYCIN examines each candidate
diagnosis in a depth-first manner. Heuristics are used to limit the search, including checking all
premises of a possible rule to see if anyone of these is known to be false.
• iv. Dialogue/Explanation:
The dialogue is computer controlled, with MYCIN driving the consultation through asking
questions. Explanations are generated by tracing back through the rules which have been fired.
Both “how?” and “why?” explanation is supported.
LEARNING
Learning is a goal-directed process of a system that improves the knowledge or the Knowledge representation
of the system by exploring experience and prior knowledge. Each person will interpret a piece of information
according to their level of understanding and their own way of interpreting things.
Factors to consider when designing a learning system

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
• Learning element
• Performance element
• Critic
• Problem generator
GENERAL LEARNING MODEL
Learning can be accomplished using a number of different methods, such as by memorization facts, by being
told, or by studying examples like problem solution. Learning requires that new knowledge structures be
created from some form of input stimulus. This new knowledge must then be assimilated into a knowledge
base and be tested in some way for its utility, the tested knowledge should be used in performance of some
task from which meaningful feedback can be obtained, where the feedback provides some measure of the
accuracy and usefulness of the newly acquired knowledge.
TYPES OF LEARNING

1. REINFORCEMENT LEARNING
Reinforcement learning refers to a class of problems in machine learning which postulate an agent
exploring an environment in which the agent perceives its current state and takes actions. The
environment, in return, provides a reward (which can be positive or negative). Reinforcement learning
algorithms attempt to find a policy for maximizing cumulative reward
for the agent over the curse of the problem.
2. SUPERVISED LEARNING
• Any situation in which both inputs and outputs of a component can be perceived
• Correct answers for each example or instance are available.
3. UNSUPERVISED LEARNING
Learning when there is no hint at all about the correct outputs
It is mainly used in probabilistic learning system
ROTE LEARNING
Rote learning is the basic learning activity and mainly focuses on memorization by avoiding the inner
complexities. So, it becomes possible for the learner to recall the stored knowledge. It is also called
memorization because the knowledge, without any modification is, simply copied into the knowledge base.
As computed values are stored, this technique can save a significant amount of time. Rote learning technique
can also be used in complex learning systems provided sophisticated techniques are employed to use the
stored values faster and there is a generalization to keep the number of stored information down to a
manageable level.
Examples:

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
• Checkers-playing program, uses this technique to learn the board positions it evaluates in its look-
ahead search.
• When a learner learns a poem or song by reciting or repeating it, without knowing the actual meaning
of the poem or song.
INDUCTIVE LEARNING.
Learning a function from examples of its inputs and outputs is called inductive learning. It finds inductive
hypotheses that explain set of observations with the help of background knowledge which is measured by their
learning curve, which shows the prediction accuracy as a function of the number of observed examples.
Hypothesis is represented by a set of logical sentences such as prior knowledge, example description and
classification. Examples and hypothesis: - Restaurant learning problem
• Learning a rule for deciding whether to wait for a table.
• The example object is described by logical sentence, the attribute is way predicates.
Consider a pair (x, f(x)), where x is the input and f(x) is the output of the function applied to x. The task of
pure inductive inference is this:
• Given a collection of examples of ‘f’, return a function ‘h’ that approximates ‘f’ were the function ‘h’
is called a hypothesis.
Note: The reason that learning is difficult, from a conceptual point of view, is that it is not easy to tell
whether any particular ‘h’ is a good approximation of ‘f’ and a good hypothesis will generalize well and
predict correctly.
Inductive learning method: Construct / adjust h to agree with f on training set (h is consistent if it agrees with
f on all examples). E.g. Curve fitting
Given:

We shall have curve fitting with various polynomial hypothesis for the same data with the ockham’s razor
which prefers the simplest hypothesis consistent with data:

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
EXPLANATION-BASED LEARNING
When an agent can utilize a worked example of a problem as a problem-solving method, the agent is said to
have the capability of Explanation-based learning (EBL). The background knowledge is sufficient to explain
the hypothesis of Explanation-Based Learning. The agent does not learn anything factually new from the
instance. It extracts general rules from single examples by explaining the examples and generalizing the
explanation.
EBL algorithm requires and accepts 4 kinds of inputs: -
• A training examples
-How is the learning seen in the world?
• Goal concepts
-A high-level description of what the program is to learn.
• An operational criterion: -
-A description of which concept are usable.
• A domain theory
-A set of rules that describe relationship b/w object and action is a domain. Entailment constraint
satisfied by EBL is,
▪ Hypothesis ˄ description ≠ classifications
▪ Background ≠ hypothesis
EXTRACTING RULES FROM EXAMPLE: -
It is a method for extracting general rules from individual observations, the idea is to construct an explanation
of the observation using prior knowledge.
Consider the problem of differentiating and simplifying the algebraic expressions. If we differentiate the
expression 𝑥 2 with respect to x, we obtain 2x. The proof tree for derivative 𝑥 2 , 𝑥 = 2x is too large to use, so
we use a simpler problem to illustrate the generalization method.
Suppose we are to simply 1 × (0 + 𝑥), the knowledge base includes the following rules:
▪ Rewrite (u, v) ˄Simplify (v, w) =>Simplify (u, w)

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
▪ Primitive(u)=>Simplify (u, u)
▪ Arithmetic unknown=>primitive(u)
▪ Number(u)=>primitive(u)
▪ Rewrite (1*u, u)
▪ Rewrite (0+u, u)
EBL PROCESS WORKING
1. Construct a proof that the goal predicted applies to the example using the available background knowledge.
2. In parallel, construct a generalized proof tree for the variabilized goal using the same inference steps as in
the original proof.
3. Construct a new rule where left hand side consists of leaves of the proof tree and RHS is the variabilized
goal.
4. Drop any conditions that are tree regardless the values of the variable are the goal.
IMPROVING EFFICIENCY
1. Reduce the large number of rules. It increases the branching factor in the search space.
2. Derived rules must offer significant increase in speed.
3. Derived rule is a general as possible, so that they apply to the largest possible set of cases.
LEARNING NEURAL NETWORKS
Artificial neural nets can be defined as massive parallel computing system open to saving and
following execution of information while simulating human brain in collecting data during
learning process and saving of these data using inter-neural connections. Artificial neural nets
are one of the options in situations where there are no strict rules according to which it is possible to simulate
result of the situation or where these rules are too complex or incomplete. Statistical
methods, multi-agent systems or adaptive computing systems are further alternatives.
Basic elements of neural nets Perceptron is a neural model which receives input signals X = (x1, x2, xn+1) th
rough synaptic weights (in neurobiology synapse is connection between two neural and a power acting in syn
apse is a synaptic weight) creating weight vector W= (w1, w2...wn+1). Input vector is called sample or patter
n. Components of input vector can gain real or binary values. Perceptron output is defined as: o = f(net) = f(
W*X) = f (∑ 𝑤𝑗 𝑛+1 𝑗=1 * xj) = f (∑ 𝑤𝑗 𝑛 𝑗=1 * xj -Θ) (1.1) where variable net assigns weight sum of inputs
– dot product of weight and input vector. Function f is called activation function of perceptron, Θ is excitatio
n threshold value of perceptron, o is perceptron output.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
Perceptron has n+1 inputs. (N+1) input value is always -1 and Wn+1 = Θ which is excitation threshold value
of perceptron. If there are only feedforward connections between neural these nets
are called feedforward neural nets. Each neural of each layer send signals to each neural of following layer.
Backward connections don’t exist. It is not necessary to know solved problem model when using artificial ne
ural nets. Suitable training set and suitable net architecture offer enough information to train designed neural
net and together with backward error spread set parameters (weights and thresholds) of net to receive accepta
ble result. Solution can be also obtained by simulations or experiments instead of rigorous and formal proble
m solving.
Examples of tasks solvable by neural nets Neural networks (NN) applicability comes from some basic featur
es of NN. The most important one is that NN are universal function approximator.
According to the fact that many problems cannot be described with known functions, NN usage
would grow in short time. The only decelerator is very high computing technique requests which on the other
hand change rapidly with high performing computing systems development.
• Networks that learn and are capable of performing tasks that are difficult with conventional computer
• Used for poorly structured problems
• Use patterns instead of the if-then-else rules used by the expert systems
• Create a model based on input and output

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
HOW ARTIFICIAL NEURAL NETWORKS WORK
An ANN usually involves a large number of processors operating in parallel and arranged in tiers. The first
tier receives the raw input information -- analogous to optic nerves in human visual processing. Each
successive tier receives the output from the tier preceding it, rather than from the raw input -- in the same way
neurons further from the optic nerve receive signals from those closer to it. The last tier produces the output
of the system.
Each processing node has its own small sphere of knowledge, including what it has seen and any rules it was
originally programmed with or developed for itself. The tiers are highly interconnected, which means each
node in tier n will be connected to many nodes in tier n-1 -- its inputs -- and in tier n+1, which provides input
data for those nodes. There may be one or multiple nodes in the output layer, from which the answer it produces
can be read.
Artificial neural networks are notable for being adaptive, which means they modify themselves as they learn from initial
training and subsequent runs provide more information about the world. The most basic learning model is centred on
weighting the input streams, which is how each node weights the importance of input data from each of its predecessors.
Inputs that contribute to getting right answers are weighted higher.

How neural networks learn


Typically, an ANN is initially trained or fed large amounts of data. Training consists of providing input and
telling the network what the output should be. For example, to build a network that identifies the faces of
actors, the initial training might be a series of pictures, including actors, non-actors, masks, statuary and animal
faces. Each input is accompanied by the matching identification, such as actors' names, "not actor" or "not

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
human" information. Providing the answers allows the model to adjust its internal weightings to learn how to
do its job better.

For example, if nodes A, B and C tell node D the current input image is a picture of E, but node F says it is G,
and the training program confirms it is E, D will decrease the weight it assigns to F's input and increase the
weight it gives to that of A, B and C.
In defining the rules and making determinations -- that is, the decision of each node on what to send to the
next tier based on inputs from the previous tier -- neural networks use several principles. These include
gradient-based training, fuzzy logic, genetic algorithms and Bayesian methods. They may be given some basic
rules about object relationships in the space being modelled.
For example, a facial recognition system might be instructed, "Eyebrows are found above eyes," or,
"Moustaches are below a nose. Moustaches are above and/or beside a mouth." Preloading rules can make
training faster and make the model more powerful sooner. But it also builds in assumptions about the nature
of the problem space, which may prove to be either irrelevant and unhelpful or incorrect and
counterproductive, making the decision about what, if any, rules to build in very important.
Further, the assumptions people make when training algorithms causes neural networks to amplify cultural
biases. Biased data sets are an ongoing challenge in training systems that find answers on their own by
recognizing patterns in data. If the data feeding the algorithm isn't neutral -- and almost no data is -- the
machine propagates bias.
TYPES OF NEURAL NETWORKS
Neural networks are sometimes described in terms of their depth, including how many layers they have
between input and output, or the model's so-called hidden layers. This is why the term neural network is used
almost synonymously with deep learning. They can also be described by the number of hidden nodes the model
has or in terms of how many inputs and outputs each node has. Variations on the classic neural network design
allow various forms of forward and backward propagation of information among tiers. Specific types of
artificial neural networks include:

• Feed-forward neural networks

• Recurrent neural networks

• Convolutional neural networks

• DE convolutional neural networks

• Modular neural networks

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
Feed-forward neural networks are one of the simplest variants of neural networks. They pass information
in one direction, through various input nodes, until it makes it to the output node. The network may or may
not have hidden node layers, making their functioning more interpretable. It is prepared to process large
amounts of noise. This type of ANN computational model is used in technologies such as facial recognition
and computer vision.
Recurrent neural networks (RNN) are more complex. They save the output of processing nodes and feed
the result back into the model. This is how the model is said to learn to predict the outcome of a layer. Each
node in the RNN model acts as a memory cell, continuing the computation and implementation of operations.
This neural network starts with the same front propagation as a feed-forward network, but then goes on to
remember all processed information in order to reuse it in the future. If the network's prediction is incorrect,
then the system self-learns and continues working towards the correct prediction during backpropagation. This
type of ANN is frequently used in text-to-speech conversions.
Convolutional neural networks (CNN) are one of the most popular models used today. This neural network
computational model uses a variation of multilayer perceptron’s and contains one or more convolutional layers
that can be either entirely connected or pooled. These convolutional layers create feature maps that record a
region of image which is ultimately broken into rectangles and sent out for nonlinear processing. The CNN
model is particularly popular in the realm of image recognition; it has been used in many of the most advanced
applications of AI, including facial recognition, text digitization and natural language processing. Other uses
include paraphrase detection, signal processing and image classification.
DE convolutional neural networks utilize a reversed CNN model process. They aim to find lost features or
signals that may have originally been considered unimportant to the CNN system's task. This network model
can be used in image synthesis and analysis.
Modular neural networks contain multiple neural networks working separately from one another. The
networks do not communicate or interfere with each other's activities during the computation process.
Consequently, complex or big computational processes can be performed more efficiently.
Advantages of artificial neural networks

• Parallel processing abilities mean the network can perform more than one job at a time.

• Information is stored on an entire network, not just a database.

• The ability to learn and model nonlinear, complex relationships helps model the real-life relationships
between input and output.

• Fault tolerance means the corruption of one or more cells of the ANN will not stop the generation of
output.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
• Gradual corruption means the network will slowly degrade over time, instead of a problem destroying
the network instantly.

• The ability to produce output with incomplete knowledge with the loss of performance being based on
how important the missing information is.

• No restrictions are placed on the input variables, such as how they should be distributed.

• Machine learning means the ANN can learn from events and make decisions based on the observations.

• The ability to learn hidden relationships in the data without commanding any fixed relationship means an
ANN can better model highly volatile data and non-constant variance.

• The ability to generalize and infer unseen relationships on unseen data means ANNs can predict the
output of unseen data.
Disadvantages of artificial neural networks

• The lack of rules for determining the proper network structure means the appropriate artificial neural
network architecture can only be found through trial and error and experience.

• The requirement of processors with parallel processing abilities makes neural networks hardware
dependent.

• The network works with numerical information, therefor all problems must be translated into numerical
values before they can be presented to the ANN.

• The lack of explanation behind probing solutions is one of the biggest disadvantages in ANNs. The
inability to explain the why or how behind the solution generates a lack of trust in the network.
Applications of artificial neural networks
Image recognition was one of the first areas to which neural networks were successfully applied, but the
technology uses have expanded to many more areas, including:
• Chabot’s

• Natural language processing, translation and language generation

• Stock market prediction

• Delivery driver route planning and optimization

• Drug discovery and development


These are just a few specific areas to which neural networks are being applied today. Prime uses involve any
process that operates according to strict rules or patterns and has large amounts of data. If the data involved

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems
is too large for a human to make sense of in a reasonable amount of time, the process is likely a prime
candidate for automation through artificial neural networks.
Other terminologies
▪ Define Q-Learning.
The agent learns an action-value function giving the expected utility of taking a given action in a given
state. This is called Q-Learning.
▪ Define Bayesian learning.
Bayesian learning simply calculates the probability of each hypothesis, given the data, and makes
predictions on that basis. That is, the predictions are made by using all the hypotheses, weighted by
their probabilities, rather than by using just a single “best” hypothesis.

DMISJBU- Lilongwe Campus. Ms. Tawonga Mkandawire (BE(CS)) Artificial Intelligence &Expert Systems

You might also like