Unit-5 (Notes AI)
Unit-5 (Notes AI)
Applications
• Artificial Intelligence Applications
Here is the list of the top 14 application of AI (Artificial Intelligence):
AI Application in E-Commerce
Personalized Shopping:-Artificial Intelligence technology is used to create recommendation
engines through which you can engage better with your customers. These recommendations
are made in accordance with their browsing history, preference, and interests. It helps in
improving your relationship with your customers and their loyalty towards your brand.
AI-powered Assistants:-Virtual shopping assistants and chatbots help improve the user
experience while shopping online. Natural Language Processing is used to make the
conversation sound as human and personal as possible. Moreover, these assistants can have
real-time engagement with your customers. Did you know that on amazon.com, soon, customer
service could be handled by chatbots?
Fraud Prevention:-Credit card frauds and fake reviews are two of the most significant issues
that E-Commerce companies deal with. By considering the usage patterns, AI can help reduce
the possibility of credit card frauds taking place. Many customers prefer to buy a product or
service based on customer reviews. AI can help identify and handle fake reviews.
Voice Assistants:-Without even the direct involvement of the lecturer or the teacher, a
student can access extra learning material or assistance through Voice Assistants. Through
this, printing costs of temporary handbooks and also provide answers to very common
questions easily.
Spam Filters:-The email that we use in our day-to-day lives has AI that filters out spam emails
sending them to spam or trash folders, letting us see the filtered content only. The popular email
provider, Gmail, has managed to reach a filtration capacity of approximately 99.9%.
Facial Recognition:-Our favorite devices like our phones, laptops, and PCs use facial
recognition techniques by using face filters to detect and identify in order to provide secure
access. Apart from personal usage, facial recognition is a widely used Artificial Intelligence
application even in high security-related areas in several industries.
Recommendation System:-Various platforms that we use in our daily lives like e-commerce,
entertainment websites, social media, video sharing platforms, like youtube, etc., all use the
recommendation system to get user data and provide customized recommendations to users to
increase engagement. This is a very widely used Artificial Intelligence application in almost all
industries.
• Inventory management
Artificial Intelligence finds diverse applications in the healthcare sector. AI applications are used
in healthcare to build sophisticated machines that can detect diseases and identify cancer cells.
Artificial Intelligence can help analyze chronic conditions with lab and other medical data to
ensure early diagnosis. AI uses the combination of historical data and medical intelligence for
the discovery of new drugs.
Artificial Intelligence is used to identify defects and nutrient deficiencies in the soil. This
is done using computer vision, robotics, and machine learning applications, AI can
analyze where weeds are growing. AI bots can help to harvest crops at a higher volume
and faster pace than human laborers.
Artificial Intelligence is used to build self-driving vehicles. AI can be used along with the
vehicle’s camera, radar, cloud services, GPS, and control signals to operate the vehicle. AI can
improve the in-vehicle experience and provide additional systems like emergency braking, blind-
spot monitoring, and driver-assist steering.
Instagram:-On Instagram, AI considers your likes and the accounts you follow to determine
what posts you are shown on your explore tab.
Facebook:-Artificial Intelligence is also used along with a tool called DeepText. With this tool,
Facebook can understand conversations better. It can be used to translate posts from different
languages automatically.
Twitter:-AI is used by Twitter for fraud detection, removing propaganda, and hateful content.
Twitter also uses AI to recommend tweets that users might enjoy, based on what type of tweets
they engage with.
1. Using AI, marketers can deliver highly targeted and personalized ads with the help of
behavioral analysis, and pattern recognition in ML, etc. It also helps with retargeting
audiences at the right time to ensure better results and reduced feelings of distrust and
annoyance.
2. AI can help with content marketing in a way that matches the brand's style and voice. It
can be used to handle routine tasks like performance, campaign reports, and much
more.
4. AI can provide users with real-time personalizations based on their behavior and can be
used to edit and optimize marketing campaigns to fit a local market's needs.
Applications of Artificial Intelligence in Chatbots
AI HYPERLINK "https://www.concurrency.com/blog/august-2019/role-of-artificial-intelligence-in-
chatbot-development"chatbots HYPERLINK "https://www.concurrency.com/blog/august-
2019/role-of-artificial-intelligence-in-chatbot-development" can comprehend natural
language and respond to people online who use the "live chat" feature that many organizations
provide for customer service. AI chatbots are effective with the use of machine learning, and
can be integrated in an array of websites and applications. AI chatbots can eventually build a
database of answers, in addition to pulling information from an established selection of
integrated answers. As AI continues to improve, these chatbots can effectively resolve customer
issues, respond to simple inquiries, improve customer service, and provide 24/7 support. All in
all, these AI chatbots can help to improve customer satisfaction.
It has been reported that 80% of banks recognize the benefits that AI can provide. Whether it’s
personal finance, corporate finance, or consumer finance, the highly evolved technology that is
offered through AI can help to significantly improve a wide range of financial services. For
example, customers looking for help regarding wealth management solutions can easily get the
information they need through SMS text messaging or online chat, all AI-powered. Artificial
intelligence can also detect changes in transaction patterns and other potential red flags that
can signify fraud, which humans can easily miss, and thus saving businesses and individuals
from significant loss. Aside from fraud detection and task automation, AI can also better predict
and assess loan risks.
Conclusion
Artificial Intelligence is revolutionizing the industries with its applications and helping solve
complex problems. Do you agree with our list of Artificial Intelligence Applications? Think we
missed anything important? Or do you have any questions for us? Feel free to share them with
us in the comments section of this article. We’d be thrilled to hear from you.Also, if you are
looking to advance your career in this exciting field and learn more about applications of artificial
intelligence, check out our Post Graduate Program in AI and Machine Learning. Offered in
partnership with Purdue University and in collaboration with IBM, this comprehensive online
bootcamp provides students with all the knowledge, tools, and techniques they need to boost
their career.Find our Professional Certificate Program in AI and Machine Learning Online
Bootcamp in top cities:
• Language Modes
Language modeling (LM) is the use of various statistical and probabilistic techniques to
determine the probability of a given sequence of words occurring in a sentence. Language
models analyze bodies of text data to provide a basis for their word predictions. They are used
in natural language processing (NLP) applications, particularly ones that generate text as an
output. Some of these applications include , machine translation and question answering.
How language modeling works
Language models determine word probability by analyzing text data. They interpret this data by
feeding it through an algorithm that establishes rules for context in natural language. Then, the
model applies these rules in language tasks to accurately predict or produce new sentences.
The model essentially learns the features and characteristics of basic language and uses those
features to understand new phrases.
There are several different probabilistic approaches to modeling language, which vary
depending on the purpose of the language model. From a technical perspective, the various
types differ by the amount of text data they analyze and the math they use to analyze it. For
example, a language model designed to generate sentences for an automated Twitter bot may
use different math and analyze text data in a different way than a language model designed for
determining the likelihood of a search query.
N-gram. N-grams are a relatively simple approach to language models. They create a
probability distribution for a sequence of n The n can be any number, and defines the size of the
"gram", or sequence of words being assigned a probability. For example, if n = 5, a gram might
look like this: "can you please call me." The model then assigns probabilities using sequences
of n size. Basically, n can be thought of as the amount of context the model is told to consider.
Some types of n-grams are unigrams, bigrams, trigrams and so on.
Unigram. The unigram is the simplest type of language model. It doesn't look at any
conditioning context in its calculations. It evaluates each word or term independently. Unigram
models commonly handle language processing tasks such as information retrieval. The unigram
is the foundation of a more specific model variant called the query likelihood model, which uses
information retrieval to examine a pool of documents and match the most relevant one to a
specific query.
Bidirectional. Unlike n-gram models, which analyze text in one direction (backwards),
bidirectional models analyze text in both directions, backwards and forwards. These models can
predict any word in a sentence or body of text by using every other word in the text. Examining
text bidirectionally increases result accuracy. This type is often utilized in machine learning and
speech generation applications. For example, Google uses a bidirectional model to process
search queries.
Exponential. Also known as maximum entropy models, this type is more complex than n-
grams. Simply put, the model evaluates text using an equation that combines feature functions
and n-grams. Basically, this type specifies features and parameters of the desired results, and
unlike n-grams, leaves analysis parameters more ambiguous -- it doesn't specify individual gram
sizes, for example. The model is based on the principle of entropy, which states that the
probability distribution with the most entropy is the best choice. In other words, the model with
the most chaos, and least room for assumptions, is the most accurate. Exponential models are
designed maximize cross entropy, which minimizes the amount statistical assumptions that can
be made. This enables users to better trust the results they get from these models.
Continuous space. This type of model represents words as a non-linear combination of
weights in a neural network. The process of assigning a weight to a word is also known as word
embedding. This type becomes especially useful as data sets get increasingly large, because
larger datasets often include more unique words. The presence of a lot of unique or rarely used
words can cause problems for linear model like an n-gram. This is because the amount of
possible word sequences increases, and the patterns that inform results become weaker. By
weighting words in a non-linear, distributed way, this model can "learn" to approximate words
and therefore not be misled by any unknown values. Its "understanding" of a given word is not
as tightly tethered to the immediate surrounding words as it is in n-gram models.The models
listed above are more general statistical approaches from which more specific variant language
models are derived. For example, as mentioned in the n-gram description, the query likelihood
model is a more specific or specialized model that uses the n-gram approach. Model types may
be used in conjunction with one another.
• Information Retrieval
What is an IR Model?
An Information Retrieval (IR) model selects and ranks the document that is required by
the user or the user has asked for in the form of a query. The documents and the
queries are represented in a similar manner, so that document selection and ranking
can be formalized by a matching function that returns a retrieval status value (RSV) for
each document in the collection. Many of the Information Retrieval systems represent
document contents by a set of descriptors, called terms, belonging to a vocabulary V.
An IR model determines the query-document matching function according to four main
approaches:
Types of IR Models
Components of Information Retrieval/ IR Model
• Acquisition: In this step, the selection of documents and other objects from various web
resources that consist of text-based documents takes place. The required data is collected
by web crawlers and stored in the database.
• Representation: It consists of indexing that contains free-text terms, controlled
vocabulary, manual & automatic techniques as well. example: Abstracting contains
summarizing and Bibliographic description that contains author, title, sources, data, and
metadata.
• File Organization: There are two types of file organization methods. i.e. Sequential: It
contains documents by document data. Inverted: It contains term by term, list of records
under each term. Combination of both.
• Query: An IR process starts when a user enters a query into the system. Queries are
formal statements of information needs, for example, search strings in web search
engines. In information retrieval, a query does not uniquely identify a single object in the
collection. Instead, several objects may match the query, perhaps with different degrees of
relevancy.
• Information Extraction
Information Extraction is the process of parsing through unstructured data and extracting
essential information into more editable and structured data formats. For example, consider
we're going through a company’s financial information from a few documents. Usually, we
search for some required information when the data is digital or manually check the same. But
with information extraction NLP algorithms, we can automate the data extraction of all required
information such as tables, company growth metrics, and other financial details from various
kinds of documents (PDFs, Docs, Images etc.).
Information Extraction from text data can be achieved by leveraging Deep Learning and NLP
techniques like Named Entity Recognition. However, if we build one from scratch, we should
decide the algorithm considering the type of data we're working on, such as invoices, medical
reports, etc. This is to make sure the model is specific to a particular use case. We’ll be learning
more about this in the following sections.
Tokenization
Computers usually won't understand the language we speak or communicate with. Hence, we
break the language, basically the words and sentences, into tokens and then load it into a
program. The process of breaking down language into tokens is called tokenization.
For example, consider a simple sentence: "NLP information extraction is fun''. This could be
tokenized into:
1. One-word (sometimes called unigram token): NLP, information, extraction, is, fun
2. Two-word phrase (bigram tokens): NLP information, information extraction, extraction is,
is fun, fun NLP
3. Three-word sentence (trigram tokens): NLP information extraction, information extraction
is, extraction is fun
• import spacy
•
• nlp = spacy.load("en_core_web_sm")
• doc = nlp("Apple is looking at buying U.K. startup for $1
billion")
• for token in doc:
• print(token.text)
• Apple
• is
• looking
• at
• buying
• U.K.
• startup
• for
• $
• 1
• billion
Parts of Speech Tagging
Tagging parts of speech is very crucial for information extraction from text. It'll help us
understand the context of the text data. We usually refer to text from documents as
''unstructured data'' – data with no defined structure or pattern. Hence, with POS tagging we can
use techniques that will provide the context of words or tokens used to categorise them in
specific ways.
In parts of speech tagging, all the tokens in the text data get categorised into different word
categories, such as nouns, verbs, adjectives, prepositions, determiners, etc. This additional
information connected to words enables further processing and analysis, such as sentiment
analytics, lemmatization, or any reports where we can look closer at a specific class of words.
Here’s a simple python code snippet using spacy, that’ll return parts of speech of a given
sentence.
import spacy
NLP = spacy.load("en_core_web_sm")
doc = NLP("Apple is looking at buying U.K. startup for $1
billion")
Dependency Graphs
Dependency graphs help us find relationships between neighbouring words using directed
graphs. This relation will provide details about the dependency type (e.g. Subject, Object etc.).
Following is a figure representing a dependency graph of a short sentence. The arrow directed
from the word faster indicates that faster modifies moving, and the label `advmod` assigned to
the arrow describes the exact nature of the dependency.
Similarly, we can build our own dependency graphs using frameworks like nltk and spacy.
Below is an example:
import spacy
from spacy import displacy
NLP = spacy.load("en_core_web_sm")
doc = NLP("This is a sentence.")
displacy.serve(doc, style="dep")
NER with Spacy
Spacy is an open-source NLP library for advanced Natural Language Processing in Python and
Cython. It's well maintained and has over 20K stars on Github. To extract information with spacy
NER models are widely leveraged.
Make sure to install the latest version of python3, pip and spacy. Additionally, we'll have to
download spacy core pre-trained models to use them in our programs directly. Use Terminal or
Command prompt and type in the following command after installing spacy:
python -m spacy download en_core_web_sm
Code:
# import spacy
import spacy
# print entities
for ent in doc.ents:
print(ent.text, ent.start_char, ent.end_char, ent.label_)
Output:
Apple 0 5 ORG
U.K. 27 31 GPE
$1 billion 44 54 MONEY
We've loaded a simple sentence here and applied NER with Spacy, and it works like magic.
Let's decode the program now.
Firstly, we've imported the spacy module into the program. Next, we load the spacy model into a
variable named NLP. Next, we load the data into the model with the defined model and store it
in a doc variable. Now we iterate over the doc variable to find the entities and then print the
word, its starting, ending characters, and the entity it belongs to.
This is a simple example: if we want to try this on real large datasets, we can use the medium
and large models in spacy.
NLP = spacy.load('en_core_web_md')
NLP = spacy.load('en_core_web_lg')
These work with high accuracy in identifying some common entities like names, location,
organisation etc. In the next section, let us look at some of the business applications
where NER is of utmost need!
Several industries deal with lots of documents every day and rely on manual work. Those
include finance, medical chains, transportation, and construction. Using NLP information
extraction techniques on documents will allow everyone on the teams to search, edit, and
analyse important transactions and details across business processes.Now we’ll look at an
example in detail on how information extraction from text can be done generically for documents
of any kind.
Information Collection
Firstly, we’ll need to collect the data from different sources to build an information extraction
model. Usually, we see documents on emails, cloud drives, scanned copies, computer software,
and many other sources for business. Hence, we’ll have to write different scripts to collect and
store information in one place. This is usually done by either using APIs on the web or building
RPA (Robotic Process Automation) pipelines.
Process Data
After we collect the data, the next step is to process them. Usually, documents are two types:
electronically generated (editable) and the other non-electronically generated (scanned
documents). For the electronically generated documents, we can directly send them into the
preprocessing pipelines. Still, we’ll need OCR to first read all the data from images and then
send them into preprocessing pipelines for the scanned copies. We can either use open-source
tools like Tesseract or any online services like Nanonets or Textract. After all the data is in
editable or electronic format, we can then apply to pre-process steps like Tokenization and POS
tagging and then use data loaders to load the data into the NLP information extraction models.
As discussed in the above sections, choosing a suitable model mostly depends on the type of
data we’re working with. Today, there are several state-of-the-art models we could rely on.
Below are some of the frequently use open-source models:
• Named Entity Recognition on HYPERLINK "https://paperswithcode.com/sota/named-entity-
recognition-ner-on-conll-2003"CoNLL HYPERLINK "https://paperswithcode.com/sota/named-
entity-recognition-ner-on-conll-2003" 2003 (English)
• Key Information Extraction From Documents: Evaluation And Generator
• Deep Reader: Information extraction from Document images via relation extraction and
Natural Language
These are some of the information extraction models. However, these are trained on a particular
dataset. If we are utilising these on our models, we’ll need to experiment on the
hyperparameters and fine-tune the model accordingly.
The other way is to utilize the pre-trained models and fine-tuning them based on our data. For
Information Extraction from text, in particular, BERT models are widely used. To learn more
about these, read our blog post here.
We evaluate the training process is crucial before we use the models in production. This is
usually done by creating a testing dataset and finding some key metrics:
• Accuracy: the ratio of correct predictions made against the size of the test data.
• Precision: the ratio of true positives and total predicted positives.
• Recall the ratio of true positives and total actual positives.
• F1-Score: harmonic mean of precision and recall.
Different metrics take precedence when considering different use cases. In invoice processing,
we know that an increase in the numbers or missing an item can lead to losses for the
company. This means that besides needing a good accuracy, we also need to make sure
the false positives for money-related fields are minimum, so aiming for a high precision value
might be ideal. We also need to ensure that details like invoice numbers and dates are
always extracted since they are needed for legal and compliance purposes. Maintaining
a high recall value for these fields might take precedence.
The full potential of the NLP models only knows when they are deployed in production. Today,
as the world is entirely digital, these models are stored on cloud servers with a suitable
background. In most cases, Python is utilised as its more handy programming language when it
comes to Text data and machine learning. The model is either exported as API or an SDK
(software development kit) for integrating with business tools. However, we need not build
everything from scratch as there are several tools and online services for this kind of use-cases.
For example, Nanonets has a highly accurate, fully trained invoice information extraction NLP
model, and you can directly integrate on our applications using APIs or supported SDKs.
Ideally, these are the steps that are required for information extraction from text data. Here’s an
example of how Nanonets performs on an ID card:
There are several applications of Information Extraction, especially with large capital companies
and businesses. However, we can still implement IE tasks when working with significant textual
sources like emails, datasets, invoices, reports and many more. Following are some of the
applications:
Invoice Automation: Automate the process of invoice information extraction.
Healthcare Systems: Manage medical records by identifying patient information and their
prescriptions.
KYC Automation: Automate the process of KYC by extracting ethical information from
customer's identity documents.
Financial Investigation: Extract import information from financial documents. (Tax, Growth,
Quarterly Revenue, Profit/Losses)
Conclusion
In this tutorial, we've learned about information extraction techniques from text data with various
NLP based methods. Next, we've seen how NER is crucial for information extraction, especially
when working with a wide range of documents. Next, we've learned about how companies can
create workflows to automate the process of information extraction using a real-time example.
Components of NLP
There are two components of NLP as given −
Difficulties in NLU
NL has an extremely rich form and structure.
It is very ambiguous. There can be different levels of ambiguity −
• Lexical ambiguity − It is at very primitive level such as word-level.
• For example, treating the word “board” as noun or verb?
• Syntax Level ambiguity − A sentence can be parsed in different ways.
• For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he
lifted a beetle that had red cap?
• Referential ambiguity − Referring to something using pronouns. For example, Rima
went to Gauri. She said, “I am tired.” − Exactly who is tired?
• One input can mean different meanings.
• Many inputs can mean the same thing.
NLP Terminology
• Phonology − It is study of organizing sound systematically.
• Morphology − It is a study of construction of words from primitive meaningful units.
• Morpheme − It is primitive unit of meaning in a language.
• Syntax − It refers to arranging words to make a sentence. It also involves determining
the structural role of words in the sentence and in phrases.
• Semantics − It is concerned with the meaning of words and how to combine words into
meaningful phrases and sentences.
• Pragmatics − It deals with using and understanding sentences in different situations and
how the interpretation of the sentence is affected.
• Discourse − It deals with how the immediately preceding sentence can affect the
interpretation of the next sentence.
• World Knowledge − It includes the general knowledge about the world.
Steps in NLP
There are general five steps −
• Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon
of a language means the collection of words and phrases in a language. Lexical analysis
is dividing the whole chunk of txt into paragraphs, sentences, and words.
• Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for
grammar and arranging words in a manner that shows the relationship among the words.
The sentence such as “The school goes to boy” is rejected by English syntactic analyzer.
• Semantic Analysis − It draws the exact meaning or the dictionary meaning from the
text. The text is checked for meaningfulness. It is done by mapping syntactic structures
and objects in the task domain. The semantic analyzer disregards sentence such as “hot
ice-cream”.
• Discourse Integration − The meaning of any sentence depends upon the meaning of
the sentence just before it. In addition, it also brings about the meaning of immediately
succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-interpreted on what it actually
meant. It involves deriving those aspects of language which require real world
knowledge.
• Machine Translation
Cost-effective translation
Machine translation increases productivity and the ability to deliver translations faster, reducing
the time to market. There is less human involvement in the process as machine translation
provides basic but valuable translations, reducing both the cost and time of delivery. For
example, in high-volume projects, you can integrate machine translation with your content
management systems to automatically tag and organize the content before translating it to
different languages.
In the early 2000s, computer software, data, and hardware became capable of doing basic
machine translation. Early developers used statistical databases of languages to train
computers to translate text. This involved a lot of manual labor and time. Each added language
required them to start over with the development for that language. Since then, machine
translation has developed in speed and accuracy, and several different machine translation
strategies have emerged.
here are several use cases of machine translation, such as those given below:
Internal communication
For a company operating in different countries across the world, communication can be difficult
to manage. Language skills can vary from employee to employee, and some may not
understand the company’s official language well enough. Machine translation helps to lower or
eliminate the language barrier in communication. Individuals quickly obtain a translation of the
text and understand the content's core message. You can use it to translate presentations,
company bulletins, and other common communication.
External communication
Companies use machine translation to communicate more efficiently with external stakeholders
and customers. For instance, you can translate important documents into different languages for
global partners and customers. If an online store operates in many different countries, machine
translation can translate product reviews so customers can read them in their own language.
Data analysis
Some types of machine translation can process millions of user-generated comments and
deliver highly accurate results in a short timeframe. Companies translate the large amount of
content posted on social media and websites every day, and translate it for analytics. For
example, they can automatically analyze customer opinions written in various languages.
Legal research
The legal department uses machine translation for preparing legal documents in different
countries. With machine translation, a large amount of content becomes available for analysis
that would have been difficult to process in different languages.
• Speech Recognition
Speech recognition is fast overcoming the challenges of poor recording equipment and noise
cancellation, variations in people’s voices, accents, dialects, semantics, contexts, etc using
artificial intelligence and machine learning. This also includes challenges of understanding
human disposition, and the varying human language elements like colloquialisms, acronyms,
etc. The technology can provide a 95% accuracy now as compared to traditional models of
speech recognition, which is at par with regular human communication.
Furthermore, it is now an acceptable format of communication given the large companies that
endorse it and regularly employ speech recognition in their operations. It is estimated that a
majority of search engines will adopt voice technology as an integral aspect of their search
mechanism.
This has been made possible because of improved AI and machine learning (ML) algorithms
which can process significantly large datasets and provide greater accuracy by self-learning and
adapting to evolving changes. Machines are programmed to “listen” to accents, dialects,
contexts, emotions and process sophisticated and arbitrary data that is readily accessible for
mining and machine learning purposes.
Speech recognition has by far been one of the most powerful products of technological
advancement. As the likes of Siri, Alexa, Echo Dot, Google Assistant, and Google Dictate
continue to make our daily lives easier, the demand for such automated technologies is only
bound to increase.
Businesses worldwide are investing in automating their services to improve operational
efficiency, increase productivity and accuracy, and make data-driven decisions by studying
customer behaviours and purchasing habits.
As speech recognition and AI impact both professional and personal lives at workplaces and
homes respectively, the demand for skilled AI engineers and developers, Data Scientists, and
Machine Learning Engineers, is expected to be at an all-time high.
• Robotics
What are Robots?
Robots are the artificial agents acting in real world environment.
Objective
Robots are aimed at manipulating the objects by perceiving, picking, moving, modifying the
physical properties of object, destroying it, or to have an effect thereby freeing manpower from
doing repetitive functions without getting bored, distracted, or exhausted.
What is Robotics?
Robotics is a branch of AI, which is composed of Electrical Engineering, Mechanical
Engineering, and Computer Science for designing, construction, and application of robots.
Aspects of Robotics
The input to an AI program is in Inputs to robots is analog signal in the form of speech
symbols and rules. waveform or images
They need general purpose computers They need special hardware with sensors and
to operate on. effectors.
Components of a Robot
Robots are constructed with the following −
• Power Supply − The robots are powered by batteries, solar power, hydraulic, or
pneumatic power sources.
• Actuators − They convert energy into movement.
• Electric motors (AC/DC) − They are required for rotational movement.
• Pneumatic Air Muscles − They contract almost 40% when air is sucked in them.
• Muscle Wires − They contract by 5% when electric current is passed through them.
• Piezo Motors and Ultrasonic Motors − Best for industrial robots.
• Sensors − They provide knowledge of real time information on the task environment.
Robots are equipped with vision sensors to be to compute the depth in the environment.
A tactile sensor imitates the mechanical properties of touch receptors of human
fingertips.
Applications of Robotics
The robotics has been instrumental in the various domains such as −
• Industries − Robots are used for handling material, cutting, welding, color coating,
drilling, polishing, etc.
• Military − Autonomous robots can reach inaccessible and hazardous zones during war.
A robot named Daksh, developed by Defense Research and Development Organization
(DRDO), is in function to destroy life-threatening objects safely.
• Medicine − The robots are capable of carrying out hundreds of clinical tests
simultaneously, rehabilitating permanently disabled people, and performing complex
surgeries such as brain tumors.
• Exploration − The robot rock climbers used for space exploration, underwater drones
used for ocean exploration are to name a few.
• Entertainment − Disney’s engineers have created hundreds of robots for movie making.
• Perception
Perception is a process to interpret, acquire, select and then organize the sensory
information that is captured from the real world.
For example: Human beings have sensory receptors such as touch, taste, smell,
sight and hearing. So, the information received from these receptors is transmitted
to human brain to organize the received information.
According to the received information, action is taken by interacting with the
environment to manipulate and navigate the objects.
Perception and action are very important concepts in the field of Robotics. The
following figures show the complete autonomous robot.
There is one important difference between the artificial intelligence program and
robot. The AI program performs in a computer stimulated environment, while the
robot performs in the physical world.
For example:
In chess, an AI program can be able to make a move by searching different nodes
and has no facility to touch or sense the physical world.
However, the chess playing robot can make a move and grasp the pieces by
interacting with the physical world.
Image formation in digital camera
Image formation is a physical process that captures object in the scene through
lens and creates a 2-D image.
Let's understand the geometry of a pinhole camera shown in the following
diagram.
In the above figure, an optical axis is perpendicular to the image plane and image
plane is generally placed in front of the optical center.
So, let P be the point in the scene with coordinates (X,Y,Z) and P' be its image
plane with coordinates (x, y, z).
If the focal length from the optical center is f, then by using properties of similar
triangles, equation is derived as,
• Planning
For any planning system, we need the domain description, action specification, and goal
description. A plan is assumed to be a sequence of actions and each action has its own set
of preconditions to be satisfied before performing the action and also some effects which can
be positive or negative.
So, we have Forward State Space Planning (FSSP) and Backward State Space Planning
(BSSP) at the basic level.
Forward State Space Planning (FSSP)
FSSP behaves in a similar fashion like forward state space search. It says that given a start
state S in any domain, we perform certain actions required and acquire a new state S’ (which
includes some new conditions as well) which is called progress and this proceeds until we
reach the goal state. The actions have to be applicable in this case.
• Disadvantage: Large branching factor
• Advantage: Algorithm is Sound
2. Backward State Space Planning (BSSP)
BSSP behaves in a similar fashion like backward state space search. In this, we move from
the goal state g towards sub-goal g’ that is finding the previous action to be done to achieve
that respective goal. This process is called regression (moving back to the previous goal or
sub-goal). These sub-goals have to be checked for consistency as well. The actions have to
be relevant in this case.
• Disadvantage: Not a sound algorithm (sometimes inconsistency can be found)
• Advantage: Small branching factor (very small compared to FSSP)
Hence for an efficient planning system, we need to combine the features of FSSP and BSSP
which gives rise to Goal Stack planning which will be discussed in the next article.
• Hardware
AI is a program written by programmers. Every program requires hardware to run. The better
the hardware, the smoother and more precise the program will perform. Artificial Intelligence
Hardware is hardware that is used to accelerate Artificial Intelligence.
There are three parts to the hardware infrastructure. These parts are compute, store, and
networking. While the computation has developed during the past few years, the other two
parts are lagging. Many companies are putting a lot of effort into developing the storage and
networking part of the hardware infrastructure.
At the meantime, companies are promoting the use of deep learning. Also, the distributed
computing infrastructure, which allows AI to work on multiple devices, is still developing.
• Moving
AI is important because it forms the very foundation of computer learning. Through AI,
computers have the ability to harness massive amounts of data and use their learned
intelligence to make optimal decisions and discoveries in fractions of the time that it would
take humans.