A Comparative Study On Chatbot Based On Machine Learning and Lexicon Based Technique

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

A Comparative Study on Chatbot Based on Machine


Learning and Lexicon Based Technique
Karthik Konar.
MCA Student,Dept of Computer Engineering,
NMIMS Mukesh Patel School Of Technology Management &
Engineering, Vile Parle(West) Mumbai.

Abstract:- Sentimental Analysis is that particular This paper provides the detailed comparison between
domain ,where you try to understand human emotions lexicon based approach and machine learning based
with the help of a software.Human emotions are in approach .[4]Chatbot refers to a chatting robot.[4] It is a
written form and we can classify those sentiments as communication simulating computer program. [4]It is all
positive,negative and neutral.Sentimental analysis is also about the conversation with the user. [4]The conversation
referred to as opinion mining because in sentimental with a Chatbot is very simple. [4]It answers the questions
analysis we are trying to analyze the thoughts of a asked by the user.[6]A chatbot, also known as a
customer with respect to a particular thing. conversational agent, is a computer software capable of
taking a natural language input and providing a
However Natural Language Processing and conversational output in real time.[7]A chatbot is the best
Machine learning are considered to be the childrens of tool which provides a quick way to interact with the users.
Artificial Intelligence,Since they both work in [7]It is very helpful to the users as it allows them to enter
conjunction and lend a hand to solve large numbers of questions in natural language and desired information is
data problems.While Natural Language Processing obtained easily to the user. 2
provides us with an understanding about how computers chatbots(CHATBOT1,DOCBOT) were developed using the
and human(natural) language interact with each other. above mentioned approaches,and the chatbot which gives us
the most accurate results are discussed in this paper along
This paper aims to identify which approach(lexicon with advantage and disadvantage of each approach is also
or machine learning)is better among the two approaches discussed in this paper. Chatbot1 which we have developed
in terms of providing accurate results when it is comes under lexicon based approach ,and that chatbot takes
implemented in ChatBot. a list of words as an input from the user and it then identifies
the polarity of the text.The main work of CHATBOT1 is to
Python language is utilized for the development of take reviews of movies from the user and then classify those
the chatbots.one chatbot is developed for classifying reviews as positive,negative or neutral.This chatbot uses
movie reviews as positive,negative or neutral by taking TEXTBLOB library for processing textual data.The concept
the input from the user and another chatbot(DocBot) is of polarity and subjectivity is used while developing this
developed for providing all the information related to chatbot(CHATBOT1).
kidney disease to the user.
DOCBOT is another chatbot which is developed using
Keyword:- Chatbot, Lexicon, Machine Learning, polarity, Machine Learning Based Approach.The DOCBOT provides
subjectivity, tokenization. information related to kidney disease to the user.The
concept of tokenization(ie Lemmatization),TFIDVectorizer
I. INTRODUCTION is used while developing DOCBOT.

Nowadays customers play a very big role in making a This study helps us to compare which approach is
business or any entity successful.A customer can make or better and provides us with the suitable results.
break a business,therefore it is very important for the
organization to understand the sentiments of its customers II. MATERIALS AND METHODS
,client’s so that any organization can reach
heights.Therefore Sentimental analysis is  Literature Survey
essential.Sentimental analysis determine useful There are 2 approaches which are extensively used to
information,those information can be used to understand detect sentiments from the text.They are symbolic
current market strategy,improve business.There are various techniques and machine learning techniques.
applications of Sentimental analysis such as Review
classification,Product Review Mining.[1]States that [1]In their research work concluded that machine
sentimental analysis is is a system or a model that takes the learning technique is very easier and efficient than symbolic
document that analyzed the input ,and generates a detailed techniques(Lexicon approach)
document summarizing the options of the given input
document. [2]Developed A Wat son chatbot which shows us and
performs the tasks like “on headlamps” or “Turn on

IJISRT20MAY943 www.ijisrt.com 1534


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
wipers”.A user may input commands while driving through [7]Developed a Chatbot which provides various
voice assist easily without any distraction from the road and information related to university or college and also
The bot will perform those tasks for him. students-related information. The chatbot can be used by
anyone who can access the university’s website. The project
[3]Introduced a new method method called B-Point uses the concept of Artificial Intelligence and Machine
Tree to speed up the search process by adding an additional Learning.
data structure that contains shortcut pointers to the
traditional search BST. The experiments had been [8]Introduced the concept of CyberBullying in two
conducted on a FloristBot, a chatbot that behaves as human way chat using machine learning algorithms,their main aim
personnel in a flower shop. The FloristBot is used to was to detect cyberbully in chatbot using cyberbully
entertain customers and take orders. algorithm.

[4] states that chatbot is one of the simple ways to [9]Their paper explains a medical chatbot which can be
transport data from a computer without having to think for used to replace the conventional method of disease
proper keywords to look up in a search or browse several diagnosis and treatment recommendation using machine
web pages to collect information.In her review paper she learning approach.
concluded that the development and improvement of chatbot
design grow at an unpredictable rate due to variety of [10] In their survey, the results showed that the greatest
methods and approaches used to design a chatbot. advantage of using chatbots in marketing is the provision of
simple, fast information, but they also showed the fear of
[5]in their paper stated that larger lexicons may yield a respondents getting the wrong information from chatbots,
decrease in performance due to ambiguity of words polarity which is something that needs to be resolved in the future.
and increased model complexity.

III. RESULTS AND DISCUSSION

 System Architecture Of ChatBot1 Developed using Lexicon based approach

Fig 1:- System Architecture of ChatBot for Movie Review Classification Using Lexicon Approach.

IJISRT20MAY943 www.ijisrt.com 1535


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
The algorithm for the above chatbot is shown as follows:
Step 1: A greeting message is displayed to the user .The user is asked to greet back the chatbot . The greeting sentence of the user
is then subjected to undergo polarity and subjectivity check.if the polarity of the sentence is less than 0 and the subjectivity of the
sentence is greater than or equal to 0.5 the chatbot assumes that the user is angry and displays an appropriate message to the user
and the chatbot terminates.
Step 2: if the polarity of the sentence is not less than 0 and the subjectivity of the sentence is less than 0.5 the chatbot assumes
that the user is fine and happy and and the chatbot asks the user to input the movie name recently watched by him/her ,and then
the chatbot asks the user to write a review about that particular movie.

The review is then subjected to measure polarity on the text.

If the polarity of the sentence:


 Is greater than or equal to 0.7,the chatbot assumes that the movie was fantastic and then it terminates.
 Is greater than or equal to 0.5 and less than 0.7 ,the chatbot assumes that the movie was above average and then it terminates.
 Is greater than or equal to 0 and less than 0.4 ,the chatbot assumes that the movie was average and then it terminates.
 Is less than -0.5 ,the chatbot assumes that the movie was worse.

This approach is known as the Lexicon based approach.

 System Architecture of DocBot Developed using Machine Learning Technique.

Fig 2:- System Architecture of ChatBot for Movie Review Classification Using Machine Learning Approach.

The algorithm for the above chatbot is shown as follows:

Step 1:Download article from the internet


pseudocode:article1=Article('SPECIFY URL')
article1.download()

IJISRT20MAY943 www.ijisrt.com 1536


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

It will download the entire article from the specified URL

Step 2:Convert the text into a list of sentences. Ie,perform tokenization.

Step 3:print tokens.

Step 4:create a dictionary (key:value) pair to remove the punctuations.


Step 5: print the punctuations.

Step 6:Print the dictionary

Step 7:create a function to return a list of lemmatized lower case words after removing punctuations.

PseudoCode:
def LemNormalize1(text1):
return nltk.word_tokenize(text1.lower().translate(remove_punc_dict))
The above is created to remove all the punctuations from the article.

Step 8: printing the tokenization text.

IJISRT20MAY943 www.ijisrt.com 1537


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Step 9:create an array named GREETING_INPUTS1 which contains a list of words which can be received as a greeting message
from the user.

pseudocode:GREETING_INPUTS1=["hi","hello","hola","wassup","hey"]

Step 10:create an array named GREETING_RESPONSES1 which should be triggered back to the user.
pseudocode:GREETING_RESPONSES=["howdy","hi","hey","what's good","hello","hey there"]

Step 11: create Function to return a random greeting response to a user's greeting.

Step 12: create a function to generate the response to user’s query

Step 13:convert the user’s query to lower case.

Step 14:print the user query

Step 15:set the chatbot response to an empty string.

Step 16:Append the user’s query to sentence list

Step 17:Create a TFIDF Vectorizer and print it’s features.

Step 18:convert the text of a matrix to TF IDF Features.Get the measure of the similarity scores from the user query.
Using cosine_similarity module.

IJISRT20MAY943 www.ijisrt.com 1538


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Step 19:Get the index of the most similar text/sentence to the users response

Step 20:sort the list in ascending order.

Step 21:Get the most similar score to the users response

Step 22:Print the similarity score

Step 23:if the similarity score is 0 then there is no text similar to the users query.

Step 24:if the similarity score is non-zero print the chatbot response and user’s query from the user’s token list.

 Results
The following results were obtained for the chatbot which was implemented using Lexicon Technique.

Fig 3:- In the above figure the polarity measured from the Greeting sentence is less than 0 .So therefore the chatbot learns that the
user is angry,therefore displays appropriate messages and then terminates .

Fig 4:- In the above figure the polarity measured from the sentence is less than -0.5 .so therefore the chatbot learns that the movie
is worse.

Fig 5:- In the above figure the polarity measured from the sentence is greater than 0.7 .so therefore the chatbot learns that the
movie is wonderful.

IJISRT20MAY943 www.ijisrt.com 1539


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 6:- In the above figure the polarity measured from the sentence is greater than 0 and less than or equal to 0.4 .so therefore the
chatbot learns that the movie is average.

Fig 7:- In the above figure the polarity measured from the sentence is greater than or equal to 0.5 and less than or equal to 0.7 .so
therefore the chatbot learns that the movie is above average.

The following results were obtained for the chatbot which was implemented using Machine Learning Technique.

Fig 8:- DocBot responding to a user's query.

Fig 9:- if there is no similarity found in the user’s query then the chatbot prints appropriate messages.

IJISRT20MAY943 www.ijisrt.com 1540


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 Findings:
Based on the experimentation performed, the following findings were obtained:

Sr. No Technique Advantages Disadvantage Remarks.

1 Lexicon Easy to implement,Easy to Accuracy rate is low when It is implemented on chatbot which
understand ,Less complex compared with machine learning classifies movies as wonderful,above
when compared with machine approach,Based on WordNet average,average,worst based on user
learning approach. Database. reviews

2 Machine Accuracy rate is very much Complex to implement when It is implemented on chatbot which
Learning higher when compared with compared with lexicon approach answers all queries related to kidney
Lexicon approach,Very good disease.
performance.
Table 1:- Summary of Comparison.

IV. CONCLUSION Trends, 23rd–24th November, 2018 College of


Computing Sciences & Information Technology,
 2 chatbots were developed Teerthanker Mahaveer University, Moradabad, India
First chatbot is developed for classifying movie [3]. Ayah Atiyah ,Shaidah Jusoh:Sufyan Almajali.’An
reviews as positive,negative or neutral by taking the input Efficient Search for Context-Based Chatbots’2018 8th
from the user and another chatbot(DocBot) is developed for International Conference on Computer Science and
providing all the information related to kidney disease to Information Technology (CSIT).
the user.When compared the performace and efficiency of [4]. M. Dahiya.’A Tool of Conversation:
both the chatbots it was observed that the chatbot which Chatbot’.International Journal of Computer Sciences
was developed using machine learning approach proved to and Engineering .Volume-5, Issue-5. E-ISSN: 2347-
produce more promosing and faster results than the chatbot 2693.
which was developed using lexicon approach. [5]. Olga Kolchyna , Tharsis T. P. Souza ´ , Philip C.
Treleaven and Tomaso Aste.’Twitter Sentiment
Thus I conclude that machine learning techniques are Analysis: Lexicon Method, Machine Learning Method
more efficient than Lexicon based approaches when it and Their Combination’.
needs to be implemented in Chatbot. [6]. Prissadang Suta,Xi Lan , Biting Wu,Pornchai
Mongkolnam and Jonathan H. Chan.’An Overview of
V. FUTURE SCOPE Machine Learning in Chatbots ‘ International Journal
of Mechanical Engineering and Robotics Research
This paper provides a detailed study on why machine Vol. 9, No. 4, April 2020.
learning approach is better than lexicon approach while [7]. Neelkumar P. Patel, Devangi R. Parikh,Prof. Darshan
implementing chatbot.Hence this paper will prove to be A. Patel,Prof. Ronak R. Patel,Prof. Darshan A. Patel3
useful for upcoming authors who wish to make a further Prof. Ronak R. Patel.’AI and Web-Based Human-Like
detailed analysis between machine learning approach and Interactive University Chatbot (UNIBOT)
lexicon based approach. ’Proceedings of the Third International Conference on
ACKNOWLEDGMENT Electronics Communication and Aerospace
Technology [ICECA 2019] IEEE Conference Record
I acknowledge the contribution of NMIMS university # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5
to provide me with this amazing opportunity and good [8]. Mrs.V.Selvi, Ms.Saranya , Ms.Chidida ,
facilities to carry out this review work. Ms.Abarna.’Chatbot and bullyfree Chat.’Proceding of
International Conference on Systems Computation
REFERENCES Automation And Networking 2019.IEEE 978-1-7281-
1524-5.
[1]. Akshay Amolik,Niketan Jivane,Mahavir [9]. Rohit Binu Mathew, Sandra Varghese, Sera Elsa Joy,
Bhandari,Dr.M Venkatesan.‘Twitter Sentiment Swanthana Susan Alex.’Chatbot for Disease
Analysis of Movie Reviews Using Machine Learning Prediction and Treatment Recommendation using
Techniques’.International Journal of Engineering and Machine Learning’.Proceedings of the Third
Technology. International Conference on Trends in Electronics and
[2]. Praveen Kumar,Mayank Sharma,Seema Informatics (ICOEI 2019) IEEE Xplore Part Number:
Rawat,Tanupriya Choudhury.’Designing and CFP19J32-ART; ISBN: 978-1-5386-9439-8.
Developing a ChatBot using Machine [10]. uros arsenijevic,marija jovic ‘Artificial intelligence
Learning’.Proceedings of the SMART–2018, IEEE marketing: Chatbots’.2019 International Conference
Conference ID: 44078 2018 International Conference on Artificial Intelligence: Applications and
on System Modeling & Advancement in Research Innovations (IC-AIAI) .

IJISRT20MAY943 www.ijisrt.com 1541

You might also like