An Approach For Automatic Analysis of Online Store Product and Services Reviews
An Approach For Automatic Analysis of Online Store Product and Services Reviews
An Approach For Automatic Analysis of Online Store Product and Services Reviews
Vol. 60
Journal of Varna University of Economics 4
Snezhana SULOVA1
Introduction
In recent years Internet was established as one of the richest and most easily ac-
cessible sources of information. The global network has a large amount of documents,
data, audio and video files, many recorded customer reviews. All these resources are
carriers of knowledge about business and after appropriate computer processing they
can contribute more detailed analyses and help to identify and explore new relation-
ships.
In the sphere of e-commerce, core business activities are carried out through dy-
namic online systems. One of the main challenges for this type of business is making
fast and accurate decisions in accordance with the changes in the market environment.
The e-commerce systems generate detailed and varied reports which are based mostly
on statistical processing of the data stored in the database. Lately for more detailed
and in-depth analysis in this area there have been used intelligent business analysis
based on both structured and unstructured data.
Practice has proven that nowadays new customers of online stores largely rely
on the opinions posted from existing customers. Besides, manufacturers and service
providers are also interested in analyzing customers' opinions to improve the quality
1
Department of Informatics, University of economics Varna, Bulgaria. e-mail: [email protected]
455
Izvestiya
2016 Volume 60 4
and standards of products and services. All this requires the search for new and effec-
tive ways to transform unstructured data, such as customer opinions in detailed re-
ports and analyses.
The purpose of this article is to propose an approach for automated analysis of
online store product reviews, based on a study of existing technologies for natural
language processing.
. Theoretical foundations of computer technology
for natural language processing
The concept of natural language processing (NLP) is a broad term that can be
viewed as a synthesis between artificial intelligence and computational linguistics. It
is more than simple machine translation. It aims at the full understanding of the text,
checking syntactic and semantic validity of linguistic input, using the real world
knowledge to understand the participants goals and beliefs, and also speech acts,
conversations and discourse structure. (Kumar, 2011, p.4). Currently many research-
ers explore different aspects of intelligent text processing. In general, knowledge
discovery in unstructured data in the literature is known as text mining (TM) (Fayyad,
Piatetsky-Shapiro and Smith, 1996; Feldman, Sanger, 2007). This process is accom-
plished through the application of technology for data mining (DM) on unstructured
text data. Typical text mining tasks include text categorization, text clustering, con-
cept/entity extraction, production of granular taxonomies, sentiment analysis, docu-
ment summarization, and entity relation modeling (Pena-Ayala, 2014, p.37).
The accumulation of more and more information on the Web becomes a prereq-
uisite for extracting knowledge from Internet sources such as web pages. A new
concept of extracting knowledge from web resources - web mining (WM) is born.
Etzioni first used the term and defined it as use of data mining techniques to auto-
matically discover and extract information from World Wide Web documents and
services (e.g., on-line travel agents, job listings, electronic malls, etc.) (Etzioni,
1996, p. 65). Later the concept WM expanded and it now includes techniques for
testing and analyzing data on the usability of the web resource. (Cooley, Mobasher,
Srivastave, 1997; Markov, Larosed, 2007). Web mining is commonly divided into the
following three sub-areas (Cooley, Mobasher, Srivastave, 1997):
web content mining (WCM) extracting useful knowledge from the contents
of web documents;
web structured mining (WSM) extracting useful knowledge based on the
structure of web sites;
456
Snezhana Sulova.
An Approach For Automatic Analysis Of Online Store Product And Services Reviews
web usage mining (WUM) extracting useful knowledge from data on the
use of Internet resources.
Many researchers deal with WCM problems (Kosala, Blockeel, 2000; Navadiya,
Patel, 2012; Markov, Larosed, 2007). The researches are differentiated depending on
the specific research tasks and type of resources that are used. Automatic
classification of web pages appears in scientific publications (Materna, 2008),
grouping of documents (Markov, Larosed, 2007), detection of similarity between text
documents (Huang, 2008; Lakshmi, 2013), extracting opinions from text and
sentiment analysis (Liu 2012; Medhat, Hassan, Korashy, 2014; DAvanzo, Pilato,
2015; Patel, Prabhu and Bhowmick, 2015).
In recent years, mainly thanks to the development of web applications and social
networks, the Internet has accumulated a large amount of customer reviews, shared
impressions, feelings, emotions. This is the reason for many researchers to focus their
research on two interrelated areas such as: opinion mining (OM) and sentiment
analysis (SA) (fig. 1).
457
Izvestiya
2016 Volume 60 4
458
Snezhana Sulova.
An Approach For Automatic Analysis Of Online Store Product And Services Reviews
459
Izvestiya
2016 Volume 60 4
products and services meet the descriptions and presentations, what else customers
want to discover in the online store, and what the general clients assessments are.
Opinion mining and sentiment analysis are important for traders, because they create
prerequisites for individual marketing to each customer and to implement better
service.
In this article for research and analysis of customers' reviews we suggest the use
of the methods of classification, first to distinguish the views of various
characteristics of the goods and then to evaluate the polarity of customer reviews
about them. The model that we use for analysis is shown in fig. 3.
Fig. 3. Model for analysis of online store product and services reviews
Usually the process of opinion mining from text data is an unconventional task,
because data is unstructured, its based on WCM and it is appropriate to analyze
online customer reviews following these steps:
1. Collecting and recording product reviews.
2. Text preprocessing of product reviews.
460
Snezhana Sulova.
An Approach For Automatic Analysis Of Online Store Product And Services Reviews
461
Izvestiya
2016 Volume 60 4
463
Izvestiya
2016 Volume 60 4
In practice, this is the last stage of the analysis of reviews. It shows the results
and therefore the used software tools and the role of human interpretation of the
results are important.
Similarly, the model that is based on the NB algorithm could be built. We made
approbation and found that they obtain similar results.
Conclusion
The rapid development of social networking and sharing capabilities that
provide many of the applications running on the Internet is a prerequisite for the
generation of large collections of consumer reviews, impressions, shared feelings and
emotions. Intelligent business analyses of these customer reviews is important to the
business and therefore is subject to a research interest in the recent years. Since there
is no specific algorithm available for viable search for knowledge in a text, based on
the results of existing studies in this paper we propose an approach for analyzing the
reviews of online stores customers, through which expressed opinions can be
classified and conclusions about the quality of goods can be made.
The resulting new knowledge could help to improve the product range and
customer satisfaction, and for e-commerce companies it is essential, because sales
revenues largely depend on it. Furthermore, this kind of analysis can be used by
managers to create successful business strategies based on the resulting in-depth and
precise analyses and forecasts. Extracting new knowledge from Internet resources
could be an important competitive advantage for companies involved in e-commerce,
because in general it contributes to improving their business.
References
1. Ankitkumar, D., Badre, R., Kinikar, M. (2014) A Survey on Sentiment Analysis
and Opinion Mining. International Journal of Innovative Research in Computer
and Communication Engineering. 2 (11). p. 6633-6639.
2. Cooley, R., Mobasher, B. and Srivastava, J. (1997) Web Mining: Information and
Pattern Discovery on the World Wide Web. Proceedings of the International
Conference on Tools with Artificial Intelligence. p. 558-567
3. DAvanzo, E., Pilato, G. (2015) Mining social network users opinions to aid
buyers shopping decisions. Computers in Human Behavior. 51. p. 12841294.
4. Das, R., Chen, M. Yahoo! (2001) For Amazon: Sentiment Parsing from Small
Talk on the Web, EFA 2001 Barcelona Meetings. [Online] Available from:
http://ssrn.com/abstract=276189 or http://dx.doi.org/10.2139/ssrn.276189,
[Accessed: 10/9/2016].
464
Snezhana Sulova.
An Approach For Automatic Analysis Of Online Store Product And Services Reviews
5. Dave, K., Lawrence, S. and Pennock, D. (2003) Mining the peanut gallery:
Opinion extraction and semantic classification of product reviews. Proceedings
of WWW. p. 519528.
6. Etzioni, . (1996) The World Wide Web: quagmire or gold mine?
Communications of the ACM. 11. p. 65-68.
7. Fayyad, M., Piatetsky-Shapiro and Smyth, P. (1996) From Data Mining to
Knowledge Discovery in Databases. AI Magazine [Online] 17(3). p. 37-54.
Available from: https://www.aaai.org/ojs/index.php/aimagazine/article/view
File/1230/1131. [Accessed: 10/6/2016].
8. Feldman, R., Sanger, J. (2007) The text mining handbook. Advanced Approaches
in Analyzing Unstructured Data. Cambridge: Cambridge University Press.
9. HeerschoP B, et. al. (2011) Polarity Analysis of Texts using Discourse Structure.
Proceedings of the 20th ACM international conference on Information and
knowledge management. p. 1061-1070.
10. HU, M and Liu, B. (2004) Mining and Summarizing Customer Reviews,
Proceedings of the tenth ACM SIGKDD international conference on Knowledge
discovery and data mining, p. 168-177.
11. Huang, . (2008) Similarity Measures for Text Document Clustering,
Proceedings of the sixth New Zealand computer science research student
conference (NZCSRSC2008). [Online] pp. 49-56. Available from:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.332.4480&rep=rep1&t
ype=pdf. [Accessed: 13/6/2016].
12. Kosala, R., Blockeel, H. (2000) Web Mining Research: a survey. ACM SIGKDD
Explorations Newsletter. [Online] 2(1). p. 1-15. Available from:
http://www.kdd.org/exploration_files/kosala.pdf. [Accessed: 11/6/2016].
13. Kumar, . (2011) Natural Language Processing. New Delhi: I. K. International
Publishing House Pvt. Ltd.
14. Lakashimi S. et. al. (2013), Analysis of Similarity Measures for Text Clustering.
International Journal of Engineering & Science Research, [Online] 3(8) pp
4627-463. Available from: http://www.ijesr.org/admin/upload_journal/journal_
Naga_lakshmi__33olaug13esr.pdf. [Accessed: 13/6/2016].
15. Liu, B. (2012) Sentiment Analysis and Opinion Mining, Morgan & Claypool
Publishers.
16. Markov, Z. and Larosed, D. (2007) Data Mining the Web Uncovering Patterns
in Web Content, Structure, and Usage. New Jersey: John Wiley & Sons.
17. Medhat, W., Hassan, A. and Korashy, H. (2014) Sentiment analysis algorithms
and applications: A survey. Ain Shams Engineering Journal. 5. p. 1093-1113.
465
Izvestiya
2016 Volume 60 4
466
Snezhana Sulova.
An Approach For Automatic Analysis Of Online Store Product And Services Reviews
30. Verma, R., Kiranjyoti (2015) Opinion Mining and Analysis of the Techniques
for User Generated Content (UGC), International Journal of Advanced Research
in Computer Science and Software Engineering. 5(5). p. 438-441.
31. Zhang, C., Wang, H., Yao, L. Y., WU, D., Lia, Y., & Wang, B. (2008).
Automatic keyword extraction from documents using conditional random fields.
Journal of Computational Information Systems. 4(3). p. 1169-1180.
467