Marketing Analytics Project
Marketing Analytics Project
Marketing Analytics Project
Analysis of
Facebook Reviews
on Trustpilot
NICK KACHANYUK
12/11/2020
WILLAMETTE UNIVERSITY
Research Questions
Motivation
Trustpilot is business review site where anyone with an email or Facebook account can register and leave a
review about a business company.
Trustpilot states that they combat fake reviews (spam, non-genuine experience) and reviews that do not
follow community guidelines (hate speech, etc.)
Research Questions
What are the different topics (technical issues, censoring, social media experience, etc.) that the Facebook
reviews are written on Trustpilot?
How these reviews can be explored via text and sentiment analysis to detect reviews that are not genuine?
Existing Theory & Evidence
Fang, Xing, and Justin Zhan. “Sentiment Analysis Using Product Review Data.” Journal of Big Data, vol.
2, no. 1, 16 June 2015, 10.1186/s40537-015-0015-2. Accessed 23 Apr. 2019.
Key Takeaways:
Sentiment analysis is hindered due to fake, spam, and non-genuine reviews that are submitted
Neutral sentiments make it difficult to categorize reviews into distinct categories
They found that categorizing reviews becomes difficult when classifying reviews to their specific star-
scaled ratings which caused them to report F1 scores less than 0.5
• In my EDA, I found a similar problem occurring
Existing Theory & Evidence (continued)
Hiremath, Prakash & Algur, Siddu & Shivashankar, S.. (2010). Cluster Analysis of Customer Reviews
Extracted from Web Pages. Journal of Applied Computer Science & Mathematics. 4.
Key Takeaways:
Features/IV can be used to segment reviews into categories (most significant, more significant,
significant, and insignificant)
Found that a significant number of features belong to the insignificant review category
Existing Theory & Evidence (continued)
Ng, James. “Natural Language Processing (NLP) to Analyse Product Reviews by Online Shoppers.”
Medium, 16 Apr. 2020, towardsdatascience.com/natural-language-processing-nlp-analysis-of-product-
reviews-by-online-shoppers-7a5966f5e615. Accessed 10 Dec. 2020.
Key takeaways:
Topic modeling has three uses (descriptive, predictive, and prescriptive analytics)
Descriptive is about understanding what the reviews are written on (my focus in the project)
Predictive is about understanding how the product will sell, attract new customers, and so on given a
certain rating
Prescriptive is about looking at reviews left by the consumer and recommending them other products
based on their previous reviews using an algorithm
Dataset and Variables
Data
Variables
review (all the written text including both the title and the body of the review)
Cluster Analysis
Elbow, Silhouette, and Gap Statistic method
Topic Modeling
LDA with Gibbs random sampling