International Journal of Information Management: Francesca Greco, Alessandro Polli
International Journal of Information Management: Francesca Greco, Alessandro Polli
Keywords: The widespread use of the Internet and the constant increase in users of social media platforms has made a large
Emotional Text Mining amount of textual data available. This represents a valuable source of information about the changes in people’s
Brand management opinions and feelings. This paper presents the application of Emotional Text Mining (ETM) in the field of brand
Twitter management. ETM is an unsupervised procedure aiming to profile social media users. It is based on a bottom-up
Network analysis
approach to classify unstructured data for the identification of social media users’ representations and sentiments
Customer profiling
about a topic. It is a fast and simple procedure to extract meaningful information from a large collection of texts.
As customer profiling is relevant for brand management, we illustrate a business application of ETM on Twitter
messages concerning a well-known sportswear brand in order to show the potential of this procedure, high-
lighting the characteristics of Twitter user communities in terms of product preferences, representations, and
sentiments.
1. Introduction crucial success factor for a brand management plan is to know what
customers disclose when they share a text on social media (Jimenez-
The Internet’s wide diffusion increases the opportunity for millions Marquez, Gonzalez-Carrasco, Lopez-Cuadrado, & Ruiz-Mezcua, 2019;
of people to surf the web daily to search and share information, ideas, Shiau, Dwivedi, & Lai, 2018).
interests, or any other forms of expression. Social media platforms, such The constant rise in the number of users on social media platforms
as Twitter or Facebook, will increasingly play a crucial role in many make a large amount of data available that represents a relevant source
areas as they enable direct, continuous and real-time communication of information. The scraping of social media platforms allows for the
(Chandler, Salvador, & Kim, 2018). collection of huge amounts of textual data, typically unstructured, in a
The most obvious outcome of such a communication process is a relatively short amount of time. Therefore, a methodology is needed to
steady increase in user-generated content (He, Zha, & Li, 2013) about process unstructured data and to extract information. As shown by the
their activities, behaviors, attitudes, preferences and values, freely literature, the online communication is analyzed by means of text
shared on such digital platforms; a circumstance that opens up great mining procedures for different purposes, such as product planning
opportunities for research and marketing professionals, who can draw (Jeong, Yoon, & Lee, 2017), marketing (AlAlwan, Rana, Dwivedi, &
on this data repository cheaply and effectively. Algharabat, 2017; Kapoor et al., 2018), voting behavior forecasting
The growing ease with which a professional can access a wide range (e.g., Greco, Maschietti, & Polli, 2017; Grover, Kar, Dwivedi, & Janssen,
of information on the markets in reference is a strategic resource for 2018), disaster management (Singh, Dwivedi, Rana, Kumar, & Kapoor,
many business functions, including brand management (Shirdastian, 2017), campaign surveys (Afful-Dadzie & Afful-Dadzie, 2017), and in
Laroche, & Richard, 2017), which plays a crucial role in increasing the the assessment of web sites, customer review effectiveness and cus-
perceived value of a product, a product line or a brand over time and, tomer perceptions of digital marketing (Antonacci, Fronzetti Colladon,
ultimately, the brand equity. Stefanini, & Gloor, 2017; Aswani, Kar, Ilavarasan, & Dwivedi, 2018;
Although the spread of social media has changed the tactics of Gloor, Fronzetti Colladon, Giacomelli, Saran, & Grippa, 2017; Rekik,
brand management, the main purpose of branding remains to attract Kallel, Casillas, & Alimi, 2018; Singh, Irani et al., 2017). In addition,
new consumers and their loyalty (Weber, 2009). Consumers often use sentiment analysis is increasingly used in order to explore people's
the social media channel to express their feelings and opinions about opinions and feelings (e.g., Aswani et al., 2018; Ceron, Curini, & Iacus,
products and consequently, their attitudes towards brands (Fan, Che, & 2016; Gloor, 2017; Hopkins & King, 2010; Liu, 2012).
Chen, 2017; Fronzetti Colladon, 2018). Not surprisingly, therefore, a This paper aims to present a methodology for the analysis of
⁎
Corresponding author at: Sapienza Università degli Studi di Roma, Via Capo d’Africa 37, 00184, Roma, Italy.
E-mail addresses: [email protected] (F. Greco), [email protected] (A. Polli).
https://doi.org/10.1016/j.ijinfomgt.2019.04.007
Received 26 December 2018; Received in revised form 25 March 2019; Accepted 12 April 2019
0268-4012/ © 2019 Elsevier Ltd. All rights reserved.
Please cite this article as: Francesca Greco and Alessandro Polli, International Journal of Information Management,
https://doi.org/10.1016/j.ijinfomgt.2019.04.007
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
massive textual data, namely, Emotional Text Mining (ETM), and apply communicative interaction, i.e., it depends on its association with other
it in some of the typical areas of brand management, for example, brand words. For example, “bomb” usually indicates a negative sentiment,
identity management and brand loyalty monitoring. ETM is a particular e.g., “There was another truck bomb explosion this morning at the
kind of sentiment analysis based on a socio-constructivist approach and market in Sadr City”, but it can also imply a positive sentiment of ad-
a psychodynamic model, which allows for the identification of the miration, e.g., “she’s a sex bomb!”. Therefore, the presence, or absence,
elements setting people’s interactions, behavior, attitudes, expectations of sentiment words in a sentence does not necessarily imply the possi-
and communication. Thus, according to a semiotic approach to the bility of classifying a sentiment. That is to say, a sentence containing
analysis of textual data, ETM allows a social profiling to be performed. sentiment terms may be neutral, which happens frequently in questions
This has already been applied in different fields ranging from political or conditional sentences, and a sentence without a sentiment word may
debate, in order to profile social media users and to anticipate their express an opinion. Moreover, sarcastic sentences, with or without
political choices (Greco, Alaimo, & Celardo, 2018; Greco, Celardo, & sentiment words, are difficult to classify (Liu, 2012).
Alaimo, 2018; Greco et al., 2017; Greco & Polli, 2019), to the profes- Based on these considerations, Emotional Text Mining (ETM)
sional training effectiveness at the Sapienza University of Rome (Greco, 2016) is a text mining procedure that, by means of its bottom-
(Cordella, Greco, Meoli, Palermo, & Grasso, 2018), to brain structure up logic, allows for a context-sensitive text mining approach on un-
(Laricchiuta et al., 2018) and to the impact of the law on society (e.g., structured data, which constitutes 95% of big data (Gandomi & Haider,
Greco, 2016; Cordella, Greco, Carlini, Greco, & Tambelli, 2018). 2015). ETM is an unsupervised text mining procedure, based on a socio-
This paper is structured as follows; in Section 2, we present the constructivist approach and a psychodynamic model. According to this
theoretical approach; in Section 3, we present the ETM procedure; in approach, sentiment is not only the expression of a mood, but also the
Section 4, ETM is applied to a case study of a famous sportswear evidence of a latent and social thinking process that sets people inter-
company following the launch of a new model of sports shoes, in order actions, behavior, attitudes, expectations and communication.
to extract useful information for business decision-making; in Section 5, We know that a person's behavior depends not only on their ratio-
we discuss the theoretical contribution of the main results, as well as nale thinking but also, and sometimes most of all, on their emotional
the managerial implications; and in Section 6, we provide the conclu- and social way of mental functioning (Carli, 1990; Moscovici, 2005;
sion. Salvatore & Freda, 2011). In other words, people consciously categorize
reality and, at the same time, unconsciously symbolize it emotionally,
2. A semiotic approach to sentiment analysis in order to adapt to their social environment (Fornari, 1976). The
conscious categorization and unconscious symbolization are two par-
Sentiment analysis is a field of study that analyzes people’s opi- allel mental processes that follow two different functioning rules, i.e.
nions, sentiments, evaluations, appraisals, attitudes and emotions to- two logic (Matte Blanco, 1975). The unconscious symbolization is so-
wards entities. It is also called opinion mining, since, frequently, the cial, as people generate it interactively and share the same emotional
sentiment is considered a personal belief or judgment which is not meanings through this interaction (Greco, 2016). Since communication
founded on rationale reasoning, but on subjective emotion. and behavior are the outcome of this social mental functioning, it is
The use of a text mining approach to classify the sentiment of a text possible to analyze the communication (text) to infer the social mental
has been largely discussed in the literature, (e.g., Balbi, Misuraca, & functioning (symbolic matrix) and explain, or forecast, people’s beha-
Scepi, 2018; Bollen, Mao, & Zeng, 2011; Ceron, Curini, Iacus, & Porro, vior in different contexts. Moreover, explaining or forecasting people’s
2014; Fronzetti Colladon, 2018; Gloor, 2017; Jeong et al., 2017; Liu, behavior by means of their social media communication is relevant for
2012; Salvatore, Gennaro, Auletta, Tonti, & Nitti, 2012). Nevertheless, business management (Gloor, 2017; He et al., 2013; Lipizzi, Iandoli, &
a text mining procedure has to refer, implicitly or explicitly, to a so- Ramirez Marquez, 2015; Liu, 2012).
ciological or a psychological theoretical approach which explains the Due to the fact that the conscious process sets the manifest content
language production and the social interaction that sets the commu- of the communication, i.e. what is communicated, the unconscious
nication exchange. In order to be rigorous, a study should match the process can be inferred through how it is communicated, namely, the
theoretical approach to the methodological one but, surprisingly, this words chosen to communicate and their association within the text. We
aspect is apparently neglected by scholars (AlAlwan et al., 2017). Most consider that people emotionally symbolize an event, or an object, and
of the literature on the text mining procedure draws particular attention socially share this symbolization. The words they choose to discuss an
to the methodology (word tagging, lexical structure of the sentence, event, or object, is the product of the socially-shared, unconscious
statistical procedure, etc.), focusing on the manifest content of the text. mental functioning (Greco, 2016).
Most methods are based on a top-down approach where an a-priori
coding procedure of terms, or text, is performed focusing on the man- 3. The Emotional Text Mining procedure
ifest content of the word. Following a top-down approach, these
methods use predefined content categories to semantically classify the ETM is an unsupervised text mining procedure allowing for the
text (e.g., Balbi et al., 2018; Liu, 2012). Each of these categories cor- detection of the symbolic matrix and the representation and the senti-
responds to a thematic dictionary containing all the words indicative of ment of an entity, e.g. a specific brand. These three elements are in-
the content represented by that category. terconnected, as the symbolic matrix generates the representation
Nevertheless, as highlighted by Saussure in a Course in General (Carli, 1990) and the representation sets the sentiment as well as be-
Linguistics, language is a system of signs that expresses a system of havior (Moscovici, 2005). Moreover, they imply different levels of
meaning. Even though the top-down approaches of text mining allow generalization and awareness. While a person is aware of his/her sen-
for a reliable and valid investigation, they present a major limitation, timent, she/he is not directly aware of the representation (Moscovici,
disregarding the contextual nature of the linguistic meaning (Carli & 2005), nor is she/he aware of the symbolic matrix, which is un-
Paniccia, 2002; Salvatore & Freda, 2011). Therefore, a term can assume conscious and socially shared. For this reason, the ETM procedure al-
a specific meaning according to its association to the other terms in the lows for the detection of both the semantic and the semiotic aspects
text. conveyed by the communication.
As stated by Liu (2012), it is not sufficient to classify the sentiment While the mental functioning proceeds from the semiotic level to
lexicon in order to perform a sentiment analysis because a term, clas- the semantic one in generating the text, the statistical procedure si-
sified as a positive or negative sentiment word, may have an opposite mulates the inverse process of the mental functioning, from the se-
orientation depending on the context. In fact, the meaning of a word is mantic level to the semiotic one. For this reason, ETM performs a se-
polysemic and is subject to the way it combines with other words in a quence of synthesis procedures, from the reduction of the type to
2
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
lemma and the selection of the keywords to the clustering and the resulted in a large size corpus. In order to check whether it was possible
factorial analysis, in order to identify the semiotic level (the symbolic to statistically process data, two lexical indicators were calculated: the
matrix), starting from the semantic one (the word co-occurrence) type-token ratio and the percentage of hapax (Giuliano & La Rocca,
(Cordella, Greco, & Raso, 2014; Greco et al., 2017). 2010).
In order to detect the associative links between the words and to First, the data were cleaned and pre-processed with the software T-
infer the symbolic matrix determining their coexistence into the text, Lab (Lancia, version T-Lab Plus 2018) and keywords were selected. In
first we perform a bisecting k-means algorithm (Savaresi & Boley, 2004; particular, we used lemmas as keywords instead of type, filtering out
Steinbach, Karypis, & Kumar, 2000), limited in the number of parti- the lemma of the sportswear brand and those of low rank of frequency
tions, excluding all the text that does not have at least two keywords co- (Bolasco, 1999; Greco, 2016). Then, on the tweets per keyword matrix,
occurrence to classify the text. We have selected this clustering proce- we performed a cluster analysis with a bisecting k-means algorithm
dure as it is the most commonly used one in the semiotic approach based on cosine similarity (Savaresi & Boley, 2004) limited to 20 par-
(Greco, 2016). As in the literature, the identification of a reliable titions, excluding all the tweets that did not have at least two keywords
methodology for results evaluation is still controversial (e.g., Misuraca, co-occurrence. In order to choose the optimal solution, we calculated
Spano, & Balbi, 2018) and three clustering validation measures are the Calinski-Harabasz, the Davies-Bouldin and the intraclass correlation
taken into account in order to identify the optimal solution: the Ca- coefficient (ρ) indices.
linski-Harabasz, the Davies-Bouldin and the intraclass correlation Then, we performed a correspondence analysis (Lebart & Salem,
coefficient (ICC) indices. 1994) on the cluster per keywords matrix, and the sentiment was cal-
Next, we perform a correspondence analysis (Lebart & Salem, 1994) culated according to the number of messages classified in the cluster
on the cluster per keywords matrix. While the cluster analysis allows for and its interpretation. Finally, we performed a network analysis with a
the detection of the representations, the correspondence analysis de- community detection model, the Louvain’s algorithm (Blondel,
tects the symbolic matrix. Guillaume, Lambiotte, & Lefebvre, 2008). We chose this method as it is
The interpretation process proceeds from the highest level of suitable for a large network of textual data.
synthesis to the lowest one, simulating once again the mental func-
tioning. Therefore, first we interpret the factorial space according to 4.3. Findings
word polarization (Greco, 2016), in order to identify the symbolic
matrix setting the communication. Then, we interpret the cluster ac- After the release of a new model of shoes on November 16th, 2018,
cording to their location in the factorial space and to the words char- the number of messages produced from November 29th to December 3rd
acterizing the context units classified in the cluster, in order to identify were, on average, more than 20,000 tweets per day. The corpus pre-
the representation. Finally, the sentiment is defined in relation to the processing determined a loss of 10% of the messages (n = 96,361) re-
elements characterizing the representations (positive, neutral, or ne- sulting in a large size corpus of 1,313,025 tokens. On the basis of the
gative), and it is calculated according to the number of messages clas- large size of the corpus, both lexical indicators highlight its richness
sified in the cluster. (TTR = 0.02; Hapax percentage = 45.0) and indicate the possibility of
proceeding with the statistical analysis, which was performed with the
4. A case study 758 keywords selected.
3
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
Table 3
Brand representations and sentiment.
Cluster Tot Tweet Size Label keyword N Tweet Sentiment
classified
brand, and support the cluster interpretation according to their location 5 8,702 9.8 Latest free 2,002 Love
in the symbolic space (Table 2). release Sportswear
ship 1,982
The five clusters are of different sizes (Table 3) and reflect different
gt 1,366
brand representations. In the first cluster, the brand is perceived as a react 1,305
company able to produce good quality sportswear used by famous sport low 1,125
champions, whose customers appreciate over time; in the second drop 1,062
Kyrie 1,017
cluster, the brand is considered as a valuable object that can be col-
dunk 974
lected or exchanged by bargain hunters; in the third cluster, the brand is
represented as sportswear useful for leisure activities by sport lovers,
i.e. people who like to be fit; in the fourth cluster, the brand is re-
presented as the producer of fashion sportswear. The customers seem to
be more interested in the design rather than in the technology, as there
are words like icon, good, fashion and lovely. Finally, in cluster five, the
brand is perceived as a trustworthy sportswear company, as in cluster
one, but customers seem to be more interested in the most recent
model.
It is interesting to note that each cluster is frequently associated
with a specific color and sportswear model. For example, the first
cluster is associated with the model airmax and the color black, white
(Table 3), red (f = 1355) and many others (blue, grey, gold, silver, orange,
green, yellow and brown) appearing in a small number of messages,
ranging from 741 to 218 tweets. Moreover, it seems to be connected to
gender as the term man appears in 1625 tweets. Only in the bargain
hunters’ cluster are there neither model nor color, and where words Fig. 2. Sentiment on the sportswear brand.
probably connected to an evaluation appear (fuck, hate).
From the interpretation of the clusters, we detected five different We classified as sport lovers, people who love the milestone model,
representations of the brand connected to a specific community of those who love to be fit and those who like new releases, as they seem
Twitter users who seem to share a similar approach to the brand. As all to be mostly focused on the technological innovation and its use. On the
the representations seem to be mainly positive, we grouped the re- other hand, we considered the bargain hunters and the fashion custo-
presentations in two sentiments: sport lovers and fashion lovers (Fig. 2). mers as fashion lovers because they are more focused on the brand’s
4
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
image. Haider, 2015), and could be usefully applied to this volume of data for
real-time analytics, owing to the fact that the analyst’s intervention is
4.3.2. Community detection model with the Louvain’s algorithm only required for the interpretation of the output. For this reason, we
The community detection algorithm identified 31 communities, the think that ETM is likely to become a useful research tool due to the
size of each is shown in Fig. 3. There are a large number of small growth in the use of social media, and the usefulness of data analytics
communities and few larger ones. Among the largest five (community aimed at supporting businesses in converting large volumes of messages
n. 01, 02, 10, 12, 22), the first community “Pair” and twelfth com- into meaningful information, thereby supporting decision-making
munity “Air” are similar to cluster 1 (Milestones) and cluster 2 (Bargain (Gandomi & Haider, 2015).
Hunters) of the ETM in words composition, while the other three Unlike the sentiment analysis based on a supervised procedure, e.g.
communities do not share a common lexical profile with the EMT machine learning (Ceron et al., 2016; Hopkins & King, 2010), in which
clusters. the researcher’s interpretation is performed at the beginning of the
The relationship within the 758 keywords is shown in Fig. 4. Due to analysis in order to build the training set, in ETM the interpretation is
the large number of terms, the interpretation of the graph is relatively performed at the end of the statistical analysis. The advantage of the
challenging. Comparing the two text mining procedures, the ETM and ETM approach is to identify the elements connected with a specific
the NA, it seems that the first one identifies a smaller number of par- sentiment, as the representations are a system of values, ideas, and
titions, thus being more effective in the identification of the Twitter practices setting people’s interaction and behavior.
user’s lexical profiles. Quite possibly, a top-down approach might be Due to the limited number of characters in a tweet and to its lexical
more effective while using the NA, as it could reduce the sparseness of peculiarity, ETM could be less accurate in the classification of messages,
the matrix. as it is based on a word co-occurrence logic. Nevertheless, we have
addressed this problem by adopting a specific keyword selection cri-
teria (Greco et al., 2017) that allows for classifying practically all the
5. Discussion
messages.
The application of ETM is interesting, as it complements the results
This paper presents the application of Emotional Text Mining in the
of market research (e.g., Dwivedi, Kapoor, & Chen, 2015; Gloor, 2017;
field of brand management (e.g., Fronzetti Colladon, 2018), with par-
He et al., 2013; Liu, 2012) and focuses on groups through virtually
ticular emphasis on the themes of brand identity management and
continuous monitoring of brand perception by potential customers. The
loyalty brand monitoring. The semiotic approach of the ETM allows for
advantage of applying this methodology is the extraction of structured
the profiling of the customers of a well-known sportswear brand by
information, which is therefore highly significant, from an unstructured
profiling Twitter users. In other words, we were able to identify Twitter
collection of texts from a potentially huge quantity of data.
users’ symbolic categories and representations of the sportswear brand,
The application of ETM allowed us to identify four symbolic cate-
and to measure their sentiments. The case study was used as an example
gories that set the communication about the brand: the value of the
to illustrate the potentiality of ETM in the field of brand management,
brand, the type of customer, the customer’s use of the product and the
but its application can easily be extended depending on the analyst’s
preferences revealed by the consumer. Within these symbolic cate-
interests.
gories, ETM detects five brand representations and the characteristics
pertaining to each community of customers, regarding their product
5.1. Theoretical contributions preferences (model, color, purchase choices, etc.) and their brand
sentiment (fashion lovers or sport lovers). With regard to the use of
Although our case study was limited to Twitter, which may not network analysis, it is interesting to note that this methodology seems
allow for the results to be generalized with regard to other platforms to be more appropriate for the analysis of content. However, network
(Kapoor et al., 2018), ETM can be applied to a variety of languages and analysis identifies a large number of communities, which is less effec-
documents, from social media and media documents (e.g., Greco, 2016; tive in reducing the complexity of textual data. The use of a top-down
Greco et al., 2017) to interviews or focus groups (e.g., Cordella, Greco, approach with a multi-stage agglomeration strategy (Balbi et al., 2018)
Meoli et al., 2018). Moreover, ETM applies a bottom-up approach to could solve this problem.
unstructured data, which constitutes 95% of big data (Gandomi &
5
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
5.2. Implication for practice for the average user. After becoming familiar with these tools, the social
media specialist could drastically reduce the time spent in reading and
In addition to these considerations, being essentially of a theoretical classifying the textual data (often done manually, an activity which
nature, there are some practical reasons that make the methodology could lead to misclassification) and focus on the more creative stages of
discussed previously particularly interesting in view of its operational his/her work. For the same reason, ETM can easily be implemented by
implications. small firms (Braojos-Gomez, Benitez-Amado, & Llorens-Montes, 2015)
Firstly, text mining has proved to be a valuable tool in business that are willing to develop a social media competence.
intelligence and in social media marketing (Dwivedi et al., 2015; Lin, Finally, the radical simplification of the textual data preprocessing
Li, & Wang, 2017; Xu, Wang, Li, & Haghighi, 2017). ETM is a fast, cheap step opens up the possibility of repeating the surveys frequently and
and simple way to extract more meaningful information from large with little effort, making the monitoring of target markets on social
collections of texts. Indeed, the application of ETM greatly reduces the media a virtually continuous activity, carried out in real-time. This
complexity of textual data, while preserving its information content. aspect is very important, as it makes the social media specialist more
Such reduction is performed through the classification of texts and the responsive to detecting the rise of new trends, aspirations, and needs.
identification of factors that explain the diversity between the different To summarize, the introduction of ETM in the social media man-
clusters. The use of an unsupervised procedure allows for obtaining ager's task list can provide clear advantages in terms of cost reduction,
results without any intervention, as there is no need to train a classi- limiting the most time-consuming activities and, ultimately, increasing
fication algorithm. The only intervention required is the final inter- productivity and effectiveness identifying customers’ profiles and social
pretation of the results by an operator. media communities.
The second practical reason is that reading and interpreting the
results becomes relatively straightforward. As we have clarified above, 6. Conclusion
considering that the scraping of Twitter or any other social media can
lead to the collection of hundreds of thousands of texts, it appears The widespread use of the Internet and the constant increase in
important to extract from this large amount of textual data only the users of social media platforms has made a large amount of textual data
essential information, which in the case of ETM is easy to achieve even available. This represents a valuable source of information regarding
6
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
changes in the opinions and feelings of people, with reference to the d’Analyse Statistque des Données Textuelles (pp. 173–184). Paris, FR: JADT.org.
most disparate topics. The extraction of textual data from a social media Cordella, B., Greco, F., Carlini, K., Greco, A., & Tambelli, R. (2018). Infertilità e pro-
creazione assistita: evoluzione legislativa e culturale in Italia. Rassegna di Psicologia,
platform allows for the collection of a large amount of data, typically 35(3), 45–56. https://doi.org/10.4458/1415-04.
unstructured, in a reasonably short time. It is, therefore, necessary to Cordella, B., Greco, F., Meoli, P., Palermo, V., & Grasso, M. (2018). Is the educational
apply technologies and methods of analysis to big data, aimed at ex- culture in Italian Universities effective? A case study. In D. F. Iezzi, L. Celardo, & M.
Misuraca (Eds.). JADT’ 18: Proceedings of the 14th International Conference on
trapolating from this mass of textual data information, and ultimately, Statistical Analysis of Textual Data (pp. 157–164). Rome, IT: Universitalia.
knowledge, useful for businesses and their brand managers. Dwivedi, Y. K., Kapoor, K. K., & Chen, H. (2015). Social media marketing and advertising.
After a brief survey of the literature, we presented the results of an The Marketing Review, 15, 289–309.
Fan, Z. P., Che, Y. J., & Chen, Z. Y. (2017). Product sales forecasting using online reviews
ETM applied to a typical problem of brand management, related to the and historical sales data: A method combining the Bass model and sentiment analysis.
management of brand identity and the monitoring of the brand loyalty. Journal of Business Research, 74, 90–100.
The results obtained make it possible to identify the area in which this Fornari, F. (1976). Simbolo e codice: Dal processo psicoanalitico all’analisi istituzionale.
Milano, IT: Feltrinelli.
method provides the best results, as well as highlight the main limita-
Fronzetti Colladon, A. (2018). The Semantic Brand Score. Journal of Business Research, 88,
tions. More specifically, ETM seems to be more effective on large col- 150–160.
lections of textual data, when the aim is to identify communities of Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and
potential customers, with reference both to their perception of brand analytics. International Journal of Information Management, 35(2), 137–144.
Gentry, J. (2016). R based Twitter client. R package version 1.1.9.
value and brand loyalty. Giuliano, L., & La Rocca, G. (2010). Analisi automatica e semi-automatica dei dati testuali,
Hence, ETM seems to be applicable in the field of brand manage- Vol. II. Milano: Led.
ment, as it allows for the identification of groups of customers who, Gloor, P. A. (2017). Sociometrics and human relationships: Analyzing social networks to
manage brands, predict trends, and improve organizational performance. London, UK:
according to their lexical profile, share the same brand representation. Emerald Publishing Limited.
Moreover, this method allows for the identification of the general ca- Gloor, P., Fronzetti Colladon, A., Giacomelli, G., Saran, T., & Grippa, F. (2017). The
tegories which can be used to organize communication about the brand impact of virtual mirroring on customer satisfaction. Journal of Business Research, 75,
67–76.
on social media. Even though it is a case study, our research could Greco, F. (2016). Integrare la disabilità. Una metodologia interdisciplinare per leggere il
easily be enriched, combining the structured information obtained by cambiamento culturale. Milano, IT: Franco Angeli.
applying ETM with the profiling data of Twitter users. This, then, allows Greco, F., & Polli, A. (2019). Vaccines in Italy: The Emotional Text Mining of social
media. Rivista Italiana di Economia Demografia e Statistica, 73(1), 89–98.
for the definition of profiles, which correspond to specific customer Greco, F., Maschietti, D., & Polli, A. (2017). Emotional Text Mining of social networks:
segments, with significant advantages in terms of costs and timeliness The French pre-electoral sentiment on migration. Rivista Italiana di Economia
for obtaining results. Demografia e Statistica, 71(2), 125–136.
Greco, F., Alaimo, L., & Celardo, L. (2018). Brexit and Twitter: The voice of people. In D.
F. Iezzi, L. Celardo, & M. Misuraca (Eds.). JADT’ 18: Proceedings of the 14th
Funding International Conference on Statistical Analysis of Textual Data (pp. 327–334). Rome,
IT: Universitalia.
This research did not receive any specific grant from funding Greco, F., Celardo, L., & Alaimo, L. M. (2018). Brexit in Italy: Text mining of social media.
In A. Abbruzzo, D. Piacentino, M. Chiodi, & E. Brentari (Eds.). Book of short papers SIS
agencies in the public, commercial, or not-for-profit sectors. 2018 (pp. 767–772). Milano: Pearson.
Grover, P., Kar, A. K., Dwivedi, Y. K., & Janssen, M. (2018). Polarization and acculturation
References in US Election 2016 outcomes–can twitter analytics predict changes in voting preferences.
Technological Forecasting and Social Changehttps://doi.org/10.1016/j.techfore.
2018.09.009.
Afful-Dadzie, E., & Afful-Dadzie, A. (2017). Liberation of public data: Exploring central He, W., Zha, S., & Li, L. (2013). Social media competitive analysis and text mining: A case
themes in open government data and freedom of information research. International study in the pizza industry. International Journal of Information Management, 33(3),
Journal of Information Management, 37(6), 664–672. 464–472.
AlAlwan, A., Rana, N. P., Dwivedi, Y. K., & Algharabat, R. (2017). Social media in Hopkins, D. J., & King, G. (2010). A method of automated nonparametric content analysis
marketing: A review and analysis of the existing literature. Telematics and Informatics, for social science. American Journal of Political Science, 54(1), 229–247.
34(7), 1177–1190. Iezzi, F. D. (2012). Centrality measures for text clustering. Communications in Statistics –
Antonacci, G., Fronzetti Colladon, A., Stefanini, A., & Gloor, P. (2017). It is rotating Theory and Methods, 41(16–17), 3179–3197.
leaders who build the swarm: Social network determinants of growth for healthcare Jeong, B., Yoon, J., & Lee, J. M. (2017). Social media mining for product planning: A
virtual communities of practice. Journal of Knowledge Management, 21(5), product opportunity mining approach based on topic modeling and sentiment ana-
1218–1239. lysis. International Journal of Information Management. https://doi.org/10.1016/j.
Aswani, R., Kar, A. K., Ilavarasan, P. V., & Dwivedi, Y. K. (2018). Search engine marketing ijinfomgt.2017.09.009.
is not all gold: Insights from twitter and SEOClerks. International Journal of Jimenez-Marquez, J. L., Gonzalez-Carrasco, I., Lopez-Cuadrado, J. L., & Ruiz-Mezcua, B.
Information Management, 38(1), 107–116. (2019). Towards a big data framework for analyzing social media content.
Balbi, S., Misuraca, M., & Scepi, G. (2018). Combining different evaluation systems on International Journal of Information Management, 44, 1–12.
social media for measuring user satisfaction. Information Processing & Management, Kapoor, K. K., Tamilmani, K., Rana, N. P., Patil, P., Dwivedi, Y. K., & Nerur, S. (2018).
54(4), 674–685. Advances in social media research: Past, present and future. Information Systems
Blondel, V. D., Guillaume, J. G., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of Frontiers, 20(3), 531–558.
communities in large networks. Journal of Statistical Mechanics Theory and Experiment, Lancia, F. (2018). User’s manual: Tools for text analysis. T-Lab version Plus 2018.
10, 1–12. Laricchiuta, D., Greco, F., Piras, F., Cordella, B., Cutuli, D., Picerni, E., Assogna, F., Lai, C.,
Bolasco, S. (1999). Analisi multidimensionale dei dati: metodi, strategie e criteri d’interpre- Spalletta, G., & Petrosini, L. (2018). “The grief that doesn’t speak”: Text mining and
tazione. Roma, IT: Carocci. brain structure. In D. F. Iezzi, L. Celardo, & M. Misuraca (Eds.). JADT’ 18: Proceedings
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of of the 14th International Conference on Statistical Analysis of Textual Data (pp. 419–
Computational Science, 2(1), 1–8. 427). Rome, IT: Universitalia.
Braojos-Gomez, J., Benitez-Amado, J., & Llorens-Montes, F. J. (2015). How do small firms Lebart, L., & Salem, A. (1994). Statistique textuelle. Paris, FR: Dunod.
learn to develop a social media competence? International Journal of Information Lin, X., Li, Y., & Wang, X. (2017). Social commerce research: Definition, research themes
Management, 35(4), 443–458. and the trends. International Journal of Information Management, 37, 190–201.
Carli, R. (1990). Il processo di collusione nelle rappresentazioni sociali. Rivista di Lipizzi, C., Iandoli, L., & Ramirez Marquez, J. E. (2015). Extracting and evaluating con-
Psicologia Clinica, 3, 282–296. versational patterns in social media: A socio-semantic analysis of customers’ reactions
Carli, R., & Paniccia, R. M. (2002). L’Analisi Emozionale del Testo: Uno strumento psicologico to the launch of new products using twitter streams. International Journal of
per leggere testi e discorsi. Milano, IT: Franco Angeli. Information Management, 35(4), 490–503.
Ceron, A., Curini, L., & Iacus, S. M. (2016). ISA: A fast, scalable and accurate algorithm for Liu, B. (2012). Sentiment analysis: Mining opinions, sentiments, and emotions. Sentiment
sentiment analysis of social media content. Information Sciences, 367–368, 105–124. analysis: Mining opinions, sentiments, and emotions. Morgan & Claypool1–367.
Ceron, A., Curini, L., Iacus, S. M., & Porro, G. (2014). Every tweet counts? How sentiment Matte Blanco, I. (1975). The unconscious as infinite sets: An essay in bi-logic. London, UK:
analysis of social media can improve our knowledge of citizens’ political preferences Duckworth.
with an application to Italy and France. New Media & Society, 16(2), 340–358. Misuraca, M., Spano, M., & Balbi, S. (2018). BMS: An improved Dunn index for document
Chandler, J. D., Salvador, R., & Kim, Y. (2018). Language, brand and speech acts on clustering validation. Communications in Statistics: Theory and Methods, 1–14.
Twitter. Journal of Product and Brand Management, 27(4), 375–384. Moscovici, S. (2005). Le rappresentazioni sociali. Bologna, IT: Il Mulino.
Cordella, B., Greco, F., & Raso, A. (2014). Lavorare con Corpus di Piccole Dimensioni in Rekik, R., Kallel, I., Casillas, J., & Alimi, A. M. (2018). Assessing web sites quality: A
Psicologia Clinica: Una Proposta per la Preparazione e l’Analisi dei Dati. In E. Nee, M. systematic literature review by text and association rules mining. International Journal
Daube, M. Valette, & S. Fleury (Eds.). Actes JADT 2014, 12es Journées internationales of Information Management, 38, 201–216.
7
F. Greco and A. Polli International Journal of Information Management xxx (xxxx) xxx–xxx
Salvatore, S., & Freda, M. F. (2011). Affect, unconscious and sensemaking. A psychody- textual reviews: Understanding consumer perceptions and influential factors.
namic, semiotic and dialogic model. New Ideas in Psychology, 29(2), 119–135. International Journal of Information Management, 37, 673–683.
Salvatore, S., Gennaro, A., Auletta, A. F., Tonti, M., & Nitti, M. (2012). Automated method
of content analysis: A device for psychotherapy process research. Psychotherapy Francesca Greco received her PhD in Sociology at the Sapienza University of Rome and
Research, 22(3), 256–273. her PhD in Psychology at the University of Paris Descartes. She is currently the Research
Savaresi, S. M., & Boley, D. L. (2004). A comparative analysis on the bisecting K-means Manager of Prisma S.r.l., and she is qualified as Associate Professor in General Sociology.
and the PDDP clustering algorithms. Intelligent Data Analysis, 8(4), 345–362. She is assistant professor in “Quantitative models for socio-economic analysis” at the
Shiau, W.-L., Dwivedi, Y. K., & Lai, H.-H. (2018). Examining the core knowledge on Sapienza University of Rome and she is a member of the Italian Sociological Association
Facebook. International Journal of Information Management, 43, 52–63. and of the Italian Statistical Society. She is an expert in textual analysis and has developed
Shirdastian, H., Laroche, M., & Richard, M. O. (2017). Using big data analytics to study a text mining procedure to perform social profiling. Her areas of interest are focused on
brand authenticity sentiments: The case of Starbucks on Twitter. International Journal psychosocial processes in the field of health care, disability, organizational management,
of Information Management. https://doi.org/10.1016/j.ijinfomgt.2017.09.007. political debate and deviance.
Singh, J. P., Dwivedi, Y. K., Rana, N. P., Kumar, A., & Kapoor, K. K. (2017). Event clas-
sification and location prediction from tweets during disasters. Annals of Operations
Alessandro Polli took a PhD in economic analysis of social phenomena at Sapienza
Research, 1–21.
Singh, J. P., Irani, S., Rana, N. P., Dwivedi, Y. K., Saumya, S., & Roy, P. K. (2017). University of Rome, where actually he teaches economic statistics and quantitative
Predicting the “helpfulness” of online consumer reviews. Journal of Business Research, methods for economics. He has been scientific advisor of several public institutions, like
70, 346–355. Bank of Italy, and the Italian Presidency of the Council of Ministers. He is a member of the
Steinbach, M., Karypis, G., & Kumar, V. (2000). A comparison of document clustering Italian Society of Economic, Demography, and Statistics. His research fields are statistic
methods for market research, sustainable development and quality of life, assessment of
techniques. KDD workshop on text mining, vol. 400, 525–526.
Weber, L. (2009). Marketing to the social web: How digital customer communities build your the economic impacts of the migrations, assessment of the economic impacts of the new
business. London: Wiley. technologies on the job market.
Xu, X., Wang, X., Li, Y., & Haghighi, M. (2017). Business intelligence in online customer