1 s2.0 S2211695815000021 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Discourse, Context and Media 7 (2015) 28–36

Contents lists available at ScienceDirect

Discourse, Context and Media


journal homepage: www.elsevier.com/locate/dcm

The form and function of quoting in digital media


Cornelius Puschmann n
Zeppelin University Friedrichshafen, Faculty of Social Sciences, Am Seemooser Horn 20, 88045 Friedrichshafen, Germany

art ic l e i nf o a b s t r a c t

Article history: In this article, we discuss the function of quoting and information sharing in social media services and
Received 29 September 2014 argue that certain aspects of quoting point to similarities with oral culture, where the social functions of
Received in revised form sharing complement the aim to inform or disseminate information. We approach the issue by first
19 January 2015
providing a brief historical account of content sharing practices from the early days of the Internet to the
Accepted 23 January 2015
Available online 11 February 2015
contemporary social media environment, in which content sharing is both prevalent and facilitated by
platform architecture. We then conduct an exploratory quantitative content analysis of three Twitter
Keywords: hashtags relating to different topics, and link their structural variation to the different content sharing
Quoting practices prevalent in them. We conclude by arguing that the social use of quotation in social media
Content sharing
discourse can be a predictor of community structure, but that the degree to which this is the case differs
Computer-mediated communication
locally.
Social media
Twitter & 2015 Elsevier Ltd. All rights reserved.
Retweeting

1. Introduction: technology and the reproduction of discourse dimensions of speech reproductions in settings such as scholarship,
journalism, political discourse, and everyday life. In this article we
Strategies for relating another speaker's words have considerable examine the role that technology plays in shaping the form and
socio-communicative relevance and accordingly belong to the linguis- function of quoting, and provide evidence for the discursive affor-
tic repertoire of many languages. Both oral quoting (Tannen, 1989) and dances of quoting in digitally mediated discourse, using Twitter as
textual quoting (Moore, 2011) are longstanding areas of interdisciplin- our example. Our main argument will be that the function of quoting
ary inquiry and raise interesting theoretical and conceptual issues, is locally configured and that its meaning differs not just between
both for linguistic pragmatics and for discourse analysis (Buchstaller different channels of communication, but from one community to
and van Alphen, 2012). the next. A range of strategies are used to represent quotation in
The pragmatic dynamics of textual quoting are at once shaped by print, such different kinds of quotation marks, indention, font style
situational factors (the relation between the writer, the reader, and and color, and yet more are common in computer-mediated com-
the related discourse) and by the technology of reproduction, and munication (see Kirshenblatt-Gimblett (1996) and Herring (1999), for
they react to change, both formally and functionally, as an increasing two early accounts). In addition to dramatically increasing the means
number of instruments for quoting is at the disposal of writers. by which text can be marked up in digital documents, technology has
Bakhtin highlighted the pragmatic volatility of speech reproduction also changed the way in which a piece of writing can be copied, from
and its potential for creative expression when he argued that “the mechanical reproduction (i.e. in photocopying) and digitization
relationship to another's words was equally complex and ambiguous (scanning and optical character recognition to digitize printed text)
in the Middle Ages… the boundary lines between someone else's to digital reproduction (i.e. use of an operating system's copy and
speech and one's own speech were flexible, ambiguous, often paste function), and, finally, content sharing functions such as liking,
deliberately distorted and confused” (Bakhtin, 1981, p. 69). Oral retweeting and reblogging. Arguably, the techniques available for
reproduction places significant cognitive demands on both speakers quoting have become both easier to use and more powerful over
and listeners, as both must be able to assign different discourse roles time, and as a result their popularity has increased.
correctly in the absence of a physical speech situation, and different
conventions, both of production and of interpretation, exist to deal
with the discourse of others (Harnad, 1995). Scholars from a variety 2. Interdisciplinary perspective on quoting across media
of disciplines have investigated the formal, functional, and cultural
Scholars have conceptualized quotation in several distinct ways,
based on their theoretical orientation and preferred analytical approach.
n
Tel.: þ 49 7541 6009 1321. Abbott (2003) provides a concise overview of research from (predomi-
E-mail address: [email protected] nantly) linguistic semantics and pragmatics, and discusses some of the

http://dx.doi.org/10.1016/j.dcm.2015.01.001
2211-6958/& 2015 Elsevier Ltd. All rights reserved.
C. Puschmann / Discourse, Context and Media 7 (2015) 28–36 29

recurring themes, such as the formal distinction between open vs. computing, it has since then become mobile and ubiquitous through
closed quotes (Recanati, 2001) and the conceptual difference between laptops, smartphones and tablets, all of which support a broad range
quotation as description vs. quotation as demonstration (Clark and of applications which are effectively synchronized through wireless
Gerrig, 1990). Most of the approaches discussed by Abbott seek in one networking or mobile Internet services. The line between synch-
way or another to distinguish quotation from non-quotation, and to ronous and asynchronous interpersonal communications, and closed
establish formal differences between distinct types of quotation. Socio- 1-to-N messaging systems where content is principally open to
linguistic and discourse analytical studies form another direction of anyone is increasingly blurred, as web sites converge with apps on
linguistic research, which has tended to focus more strongly on the mobile devices (Herring, 2007). The broad usage of mobile devices
social and interactional aspects of reproduction, particularly in spoken makes the interaction with existing content increasingly attractive, as
discourse. A central object of interest have been quotatives – devices platforms and services that enable co-creation and blur the boundary
that signal the reproduction of spoken discourse in spoken language between producers and consumers proliferate (Bruns, 2008). Rather
and their historical development and proliferation (Buchstaller and van than just providing content that can be passively used, with a
Alphen, 2012; Macaulay, 2001; Romaine and Lange, 1991; Tagliamonte relatively high barrier for content creation, social media environ-
and D'Arcy, 2004). ments place a strong emphasis on interaction without the need to
Accounts that focus on the role of technology for the production invest much time, for which information sharing is an ideal instru-
and interpretation of textual quotations are somewhat rarer. Moore ment. A second noticeable change in CMC is the shift from an open
(2011) provides such an account, focusing on the historical develop- Web to platforms. Services such as Twitter and Facebook depend on
ment of media technology in tandem with the formal and functional measurable user interaction in order to generate data that makes
development of quoting. The evolution of typographical standards for user engagement visible (Gerlitz and Helmond, 2013). Original
quoting took place alongside general standardization: printers even- content creation is just one proxy for engagement, another is the
tually settled on specific markers and discarded others as the meaning sharing and retweeting of content produced by others. Quoting in
of quotations became conventionalized. The written reproduction of CMC blends a technique well-established in print culture with the
writing also gave rise to a set of norms different from the more affordances of a new technology, by means of countless functions
generous conventions of reproducing spoken discourse orally. Report- that allow the redistribution of content through the push of a button.
ing speech in writing, e.g. in the news media, is very tightly bound to While initially information-sharing was a key development aim
conventions of precision which are an integral part of journalistic of both the Internet as a decentralized network and the World
ethics, just as citing sources correctly is paramount to scholarly Wide Web as a service based on an open hypertext standard,
practices (Zelizer, 1989). In journalism “quotes should be faithful technical intricacies and high costs made it largely impossible for
to the words and meaning of the speaker.” (Clark, 1995, para 1), a most early users to contribute content. Users of the 1990s Internet
norm that also applies in to scholarship, and to many formal written were mostly confined to the role of readers, downloaders and
genres. While the importance of faithful reproduction holds both for consumers, rather than content producers. The facilities for redis-
writing and for speech, truly verbatim reproduction is unrealistic in tributing information were limited and the content itself was
many contexts of spoken language use (cf. Clark and Gerrig, 1990, p. largely textual. Many users were introduced to digital textual
795). While technically available to anyone, the proliferation of written quoting through email, while some were already familiar with
quotation beyond specific communities of practice seems to be a the conventions of inline text production through newsgroups and
recent development, especially when examining the evolution of message board systems (Herring, 1999). Both email and news-
quoting in computer-mediated communication (CMC). The technolo- groups offered means of replying to others that incorporated
gical means of reproduction – ‘copy and paste’ in older forms of CMC, quoting, though compared to social media, the means were still
and buttons that allow easy sharing, retweeting and reblogging in relatively cumbersome. Herring (1999, p. 8) characterized quoting
contemporary social media platforms – arguably impact the role of in early CMC as a means of “creating the illusion of adjacency” in a
quotation in CMC more broadly, especially when taking into account sequence of email messages. Increasingly, such an illusion can be
the large user communities that engage in the production and at once discursive and social. Content sharing in social media
reproduction of information in CMC contexts (Kwak et al., 2010). platforms generally creates a visible link between the quoter and
Assuming a view of quotation that emphasizes its characteristics as the quotee, intuitively making it a mean of establishing affiliation
shaped by technology (and technology in turn being appropriated by a between two users. This is possible although the person being
range of actors in a variety of sociocultural settings) therefore quoted may not be consciously aware of the fact that they are
introduces a new dynamic into the study of quoting. Many of the being quoted, or may not agree to it. It is in this vein that Boyd
relevant influences are issues that apply more broadly to discourse et al. (2010) argue that retweets in Twitter are not just a form of
analysis in computer-mediated settings and relate to specificities of content diffusion, but allow users “to validate and engage with
CMC, such as the combination of spontaneous production with the others” (p. 1), and that Page (2012) notes their potential to “display
permanency of data storage, the influence of technology on the shape connection with others or to signal influence” (p. 183).
of the discourse, or the relation of the discourse produced to the Sharing content is also a vastly popular activity online. It ranks
community that produces it. Bolander and Locher (2014) and Giles among the most popular activities on a wide array of social Web
et al. (2014) provide valuable overviews of central issues in socio- platforms, such as social networking sites, blogging and microblog-
linguistic and discourse analytic perspectives on CMC that are ging services. Not only have functions related to content-sharing
important in this context. In what follows, we will trace some of the become central in services such as Twitter by supporting specific
technological determinants of quoting on the Web. platforms from which the content is taken and by increasingly
offering facilities to embed and preview the material (for example
videos posted on YouTube), but new services built specifically around
3. From quoting in early CMC to sharing in social media content sharing have also emerged, such as Tumblr and Pinterest.
Tumblr is a hybrid social networking site and microblogging platform
Computer-mediated communication has changed considerably designed to share content by posting it to the user's tumblelog.
with the rise and proliferation of the Internet since the 1980s and Different formats such as photo, (textual) quote, link, chat, audio, and
the emergence of the World Wide Web in the 1990s, with important video are supported. Objects shared by users are visible in their
implications for quoting and content sharing. While CMC was tumblelog, the equivalent of the Facebook timeline. They can be
initially tied to the closely controlled environment of desktop reblogged to one's own tumblr, but it is also possible to allow other
30 C. Puschmann / Discourse, Context and Media 7 (2015) 28–36

users access in order to create a collaboratively curated site. The communities (Mahrt et al., 2013). Retweets can be interpreted as
result are colorful collections of Internet content from a variety of intra-Twitter citations in some discourses, taking on a variety of
sources, photos posted in blogs and on a variety of news outlets, functions depending on their users and context of usage. Retweet-
YouTube videos and other sources. These examples point to the ing also serves as a model for how discourse conventions emerge
convergence of quoting and sharing in social media environments. in the environment of a digital social network. Kooti et al. (2012)
Sharing can be regarded as part of what Scott (2009) refers to as the studied the emergence of different markers to denote retweets.
“gift economy of the Web”, in which social relationships are They were able to trace the origin of the convention to a small
negotiated and stabilized through a set of symbolic interactions. network of densely connected and influential users through whom
While quoting in formal print genres such as journalism and it spread through the entire network to become a fixed conven-
scholarship serves primarily informational and argumentative pur- tion. By being integrated into the interface, the convention of using
poses, content sharing in locally configured online communities is a “RT” to denote retweets was gradually standardized and several
component of this gift economy. competing conventions were replaced by a single norm. Potts et al.
(2014) study retweeting among political activists in the UK and
note the lack of original content circulating among the selection of
4. Retweeting as a form of quoting and information sharing activists accounts they analyze. In line with these findings, Bastos
et al. (2013) find a group of serial activists that tweet across many
In the following, we will take a closer look at one particular different hashtags to promote a variety of causes, and accordingly
example of quoting in computer-mediated communication that diverge strongly from users in other communities. The most
highlights the relationship of traditional quoting with content comprehensive inquiry into the form and function of retweeting
sharing—the practice of retweeting on Twitter. Twitter is increas- comes from Boyd et al. (2010, p. 6) who approached the topic
ingly studied by researchers from a variety of fields for its role in when Twitter was still in its infancy. Based on an informal survey
self-presentation and promotion, the proliferation of news, poli- conducted by Danah Boyd among her own followers, users have
tical debate, online activism, scholarship, and popular culture the following motives for retweeting other user's messages:
(Page, 2012; Bastos et al., 2013; Burgess and Bruns, 2012, Potts et
al., 2014; Mahrt et al., 2013; Highfield et al., 2013). Specific  to amplify of spread messages to new audiences;
communicative strategies such as sending messages through use  to entertain or inform a specific audience;
of the @-sign (Honeycutt and Herring, 2009) and retweeting (Boyd  to comment on someone's tweet;
et al., 2010) have also been investigated, as have linguistic and  to make one's presence as a listener visible;
cultural differences in the adoption and usage of the service  to publicly agree with someone;
(Bamman et al., 2014).  to validate others' thoughts;
Twitter allows users to post messages (tweets), point to Web  to recognize or refer to less popular people or less visible
pages and communicate both publicly and privately with others. content;
From a reader's perspective, Twitter creates a composite stream of  to gain followers;
the posts of users that the reader is following (the all friends view).  to save tweets for future access.
Users communicate with each other by sending direct messages
(which are private) or by using the @ character followed by the user
name of the addressee. Two additional strategies make it easier to Boyd's collection suggests that users have a range of complex
become aware of tweets from other users: the use of hashtags and motives for retweeting messages, ranging from personal gain to
passing on the tweets of others to one's own followers (retweeting). informing others. Our analysis will aim to connect these motives
While retweeting disseminates information from another source with the diverging structural properties of retweeting across three
only to one's own followers, the use of a hashtag makes a tweet different Twitter communities.
visible to all users actively searching for that hashtag without the
need to follow the tweet's creator, thus creating an ad-hoc discursive
space for an event or topic (e.g. royalwedding, fifa, acta). While
Twitter opens new possibilities of study for discourse analysis, it also
poses challenges related to the quantity, quality and granularity of 5. A case study: retweeting in three hashtag publics
data, as well as ethical issues that arise when working with publicly
accessible social media data (see Giles et al. (2014), for an in-depth How do retweeting practices differ and vary across user commu-
discussion). nities? A helpful concept in this context is that of hashtag publics
Page (2012) considers Twitter as a linguistic marketplace on (Bruns and Burgess, 2011), groups of users that form in an ad-hoc
which users discursively act out an identity, and sees retweeting as fashion around a particular hashtag and have a relatively loose social
one resource that is enlisted to achieve that goal (p. 183). Boyd structure. While social media services like Twitter provide a platform
et al. (2010) argue that beyond simply passing on messages for the for communication, the respective practices of each user community
sake of their informational value, retweeting plays a crucial role in is different, and generalizations across the entire population of
mediating relationships between the retweeter, his/her followers Twitter users are difficult to make. In addition to practices varying
and the retweetee. Users who are frequently retweeted can be from one user to another, they also vary from one discourse context
assumed to receive attention beyond the circle of their immediate to the next, depending on who the target audience is and what
followers, which can either be interpreted as a sign of popularity, communicative conventions are in place. An example of this type of
or as an indicator of the value of their contributions to the variation is the use of hashtags on Twitter. A hashtag can have a
community. While this is not in itself a new aspect, the ease of variety of different functions to its users, not just individually, but
passing on information in an environment such as Twitter in also depending on whether it is used to describe a topic, event, or
contrast to quoting in scholarly or journalistic contexts is likely to issue. While some hashtags are used playfully to denote specific
strengthen the phatic function of quotation. Retweeting takes on concepts, others serve as a discursive space in which communities
an important role in specific usage contexts of Twitter among emerge. In the following, three hashtags will be longitudinally
certain communities of practice, for example to disseminate described in terms of how the retweeting in them correlates with
information and as a face-enhancing instrument in academic the sophistication of their community structure.
C. Puschmann / Discourse, Context and Media 7 (2015) 28–36 31

5.1. Data and methods report approximately 30% retweets in the election-related hashtag
qldvotes. Bastos et al. (2013) examine a dataset of 455 hashtags
The three hashtags examined for this study, phdchat, bigdata containing 8.4 million tweets by 3.8 million users, finding a
and GoT represent three very different communicative spaces. retweet percentage of 34% and a tweet-user ratio of 2.2:1. By
While phdchat and bigdata are thematic, relating to graduate contrast, Page et al. (2012, p. 186) reports much lower percentages
student life and information technology, respectively, GoT is used (o10%) of retweets and a high percentage of @-messages (up to
by fans of the television series Game of Thrones. While the three 30%) in a corpus drawn from individual accounts, suggesting that
hashtags cover markedly different topics, they are what Highfield retweeting is less common outside of hashtags than it is within,
et al. (2013, p. 321) refer to as “topical”, in contrast to emotive and that political debates attract more retweeting than entertain-
hashtags such as fail or facepalm. Zappavigna (2011) sees such ment. Table 1 provides an overview of the data.
emotive hashtags a source of ambient affiliation, rather than just
being topical rallying points for particular communities. Page
5.2. Activity over time
(2012) distinguishes between topical and evaluative hashtags in
her analysis, finding the first type to be much more common than
The fact that the first two hashtags are associated with open
the second (p. 188).
themes to which users can contribute discussion and information has
Tweets with the three hashtags were collected for a period
an impact on the volume of contribution over time. Fig. 1 shows
of three months each in 2012 and 2013 using the server-based
tweets over time in the three hashtags over the three-month period
software yourTwapperKeeper, which then relied on a combination
of data collection, with different scales on vertical axis to accom-
of the Twitter Search and Twitter Streaming API (for a detailed
modate the variation in volume. While the phdchat hashtag is also
discussion of this approach, see Bruns et al. (2012)). While there
used for ongoing discussion, its kernel is a moderated weekly
are known consistency issues associated with the completeness of
discussion that takes places each Wednesday evening. Accordingly,
samples drawn from the Twitter APIs, we assume that our three
there are predictable weekly spikes, the strongest of which in the
samples are internally consistent as a result of having used a fixed
sampled period occurred on April 4th, 2012, when the topic of chat
period of collection, and that the characteristics under examina-
was “Blogging about your research” (Phdchat, 2013). Similarly,
tion are unlikely to be affected by sampling issues. The three-
bigdata also shows a temporal pattern of activity, with high activity
month sampling window in the present study places an emphasis
characterizing the middle of the work week, and low activity on the
on long-term contribution to the hashtag, resulting in a much
weekend. Third, GoT shows a significant spike of over 12,000 tweets
higher mean number of tweets per user for phdchat and bigdata
on a single day on June 3rd, 2012, when the final epi-
than for GoT, since backchannel discussion of broadcast media
sode of the show's second season was aired in the U.S., capturing
tends to be highly ephemeral. In their overview of studies
an audience over four million viewers. All three hashtags have
examining the use of Twitter at academic conferences, Mahrt
median activity levels of several hundred (phdchat, GoT) to several
et al. (2013, p. 403) found a mean tweet-user ratio of 8:1 across 12
thousand (bigdata) tweets per day.
scholarly conferences in different fields, echoing the ratio on
phdchat, though high levels of participation are easier to uphold
across the short time span of a conference than in a virtual 5.3. Sporadic users vs. regulars
discourse space such as phdchat over several months. Highfield
et al. (2013, p. 324) report a mean tweet-user ratio of 3:1 in their How much of the activity was produced by users who con-
discussion of the eurovision hashtag, consistent with the low tinually contributed to the conversation, rather than just posting a
mean in GoT. The three hashtags also differ with regard to the single tweet? We assessed this by counting the number of days
percentage of tweets that are retweets, with 20% in phdchat, 39% that each user was active under the respective hashtag (defining
in bigdata, and 15% in GoT. Larsson et al. (2011, p. 736) report a ‘being active’ as having tweeted once or more), rather than simply
share of 33% retweets in their analysis of the Swedish national counting the number of total tweets contributed by the user. In
election hashtag val2010 and Bruns and Highfield (2013, p. 685) theory, a user could be active every single day, and some come

Table 1
Description of three hashtag datasets, with mean number of tweets/user and percentage of retweets.

Hashtag Topic Period Tweets Users x σ % RT

phdchat Pursuing a PhD, higher learning Apr 1st 2012 – Jun 30th 2012 21,735 3045 7.1 36.4 20
bigdata News and discussion on (mostly) technical issues of data/IT Nov 1st 2012 – Jan 31st 2013 215,263 45,074 4.8 67.3 39
GoT HBO television series Game of Thrones Jun 1st 2012–31st Aug 2012 100,053 60,989 1.6 3.3 15

Fig. 1. Tweets over time for phdchat (a), bigdata (b), and GoT (c).
32 C. Puschmann / Discourse, Context and Media 7 (2015) 28–36

Fig. 2. Regular users ranked by days of activity for phdchat (a), bigdata (b), and GoT (c).

Table 2
User feature correlations in phdchat.

ats_sent ats_received rts_given rts_received urls_tweeted days_active

ats_sent 1 0.68 0.18 0.22 0.09 0.33


ats_received 0.68 1 0.19 0.19 0.04 0.33
rts_given 0.18 0.19 1 0.21 0.45 0.23
rts_received 0.22 0.19 0.21 1 0.43 0.33
urls_tweeted 0.09 0.04 0.45 0.43 1 0.42
days_active 0.33 0.33 0.23 0.33 0.42 1

Table 3
User feature correlations in bigdata.

ats_sent ats_received rts_given rts_received urls_tweeted days_active

ats_sent 1 0.29 0.07 0.23 0.15 0.20


ats_received 0.29 1 0.11 0.29 0.16 0.21
rts_given 0.07 0.11 1  0.05 0.22 0.21
rts_received 0.23 0.29  0.05 1 0.38 0.40
urls_tweeted 0.15 0.16 0.22 0.38 1 0.79
days_active 0.20 0.21 0.21 0.40 0.79 1

close to this mark, with 90 days (bigdata), 85 days (phdchat), as ats_sent, rts_given and urls_tweeted are depended on the total
and 71 days (GoT) as the maximum values in the three hashtags. number of tweets, and urls_tweeted is depended on rts_given. We
However, only very few accounts show such a high level of also restricted the correlational analysis to regulars, translating into
participation, and both in bigdata and GoT some of the activity 332 (phdchat), 4888 (bigdata), and 6887 (GoT) users overall. This was
comes from organizational or automated accounts, whereas in done both to keep correlation matrices at a computationally manage-
phdchat tweets are contributed almost exclusively by individuals. able size, and because an excess of very low feature values tends to
For our further analysis, we designated those users with the values overinflate correlation scores. Tables 2–4 give an overview of the
above the ninth decile of activity (the upper 10% of the sample) as results.
regulars. Fig. 2 shows a plot of user activity for the regulars in each In phdchat, a positive correlation between sharing hyperlinks
of the three hashtags. While the rank curve is least steep in and both giving and receiving retweets exists, indicating that users
phdchat, bigdata also has a large number of regular contributors. place strong emphasis both on disseminating information and on
The skew is considerably more pronounced in GoT, where a large acknowledging others who do. While strictly speaking the two
share of users contributes only quite sporadically. The user bases variables are confounded, the lack of such a correlation in GoT
of phdchat and bigdata are thus broader, with more users shows that long-term engagement does not automatically mean
contributing on a relatively regular basis in the period under study. that a community structure emerges. There is also a strong
positive correlation between sending and receiving @-messages
5.4. Correlations of communicative features in phdchat, suggesting not only that, users engage in this activity,
but also that sending @-messages is reciprocated in the commu-
What are structural differences in the communicative activities nity. By contrast, this is not the case with retweets, which are not
that regular users engage in under the three hashtags, and how are generally reciprocated. Finally, posting tweets that contain hyper-
these activities internally correlated? To answer this question, we links is positively correlated with the duration of activity, sug-
calculated Kendall's tau rank correlation coefficient (τ) for six Twitter gesting that those users who contribute over longer periods also
features for each user: the number of @-messages sent (ats_sent), the post more informative content than occasional contributors. In
number of @-messages received (ats_received), the number of bigdata, long-term engagement is even more strongly correlated
retweets given (rts_given), the number of retweets received (rts_re- with the number of hyperlinks shared than in phdchat, emphasiz-
ceived), the number of tweets containing a URL (urls_tweeted), and ing the hashtags focus on information sharing, rather than enga-
the number of active days (days_active). We calculated the feature ging in discussion. There is also a weak positive correlation
correlation coefficients using Kendall's tau, which was chosen in between the period of activity and receiving retweets, suggesting
favor of Pearson's r, as feature scores were not normally distributed that more active users receive somewhat more attention than less
and there was a strong presence of outliers capable of causing active ones. Sending and receiving @-messages and giving and
coefficient overestimation. Features are also partially confounded, receiving retweets are by comparison both very weakly correlated.
C. Puschmann / Discourse, Context and Media 7 (2015) 28–36 33

Table 4
User feature correlations in GoT.

ats_sent ats_received rts_given rts_received urls_tweeted days_active

ats_sent 1 0.26  0.04 0.06  0.05 0.17


ats_received 0.26 1 0.12 0.15 0.04 0.18
rts_given  0.04 0.12 1 0.06 0.17 0.12
rts_received 0.06 0.15 0.06 1 0.14 0.23
urls_tweeted  0.05 0.04 0.17 0.14 1 0.21
days_active 0.17 0.18 0.12 0.23 0.21 1

Fig. 3. Feature correlation heatmaps of phdchat (a), bigdata (b), and GoT (c).

Finally, GoT is notable because correlations of Z0.3 are entirely analysis to the retweet data for the regular users in our three
absent from the data. Even the correlation between activity and hashtags (see Larsson and Moe (2011), for a similar approach to
the other features is much weaker than in the other two data sets, retweets as a network). We differentiate between retweets con-
though activity is arguably confounded with all other actions. taining URLs and those that do not, in order to assess how much of
Fig. 3 plots the correlations in the three data sets and orders the the retweeting activity is at least in part related to information
features by the strength of their association. While granular sharing, in contrast to retweeting that is a form of support,
feature association clusters are visible in phdchat and, to a lesser endorsement or social signaling. Only 35% of all tweets in phdchat
degree, in bigdata, systematic associations are absent in GoT. contain a URL, whereas 65% do not, while 73% of all tweets under
5.5. Relation of retweeting and information sharing the hashtag that are retweeted at least once contain a URL, while
23% do not. In other words, there is a strong preference for
While the above characterizes differences between users in the retweeting content that contains URLs over tweets that do not.
three hashtags, we have not yet examine the relation between Interestingly however, tweets that do not contain a URL predict the
retweeting, information sharing, and other activities. Across hashtags, communal structure of the network quite well, which becomes
what is shared often contains a URL, but both the percentage of evident when comparing reciprocal and non-reciprocal network
retweets and the percentage of tweets with URLs in them vary structures. Fig. 5 shows retweets (edges) between users (nodes).
strongly across all three hashtags. Fig. 4 shows these differences in The color of edges indicates whether a retweet contains a URL
three two-by-two matrices. In phdchat, the majority of content (light blue) or not (red). Node size indicates in degree, i.e. the
contains neither a URL nor is it retweeted—an indicator for the number of received retweets among the regular users. The first
discursive nature of the hashtag. Unsurprisingly, much of the informa- graph shows all retweets (a), while the second graph shows only
tion that is shared is not retweeted, though this percentage is those retweets which are reciprocated (b). Many nodes disappear
considerably higher in bigdata. Finally, the bulk of what is retweeted from the first graph to the second, as peripheral users only retweet
contains a URL. In bigdata a much higher percentage than in the two one of the more central users without that user reciprocating the
other hashtags has a URL, but there seems to be hardly any difference retweet. Most of the retweets without a URL are preserved,
between tweets containing a URL and those that do not in terms of however, providing tentative evidence for their social function.
how often they are retweeted. Finally, in GoT patterns initially seem Most of the non-reciprocated retweets on the periphery of the
similar to phdchat, but with the significant difference that there is also network that disappear from the first graph to the second are also
little difference between tweets than contain a hyperlink and those tweets that contain a URL, while some of the isolate retweet dyads
that do not. As in bigdata, retweeting seems to be more indiscriminate, without a URL are retained (Figs. 6 and 7).
perhaps indicating that users in these two hashtags pay less attention In bigdata the proportion of retweets without a URL is even
to the content itself than they do in phdchat. lower than it is in phdchat, as is the proportion of reciprocal edges.
The graph is essentially trimmed down when non-reciprocal edges
5.6. Reciprocation in retweet networks are removed, preserving a much smaller core. The graph's rela-
tively compact appearance is the result of it being initially much
What role does retweeting play in the community structure larger than the two others, but considerably more clustered than
of phdchat, and what is its significance beyond information GoT. Visual inspection shows that GoT contrasts markedly with
sharing? We approach these two questions by applying network both other hashtags by being much sparser, meaning that a large
34 C. Puschmann / Discourse, Context and Media 7 (2015) 28–36

Fig. 4. Relation of overall tweets to retweets/tweets containing URLs to those without for phdchat (a), bigdata (b), and GoT (c).

Fig. 5. Retweet graph of phdchat with all nodes (a), and with reciprocally connected nodes only (b). Light blue edges indicate retweets containing a URL, red edges indicate retweets
without a URL. Node size indicates in degree. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

proportion of users are not connected to their peers. In contrast to contain hyperlinks, but this does not make such social retweets a
both phdchat and bigdata the majority of retweets among the good predictor of community structure. Retweets without URLs can
regulars apply to tweets that do not contain a URL, but can be indicate such a structure more reliably than those without, as is the
assumed to be entertaining, for example by quoting memorable case in phdchat, but this meaning is locally configured, rather than
lines from the show. The already sparse graph is trimmed down universal. When examining the tweets themselves, most of those
considerably when reduced to reciprocal edges only, and in which contain a URLs are guides, how-tos and other informative
contrast to the other two hashtags even the reciprocal compo- content (e.g. how to get a job in academia when you finish your phd,
nents are not internally connected. 42 tips to surviving and thriving during your phd, my phd viva
Based on the network analysis, a complex picture of the role of preparation and experience phdchat unedited and honest). By con-
retweeting in different communities emerges. In phdchat, the trast, the tweets without a URL were more likely to express an
hashtag with the strongest community structure in terms of regular opinion, voice a common concern, or ask for solidarity (e.g. you
engagement, a combination of informational and social sharing know what i think needs discussion in phdchat? mental health. just
takes place, with a high degree of reciprocation between the core heard of a phd student taking their own life. no phd is worth this.).
regulars who both give and receive retweets. In bigdata the Many also explicitly ask to be retweeted, for example when
structure is trimmed down at scale, preserving key actors who promoting an upcoming phdchat session (cf. Potts et al., 2014).
mostly disseminate information. Finally, GoT represents a collection Before this background, it seems likely that retweeting is truly social
of very small groups only superficially connected through the where a community structure already exists and is effectively
hashtag. Much of the retweetings applies to messages that do not reinforced through reciprocation.
C. Puschmann / Discourse, Context and Media 7 (2015) 28–36 35

Fig. 6. Retweet graph of bigdata with all nodes (a), and with reciprocally connected nodes only (b). Light blue edges indicate retweets containing a URL, red edges indicate retweets
without a URL. Node size indicates in degree. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 7. Retweet graph of GoT with all nodes (a), and with reciprocally connected nodes only (b). Light blue edges indicate retweets containing a URL, red edges indicate
retweets without a URL. Node size indicates in degree. (For interpretation of the references to color in this figure legend, the reader is referred to the web version
of this article.)

6. Discussion shown that in users in phdchat that are overall highly active also
tend to take on a central role in the retweet network, are more
We have chosen to analyze three hashtags that vary strongly likely to have many reciprocal ties, and are finally more likely than
not only in the topics they address, but also in their temporal less central users to retweet socially, which can be interpreted as a
pattern of activity (regular vs. bursty), the size of their contributor signal of solidarity. Our central finding however is that the meaning
base (broad vs. narrow) and the interaction of communicative of the same action (retweeting) can be strongly locally configured,
features in them (clustered vs. uncorrelated). We have furthermore with different communities prioritizing different forms of discourse
36 C. Puschmann / Discourse, Context and Media 7 (2015) 28–36

over others (e.g. sharing information in bigdata over discussion in Bruns, A., Liang, Y.E., 2012. Tools and methods for capturing Twitter data during
phdchat). Finally, some hashtags hardly represent communities in natural disasters. First Monday 17 (4), http://dx.doi.org/10.5210/fm.v17i4.3937.
Buchstaller, I., van Alphen, I. (Eds.), 2012. Quotatives: Cross-linguistic and Cross-
the narrow sense of the term, but rather collections of disparate disciplinary Perspective. John Benjamins, Amsterdam, p. 296.
conversations that are only weakly connected. Burgess, J., Bruns, A., 2012. (Not) the Twitter election: the dynamics of the ausvotes
conversation in relation to the Australian media ecology. Journal. Pract. 6 (3),
384–402. http://dx.doi.org/10.1080/17512786.2012.663610.
7. Outlook Clark, H.H., Gerrig, R.J., 1990. Quotations as demonstrations. Language 66 (4),
764–805.
Clark, R.P., 1995. Tips on handling quotes. Am. Journal. Rev.
Quoting in formal print media serves a range of functions asso- Gerlitz, C., Helmond, A., 2013. The like economy: Social buttons and the data-
ciated with the needs of specific communities of practice (scholars, intensive web. New Media Soc. 15 (8), 1348–1365. http://dx.doi.org/10.1177/
1461444812472322.
journalists) and has been conventionalized in ways that suit the Giles, D., Stommel, W., Paulus, T., Lester, J., Reed, D., 2014. Microanalysis of online
rhetorical needs of these users. New approaches to quoting in data: the methodological development of “digital CA”. Discourse Context
computer-mediated communication, which initially echoed estab- Media, 1–7. http://dx.doi.org/10.1016/j.dcm.2014.12.002.
Harnad, S., 1995. Interactive cognition: exploring the potential of electronic quote/
lished conventions in terms of form, have become much more
commenting. In: Mey, J.L., Gorayska, B. (Eds.), Cognitive Technology: In Search
generic in the step from the early days of the Internet to the present. of a Humane Interface. Elsevier, Amsterdam, pp. 397–414.
Quoting has taken on a broader role, emphasizing phatic and Herring, S.C., 1999. Interactional coherence in CMC. J. Comput. Mediat. Commun. 4 (4).
sociocommunicative aspects in addition to argumentative and infor- Herring, S.C., 2007. A faceted classification scheme for computer-mediated dis-
course. Language@Internet 4.
mational needs. Platform providers increasingly encourage users to Highfield, T., Harrington, S., Bruns, A., 2013. Twitter as a technology for audiencing
engage in sharing to generate such data, which has significant and fandom. Inf. Commun. Soc. 16 (3), 315–339. http://dx.doi.org/10.1080/
commercial value and fosters users engagement. New forms of 1369118X.2012.756053.
Honeycutt, C., Herring, S. C., 2009. Beyond microblogging: conversation and
content sharing integrate established conventions of quoting with collaboration via Twitter. In: Proceedings of the 42nd Hawaii International
novel technical options to distribute information and reinforce social Conference on System Sciences (HICSS-42).. IEEE Press, Los Alamitos, CA, pp.
ties, particularly by indicating to users that something they have 1–10. 10.1109/HICSS.2009.602.
Kirshenblatt-Gimblett, B., 1996. The electronic vernacular. In: Marcus, G.E. (Ed.),
posted has been shared. It remains to be seen how these novelties
Connected: Enagements with Media!. University of Chicago Press, Chicago,
will evolve in the future and what interactions will take place pp. 21–68.
between their linguistic and sociotechnical characteristics. The plat- Kooti, F., Gummadi, K. P., Yang, H., Cha, M., Mason, W. A.,2012. The emergence of
form services themselves are an important driver of the trend to conventions in online social networks. In Proceedings of the Sixth International
AAAI Conference on Weblogs and Social Media (ICWSM'12). The AAAI Press,
socially link users and media objects through content sharing. The Menlo Park, CA, pp. 194–201. Retrieved from 〈http://www.aaai.org/ocs/index.
usage patterns produced by users provide the fuel for a data-driven php/ICWSM/ICWSM12/paper/viewPDFInterstitial/4661/4983〉.
economy which translates such patterns into profiles to be used in Kwak, H., Lee, C., Park, H., Moon, S., 2010. What is Twitter, a social network or a
news media? Categories and subject descriptors. In: J. Freire and S. Chakrabarti
targeted advertising. The advantages of constantly extracting beha-
(Eds.). Proceedings of the 19th International Conference on the World Wide
vioral patterns from users are economically apparent, as are the legal Web (WWW'10). ACM Press, Raleigh, NC, pp. 591–600.
and ethical problems which may arise from such approaches. Digital Larsson, A.O., Moe, H., 2011. Studying political microblogging: Twitter users in the
content sharing and quoting no longer just take place between 2010 Swedish election campaign. New Media Soc. 14 (5), 729–747. http://dx.
doi.org/10.1177/1461444811422894.
quoter, quotee and reader, but have become an important source of Macaulay, R., 2001. You’re like “why not?” The quotative expressions of Glasgow
data in this preferential economy. adolescents. J. Socioling. 5 (1), 3–21. http://dx.doi.org/10.1111/1467-9481.00135.
Mahrt, M., Weller, K., Peters, I., 2013. Twitter in scholarly communication. In:
Weller, K., Bruns, A., Burgess, J., Mahrt, M., Puschmann, C. (Eds.), Twitter and
References Society. Peter Lang, New York, pp. 399–410.
Moore, C., 2011. Quoting speech in early English. Cambridge University Press,
Abbott, B., 2003. Some notes on quotation. Belg. J. Linguist. 17, 13–26. Cambridge p. 230.
Bakhtin, M., 1981. In: Holquist, M. (Ed.), The dialogic imagination: four essays. Page, R., 2012. The linguistics of self-branding and micro-celebrity in Twitter: the
University of Texas Press, Austin, TX, p. 444. role of hashtags. Discourse Commun. 6 (2), 181–201. http://dx.doi.org/10.1177/
Bamman, D., Eisenstein, J., Schnoebelen, T., 2014. Gender identity and lexical 1750481312437441.
variation in social media. J. Socioling. 18 (2), 135–160. http://dx.doi.org/ Phdchat Wiki., 2013. Blogging about your research. Retrieved from 〈http://phdchat.
10.1111/josl.12080. pbworks.com/w/page/52525100/〉.
Bastos, M. T., Puschmann, C., Travitzki, R., 2013. Tweeting across hashtags: over- Potts, A., Simm, W., Whittle, J., Unger, J.W., 2014. Exploring “success” in digitally
lapping users and the importance of language, topics, and politics. In: augmented activism: a triangulated approach to analyzing UK activist Twitter use.
Proceedings of the 24th ACM Conference on Hypertext and Social Media Discourse Context Media 6, 65–76. http://dx.doi.org/10.1016/j.dcm.2014.08.008.
(HT'13). ACM Press, New York, pp. 164–168. 10.1145/2481492.2481510. Recanati, F., 2001. Open quotation. Mind 110 (439), 637–687. http://dx.doi.org/
Bolander, B., Locher, M. a., 2014. Doing sociolinguistic research on computer- 10.1093/mind/110.439.637.
mediated data: a review of four methodological issues. Discourse Context Romaine, S., Lange, D., 1991. The use of “like” as a marker of reported speech and
Media 3, 14–26. http://dx.doi.org/10.1016/j.dcm.2013.10.004. thought: a case of grammaticalization in progress. Am. Speech 66 (3), 227–279.
Boyd, D., Golder, S., Lotan, G., 2010. Tweet, tweet, retweet: conversational aspects of Scott, S., 2009. Repackaging fan culture: the regifting economy of ancillary content
retweeting on Twitter. In: Proceedings of the 43rd Hawaii International models. Transform. Works Cult., 3. http://dx.doi.org/10.3983/twc.2009.0150.
Conference on System Sciences. IEEE Computer Society, Los Alamitos, CA, Tagliamonte, S., D’Arcy, A., 2004. He’s like, she’s like: the quotative system in
pp. 1–10. 10.1109/HICSS.2010.412. Canadian youth. J. Socioling. 8 (4), 493–514. http://dx.doi.org/10.1111/j.1467-
Bruns, A., 2008. Blogs, Wikipedia, Second Life, and Beyond: From Production to 9841.2004.00271.x.
Produsage. Peter Lang, New York p. 418. Tannen, D., 1989. Talking Voices: Repetition, Dialogue, and Imagery in Conversa-
Bruns, A., Burgess, J., 2011. The use of Twitter hashtags in the formation of ad hoc tional Discourse. Cambridge University Press, Cambridge p. 244.
publics. In: Proceedings of the 6th European Consortium for Political Research Zappavigna, M., 2011. Ambient affiliation: a linguistic perspective on Twitter.
General Conference (ECPR 2011). Reykjavik: University of Iceland, pp. 1–9. New Media Soc. 13 (5), 788–806. http://dx.doi.org/10.1177/1461444810385097.
Bruns, A., Highfield, T., 2013. Political networks on Twitter: Tweeting the Queens- Zelizer, B., 1989. “Saying” as collective practice: quoting and differential address in
land state elections. Inf. Commun. Soc. 16 (5), 667–691. http://dx.doi.org/ the news. Text—Interdiscip. J. Study Discourse 9 (4), 369–388. http://dx.doi.org/
10.1080/1369118X.2013.782328. 10.1515/text.1.1989.9.4.369.

You might also like