Page MenuHomePhabricator

[SPIKE] What percentage of edits are reverted because of peacock behavior?
Open, MediumPublic

Description

In T368274, we are exploring the viability of using an LLM to detect the presence of peacock behavior.

This task involves the work of learning what percentage of edits are reverted because of peacock behavior so that, ultimately, we can decide how to surface feedback of this sort.

Decision(s) to be made

  1. How will we treat MOS:Peacock-related feedback and ultimately, present it to people?
    • E.g. Do we consider feedback of this sort essential to communicate to people before they publish an edit because NOT doing so would likely result in the edit being reverted? Do we consider feedback of this sort something that could be presented as a suggestion later on? Both?
NOTE: the decision above will draw on the clarity we reach through T367897

Research question(s)

  1. What percentage of edits that newcomers [i] and Junior Contributors to the Wikipedia main namespace at en.wiki and fr.wiki are reverted because of the presence of peacock behavior?

i. Newcomers as defined by people who have created accounts and published 0 edits and unregistered accounts


Note this ticket was an idea @MNeisler raised during an offline conversation on 25 July 2024.

Event Timeline

ppelberg renamed this task from [SPIKE] to [SPIKE] What percentage of edits because of peacock behavior?.Jul 27 2024, 1:53 AM
ppelberg moved this task from Backlog to Analytics on the Editing-team (Tracking) board.
ppelberg renamed this task from [SPIKE] What percentage of edits because of peacock behavior? to [SPIKE] What percentage of edits are reverted because of peacock behavior?.Aug 14 2024, 8:14 PM
MNeisler triaged this task as Medium priority.

What percentage of edits to the Wikipedia main namespace are reverted because of the presence of peacock behavior?

This question will help us understand how frequently users are reverted due to peacock behavior vs other policy violations.

Proposed approach:

  • Collect a recent sample of edits made in the last 90 days to a Wikipedia main namespace. This sample should include the revision id, revision content, wiki, user experience level platform, and if the edit was reverted.
  • Use the new content edit tag (editcheck-newcontent) to limit to revisions that added new content to an article. Rationale: We would want to focus on edit type that add content (vs minor edits or edits that remove content) and use VE as those are the types of edits edit check will be presented within.
  • Use a string-matching approach, looking for common peacock keywords in the sample revision text. Consult @diego to align with approach being used to calculate the baseline in T368274
  • Calculate the percentage of edits that are reverted due to peacock behavior. Include breakdowns by wiki, user experience, and platform.

Per what @MNeisler and I talked about offline today, we're going to move forward with the approach she described in T371158#10134639 above.

Update: I reached out to @diego who clarified that he does not currently have a list of keywords but a TF-IDF approach, that uses words to classify articles. While the model itself doesn't output the list, additional steps could be completed to output a list of keywords related to peacock behavior and promotional tone. Diego indicated he might be able to find time to do this during the next week or so depending on task urgency and priorities.

Hi @MNeisler ! I've just shared on Slack the list and code to create the peacock related words. Just for the records a put the links to notebook and list here.

ppelberg updated the task description. (Show Details)

@ppelberg Below is a summary of the initial results for review. This is based on all new content edits published between May and June 2024 on an English Wikipedia main namespace by newcomers, Junior Contributors, or unregistered users.

Methodology Overview:

  • Identified new content edits from mediawiki_history using the edit tag (editcheck-newcontent).
  • Queried mediawiki_wikitext_history to obtain the revision text for each revision id in the sample
  • Used a string-matching approach to look for the set of identified common peacock-related words in the revision text.
  • The list of peacock-related words is based on the list provided in T371158#10202597. Note: I manually filtered and removed some words that appeared due to data issues to a final list of 61 peacock-related words.
  • Reviewed if edits were reverted within 48 hours.
NOTE: Limited review to English Wikipedia as that list of peacock words is based on a sample of English articles and may not be accurate for French or other Wikipedia articles.

Results

What percentage of edits published by newcomers and Junior Contributors to the Wikipedia main namespace at en.wiki are reverted because of the presence of peacock behavior?

  • A little over half (56%) of all new content edits by newcomers and Junior contributors included one of the identified peacock-related words.
  • Overall on English Wikipedia, 22% of all new content edits by newcomers and Junior Contributors that contain at least one of the identified peacock words is reverted. This represents a 46.7% increase (+7 percentage points) in the likelihood of being reverted compared to an edit without a peacock-related word.

Revert rate of peacock-related words on English Wikipedia

Presence of peacock wordsnumber of edits revertednumber of editsrevert rate
does not include peacock words18851244715%
includes peacock words34341592422%
  • New content edits with peacock words are equally likely to occur on both mobile web and desktop platforms. 57% of new content edits on mobile web and 56% of new content edits on desktop include peacock-related words.
  • We observed a similar increase in the likelihood of an edit with a peacock-related word being reverted on a per-platform basis, as shown below.
platformprescense of peacock wordsnumber of edits revertednumber of editsrevert rate
Mobile webdoes not include peacock words572206528%
Mobile webincludes peacock words986272836%
Desktopdoes not include peacock words13131038213%
Otherincludes peacock words24481319619%
  • Newcomers are more likely to have their new content edits reverted than Junior Contributors. If their new content edit includes a peacock-related word, the likelihood of the edit being reverted increases by 30% (30% to 39%).
Editor experienceprescense of peacock wordsnumber of edits revertednumber of editsrevert rate
Junior contributor014111084913%
Junior contributor125911375419%
newcomer0474159830%
newcomer1843217039%
  • As a reference point, when establishing a baseline for reference check in T332848#9077218, we found that about 21% of all new content edits without a reference by Junior Contributors and 40% of all new content edits without a reference by newcomers were reverted on English Wikipedia. This indicates that the revert rate for edits with peacock-related words is very similar to the revert rate for edits without a reference.

Some caveats:

  • This is based on a sample list of peacock words and is not an exhaustive list of all possible peacock terms. It's possible some new content edits included a peacock word not identified on the list and were not counted in the analysis.
  • Also, this does not consider the context of how the word was used. There may be some cases where the use was acceptable (i.e. part of a proper noun).

Code repo