[SPIKE] What percentage of edits are reverted because of peacock behavior?
Open, MediumPublic
Actions

Assigned To

Authored By

	ppelberg
	Jul 27 2024, 1:53 AM

Description

In T368274, we are exploring the viability of using an LLM to detect the presence of peacock behavior.

This task involves the work of learning what percentage of edits are reverted because of peacock behavior so that, ultimately, we can decide how to surface feedback of this sort.

Decision(s) to be made

How will we treat MOS:Peacock-related feedback and ultimately, present it to people?
- E.g. Do we consider feedback of this sort essential to communicate to people before they publish an edit because NOT doing so would likely result in the edit being reverted? Do we consider feedback of this sort something that could be presented as a suggestion later on? Both?

NOTE: the decision above will draw on the clarity we reach through T367897

Research question(s)

What percentage of edits that newcomers [i] and Junior Contributors to the Wikipedia main namespace at en.wiki and fr.wiki are reverted because of the presence of peacock behavior?

i. Newcomers as defined by people who have created accounts and published 0 edits and unregistered accounts

Note this ticket was an idea @MNeisler raised during an offline conversation on 25 July 2024.

Related Objects
Search...

Status	Assigned	Task
Open	None	T265163 Create a system to encode best practices into editing experiences
Open	None	T365300 Introduce Edit Checks that encourage behavior aligned with Manual of Style consensus
Open	None	T365301 Peacock Check: Prompt people to revise promotional language
Open	MNeisler	T371158 [SPIKE] What percentage of edits are reverted because of peacock behavior?

Event Timeline

ppelberg created this task.Jul 27 2024, 1:53 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 27 2024, 1:53 AM

ppelberg renamed this task from [SPIKE] to [SPIKE] What percentage of edits because of peacock behavior?.Jul 27 2024, 1:53 AM

ppelberg moved this task from Backlog to Analytics on the Editing-team (Tracking) board.

ppelberg added a parent task: T365301: Peacock Check: Prompt people to revise promotional language.Aug 7 2024, 6:00 PM

ppelberg renamed this task from [SPIKE] What percentage of edits because of peacock behavior? to [SPIKE] What percentage of edits are reverted because of peacock behavior?.Aug 14 2024, 8:14 PM

MNeisler claimed this task.Aug 15 2024, 8:02 PM

MNeisler triaged this task as Medium priority.

MNeisler added a project: Product-Analytics (Kanban).

What percentage of edits to the Wikipedia main namespace are reverted because of the presence of peacock behavior?

This question will help us understand how frequently users are reverted due to peacock behavior vs other policy violations.

Proposed approach:

Collect a recent sample of edits made in the last 90 days to a Wikipedia main namespace. This sample should include the revision id, revision content, wiki, user experience level platform, and if the edit was reverted.
Use the new content edit tag (editcheck-newcontent) to limit to revisions that added new content to an article. Rationale: We would want to focus on edit type that add content (vs minor edits or edits that remove content) and use VE as those are the types of edits edit check will be presented within.
Use a string-matching approach, looking for common peacock keywords in the sample revision text. Consult @diego to align with approach being used to calculate the baseline in T368274
Calculate the percentage of edits that are reverted due to peacock behavior. Include breakdowns by wiki, user experience, and platform.

Per what @MNeisler and I talked about offline today, we're going to move forward with the approach she described in T371158#10134639 above.

Update: I reached out to @diego who clarified that he does not currently have a list of keywords but a TF-IDF approach, that uses words to classify articles. While the model itself doesn't output the list, additional steps could be completed to output a list of keywords related to peacock behavior and promotional tone. Diego indicated he might be able to find time to do this during the next week or so depending on task urgency and priorities.

Hi @MNeisler ! I've just shared on Slack the list and code to create the peacock related words. Just for the records a put the links to notebook and list here.

MNeisler moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.Oct 4 2024, 3:30 PM

ppelberg awarded a token.Oct 4 2024, 5:14 PM

ppelberg updated the task description. (Show Details)Oct 9 2024, 8:20 PM

ppelberg updated the task description. (Show Details)

@ppelberg Below is a summary of the initial results for review. This is based on all new content edits published between May and June 2024 on an English Wikipedia main namespace by newcomers, Junior Contributors, or unregistered users.

Methodology Overview:

Identified new content edits from mediawiki_history using the edit tag (editcheck-newcontent).
Queried mediawiki_wikitext_history to obtain the revision text for each revision id in the sample
Used a string-matching approach to look for the set of identified common peacock-related words in the revision text.
The list of peacock-related words is based on the list provided in T371158#10202597. Note: I manually filtered and removed some words that appeared due to data issues to a final list of 61 peacock-related words.
Reviewed if edits were reverted within 48 hours.

NOTE: Limited review to English Wikipedia as that list of peacock words is based on a sample of English articles and may not be accurate for French or other Wikipedia articles.

Results

What percentage of edits published by newcomers and Junior Contributors to the Wikipedia main namespace at en.wiki are reverted because of the presence of peacock behavior?

A little over half (56%) of all new content edits by newcomers and Junior contributors included one of the identified peacock-related words.
Overall on English Wikipedia, 22% of all new content edits by newcomers and Junior Contributors that contain at least one of the identified peacock words is reverted. This represents a 46.7% increase (+7 percentage points) in the likelihood of being reverted compared to an edit without a peacock-related word.

Revert rate of peacock-related words on English Wikipedia

Presence of peacock words	number of edits reverted	number of edits	revert rate
does not include peacock words	1885	12447	15%
includes peacock words	3434	15924	22%

New content edits with peacock words are equally likely to occur on both mobile web and desktop platforms. 57% of new content edits on mobile web and 56% of new content edits on desktop include peacock-related words.
We observed a similar increase in the likelihood of an edit with a peacock-related word being reverted on a per-platform basis, as shown below.

platform	prescense of peacock words	number of edits reverted	number of edits	revert rate
Mobile web	does not include peacock words	572	2065	28%
Mobile web	includes peacock words	986	2728	36%
Desktop	does not include peacock words	1313	10382	13%
Other	includes peacock words	2448	13196	19%

Newcomers are more likely to have their new content edits reverted than Junior Contributors. If their new content edit includes a peacock-related word, the likelihood of the edit being reverted increases by 30% (30% to 39%).

Editor experience	prescense of peacock words	number of edits reverted	number of edits	revert rate
Junior contributor	0	1411	10849	13%
Junior contributor	1	2591	13754	19%
newcomer	0	474	1598	30%
newcomer	1	843	2170	39%

As a reference point, when establishing a baseline for reference check in T332848#9077218, we found that about 21% of all new content edits without a reference by Junior Contributors and 40% of all new content edits without a reference by newcomers were reverted on English Wikipedia. This indicates that the revert rate for edits with peacock-related words is very similar to the revert rate for edits without a reference.

Some caveats:

This is based on a sample list of peacock words and is not an exhaustive list of all possible peacock terms. It's possible some new content edits included a peacock word not identified on the list and were not counted in the analysis.
Also, this does not consider the context of how the word was used. There may be some cases where the use was acceptable (i.e. part of a proper noun).

Code repo

MNeisler moved this task from Doing to Needs Review on the Product-Analytics (Kanban) board.Oct 23 2024, 3:44 PM

ppelberg edited projects, added Editing-team (Kanban Board); removed Editing-team (Tracking).Tue, Nov 19, 7:36 PM

ppelberg moved this task from Incoming to Ready for Sign Off on the Editing-team (Kanban Board) board.