Page MenuHomePhabricator

Copyvio: Show copyvio info in PageTriage feed
Closed, ResolvedPublic

Description

Show a hint that a copyright violation exist for one or many revisions of the current page

Specific UX is TBD ( ping @MMiller_WMF )

A copyvio exist if the "copyvio" PageTriage tag exists with a truthy value in page metadata. The tag should be created as part of T202041

The hint offers a link to CopyPatrol to show the offending revisions for the page. The specific link URL should be defined as part of T201075

Event Timeline

Change 451317 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/Copyvio@master] Helper function to find scores by page ids

https://gerrit.wikimedia.org/r/451317

Change 451318 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/PageTriage@master] Show copyvio scores in Special:NewPagesFeed

https://gerrit.wikimedia.org/r/451318

Change 451317 merged by jenkins-bot:
[mediawiki/extensions/Copyvio@master] Helper function to find scores by page ids

https://gerrit.wikimedia.org/r/451317

@MMiller_WMF What do we want to see when we don't currently have the copyvio score for a page? What about when the copyright violation probability is zero?

Change 451318 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Show copyvio scores in Special:NewPagesFeed

https://gerrit.wikimedia.org/r/451318

Change 452338 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/PageTriage@master] Fix indentation

https://gerrit.wikimedia.org/r/452338

Change 452338 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Fix indentation

https://gerrit.wikimedia.org/r/452338

SBisson renamed this task from Copyvio: PageTriage integration to Copyvio: Show copyvio info in PageTriage feed.Aug 17 2018, 10:22 AM

Change 453399 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/PageTriage@master] [WIP] Show copyvio hint and link to report in feed

https://gerrit.wikimedia.org/r/453399

One idea discussed by the team today: in CopyPatrol one can mark a page as "fixed". CopyPatrol could be modified to post to PageTriage API so that the truthy value associated with whether copyvio was detected in any of the revisions for a given page is set to false. That way, if a page is fixed it won't show up with a flag that copyvio might be present when reviewers are looking at the new pages feed.

However we've decided to pursue that idea at a later time, not in this iteration.

@SBisson @kostajh @RHo and I discussed designs this morning, and @RHo made a design. The full design is described in T202161, but the parts that are relevant for copyvio are:

  • "Predicted issues" renamed to "Potential issues"
  • Copyvio gets listed with "Potential issues", separated with dot, and link icon to external service
  • "Potential issues" aligned right for easier scanning.

image.png (782×856 px, 365 KB)

@SBisson -- I realized that we asked this question for ORES models, so I want to bring it up here as well: what about the User space? For ORES, we decided to score User space pages, but I think for copyvio it depends on the current logic of EranBot and whether we want to change it. Do you know whether the bot currently scores User space? Do you have thoughts on whether we should?

Change 453399 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Show copyvio hint and link to report in feed

https://gerrit.wikimedia.org/r/453399

@SBisson -- I realized that we asked this question for ORES models, so I want to bring it up here as well: what about the User space? For ORES, we decided to score User space pages, but I think for copyvio it depends on the current logic of EranBot and whether we want to change it. Do you know whether the bot currently scores User space? Do you have thoughts on whether we should?

Eranbot current does NOT check the userspace. And as a result CopyPatrol does not show it.

I'm not sure about the use case in userspace. I see a lot of entries in the feed that make no sense to review as "potential new pages" but some do. I guess we should ask the community.

  • "Predicted issues" renamed to "Potential issues"
  • Copyvio gets listed with "Potential issues", separated with dot, and link icon to external service

These 2 were done here.

  • "Potential issues" aligned right for easier scanning.

This one will be done in T202161: New Pages Feed: arrangement of metadata in feed

Change 454812 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/PageTriage@master] "Predicted issues" becomes "Potential issues"

https://gerrit.wikimedia.org/r/454812

Change 454812 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] "Predicted issues" becomes "Potential issues"

https://gerrit.wikimedia.org/r/454812

@MMiller_WMF
Checked in betalabs :

Screen Shot 2018-08-27 at 5.30.40 PM.png (462×1 px, 95 KB)

  • copyvio is displayed capitalized and not in red (it makes sense since it's a link to an existing page) - copypatrol/en
  • would copyvio tag be combined with ORES scores?
  • there is no OR for copyvio; all other options will have OR - e.g. Vandalism and None will display drafts that have Vandalism and drafts that have None scores.

Change 455831 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/PageTriage@master] Treat unknown ORES issue as None

https://gerrit.wikimedia.org/r/455831

  • would copyvio tag be combined with ORES scores?
  • there is no OR for copyvio; all other options will have OR - e.g. Vandalism and None will display drafts that have Vandalism and drafts that have None scores.

It is OR but the version on betalabs is not working well where a page doesn't have a 'draftquality' score.

I have just submitted a patch for review that expands the definition of "None" to mean: no known ORES issue or copyvio, instead of being directly linked with ORES draftquality class "OK". This should be much less confusing since any page that does not show an explicit issue should be included by the "None" filter.

Change 455831 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Treat unknown ORES issue as None

https://gerrit.wikimedia.org/r/455831

I have just submitted a patch for review that expands the definition of "None" to mean: no known ORES issue or copyvio, instead of being directly linked with ORES draftquality class "OK".

Checked in betalabs - the copyvio filter works as expected.

@Etonkovidova -- since we need to also validate this in Test Wiki, and not just in Beta Labs, I am putting this back in your column. This is now blocked on T203119, which is about making it possible to test in Test Wiki.

@SBisson -- regarding the User space: we will not need to scan User space for now, and so we can leave everything as-is. Reviewers will just see no results if they select the Copyvio and User space filters at the same time.

Re-checked in testwiki - test sample data for Copyvio works as expected.