Page MenuHomePhabricator

Update Wikidata-related extensions for IP Masking
Closed, ResolvedPublic

Description

IP masking is coming and based on the investigation in T328454 we have a number of changes to make. They are tracked in the subtickets of this ticket.

Related Objects

StatusSubtypeAssignedTask
Resolvedkostajh
DeclinedNone
In ProgressNiharika
OpenNone
OpenNone
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArthurTaylor
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
DeclinedNone
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedLydia_Pintscher
ResolvedArian_Bozorg
ResolvedNone
OpenNone
ResolvedNone
ResolvedArian_Bozorg
ResolvedLucas_Werkmeister_WMDE
ResolvedNone
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArthurTaylor
ResolvedArthurTaylor
ResolvedArthurTaylor
ResolvedArthurTaylor
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedLucas_Werkmeister_WMDE
ResolvedArian_Bozorg
ResolvedLucas_Werkmeister_WMDE
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedLucas_Werkmeister_WMDE
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedIfrahkhanyaree_WMDE
ResolvedMuhammad_Yasser_Jazirahly_WMDE
ResolvedJakob_WMDE

Event Timeline

Anything major I forgot based on the investigation?

Most of the subtasks here are done, but I think there’s still a big open question about the rollout. In T343800, we found that temporary accounts, once created on a wiki with IP Masking enabled, were also “logged in” and available on other wikis via CentralAuth, which worked well for us. But this has changed in the meantime (T342475) – now, you’ll still be an IP on wikis that don’t have temporary accounts enabled. So what happens with cross-wiki edits now?

  • If IP Masking is enabled on Wikidata but not on a client wiki: Suppose an anonymous user makes an edit on Wikidata, with a temporary account. The edit affects a page on a client wiki, so it gets dispatched there, and added to the recentchanges table (type RC_EXTERNAL). Which actor do we assign this row to?
    • It can’t be the temporary account, because they’re not supposed to exist on the client wiki.
    • It can’t be the user’s IP address, because we don’t know it anymore (and shouldn’t leak it in any case).
    • I guess we could reassign all such edits to a system user or something? (But that still leaves the problem that it looks like a registered edit to the client wiki when it really isn’t, which IIUC was the main argument against just letting temporary accounts be shared via CentralAuth like they used to.)
      • Or some kind of special IP address(es), somewhere in a private network block?
  • If IP Masking is enabled on a client wiki but not on Wikidata: Suppose a user acquires a temporary account on a client wiki (e.g. with a normal edit), then edits Wikidata from the client wiki. What should happen?
    • I think there’s no actual scenario where this happens. The LinkItem widget is not available to unregistered users, including temporary accounts (see T351971); the data bridge is currently broken (T354750); some wikis have gadgets to edit Wikidata (e.g. WE-Framework), but temporary users won’t be able to enable them.
    • Even if there’s a way for anonymous users to edit Wikidata from a client wiki that I missed – it’s probably okay(ish) if the outcome is just “it doesn’t work”.

Comparing these two alternatives, I wonder if we should just enable IP Masking on Wikidata last of all wikis, so that the first set of problems never applies…

There’s one more case I just thought of:

  • If IP Masking is enabled on a client wiki but not on Wikidata: Suppose an anonymous user makes an edit on Wikidata which affects a client wiki. The Wikidata edit is made by an IP address, but it gets dispatched to a wiki where IP addresses are no longer supposed to make edits. What happens?
    • The answer, at least in local testing, seems to be that the IP address is added to the client wiki’s recentchanges, without problem, even if it’s a brand new IP address that wiki2 doesn’t have an actor ID for yet. So apparently this still works just fine. It does mean that wikis can still have IP edits in their recentchanges even if they have IP masking enabled, but I think that’s better than any alternative in this scenario. (I was worried that the change dispatching might crash with an error from “Cannot create an actor for an IP user when temporary accounts are enabled”.)
  • I guess we could reassign all such edits to a system user or something? (But that still leaves the problem that it looks like a registered edit to the client wiki when it really isn’t, which IIUC was the main argument against just letting temporary accounts be shared via CentralAuth like they used to.)

FWIW, we explored assigning actions to system users in T353953 and T334623 but came across problems with assigning many actions by many users to a single system user (which I think is unprecedented, but I'm happy to be corrected), e.g. when performing CheckUser checks. Just mentioning to highlight that we've had similar problems of who to attribute an action to in other places.

Comparing these two alternatives, I wonder if we should just enable IP Masking on Wikidata last of all wikis, so that the first set of problems never applies…

Thanks for sharing this analysis. We're looking into some rollout options, but I don't think we have any opposition to Wikidata being last. Pinging @Niharika and @Madalina in case they disagree.

The only opposing argument I'd heard was from the mobile apps team, where I believe attribution becomes more complicated if the client wiki has temp accounts but the Wikidata doesn't (I'm hazy on the details), so we were hearing the opposite request: for Wikidata to be among the pilot wikis. However I've spoken with @JTannerWMF and my understanding is that they can work with Wikidata being rolled out to last.

Thanks Thal! Yes that is correct, you may get user complaints (probably from patrollers), if the same anon editor edits the text of an article and the article description. If you're okay to assume the risk we can move forward with that understanding and if we get complaints flag it to T&S Product and WMDE as the decision makers of the roll out process.

Most of the subtasks here are done, but I think there’s still a big open question about the rollout. In T343800, we found that temporary accounts, once created on a wiki with IP Masking enabled, were also “logged in” and available on other wikis via CentralAuth, which worked well for us. But this has changed in the meantime (T342475) – now, you’ll still be an IP on wikis that don’t have temporary accounts enabled. So what happens with cross-wiki edits now?

  • If IP Masking is enabled on Wikidata but not on a client wiki: Suppose an anonymous user makes an edit on Wikidata, with a temporary account. The edit affects a page on a client wiki, so it gets dispatched there, and added to the recentchanges table (type RC_EXTERNAL). Which actor do we assign this row to?
    • It can’t be the temporary account, because they’re not supposed to exist on the client wiki.
    • It can’t be the user’s IP address, because we don’t know it anymore (and shouldn’t leak it in any case).
    • I guess we could reassign all such edits to a system user or something? (But that still leaves the problem that it looks like a registered edit to the client wiki when it really isn’t, which IIUC was the main argument against just letting temporary accounts be shared via CentralAuth like they used to.)
      • Or some kind of special IP address(es), somewhere in a private network block?
  • If IP Masking is enabled on a client wiki but not on Wikidata: Suppose a user acquires a temporary account on a client wiki (e.g. with a normal edit), then edits Wikidata from the client wiki. What should happen?
    • I think there’s no actual scenario where this happens. The LinkItem widget is not available to unregistered users, including temporary accounts (see T351971); the data bridge is currently broken (T354750); some wikis have gadgets to edit Wikidata (e.g. WE-Framework), but temporary users won’t be able to enable them.
    • Even if there’s a way for anonymous users to edit Wikidata from a client wiki that I missed – it’s probably okay(ish) if the outcome is just “it doesn’t work”.

Comparing these two alternatives, I wonder if we should just enable IP Masking on Wikidata last of all wikis, so that the first set of problems never applies…

@Lucas_Werkmeister_WMDE: Some more scenarios I can think of but probably theoretical only:

  • a gadget that allows editing from the client is enabled by default and thereby also for non-logged-in users (at least I think they then also work for them...) I don't think this is a problem in practice though.
  • page moves and deletions: Do we have any wikis where non-logged-in users can rename or delete a page, which would then result in an automated sitelink edit?

In general I agree that going with Wikidata last seems best right now.

Arian_Bozorg claimed this task.
Arian_Bozorg subscribed.

Thank you all for your work on this

  • If IP Masking is enabled on Wikidata but not on a client wiki: Suppose an anonymous user makes an edit on Wikidata, with a temporary account. The edit affects a page on a client wiki, so it gets dispatched there, and added to the recentchanges table (type RC_EXTERNAL). Which actor do we assign this row to?
    • It can’t be the temporary account, because they’re not supposed to exist on the client wiki.
    • It can’t be the user’s IP address, because we don’t know it anymore (and shouldn’t leak it in any case).
    • I guess we could reassign all such edits to a system user or something? (But that still leaves the problem that it looks like a registered edit to the client wiki when it really isn’t, which IIUC was the main argument against just letting temporary accounts be shared via CentralAuth like they used to.)
      • Or some kind of special IP address(es), somewhere in a private network block?

For the record, we currently have this situation on the Beta cluster (temp accounts are enabled on Beta Wikidata, but not on some other Beta Wikipedias). And what happens in practice is that the change is dispatched with an actor that uses the temporary account name:

image.png (269×879 px, 49 KB)

MariaDB [enwiki]> SELECT * FROM actor WHERE actor_id = 119895508\G
*************************** 1. row ***************************
  actor_id: 119895508
actor_user: NULL
actor_name: d>~2024-14860

Notice that the recent changes row, as seen on Beta English Wikipedia, is missing class=mw-tempuserlink, because English Wikipedia doesn’t know that this matches the pattern for temporary accounts.

Notice that the recent changes row, as seen on Beta English Wikipedia, is missing class=mw-tempuserlink, because English Wikipedia doesn’t know that this matches the pattern for temporary accounts.

One way to address that would be to set the known flag on the temp user configuration (T356524: Ensure temp accounts can be safely disabled after being enabled) across all wikis, although that is not the original purpose of that config flag. (It was intended as a mechanism to safely disable temp accounts on a wiki after it had already been enabled.)