Page MenuHomePhabricator

Global rename of The_Photographer → Wilfredor: supervision needed
Closed, ResolvedPublic

Description

The user requested to get renamed from The_Photographer to Wilfredor. He has more than 173,757 edits, so we need a sysadmin here. Here is a proof of his identity (for the ones who can see the ticket).

Note: the user main edits number in commonswiki (144,552) & eswiki (25,026) (CentralAuth)

Event Timeline

Can we not rename users with large edit counts back and forth (c.f. https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress?username=The_Photographer)? Will this be the user's last rename request?

Can we not rename users with large edit counts back and forth (c.f. https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress?username=The_Photographer)? Will this be the user's last rename request?

I understand that this is difficult to do and I will try not to request this again, however, I am not totally sure that I will not ask for a future rename. I asked for this because it was really necessary and If this is technically impossible, I understand

Can you also post the wiki with this rename request so we can check which wikis have more edits?

Thanks

Its a global rename, spanish wikipedia and commons have more edits

Thanks

It could be done on any moment, preferably a Friday. Thanks

@Matiia or @Trijnstel are you guys handling this, since I don't have access to that OTRS queue.

It could be done on any moment, preferably a Friday. Thanks

I am in UTC+1 timezone, what about you?
If possible I would avoid Friday unless done early in the UTC morning (around 7AM UTC or similar).

Marostegui triaged this task as Medium priority.Feb 9 2019, 8:56 AM

It could be done on any moment, preferably a Friday. Thanks

I am in UTC+1 timezone, what about you?
If possible I would avoid Friday unless done early in the UTC morning (around 7AM UTC or similar).

@Marostegui Well, they're the one being renamed, so I think they just responded without realising that it have to do by steward or renamer.

@Marostegui Well, they're the one being renamed, so I think they just responded without realising that it have to do by steward or renamer.

Right :-)
Just let me know when the steward or the renamer wants to do it and we can coordinate.

Thanks Marostegui, please let me know if you need something else and feel you free to contact me, take care by yourself

I'll be around for around 2 hours, I think. If you can, we could do it now.

Matiia or anybody, please, do it on any moment. Thanks

Sorry, I wasn't available during the weekend.
I am normally available from Monday to Friday from 7:00 UTC to 16:00 UTC

Friday is ok to me, that way I'll have the weekend to fix something that needs to be changed

It can be done anytime, Im waiting

I am free to do it today if @Marostegui or other DBA is willing to have a look while the rename is ongoing.

According to https://wikitech.wikimedia.org/wiki/Stuck_global_renames the system should try to restart failed jobs several times. I'm not sure about jobs stuck "in progress" though.

@Tgr has helped us in the past, too, with stuck global renames.

@Marostegui (or anyone with access) could you see the wikitech page I linked above and check the CentralAuthRename and Renameuser channels to see if the job has been attempted automatically again, and its status? (looking for further db timeouts as suggested in the guide above if possible too?)

Thanks.

Just in case it is needed:

On mwmaint1002
mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'The_Photographer' 'Wilfredor'

And...

!log message
!log Ran "mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'The_Photographer' 'Wilfredor'" for T215107.

Large renames should be easy once actor migration (T167246) is finished, right? I'd just block them for now.

Job started on Commons at 2019-02-18T14:45:45, log entries are

GlobalRename: Starting rename of The Photographer to Wilfredor
GlobalRename: Updating logging table for The Photographer to Wilfredor
RenameuserSQL::rename	10.64.48.23	2062	Read timeout is reached (10.64.48.23)	UPDATE  `image` SET img_user_text = 'Wilfredor' WHERE img_user_text = 'The Photographer' AND img_user = '32521'
(same once more)
Transaction round stage must be 'cursory' (not 'within-rollback')
    #0 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(255): Wikimedia\Rdbms\LBFactory->assertTransactionRoundStage(string)
    #1 /srv/mediawiki/php-1.33.0-wmf.17/includes/MediaWiki.php(897): Wikimedia\Rdbms\LBFactory->commitMasterChanges(string)
    #2 /srv/mediawiki/rpc/RunSingleJob.php(94): MediaWiki->restInPeace()
    #3 {main}
Transaction callbacks are still pending: RenameuserSQL::rename, RenameuserSQL::rename, User::saveSettings, User::saveSettings, User::saveOptions, RenameuserSQL::rename, RenameuserSQL::rename, User::clearSharedCache, Title::invalidateCache, User::clearSharedCache, Title::invalidateCache
    #0 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/DatabaseMysqlBase.php(126): Wikimedia\Rdbms\Database->close()
    #1 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/Database.php(4149): Wikimedia\Rdbms\DatabaseMysqlBase->open(string, string, string, string, string, string)
    #2 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/Database.php(1202): Wikimedia\Rdbms\Database->replaceLostConnection(string)
    #3 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/Database.php(4023): Wikimedia\Rdbms\Database->query(string, string, boolean)
    #4 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/Database.php(3974): Wikimedia\Rdbms\Database->doRollback(string)
    #5 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1534): Wikimedia\Rdbms\Database->rollback(string, string)
    #6 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1779): Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges(Wikimedia\Rdbms\DatabaseMysqli)
    #7 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1535): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;674)
    #8 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(223): Wikimedia\Rdbms\LoadBalancer->rollbackMasterChanges(string)
    #9 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(405): Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod(Wikimedia\Rdbms\LoadBalancer, string, array)
    #10 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(226): Wikimedia\Rdbms\LBFactoryMulti->forEachLB(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;312, array)
    #11 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(293): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
    #12 /srv/mediawiki/php-1.33.0-wmf.17/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php(67): Wikimedia\Rdbms\LBFactory->rollbackMasterChanges(string)
    #13 /srv/mediawiki/php-1.33.0-wmf.17/extensions/EventBus/includes/JobExecutor.php(65): LocalRenameJob->run()
    #14 /srv/mediawiki/rpc/RunSingleJob.php(77): JobExecutor->execute(array)
    #15 {main}
DB connection was already closed.
    #0 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/database/Database.php(3972): Wikimedia\Rdbms\Database->assertOpen()
    #1 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1534): Wikimedia\Rdbms\Database->rollback(string, string)
    #2 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1779): Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges(Wikimedia\Rdbms\DatabaseMysqli)
    #3 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1535): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;674)
    #4 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(223): Wikimedia\Rdbms\LoadBalancer->rollbackMasterChanges(string)
    #5 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(405): Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod(Wikimedia\Rdbms\LoadBalancer, string, array)
    #6 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(226): Wikimedia\Rdbms\LBFactoryMulti->forEachLB(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;312, array)
    #7 /srv/mediawiki/php-1.33.0-wmf.17/includes/libs/rdbms/lbfactory/LBFactory.php(293): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
    #8 /srv/mediawiki/php-1.33.0-wmf.17/includes/exception/MWExceptionHandler.php(123): Wikimedia\Rdbms\LBFactory->rollbackMasterChanges(string)
    #9 /srv/mediawiki/php-1.33.0-wmf.17/extensions/EventBus/includes/JobExecutor.php(96): MWExceptionHandler::rollbackMasterChangesAndLog(RuntimeException)
    #10 /srv/mediawiki/rpc/RunSingleJob.php(77): JobExecutor->execute(array)
    #11 {main}
(some more log events as that last exception bubbles up to the job runner)

I cant login with any user or any password. I tried to reset my password but its told me that the username not exist

I cant login with any user or any password. I tried to reset my password but its told me that the username not exist

Because this is not completly done (see previous comments). It is reason. When we get this completed, you will be able to log in with new username without problems.

So that's an impressive cascade of failures:

  • Updates for the image table are not batched so the query times out. (Why doesn't the DB error have a stack trace? I thoght we are logging that.)
  • LocalRenameJob::run catches the exception and tries to roll back (and does not log; even though it would rethrow the exception later, that seems fragile). Database::doRollback calls Database::query with to run a ROLLBACK query; the database handler decides it needs to reopen the connection (makes sense for a timeout error; except the error code was 2062 according to the log, and that's not one of the codes DatabaseMysqlBase::wasConnectionError would consider as connection loss), so it calls Database::open which calls Database::close which flips out because there are unprocessed transaction callbacks and throws an exception. But before throwing it does mark the connection as closed.
  • This new exception is caught by JobExecutor::execute which tries another rollback. This fails because the connection is now marked as closed, triggering another exception.
  • This newest exception is caught by the error handling within MWExceptionHandler::rollbackMasterChangesAndLog, which for once doesn't blow up but logs the error and returns. So JobExecutor proceeds to log the error, tear down the job and exit to mediawiki/operation/config's RunSingleJob.php, which then calls MediaWiki::restInPiece. That calls LBFactory::commitMasterChanges which dies because the stage is actually not 'cursory' but 'rollback' (the rollback call never finished).

Actionables:

  • LocalRenameJob should batch image-related updates if the user has many images. This task is blocked on that.
  • Database::replaceLostConnection should clean up callbacks before reopening the connection (it's done in handleSessionLoss which is called way too late). Except it also tries to run trxEndCallbacks callbacks (which should survive a rollback) and those need an open connection... so maybe split it in two?
    • Also Database::replaceLostConnection should somehow force the open/close logic it calls to ignore trxEndCallbacks and not throw exceptions because of it being set.
  • Maybe mark when a connection was closed due to errors and ignore rollback calls in such case? If the connection is lost, the transaction has effectively been rolled back, maybe log a warning but otherwise better to treat it as success than to throw an exception.

The last error does not seem worth bothering with.

Change 491400 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/Renameuser@master] Batch renames in image tables

https://gerrit.wikimedia.org/r/491400

How much it could take?

Hello. I am not sure. There's a problem with the renameuser extension, and they're working to get a fix so your rename could be restarted (and by the way, further renames where the renamed user has lots of edits don't fail in the future). I cannot provide an estimate, however. My impression is that the proposed fix is being actively reviewed though.

Well, thank you very much guys. I will wait patiently for this problem to be resolved some day

Change 491400 merged by Gergő Tisza:
[mediawiki/extensions/Renameuser@master] Batch renames in image tables

https://gerrit.wikimedia.org/r/491400

Change 492740 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/Renameuser@wmf/1.33.0-wmf.18] Batch renames in image tables

https://gerrit.wikimedia.org/r/492740

Change 492740 merged by Reedy:
[mediawiki/extensions/Renameuser@wmf/1.33.0-wmf.18] Batch renames in image tables

https://gerrit.wikimedia.org/r/492740

Mentioned in SAL (#wikimedia-operations) [2019-02-25T19:34:59Z] <reedy@deploy1001> Synchronized php-1.33.0-wmf.18/extensions/Renameuser: T215107 (duration: 00m 46s)

@Marostegui @jcrespo - green light to retry the rename after Renameuser apparently fixed?

The jobs need to be re-enqueued using the commands I left at T215107#4962195. I cannot do that. I hope there's a deployer around tomorrow to issue them, if you authorize.

Best regards.

@MarcoAurelio green light from use to get the job re-scheduled

Mentioned in SAL (#wikimedia-operations) [2019-02-26T06:10:24Z] <tgr> T215107 running mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'The_Photographer' 'Wilfredor'

Mentioned in SAL (#wikimedia-operations) [2019-02-26T06:34:07Z] <tgr> T215107 running mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki --ignorestatus 'The_Photographer' 'Wilfredor'

The job has finished running, no obvious sign of anything going wrong. Should be able to login now.

@aaron what do you think about the changes in T215107#4963154, do those make sense?

Lots of unnatached accounts at https://meta.wikimedia.org/wiki/Special:CentralAuth/Wilfredor

This shouldn't be happening.

If @Wilfredor cannot merge them accessing Special:MergeAccount we should attach them all.

That's not good...

Can you check if the move otherwise went OK? Especially, have images (current + old versions + deleted) been reassigned correctly, both on wikis where the account has an 500+ editcount and where it doesn't (those use a different algorithm)?

Wikimedia\Rdbms\LoadBalancer::runMasterTransactionIdleCallbacks: found writes pending (CentralAuthUser::removeLocalName, CentralAuthUser::addLocalName, Wikimedia\Rdbms\Database::onTransactionPreCommitOrIdle).

There are warnings like that on the affected wikis (sometimes the trace is CentralAuthUser::updateLocalName, Wikimedia\Rdbms\Database::onTransactionPreCommitOrIdle), all from jobs, the timestamps match the renames. But the same seems to be true for wikis were the rename succeeded, and for other renames made since then (which all seem successful, although the affected accounts all had very few edits) so probably unrelated.

Oh, right, this is T188882: Attachment method should be preserved through global rename, let's follow up there. The accounts should be usable, this is more of a display bug.

The job has finished running, no obvious sign of anything going wrong. Should be able to login now.

@aaron what do you think about the changes in T215107#4963154, do those make sense?

They all make sense to me.

Oh, right, this is T188882: Attachment method should be preserved through global rename, let's follow up there. The accounts should be usable, this is more of a display bug.

As per this - I am going to close this as solved (if someone feels this needs to remain open, please do so!)
Thanks everyone!

So that's an impressive cascade of failures:

  • Updates for the image table are not batched so the query times out. (Why doesn't the DB error have a stack trace? I thoght we are logging that.)

Note that the corresponding Exception log does have a backtrace but not the DBQuery one itself...this can probably be changed for convenience.