Page MenuHomePhabricator

title-invalid-characters error should be able to safely suggest a title
Closed, ResolvedPublic

Description

The title-invalid-characters message is displayed when trying to read a title that contains invalid characters like | or ]. This can happen when copying a URL incorrectly like <https://www.mediawiki.org/wiki/Download>, or using invalid link syntax like https://www.mediawiki.org/wiki/Download|page.

It should be be possible to automatically suggest a title by checking whether a valid page exists at the given title up to the first invalid character. In the above examples, we would suggest Download.

The interface message sets $2 to the invalid title, so a naively we could attempt to parse this with templates/Lua and use {{#ifexist:}}. However, the value of $2 is not escaped, so it is not safe to pass into templates, as I tried here. ({{foo|$2}} with the invalid title Download|page expands to {{foo|Download|page}}, and there is no real way to escape the pipe. An invalid title like Download}}page will break it completely.)

A possible solution (suggested by @Legoktm) is for MediaWiki to set $3 to a suggested title (for now, the given title up to the first invalid character), and an interface message could simply use {{#ifexists:$3|Did you mean: [[:$3]]?}}.

See discussion at enwiki's village pump.

Event Timeline

Throwing an extra regex check in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/title/MediaWikiTitleCodec.php#425 should do the trick.

A possible solution (suggested by @Legoktm) is for MediaWiki to set $3 to a suggested title (for now, the given title up to the first invalid character), and an interface message could simply use {{#ifexists:$3|Did you mean: [[:$3]]?}}.

Now that I've thought about it more, ideally we would do the ifexist check inside PHP, and then output a different message (e.g. title-invalid-characters-dym) if the fixed title does exist with a link using $3. Otherwise it would just use the default behavior. But this code is coming from MediaWikiTitleCodec which doesn't know whether titles exist or not. I think we could create a subclass of MalformedTitleException just for the invalid characters case that did this logic of changing the message key/params if the title exists.

Now that I've thought about it more, ideally we would do the ifexist check inside PHP, and then output a different message (e.g. title-invalid-characters-dym) if the fixed title does exist with a link using $3. Otherwise it would just use the default behavior. But this code is coming from MediaWikiTitleCodec which doesn't know whether titles exist or not.

I don't think the suggestion needs to bother about checking the title existence — just validity. If the page exists, that's fine, if it doesn't, normal flow of redlink will kick in. When you have "Download>", it's better to display redlink for "Download" than the present error message.

The parameter should simply go through wfEscapeWikiText() to replace special characters with HTML entities. You can currently display arbitrary wikitext on that error page… https://www.mediawiki.org/wiki/Download{{test}}?uselang=qqx

Change 666130 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/core@master] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666130

Change 666130 merged by jenkins-bot:
[mediawiki/core@master] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666130

matmarex claimed this task.

You should be able to use $2 as expected after this change is deployed later this week. (Note that it is escaped using HTML entities, e.g. { to &#123;, so you might need to un-escape it in Lua code – I'm not sure how that is usually handled.)

Will this be backported to 1.31 and 1.35? (It's not a huge deal for us if not; it's a simple enough change that we could just apply the patch locally, but [obviously] I would prefer to avoid doing that.)

Change 666345 had a related patch set uploaded (by Legoktm; owner: Bartosz Dziewoński):
[mediawiki/core@REL1_35] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666345

Change 666686 had a related patch set uploaded (by Legoktm; owner: Bartosz Dziewoński):
[mediawiki/core@REL1_31] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666686

Will this be backported to 1.31 and 1.35? (It's not a huge deal for us if not; it's a simple enough change that we could just apply the patch locally, but [obviously] I would prefer to avoid doing that.)

It cherry-picked cleanly, so done :)

Change 666686 merged by jenkins-bot:
[mediawiki/core@REL1_31] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666686

Change 666345 merged by jenkins-bot:
[mediawiki/core@REL1_35] Escape wikitext in the title in invalid title error messages

https://gerrit.wikimedia.org/r/666345

Thanks a lot for the quick fix and deployment! If anyone is interested, I've got my proof of concept working in Lua here: https://en.wikipedia.org/wiki/Module:Bad_title_suggestion.