@Ladsgroup intends to do a performance review to make sure the additional messages that the extension loads do not adversely affect performance.
This review blocks further deployment of the extension.
TTO | |
Jan 30 2024, 8:34 AM |
F41740410: grafik.png | |
Feb 1 2024, 4:29 PM |
@Ladsgroup intends to do a performance review to make sure the additional messages that the extension loads do not adversely affect performance.
This review blocks further deployment of the extension.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T151658 Add some new MediaWiki system messages inside #mw-content-text | |||
Open | None | T138652 [Story] Add a new MediaWiki system message at the footer of content pages | |||
Open | None | T151682 Add a new MediaWiki system message as a content header inside #mw-content-text | |||
In Progress | Feature | None | T6469 Provide per-namespace site notices | ||
Stalled | None | T61245 Review the PageNotice extension for deployment | |||
Open | Ladsgroup | T356159 Performance review for PageNotice extension |
Generally, looks good. I think I found one potential issue:
One note: This really should be in core. It'd be great if just that fifty lines moved to core and undeploy this extension.
Yup, the wikitext of the message gets cached by not parsing it... https://performance.wikimedia.org/excimer/profile/4cef32d5fdd53209
@Ladsgroup thanks for taking a look.
If templates or modules are called from the PageNotice message, the parsed result will vary per page, so it would make sense to cache the parsed result in the same place the page content itself is cached. I don't know what that place is, or how to do it, though.
Anything I can do to assist?
It should be in ParserCache. This should be basically and parser cache value. The same way language links are. I think it's called extension options. @daniel would know for sure
Anything I can do to assist?
Parsing is...complicated. I'd say I'll have to bother CTT on this. Give me a bit. This is more work than usual. We can unblock this extension by simply putting that into PC but for other cases, I'll need to find a solution.
You can attach data to the ParserOutput using the setExtensionData() emthod. The data can be a nested array, but must not contain objects.
But this must happen before the output gets written to the parser cache, probably in ParserAfterParseHook. It is currently not easily possible to update a ParserCache entry once it has been saved.
Yup, that was it.
But this must happen before the output gets written to the parser cache, probably in ParserAfterParseHook.
Yeah, the deployment is blocked unless this happens. Someone needs to do it but not me.
It is currently not easily possible to update a ParserCache entry once it has been saved.
That's an acceptable compromise for the sake of performance
@daniel @Ladsgroup Does this mean we should switch PageNotices to be in the page content language? Not sure if it's possible to put userlang stuff in PC. Right now they are rendered in user language but should be fine to switch to content language as it's a new feature.
Yeah it should be page language. I don't think it's reasonable to make this be parsed on the fly on every page view :(
It's possibel to split the parser cache by user language. All you have to do is call getUserLangObj() on the ParserOptions object. But we'd be duplicating the actual page content in the cache... For a small-ish number of pages, that would be ok I guess.
This extension is going to be showing a notice over every page in the wiki (mw message per ns)
The part of PageNotice extension that provides page-specific notices is not planned to be used in WMF prod, but only the namespace-level notices. They can use variables like {{PAGENAME}} so a cache strategy like that of core sitenotices (which are parsed once and reused across all pages) doesn't work. @daniel @Ladsgroup Can you confirm if parser caching is still workable? I am at a loss to understand how cache invalidation would work:
Yeah exactly, the idea is that we just don't and let it get regenerated with refreshlinks/edits/expiry. I don't think page notices are time sensitive (similar to templates). I could be wrong though.
Parsing the message in every pageview (status quo) is also quite taxing and that's why needs to be addressed.
- We could instead store the rev_id of the notice in the page's PC, and in the ArticleViewHeaderHook, check if the cached rev_id is still the latest. If not, ignore the cached notice and parse the notice now. In a deferred update, purge the page so that a cached notice would be available the next time.
- However, this only works for logged-in views. For anonymous views, the entire HTML which includes the notices are served from the CDN cache, so users would see stale notices on pages which haven't been edited/purged after the notice update.
That could work but templates used in the page notice won't trigger an update but maybe an acceptable loss.
Can we take a couple of steps back?
This is injecting a system message into the page output, depending on the namespace.
We inject dozens of system messages in the user's language into the output on every request to build the skin.
I don't see why this is a problem, or why it needs caching. I mean, if the message is slow to parse, then we'd run into problems.
I am not very familiar with MessageCache. I was assuming that it would also cache parsed messages. But i'm not seeting the relevant code. Perhaps @Nikerabbit or @abi_ can shed some light.
Well, when a namespace notice is created (or updated), one would expect to see it on all pages – not just on newly edited ones. I think it will be very confusing otherwise.
I concur with this, although Ladsgroup seems to disagree :) We already have a couple of system messages that act like page notices: Scribunto-doc-page-show (shown on module pages) and Clearyourcache (shown on user/site level CSS/JS/JSON pages), that wikis customise heavily. I understand those pages are back-areas, and other namespaces like ns0 get a lot more traffic, but most of those hits would be anonymous and served from Varnish. I would prefer to see some data showing performance impact before we make premature optimizations. It seems clear now that using the parser cache would be a hacky solution.
They are from software and thus don't have complex templates with lots of lookups and such, and also should be cached (to the degree possible) but currently aren't.
I don't see why this is a problem, or why it needs caching. I mean, if the message is slow to parse, then we'd run into problems.
This message would come from user input. The chance of users not knowing the impact of this and accidentally bringing down the site is quite high.
I am not very familiar with MessageCache. I was assuming that it would also cache parsed messages. But i'm not seeting the relevant code. Perhaps @Nikerabbit or @abi_ can shed some light.
Nope, I checked multiple times, we cache the content of message but not the parsed value. We should :D (in cases that it doesn't involve fancy stuff like user lang etc.)
They can use variables like {{PAGENAME}} so a cache strategy like that of core sitenotices (which are parsed once and reused across all pages) doesn't work.
Note: Since we can use such variables, some wikis may configure it more heavily once it is available. One example (copied from T138652) is to automatically show content from Wikidata (e.g. the table for authority control), so it will have completely different (and not cheap to get) content in different pages.
I think content language actually even makes more sense from a user point of view. Since this is some kind of content, maintained by wiki administrators, they probably won’t want to maintain multiple subpages. On monolingual wikis, admins won’t bother maintaining it in any language other than the content language. On multilingual wikis, probably an {{int:lang}}-based hack will be used instead of duplicating the whole notice in multiple languages – the extension should be prepared for the case that the message content splits the cache by language, but it doesn’t have to worry about the user language on its own.
One thing to consider is that this is page notice (header and footer) and not site notice. So it makes sense to be dependent on the page. Whether it's authority control data or the message's language.
If there is a need for a general purpose message, that should go to the sitenotice.
MediaWiki has three notions of languages:
Language category | Specific to | Accessible in wikitext with |
---|---|---|
Content | wiki | {{CONTENTLANGUAGE}} (supported) |
Interface | user (mostly) | {{int:lang}} (hack, only if set up, e.g. on Commons; splits the parser cache by interface language if used) |
Page | page (ideally) | {{PAGELANGUAGE}} (supported) |
Currently the extension uses the interface language, which is dependent on the user, not the page. I propose that it uses the content language, and if a notice needs to do things differently based on the page language – which won’t be common I think –, it can use {{PAGELANGUAGE}} (of course, the notice needs to be parsed in the context of the page).