The problem
During discussion of T155813: Decide on storage and delivery method for TemplateStyles CSS, it became clear that for performance it would be desirable if parser tags and parser functions that cause stylesheets to be added to the page (i.e. by calling addModuleStyles()) would embed those styles in the body content in <style> tags rather than adding them to the stylesheets loaded in the page head, with deduplication if the same styles were to be included multiple times.
This could be used for the following components and extensions. This is an incomplete list, drawn from a quick grep through WMF-deployed extensions.
- MediaWiki-HTMLForm
- MediaWiki-extensions-Babel
- ShoutWiki Calendar
- MediaWiki-extensions-CategoryTree
- CharInsert
- Cite
- MediaWiki-extensions-InputBox
- Maps (Kartographer)
- Math
- SyntaxHighlight
- TemplateData
- TemplateStyles
- EasyTimeline
- MediaWiki-extensions-Translate
- WikiHiero
- #WikimediaIncubator
Services such as Mobile-Content-Service may need to be aware of the deduplication so they can make sure styles are retained even if the HTML containing the <style> tag is removed.
Straw proposal
This takes advantage of the existing code in ResourceLoader for loading data from various sources (filesystem, on-wiki, etc) and minifying it.
- Add a static method ResourceLoaderClientHtml::getStyleEmbedToken( $module ).
- Extension tag hooks that wish to add embedded styles will call this and include the returned token-string in their output.
- The tokens will be general strip markers, with content being <link rel="stylesheet" href="/w/load.php?..." data-mw-embed-module="..."/>. That way even if they somehow don't get replaced the browser should still load the styles, just in a less-efficient way.
- Add a static method ResourceLoaderClientHtml::embedStyleModules( $context, $text ).
- This will replace the above <link> tags with <style data-mw-embeded-module="...">...</style>. Only the first <style> tag for each module will have content, the rest will be empty (but still present to support reduplication).
- The PHP parser will call embedStyleModules() just before the ParserBeforeTidy hook.
- So MediaWiki-extensions-TemplateSandbox can work right, it'll use the mCurrentRevisionCallback from ParserOptions for Ib9d2ce42's setContentOverrideCallback. We'll probably have to add a flag to ParserOptions to indicate that that's safe to do for the same reason that OutputPage::userCanPreview() exists.
- MobileFrontend's "action=mobileview" will need a patch to avoid screwing up the styles when it does its page mangling. This would be done by extracting the styles from the page HTML as a whole before it's mangled, then injecting them (with deduplication) back into each chunk of HTML after they're mangled and split.
The end result should be
- The output article HTML gets embedded, deduplicated styles.
- Styles might be duplicated if they're used in multiple parsed messages on a page.
- I think Parsoid will wind up with embedded, not-deduplicated styles since I think it parses each parser tag individually via api.php. It would be up to Parsoid's maintainers to deduplicate based on the data-mw-embeded-module attribute if they want to.
- Mobile-Content-Service looks like it uses MobileFrontend for the section stuff, so it should be ok if MobileFrontend is.