Reduce runtime of MW shared gate Jenkins jobs to 5 min
Open, MediumPublic
Actions

Assigned To

None

Authored By

	Krinkle
	Jun 13 2019, 3:28 PM

Description

Objective

For the "typical" time it takes for a commit to be approved and landed in master to be 5 minutes or less.

Status quo

As of 11 June 2019, the gate usually takes around 20 minutes.

The two slowest jobs typically take 13-17 minutes each. The time for the gate overall is rarely under 15 minutes, because we run multiple of these jobs (increasing the chances of random slowness), and while they can run in parallel, they don't always start immediately - given limited CI execution slots.

Below a sample from a MediaWiki commit (master branch):

Gate pipeline build succeeded.

wmf-quibble-core-vendor-mysql-php72-docker SUCCESS in 12m 03s

wmf-quibble-core-vendor-mysql-hhvm-docker SUCCESS in 14m 12s

mediawiki-quibble-vendor-mysql-php72-docker SUCCESS in 7m 34s

mediawiki-quibble-vendor-mysql-php71-docker SUCCESS in 7m 12s

mediawiki-quibble-vendor-mysql-php70-docker SUCCESS in 6m 48s

mediawiki-quibble-vendor-mysql-hhvm-docker SUCCESS in 8m 32s

mediawiki-quibble-vendor-postgres-php72-docker SUCCESS in 10m 05s

mediawiki-quibble-vendor-sqlite-php72-docker SUCCESS in 7m 04s

mediawiki-quibble-composer-mysql-php70-docker SUCCESS in 8m 14s

(+ jobs that take less than 3 minutes: composer-test, npm-test, and phan.)

These can be grouped in two kinds of jobs:

wmf-quibble: These install MW with the gated extensions, and then run all PHPUnit, Selenium and QUnit tests.
mediawiki-quibble: These install MW bundled extensions only, and then run PHPUnit, Selenium and QUnit tests.

Stats from wmf-quibble-core-vendor-mysql-php72-docker

9-15 minutes (wmf-gated, extensions-only)
Sample:
- PHPUnit (dbless): 1.91 minutes / 15,782 tests.
- QUnit: 29 seconds / 1286 tests.
- Selenium: 143 seconds / 43 tests.
- PHPUnit (db): 3.85 minutes / 4377 tests.

Stats from mediawiki-quibble-vendor-mysql-php72-docker:

7-10 minutes (plain mediawiki-core)
Sample:
- PHPUnit (unit+dbless): 1.5 minutes / 23,050 tests.
- QUnit: 4 seconds / 437 tests.
- PHPUnit (db): 4 minutes / 7604 tests.

Updated status quo

As of 11 May 2021, the gate usually takes around 25 minutes.

The slowest job typically takes 20-25 minutes per run. The time for the gate overall can never be faster than the slowest job, and can be worse as though we run other jobs in parallel, they don't always start immediately, due to given limited CI execution slots.

Below is the time results from a sample MediaWiki commit (master branch):

[Snipped: Jobs faster than 5 minutes]

9m 43s: mediawiki-quibble-vendor-mysql-php74-docker/5873/console
9m 47s: mediawiki-quibble-vendor-mysql-php73-docker/8799/console
10m 03s: mediawiki-quibble-vendor-sqlite-php72-docker/10345/console
10m 13s: mediawiki-quibble-composer-mysql-php72-docker/19129/console
10m 28s: mediawiki-quibble-vendor-mysql-php72-docker/46482/console
13m 11s: mediawiki-quibble-vendor-postgres-php72-docker/10259/console
16m 44s: wmf-quibble-core-vendor-mysql-php72-docker/53990/console
22m 26s: wmf-quibble-selenium-php72-docker/94038/console

Clearly the last two jobs are dominant in the timing:

wmf-quibble: This jobs installs MW with the gated extensions, and then runs all PHPUnit and QUnit tests.
wmf-quibble-selenium: This job installs MW with the gated extensions, and then runs all the Selenium tests.

Note that the mediawiki-quibble jobs each install just the MW bundled extensions, and then run PHPUnit, Selenium and QUnit tests.

Stats from wmf-quibble-core-vendor-mysql-php72-docker:

13-18 minutes (wmf-gated, extensions-only)
Select times:
- PHPUnit (unit tests): 9 seconds / 13,170 tests.
- PHPUnit (DB-less integration tests): 3.31 minutes / 21,067 tests.
- PHPUnit (DB-heavy): 7.91 minutes / 4,257 tests.
- QUnit: 31 seconds / 1421 tests.

Stats from wmf-quibble-selenium-php72-docker:

20-25 minutes

Scope of task

This task represents the goal of reaching 5 minutes or less. The work tracked here includes researching ways to get there, trying them out, and putting one or more ideas into practice. The task can be closed once we have reached the goal or if we have concluded it isn't feasible or useful.

Feel free to add/remove subtasks as we go along and consider different things.

Stuff done

T202116: LoadBalancer opening extra connections in different connection categories doesn't work with PHPUnit & temporary tables
T225496: Improve caching in CI tests
T225901: Don't deduplicate archive table on new installs
T225719: HashRingTest::testHashRingKetamaMode takes 3+ seconds
T225184: CirrusSearch\SearcherTest::testSearchText PHPUnit tests take a while and runs for everyone
T227067: ReleaseNotesTest:testReleaseNotesFilesExistAndAreNotMalformed takes ~ 4 seconds
T196347: Quibble may need to rebuild localization cache before running tests
Speed up LocalisationCache in Quibble jobs by switching from DB to file-based (by setting $wgCacheDirectory). – https://gerrit.wikimedia.org/r/528933, https://gerrit.wikimedia.org/r/529057
Skip "npm test" step during Quibble jobs (covered by separate node10-test job). – T233143
T230701: Migrate Scribunto to stop using MediaWikiIntegrationTestCase on unit tests
T232759: Move CI selenium/qunit tests of mediawiki repository to a standalone job
Wikibase: T231862: Selenium tests for Wikibase are being ran twice
T225068: Add a PHPUnit group to skip test on gated CI runs
T225248: Consider moving browser based tests (Selenium and QUnit) to a non-voting pipeline
Use npm ci: T234738: Quibble jobs re-download npm packages every build (Castor not loading?)

Ideas to explore and related work

Look at the PHPUnit "Test Report" for a commit and sort the root by duration. Find the slowest ones and look at its test suite to look for ways to improve it. Is it repeating expensive setups? Perhaps that can be skipped or re-used. Is it running hundreds of variations for the same integration test? Perhaps reduce it to just 1 case for that story, and apply the remaining cases to a lighter unit test instead.

Details

Subject	Repo	Branch	Lines +/-
Make MapCacheLRU in LanguageFactory static	mediawiki/core	master	+14 -5
phpunit: Set much smaller defaults for RandomImageGenerator	mediawiki/core	master	+6 -6
qunit: Replace slow mw.messages reset with empty object reset	mediawiki/core	master	+41 -19
Run post-dependency install, pre-test steps in parallel	integration/quibble	master	+40 -15
DNM: Measuring impact of some improvements	mediawiki/core	master	+6 -0
DevelopmentSettings: Use MWLoggerDefaultSpi for debug logging	mediawiki/core	master	+30 -4
[DNM] Add excimer config	integration/quibble	master	+109 -0
selenium: Move notifications page test into general suite	mediawiki/extensions/Echo	master	+6 -18
selenium: Use a single login for browser tests	mediawiki/extensions/Echo	master	+16 -33
Less_Parser: Faster MatchQuoted() by using native strcspn()	mediawiki/libs/less.php	master	+3 -3
TestLocalisationCache: add static cache of JSON files	mediawiki/core	master	+14 -0
Make "pluralRules" caches static in LocalisationCache	mediawiki/core	master	+22 -25
Less_Parser: Inline and optimize heavily called MatchQuoted()	mediawiki/libs/less.php	master	+39 -43
Improve performance of LocalisationCache::mergeMagicWords()	mediawiki/core	master	+11 -8
Much faster tests by setting content language to qqx	mediawiki/extensions/Kartographer	master	+35 -27
[POC] Set content language to qqx in slow tests	mediawiki/extensions/Translate	master	+7 -4
Re-arrange execution order in LESS parser for performance	mediawiki/libs/less.php	master	+11 -6
tests: Pass Title to editPage() when already parsed	mediawiki/core	master	+30 -30
Fix TestLocalisationCache being way to small	mediawiki/core	master	+5 -4
storage tests: Call editPage() with WikiPage when used for same page	mediawiki/core	master	+23 -22
api tests: Call editPage() with WikiPage when used for same page	mediawiki/core	master	+52 -55
npm: Use cache for npm ci and prefer offline	integration/quibble	master	+24 -2
[DNM] Measure impact of GeneralizedSql on tests run time	mediawiki/core	master	+3 -3
password: Reduce time cost of password unit tests	mediawiki/core	master	+2 -28
Remove selenium entries from package.json	mediawiki/extensions/MobileFrontend	master	+0 -3
dockerfiles: Add php-excimer to quibble	integration/config	master	+14 -0
jjb: [quibble] Remove 'compress junit' postbuildscript step	integration/config	master	+0 -4
[DNM] Test if reduction in logging would reduce the tests run time	mediawiki/core	master	+0 -5
debug: Fix $wgDebugRawPage to work with PSR-3 debug logging	mediawiki/core	master	+56 -16
phpunit: Set $wgSQLMode from DevelopmentSettings instead of MediaWikiIntegrationTestCase (ii)	mediawiki/core	master	+2 -12
phpunit: Fix slow testBotPasswordThrottled by lowering limits	mediawiki/core	master	+8 -3
Logging: Use MWLoggerDefaultSpi	integration/quibble	master	+34 -0
phpunit: Set $wgSQLMode from DevelopmentSettings instead of MediaWikiIntegrationTestCase	mediawiki/core	master	+2 -12
Revert "zuul: Install MobileFrontend when testing Echo"	integration/config	master	+1 -1
jjb: update Quibble jobs from 1.3.0 to 1.4.3	integration/config	master	+32 -32
jjb: update Quibble jobs from 1.3.0 to 1.4.0	integration/config	master	+28 -28
release: Quibble 1.4.0	integration/quibble	master	+27 -3
build: Update eslint-config-wikimedia to 0.21.0	mediawiki/extensions/ProofreadPage	master	+11 K -621
build: Update eslint-config-wikimedia to 0.21.0	mediawiki/extensions/GrowthExperiments	master	+13 K -549
build: Update eslint-config-wikimedia to 0.21.0	mediawiki/extensions/Echo	master	+11 K -216
build: Update eslint-config-wikimedia to 0.21.0	mediawiki/extensions/AbuseFilter	master	+12 K -469
build: Update eslint-config-wikimedia to 0.21.0	mediawiki/extensions/FileImporter	master	+11 K -530
phpunit: Remove MediaWikiPHPUnitTestListener	mediawiki/core	master	+22 -140
mediawiki.jqueryMsg: Refactor test suite to not make any API requests	mediawiki/core	master	+1 K -679
Revision: use MWTimestamp directly instead of wfTimesamp()	mediawiki/core	master	+21 -14
Revision: Use setFakeTime() instead of sleep() in unit tests	mediawiki/core	master	+6 -4
build: Merge doc linting into 'npm test'	mediawiki/core	master	+1 -1
layout: Drop mediawiki-core-javascript-docker, rely on mwgate-node10-docker	integration/config	master	+0 -19
zuul: omit mediawiki-quibble-(composer\|vendor) from mwcore test pipeline	integration/config	master	+12 -5
phpunit: Speed up MediaWikiCoversValidator trait	mediawiki/core	master	+20 -19
resourceloader: Speed up structure/ResourcesTest	mediawiki/core	master	+17 -49
selenium: Remove "RunJobs" wait from specialrecentchanges test	mediawiki/core	master	+1 -3
localisation: Add process cache to LCStoreDB	mediawiki/core	master	+13 -1
jjb: Upgrade all quibble jobs to 0.0.35	integration/config	master	+22 -22
docker: rebuild for quibble 0.0.35	integration/config	master	+54 -0
Set cache directory	integration/quibble	master	+4 -0
resourceloader: Remove slow structure test for checking getVersionHash	mediawiki/core	master	+3 -10
phpunit: Avoid get_class() in MediaWikiCoversValidator	mediawiki/core	master	+1 -1
Avoid using MediaWikiIntegrationTestCase on unit tests	mediawiki/extensions/Wikibase	master	+70 -21
resourceloader: Speed up dependency checks in structure/ResourcesTest	mediawiki/core	master	+27 -37
resourceloader: Document which FileModule methods use a DB	mediawiki/core	master	+29 -5
AutoLoader: Skip tokenizing of irrelevant lines in ClassCollector	mediawiki/core	master	+37 -1
Make BabelTest less slow (don't create a dozen wiki pages for every test)	mediawiki/extensions/Babel	master	+47 -52
Add a report about slow PHPUnit tests	mediawiki/core	master	+16 -1

Revisions and Commits

rMW MediaWiki
	rMW047c184bfef7 tests: Use Title::makeTitle instead of Title::newFromText

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T225730 Reduce runtime of MW shared gate Jenkins jobs to 5 min
Resolved		Ladsgroup	T230701 Migrate Scribunto to stop using MediaWikiIntegrationTestCase on unit tests
Resolved		aaron	T202116 LoadBalancer opening extra connections in different connection categories doesn't work with PHPUnit & temporary tables
Resolved		hashar	T188375 castor rsync's taking 3-5 minutes for mwgate-npm jobs
Resolved		aborrero	T232646 Move integration-castor03.integration.eqiad.wmflabs to a newer cloudvirt machine
Resolved		hashar	T232644 Check bandwidth limitation on integration-castor03.integration.eqiad.wmflabs / cloudvirt1002
Resolved		hashar	T232759 Move CI selenium/qunit tests of mediawiki repository to a standalone job
Resolved		hashar	T249845 CI jobs for WMF deployed extensions should use vendor.git , not composer
Resolved		• Mholloway	T237596 Add MachineVision dependencies to vendor
Declined		Reedy	T237588 Security review for MachineVision libraries
Open		None	T234002 Make MediaWiki Wdio tests less slow (Sept 2019)
Resolved		Krinkle	T213268 Upgrade webdriverio to version 5 in mediawiki/core
Resolved		Jdforrester-WMF	T222406 Switch quibble-based CI jobs from node6 to node10
Resolved		Jdforrester-WMF	T224983 mediawiki-phpunit-coverage-patch-docker fails to install [email protected]
Declined		Jdforrester-WMF	T224997 Update MobileFrontend-npm-run-lint-modules-docker to run node10
Resolved		Krinkle	T214708 Usage instructions in tests/selenium/README.md are confusing
Resolved		hashar	T234028 Investigate speed of MediaWiki logging setup under Quibble
Declined		None	T225871 Selenium and PHPUnit: Stop execution on failure
Resolved		Jdforrester-WMF	T234738 Quibble jobs re-download npm packages every build (Castor not loading?)
Open		None	T235449 Quibble: Run PHPUnit databaseless and database stages in parallel
Open		None	T303046 SearchEngineTest: failures when run concurrently with non-database group PHPUnit tests
Resolved		None	T233133 DefaultPreferencesFactoryTest:testIntvalFilter takes ~8s for a pretty trivial test for a single preference value
Resolved		Daimona	T234020 Switch mediawiki code coverage from xdebug to pcov
Open		None	T243847 Add pcov PHP extension to wikimedia apt (and upgrade from 1.0.6-4+wmf1~buster1 to 1.0.11) so it can be used in Wikimedia CI
Declined		None	T248531 Abort a Zuul pipeline when one job completed with failures (change zuul scheduler's failure check from areAllJobsComplete to didAnyJobFail)
Open		None	T129357 Migrate deployment of integration/config for zuul / jjb to scap deploy
Resolved		None	T134001 scap to reload a service instead of restart
Duplicate		None	T248596 Change zuul scheduler's failure check from areAllJobsComplete to didAnyJobFail to abort CI failures earlier
Open		None	T226869 Run browser tests in parallel
Declined		None	T291476 Migrate Wikibase selenium tests to Quibble+Apache, enable concurrency
Resolved		None	T285649 Switch all Quibble Selenium and api-testing jobs to use apache
Resolved		None	T276428 Introduce non-voting jobs with quibble+apache
Resolved		awight	T225218 Consider httpd for quibble instead of php built-in server
Resolved		kostajh	T282154 Wikibase: Special:EntityData compression encoding issues in local dev env
Open		None	T227352 Set up extension tests for Parsoid repo
Resolved		cscott	T248726 Run phan on code in extension/
Resolved		kostajh	T297480 Inject environment variable to indicate that Quibble is running with Apache
Open		None	T298514 "should mark the revert as a bot edit" test flaky with Apache + php-fpm
Resolved		kostajh	T299491 Switch QUnit tests to use Apache backend
Resolved		hashar	T299492 Quibble ci-fullrun jobs should use Apache backend
Open		None	T301100 Fix flaky ContributionsCount.js test
Resolved	PRODUCTION ERROR	hoo	T255566 [LEX] [TECH] Figure out why LexemeEditEntityTest::testGivenInvalidDataInClearRequest_errorIsReported is throwing an Exception/logging an error with AbuseFilter enabled
Resolved		Lucas_Werkmeister_WMDE	T298682 Wikibase\Repo\Store\Sql\SqlIdGenerator::generateNewId deadlock on wb_id_counters when running selenium tests in parallel
Resolved		None	T337862 Try to run browser tests of WMDE-TechWish projects in parallel
Resolved		Dreamy_Jazz	T340823 Fix CheckUser selenium tests so they can run concurrently
Resolved		None	T340943 TwoColConflict selenium tests run in parallel trigger db errors
Open		None	T344754 Browser tests video capture is shared between tests
Open		None	T345149 Special:RecentChanges Selenium test has a race condition when running concurrently
Open		None	T287582 Move some Wikibase selenium tests to a standalone job
Open		None	T50217 Speed up MediaWiki PHPUnit build by running integration tests in parallel
Resolved		kostajh	T297078 PHPUnit tests should pass when run on their own
Resolved		kostajh	T297068 MediaWikiServicesTest::testDefaultServiceInstantiation fails when run on its own
Resolved		Krinkle	T297079 LanguageIntegrationTest.php fails when run on its own
Resolved		kostajh	T297082 DefaultPreferencesFactoryTest fails when run on its own
Resolved		kostajh	T297102 ArticleViewTest::testViewOfOldRevisionFromCache fails when run on its own
Resolved		kostajh	T297103 SkinMustacheTest::testGetTemplateData fails when run on its own
Resolved		Daimona	T90875 Use vendor/bin/phpunit instead of tests/phpunit/phpunit.php
Resolved		kostajh	T297292 Fix "Deprecated: Premature access to service" from data providers when using vendor/bin/phpunit
Resolved		Daimona	T227900 Phase out usage of tests/phpunit/suite.xml
Resolved		daniel	T304183 Allow LocalSettings.php to be loaded in a function scope
Resolved	BUG REPORT	kostajh	T306020 PHPUnit integration tests fail with "No such service: HookContainer"
Resolved		Daimona	T342259 Move MediaWikiIntegrationTestCase setup and teardown away from run()
Resolved		Daimona	T342301 Remove MediaWikiIntegrationTestCase::$tablesUsed in favour of automatic query tracking
Open		None	T345481 Migrate Parser and extension tests away from deprecated PHPUnit TestSuite subclassing
Open		None	T358394 Stop using PHPUnit's TestSuite in Scribunto
Resolved		ArthurTaylor	T363797 DifferenceEngineTest::testMapDiffPrevNext fails when run standalone in CI / Quibble context
Resolved		ArthurTaylor	T365976 [REPO][CLIENT][SW] Create a `composer phpunit:prepare-parallel-test-suites` command to split suite into smaller groups
Resolved		None	T368369 SkinMinervaTest.php fails when run in a suite with SpecialAvailableBadgesTest.php
Resolved		None	T368390 RevisionFormatterTest.php fails when run in a suite with PermissionsTest.php
Resolved		ArthurTaylor	T368506 PostCollectionTest.php fails when run in a suite with SpecialPageFatalTest.php
Resolved		Lucas_Werkmeister_WMDE	T369708 ConfigHelperTest.php and MinervaPagePermissionsTest.php fail when run in a suite with SpecialListDatatypesTest.php
Resolved		ArthurTaylor	T374912 SpecialCentralAuthTest fails when run in a suite with PopulateCentralCheckUserIndexTablesTest
Resolved		None	T365978 [Infra][SW] Migrate Quibble from Python test suite splitting to PHP / composer implementation
Open		ArthurTaylor	T372618 Create a daily job running core + extension PHPUnit tests serially
Resolved		ArthurTaylor	T375482 Test failure: GrowthExperiments\Tests\Integration\SpecialQuitMentorshipTest::testPost with data set "has mentees"
Resolved		ArthurTaylor	T375483 Test failure: The log type pagetranslation has the same translation as pagetranslation for qqx / The log type messagebundle has the same translation as messagebundle for qqx
Duplicate		None	T375486 Test failure: MediaWiki\Skins\Vector\Tests\Structure\PerformanceBudgetTest::testTotalModulesSize
Open		None	T375551 Test failure: WikiLambda ContentHandlerFunctionalTest::testGetParserOutput
Open		None	T375558 Test failure: Wikibase\Repo\Tests\Hooks\Helpers\OutputPageEditabilityTest::testGivenPageIsEditable_returnsTrue
Resolved		hashar	T375756 PHPUnit test failure in MediaWiki\Extension\Translate\MessageLoading\MessageIndexTest::testMessageIndexImplementation: Error 1114: The table 'unittest_translate_messageindex' is full
Resolved		None	T375851 [Infra] Implement a stable ordering for tests in PHPUnit parallel test runs
Resolved		None	T375852 Create separate debug log streams for PHPUnit parallel runs
Resolved		jsn.sherman	T376412 Test failure on parallel run in MediaWiki\Extension\PageTriage\Test\Integration\PageTriageTest::testBulkSetTagsUpdated
Resolved		None	T377176 Re-enable parallel PHPUnit for 7.4 jobs
Resolved		ArthurTaylor	T377197 SpecialCentralAuthTest fails when run in a suite with AccountCreationDetailsLookupTest
Invalid		None	T377234 AutoLoaderStructureTest::testAutoloadOrder - autoload.php does not match output of generateLocalAutoload.php script
Duplicate	BUG REPORT	None	T377644 Some patches in the CLDR extension fail in CI for an unclear reason
Resolved		Umherirrender	T377668 Newsletter CI failure under parallel test run about missing newsletter main page
Resolved		None	T378478 Fill split_groups for parallel testing in sequence rather than round robin
Resolved		None	T378481 Make failed test output for parallel runs easier to read
Open		None	T378797 Use PHPUnit test results cache timing data to distribute tests in parallel runs
Open		None	T379712 Run PHPUnit tests for mediawiki/core in parallel
Open		hoo	T379764 phpunit splitter fails with obscure error message when a data provider is invalid
Open		None	T297348 PHPUnit: Allow test runner to use main wiki database
Stalled		None	T297349 PHPunit: Don't run DeferredUpdates by default
Open		None	T297561 Run linters before starting longer running jobs
Open		None	T298735 Run api-testing tests in parallel
Resolved		Krinkle	T199393 Selenium tests sometimes fail due to deadlock in User::addToDatabase from api.php?action=createaccount
Resolved		hashar	T342088 Run Quibble Mariadb server with innodb_print_all_deadlocks
Open		None	T284568 Speed up login
Open		None	T332865 PHPUnit data providers should be simple static functions that return plain data
Open		None	T337130 Make PHPUnit dataProvider LexemeFieldTest::getTestData static (WikibaseLexemeCirrusSearch extension)
Open		None	T337135 Make PHPUnit dataProvider on UstringLibraryTest and LuaEngineUnitTestBase static
Open		None	T337144 Make PHPUnit dataProvider static in AbuseFilter tests
Open		None	T337155 Make PHPUnit dataProvider static in ReadingLists tests
Open		None	T337156 Make PHPUnit dataProvider static in IPInfo tests
Resolved		tstarling	T337157 Make PHPUnit dataProvider static in LoginNotify tests
Open		None	T337158 Make PHPUnit dataProvider static in Flow tests
Resolved		Dreamy_Jazz	T337159 Make PHPUnit dataProvider static in CheckUser tests
Resolved		Dreamy_Jazz	T346044 Remove CheckUserUnionQueryBuilder
Resolved		Physikerwelt	T337160 Make PHPUnit dataProvider static in Math tests
Resolved		Tgr	T337161 Make PHPUnit dataProvider static in OAuth tests
Open		None	T337162 Make PHPUnit dataProvider static in ProofreadPage tests
Open		None	T337163 Make PHPUnit dataProvider static in CirrusSearch tests
Resolved		None	T337164 Make PHPUnit dataProvider static in FileImporter tests
Open		None	T337165 Make PHPUnit dataProvider static in GrowthExperiments tests
Open		None	T337166 Make PHPUnit dataProvider static in CampaignEvents tests
Open		None	T371467 [GENERAL] Make PHPUnit dataProvider static in Wikibase-related extensions
Open		None	T337154 Make PHPUnit dataProvider static in Wikibase tests
Open		AudreyPenven_WMDE	T380604 Make provideLuaData static
Open		Lucas_Werkmeister_WMDE	T380605 Make EntityMetaTagsCreatorTestCase's provideTestGetMetaTags dataProvider static
Resolved		Lucas_Werkmeister_WMDE	T337153 Make PHPUnit dataProvider static in WikibaseQualityConstraints tests
Open		None	T337151 Make PHPUnit dataProvider static in WikibaseLexeme tests
Open		None	T337152 Make PHPUnit dataProvider static in WikibaseMediaInfo tests
Resolved		thiemowmde	T337150 Make PHPUnit dataProvider static in Wikidata.org tests
Open		None	T337148 Make PHPUnit dataProvider static in WikibaseCirrusSearch tests
Resolved		Daimona	T342428 Avoid creating a page and user for every PHPUnit Database test
Resolved		Daimona	T155147 Do not initialise the database in tests when not needed

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Three years later: maybe we should target a more realistic number, like 15 minutes? If we can get to 15 minutes, we could then move on to 10 minutes, and then maybe end up back at this task's current goal of 5 minutes.

T287582: Move some Wikibase selenium tests to a standalone job is a potential low hanging fruit. The Wikibase Selenium tests takes roughly 5minutes30 and only depends on MinervaNeue, MobileFrontend and UniversalLanguageSelector (and MediaWiki core obviously). If we moved those to a standalone job only triggering for those few repositories, that will save 5minutes30 for all other repositories. We would have to drop the npm selenium-test entry point from Wikibase to prevent it from being discovered when Wikibase is a dependency.

Another potentially large saver would be to mark slow tests with @group Standalone which would only trigger when a patchset target that repo and thus prevent them from running when the patchset is for another repository.

@hashar Afaik for PHPUnit we only install (and thus test) dependency extensions, based on the dependency map in CI config. Is this not the case for the Selenium job? If there's a default list applied also, perhaps we can opt-out from that for the Selenium job.

Change 853292 merged by jenkins-bot:

[mediawiki/extensions/MobileFrontend@master] Remove selenium entries from package.json

https://gerrit.wikimedia.org/r/853292

ReleaseTaggerBot added a project: MW-1.40-notes (1.40.0-wmf.10; 2022-11-14).Nov 4 2022, 6:00 PM

Change 859146 had a related patch set uploaded (by Krinkle; author: Tim Starling):

[mediawiki/core@master] Reduce time cost of password tests

https://gerrit.wikimedia.org/r/859146

Change 859146 merged by jenkins-bot:

[mediawiki/core@master] password: Reduce time cost of password unit tests

https://gerrit.wikimedia.org/r/859146

ReleaseTaggerBot edited projects, added MW-1.40-notes (1.40.0-wmf.12; 2022-11-28); removed MW-1.40-notes (1.40.0-wmf.10; 2022-11-14).Nov 23 2022, 1:01 AM

In T225730#8370890, @Krinkle wrote:

@hashar Afaik for PHPUnit we only install (and thus test) dependency extensions, based on the dependency map in CI config. Is this not the case for the Selenium job? If there's a default list applied also, perhaps we can opt-out from that for the Selenium job.

The PHPUnit and Selenium kind jobs are alike: they both use Quibble as test runner and both have extensions/skins dependencies injected from CI. The only difference is the kind of tests being run which are filtered via quibble parameters, respectively: --skip=selenium and --run=selenium.

The registration or discovery of tests varies though. A detailed description (you probably know most of that already but that can be helpful for other readers):

PHPUnit

We run the MediaWiki core extensions suite (defined at tests/phpunit/suites/ExtensionsTestSuite.php) which:

discover tests if they are put in a /tests/phpunit directory relatively to each extension/skin.
add tests registered via the UnitTestsList hook

Those tests are depending on the centrally managed configuration from mediawiki/core eg the phpunit version.

Selenium

There is no registration mechanism, only a discovery phase. Each extension/skin test suite is standalone and can use different versions of Webdriver.io. The convention is developers define how to run the browser tests by adding a selenium-test npm script. In those jobs quibble --run=selenium has all extensions/skins cloned in and checked out, Quibble then crawls through each repo looking for a package.json and if has a selenium-test it will run those tests invoking npm run selenium. They are run serially.

The alternative would be to only run the Selenium test for the triggering repository. Something such as: quibble --command 'cd $THING_NAME && npm run-script selenium-test. But we will loose the integration testing between repository :-(

For PHPUnit test we have moved some tests to @group Standalone. The use case was to avoid running the Scribunto integration tests for any extension depending on it, given those tests are only going to be affected by a change to Scribunto there was no point in running them when a patch triggers Wikibase for example. It is another build stage defined in Quibble: `quibble --run phpunit-standalone.

I guess we could do something similar for Selenium and split tests between:

integration testing (to be run by any repo)
standalone tests (to be run solely when a patch triggers for this repo)
add a selenium-standalone npm script convention which in the repo would invoke something such as wdio --spec tests/selenium/specs/standalone/**/*.js

In T225730#8423575, @hashar wrote:

In T225730#8370314, @hashar wrote:

The Wikibase Selenium tests […] only depends on MinervaNeue, MobileFrontend and UniversalLanguageSelector […]
! In T225730#8370890, @Krinkle wrote:

@hashar Afaik for PHPUnit we only install (and thus test) dependency extensions, based on the dependency map in CI config. Is this not the case for the Selenium job? […]

The PHPUnit and Selenium kind jobs are alike: they both use Quibble as test runner and both have extensions/skins dependencies injected from CI. […] The registration or discovery of tests varies though. […] Selenium: There is no registration mechanism, only a discovery phase. […]

My reason for asking is that I thought you meant that Wikibase CI is slow because it is running selenium tests for extensions that it does not depend on. The lack of a test registration system for Selenium should not be an issue since we can only discover what we install in CI, and CI only installs what Wikibase depends on according to the repo dependency map.

For PHPUnit test we have moved some tests to @group Standalone. […] I guess we could do something similar for Selenium and split tests, [and] add a selenium-standalone npm script convention.

Another option might be to filter out @standalone tests via wdio --mochaOpts.grep --invert. This is even more similar to PHPUnit and is what we already use for running the @daily selenium tests. This has the benefit of keeping a simple and consistent way to run selenium tests on any given repo as a contributor without needing to know about this detail. It is then to Quibble to invoke a variant npm run selenium-nostandalone on non-current repos to skip those. This instead of e.g. having the main selenium entrypoint no longe run all the tests, which might lead to confusion.

kostajh added a subtask: T284568: Speed up login.Dec 9 2022, 3:50 PM

zeljkofilipin changed the status of subtask T284568: Speed up login from In Progress to Open.Dec 15 2022, 11:52 AM

Krinkle mentioned this in T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel.Feb 1 2023, 5:24 PM

Krinkle added a subtask: T332865: PHPUnit data providers should be simple static functions that return plain data.Mar 24 2023, 2:52 AM

Krinkle mentioned this in T330508: Expand running of ForeignResourceStructureTest against skins and extensions that have foreign-resources.yaml files.Apr 27 2023, 4:40 PM

Change 913562 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] [DNM] Measure impact of GeneralizedSql on tests run time

https://gerrit.wikimedia.org/r/913562

Change 913562 abandoned by Ladsgroup:

[mediawiki/core@master] [DNM] Measure impact of GeneralizedSql on tests run time

Reason:

https://gerrit.wikimedia.org/r/913562

Change 702909 abandoned by Hashar:

[integration/quibble@master] npm: Use cache for npm ci and prefer offline

Reason:

https://gerrit.wikimedia.org/r/702909

TheresNoTime subscribed.Jun 6 2023, 2:38 PM

Change 932455 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] api tests: Call editPage() with WikiPage when used for same page

https://gerrit.wikimedia.org/r/932455

Change 932455 merged by jenkins-bot:

[mediawiki/core@master] api tests: Call editPage() with WikiPage when used for same page

https://gerrit.wikimedia.org/r/932455

ReleaseTaggerBot added a project: MW-1.41-notes (1.41.0-wmf.15; 2023-06-27).Jun 25 2023, 11:00 AM

Change 936317 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] storage tests: Call editPage() with WikiPage when used for same page

https://gerrit.wikimedia.org/r/936317

Change 936230 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/core@master] Fix TestLocalisationCache being way to small

https://gerrit.wikimedia.org/r/936230

Change 936317 merged by jenkins-bot:

[mediawiki/core@master] storage tests: Call editPage() with WikiPage when used for same page

https://gerrit.wikimedia.org/r/936317

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.17; 2023-07-11); removed MW-1.41-notes (1.41.0-wmf.15; 2023-06-27).Jul 7 2023, 9:00 PM

Change 936230 merged by jenkins-bot:

[mediawiki/core@master] Fix TestLocalisationCache being way to small

https://gerrit.wikimedia.org/r/936230

thiemowmde subscribed.Jul 9 2023, 6:43 PM

Change 938346 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] tests: Pass Title to editPage() when already parsed

https://gerrit.wikimedia.org/r/938346

Daimona added a subtask: T155147: Do not initialise the database in tests when not needed.Jul 15 2023, 4:54 PM

Change 938346 merged by jenkins-bot:

[mediawiki/core@master] tests: Pass Title to editPage() when already parsed

https://gerrit.wikimedia.org/r/938346

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.18; 2023-07-18); removed MW-1.41-notes (1.41.0-wmf.17; 2023-07-11).Jul 16 2023, 10:00 AM

Change 939296 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/libs/less.php@master] Re-arrange execution order in LESS parser for performance

https://gerrit.wikimedia.org/r/939296

Daimona updated the task description. (Show Details)Jul 18 2023, 1:00 PM

Change 939684 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Much faster tests by setting content language to qqx

https://gerrit.wikimedia.org/r/939684

Change 939690 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Translate@master] [POC] Set content language to qqx in slow tests

https://gerrit.wikimedia.org/r/939690

Change 939296 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/libs/less.php@master] Re-arrange execution order in LESS parser for performance

Reason:

https://gerrit.wikimedia.org/r/939296

Change 939727 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/libs/less.php@master] Inline and optimize heavily called Parser::MatchQuoted

https://gerrit.wikimedia.org/r/939727

Change 939690 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/extensions/Translate@master] [POC] Set content language to qqx in slow tests

Reason:

No, that's not it.

https://gerrit.wikimedia.org/r/939690

Change 939717 had a related patch set uploaded (by Krinkle; author: Thiemo Kreuz (WMDE)):

[mediawiki/libs/less.php@master] Less_Parser: Faster MatchQuoted() by using native strcspn()

https://gerrit.wikimedia.org/r/939717

Change 939717 merged by jenkins-bot:

[mediawiki/libs/less.php@master] Less_Parser: Faster MatchQuoted() by using native strcspn()

https://gerrit.wikimedia.org/r/939717

Change 939684 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Much faster tests by setting content language to qqx

https://gerrit.wikimedia.org/r/939684

In T225730#9030571, @gerritbot wrote:

Change 939684 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Much faster tests by setting content language to qqx

https://gerrit.wikimedia.org/r/939684

From my review on that change:

Think the root cause is the localization cache is set to use a static array from includes/DevelopmentSettings.php:

// Localisation Cache to StaticArray (T218207)
$wgLocalisationCacheConf['store'] = 'array';

For PHPUnit tests, it is probably fine since there is a single process. But for Selenium tests, that means each web request has to regenerate the localization cache which would explain the slowdown?

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.19; 2023-07-25); removed MW-1.41-notes (1.41.0-wmf.18; 2023-07-18).Jul 20 2023, 10:00 AM

for Selenium tests, that means each web request has to regenerate the localization cache which would explain the slowdown?

When I look at the strange, sometimes extreme runtimes of ResourcesTest and SpecialPageFatalTest I suspect something similar happens in other places. I mean, sure, executing special pages, parsing .less files, or initializing LocalisationCaches is sometimes just very complicated and expensive. But even the most expensive of these examples consumes only a few 100 milliseconds. That's not a problem.

But it is a problem if it's done thousands of times for tests that don't even do anything with the results.

I'm digging into this for a few days now but can't find that single bottleneck anywhere. I suspect the majority of the up to 20 minutes this ticket talks about is just because we are naively running the same processes with the same inputs and the same outputs over and over again, millions and millions of times.

In T225730#9030945, @thiemowmde wrote:

for Selenium tests, that means each web request has to regenerate the localization cache which would explain the slowdown?

When I look at the strange, sometimes extreme runtimes of ResourcesTest and SpecialPageFatalTest I suspect something similar happens in other places. I mean, sure, executing special pages, parsing .less files, or initializing LocalisationCaches is sometimes just very complicated and expensive. But even the most expensive of these examples consumes only a few 100 milliseconds. That's not a problem.

T50217#9030084 is related. I noticed that paratest was hanging, and spending most of the time (12 of 18 seconds) on a single test, LanguageConverterFactoryTest. I looked at that test yesterday, it basically loads the localisation cache for all languages, one by one. And while loading a single language isn't terribly expensive (the slowest I saw was around 80ms), it adds up when done hundreds of times. I couldn't find a solution to that, though, and just stopped trying after a while.

Change 940120 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/core@master] Make MapCacheLRU in LanguageFactory static

https://gerrit.wikimedia.org/r/940120

Change 940233 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/core@master] TestLocalisationCache: add static cache of JSON files

https://gerrit.wikimedia.org/r/940233

In order to help understand what's going on under the hood and make more informed decisions on what to improve and what don't. I run tests with excimer and built flamegraphs out of them which you can see for three types of tests (phpunit, selenium, api_tests) here: https://people.wikimedia.org/~ladsgroup/tests_flamegraphs/

The flamegraphs can be searched, zoomed in, etc. (More info: https://www.brendangregg.com/flamegraphs.html)

That provides way better insights and are a treasure trope of areas to improve. I haven't fully dug into all of them but for example: 3% of phpunit tests is being spent on resetting database between each test (MediaWikiIntegrationTestCase::resetDB) which makes some sense but almost all of it is being spent in creating pages and edits(!), why? because it internally calls MediaWikiIntegrationTestCase::addCoreDBData which in turns calls WikiPage::doUserEditContent and makes a page and edits in every integration test(!!!!!)

Or for selenium tests: The biggest bottleneck is minification and RL modules access, I guess it's not properly cached. It can easily brush off a minute from tests.

Feel free to dig into them!

(How it was made? this patch enables logging of excimer and tests create two files, one for wall time and one for cpu time with sampling rate of every .1 second. Then downloaded those logs and run this on the logs: cat ../selenium/cpu-profile.log* | ./flamegraph.pl > ../selenium/cpu-profile.svg; cat ../selenium/wall-profile.log* | ./flamegraph.pl > ../selenium/wall-profile.svg; cat ../selenium/cpu-profile.log* | ./flamegraph.pl --reverse --colors blue > ../selenium/cpu-profile.reverse.svg; cat ../selenium/wall-profile.log* | ./flamegraph.pl --reverse --colors blue > ../selenium/wall-profile.reverse.svg)

You can also take the setting of the patch and enable it locally and run them.

In T225730#9032430, @Ladsgroup wrote:

3% of phpunit tests is being spent on resetting database between each test (MediaWikiIntegrationTestCase::resetDB) which makes some sense but almost all of it is being spent in creating pages and edits(!), why? because it internally calls MediaWikiIntegrationTestCase::addCoreDBData which in turns calls WikiPage::doUserEditContent and makes a page and edits in every integration test(!!!!!)

That's something I'd like to get rid of. But first of all, we should avoid cloning and resetting the database for tests not in the 'Database' group (T155147). Once that's done, I'll look into not creating a page and a user for every test.

Zooming out a little bit, PHPUnit (and possibly selenium too, but I haven't tested it) would benefit a lot from parallelization with paratest, see T50217#9022060 and the following comments. I'm a bit skeptical about seeing a x6 speedup in CI as I observed locally; but still, even if it's just a 2x, it's much more than we can do by just improving the code. I'm also planning to look into paratest support once T155147 and T90875 are done.

The one single test for which we should really do something is LanguageConverterFactoryTest. The test instantiates something like 400 language objects, which is really slow. The whole test class takes roughly 13 seconds to run. Running the whole "includes" suite with paratest takes around 19 seconds, and because LanguageConverterFactoryTest is a single test handled by a single process, its 13 seconds mean a lot for the overall runtime. I was looking for ways to speed it up, but I doubt we'll be able to do much. Another option would be to split it into multiple test classes, each one running the same test for a subset of the languages. But that's also something I will look into more closely once we're in a better position to evaluate paratest.

As requested by @Krinkle I increased the sampling rate from every 100m to 1ms. The result are here https://people.wikimedia.org/~ladsgroup/tests_flamegraphs/high_res/

In T225730#9032655, @Daimona wrote:

The one single test for which we should really do something is LanguageConverterFactoryTest. The test instantiates something like 400 language objects, which is really slow. The whole test class takes roughly 13 seconds to run. Running the whole "includes" suite with paratest takes around 19 seconds, and because LanguageConverterFactoryTest is a single test handled by a single process, its 13 seconds mean a lot for the overall runtime. I was looking for ways to speed it up, but I doubt we'll be able to do much. Another option would be to split it into multiple test classes, each one running the same test for a subset of the languages. But that's also something I will look into more closely once we're in a better position to evaluate paratest.

13 seconds is not much compared to at 1-2 minutes of every selenium test being spent on cache misses of RL modules: https://people.wikimedia.org/~ladsgroup/tests_flamegraphs/high_res/wall-profile.selenium.svg

In T225730#9032430, @Ladsgroup wrote:

Or for selenium tests: The biggest bottleneck is minification and RL modules access, I guess it's not properly cached. It can easily brush off a minute from tests.
[…]

When I download the .log artefacts and load them up into Speedscope, I can confirm that APCUBagOStuff is selected (so php-apcu cache is enabled and discovered by MW), and we see both doGet calls without minification, and minification calls with doSet. Suggesting that functionally it is using the cache correctly.

It's had to tell from the current snapshot whether it is slow or whether it is missing the cache more than once for the same module. The settings here are more or less stock MediaWiki + DevelopmentSettings, similar to mediawiki-docker where this is generally working well.

My first hunch here is that the CI worker is under memory pressure and forced to evict fresh values from the cache before they can be used. For example, the default apc.shm_size in PHP is 32M (https://www.php.net/manual/en/apcu.configuration.php) which is quite small but may just about suffice for human use in MediaWiki-Docker, but in CI we're likely over a threshold where we engage with a lot of different areas. Likewise, opcache (https://www.php.net/manual/en/opcache.configuration.php) has a default of opcache.memory_consumption=128 (MB) and opcache.interned_strings_buffer=8 (MB).

In production (Codesearch) we use:

# appserver
profile::mediawiki::apc_shm_size: 6096M
opcache.interned_strings_buffer: 96
opcache.memory_consumption: 1024

# jobrunner
profile::mediawiki::apc_shm_size: 4096M
opcache.interned_strings_buffer: 96
opcache.memory_consumption: 1024

# default (unused?)
profile::mediawiki::apc_shm_size: 128M
opcache.interned_strings_buffer: 50
opcache.memory_consumption: 300

We don't need that much in CI given that we only interact with 1 wiki (not 1000), and in ~1 language (not 300). It currently has the following set in Quibble (source):

opcache.memory_consumption=256

I presume the others inherit upstream defaults of apc.shm_size=32M and opcache.interned_strings_buffer=8, which would be very small indeed.

I've uploaded the raw log files from @Ladsgroup's high-res exicmer capture (change 940226) to people.wikimedia.org, which has CORS enabled. This means we can load them into Speedscope (aka Excimer UI).

https://people.wikimedia.org/~krinkle/T225730_5min_jenkins-excimer-20230720/

For example (this is a 35MB large file): https://performance.wikimedia.org/excimer/speedscope/#profileURL=https://people.wikimedia.org/~krinkle/T225730_5min_jenkins-excimer-20230720/phpunit_mwquibble_cpuprofile.log.

Screenshot 2023-07-21 at 03.02.01.png (1×2 px, 403 KB)

The sample count reflects time passing in milliseconds. It says the phpunit invocation for mediawiki-quibble-vendor yielded 374,000 samples, or 374 seconds, or 6 minutes, which is right.

Using the "Sandwhich" mode, which is like a more powerful version of classic "reverse" flamegraphs, we can quickly find (for example) the LocalisationCache as having one of the highest "self" times. Clicking it, then reveals which tests it is largely spent by, with most (57%) coming from LanguageConverterFactoryTest.

Screenshot 2023-07-21 at 03.06.49.png (973×2 px, 404 KB)

In T225730#9032655, @Daimona wrote:

The one single test for which we should really do something is LanguageConverterFactoryTest. The test instantiates something like 400 language objects, which is really slow. The whole test class takes roughly 13 seconds to run.

I incidentally happened to look into language object instantiation yesterday for another reason: POC: Create Language objects without initializing l10n cache – I’ll probably look into this a bit more. (Edit: now tracked at T342418.)

Daimona added a subtask: T342428: Avoid creating a page and user for every PHPUnit Database test.Jul 21 2023, 12:14 PM

Michael subscribed.Jul 21 2023, 4:02 PM

Change 940502 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/core@master] Improve performance of LocalisationCache::mergeMagicWords()

https://gerrit.wikimedia.org/r/940502

Change 939727 merged by jenkins-bot:

[mediawiki/libs/less.php@master] Less_Parser: Inline and optimize heavily called MatchQuoted()

https://gerrit.wikimedia.org/r/939727

Change 940502 merged by jenkins-bot:

[mediawiki/core@master] Improve performance of LocalisationCache::mergeMagicWords()

https://gerrit.wikimedia.org/r/940502

Change 940929 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/core@master] Make "pluralRules" caches static in LocalisationCache

https://gerrit.wikimedia.org/r/940929

Change 940929 merged by jenkins-bot:

[mediawiki/core@master] Make "pluralRules" caches static in LocalisationCache

https://gerrit.wikimedia.org/r/940929

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.20; 2023-08-01); removed MW-1.41-notes (1.41.0-wmf.19; 2023-07-25).Jul 25 2023, 8:00 PM

If we can just fix the RL's caching in the apache server of selenium. We can easily brush off 1-5 minutes (!) from each selenium run. See my attempt in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/944984

Novem_Linguae subscribed.Aug 8 2023, 6:18 AM

Krinkle moved this task from Watching to Perf recommendation on the Performance-Team (Radar) board.Aug 18 2023, 8:02 PM

Krinkle edited projects, added Wikimedia-Performance-recommendation; removed Performance-Team (Radar).Aug 18 2023, 8:42 PM

Daimona closed subtask T155147: Do not initialise the database in tests when not needed as Resolved.Aug 31 2023, 3:19 PM

Jdforrester-WMF closed subtask T342428: Avoid creating a page and user for every PHPUnit Database test as Resolved.Sep 7 2023, 1:30 PM

kostajh mentioned this in T351357: In CI, only run tests for code that is affected by a given change.Nov 16 2023, 12:08 PM

Change 698467 abandoned by Kosta Harlan:

[mediawiki/extensions/Echo@master] selenium: Use a single login for browser tests

Reason:

https://gerrit.wikimedia.org/r/698467

Change 698468 abandoned by Kosta Harlan:

[mediawiki/extensions/Echo@master] selenium: Move notifications page test into general suite

Reason:

https://gerrit.wikimedia.org/r/698468

Change 748314 abandoned by Hashar:

[integration/quibble@master] [DNM] Add excimer config

Reason:

https://gerrit.wikimedia.org/r/748314

Krinkle changed the status of subtask T297349: PHPunit: Don't run DeferredUpdates by default from Open to Stalled.Dec 14 2023, 10:56 PM

Change 774409 abandoned by Kosta Harlan:

[mediawiki/core@master] DevelopmentSettings: Use MWLoggerDefaultSpi for debug logging

Reason:

https://gerrit.wikimedia.org/r/774409

wmf-quibble-core-vendor-mysql-php74-docker/37543/console today shows 18m 47s, which is a bit higher than the upper end of the spectrum noted in May 2021 (13-18m).

In T225730#8370008, @kostajh wrote:

Three years later: maybe we should target a more realistic number, like 15 minutes? If we can get to 15 minutes, we could then move on to 10 minutes, and then maybe end up back at this task's current goal of 5 minutes.

I still think this (more realistic scope) might be a useful way to orient our work, though some individual or team would have to own this goal for it to be realistic, and to hold everyone accountable to it.

I'm still convinced that T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel can give us a lot. Looking at this run of wmf-quibble-core-vendor-mysql-php74-docker, the PHPUnit integration non-database suite took 2m33s, and the database suite 13m41s. In total, that's roughly 16 minutes spent running PHPUnit. Assuming a 5x speedup, consistent with the results of my tests in T50217, this total time would go down to ~3.5 minutes, thus saving ~13 minutes.

Tgr updated the task description. (Show Details)Feb 12 2024, 3:01 PM

In T225730#9533714, @Daimona wrote:

I'm still convinced that T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel can give us a lot. Looking at this run of wmf-quibble-core-vendor-mysql-php74-docker, the PHPUnit integration non-database suite took 2m33s, and the database suite 13m41s. In total, that's roughly 16 minutes spent running PHPUnit. Assuming a 5x speedup, consistent with the results of my tests in T50217, this total time would go down to ~3.5 minutes, thus saving ~13 minutes.

The obsolete, by 5 years, CI infrastructure will not be able to handle that. Last time I checked PHPUnit took a full CPU and MySQL took the other half and running them in parallel would exceed the capacity of the executing VM which runs multiple jobs in parallel competing for the same resource.

There are a few things that can potentially be done though such as:

moving Wikibase tests to their own jobs instead of running them from any repository participating in the wmf-quibble jobs T287582
related find a way to avoid running every single tests
do some profiling to find the low hanging fruit. I used to do that regularly and found some oddities such as a MediaWiki a service uselessly triggering, using a large PHPUnit dataprovider which causes each cases to be hit by PHPUnit test overhead
mocking the slow database wherever possible

do some profiling to find the low hanging fruit. I used to do that regularly and found some oddities […]

I also do this regularly, and will happily continue to do so. However. The sad reality is that it doesn't make much of a difference. The slowest individual PHPUnit tests we currently have take about 2 seconds. That's about 0.2% of the total runtime. Even if we could identify and eliminate 200 of such slow tests, the remaining 12,000 tests would still take 10 minutes.

I think the only approach that makes an actual difference is to find more ways to not run all tests all the time. For example, what about a more fine-grained tree of dependencies between codebases that tells us more than the current list of "gated extensions"? For example, while it's critical to run the FileImporter tests whenever something in core's import system is changed, it doesn't make sense to run the Wikibase tests when I make changes in FileImporter. There are a lot of weird combinations like this.

That said, one interesting bottleneck I'm still exploring is the LanguageFactory, see https://gerrit.wikimedia.org/r/940120.

In T225730#9534306, @hashar wrote:

In T225730#9533714, @Daimona wrote:

I'm still convinced that T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel can give us a lot. Looking at this run of wmf-quibble-core-vendor-mysql-php74-docker, the PHPUnit integration non-database suite took 2m33s, and the database suite 13m41s. In total, that's roughly 16 minutes spent running PHPUnit. Assuming a 5x speedup, consistent with the results of my tests in T50217, this total time would go down to ~3.5 minutes, thus saving ~13 minutes.

The obsolete, by 5 years, CI infrastructure will not be able to handle that. Last time I checked PHPUnit took a full CPU and MySQL took the other half and running them in parallel would exceed the capacity of the executing VM which runs multiple jobs in parallel competing for the same resource.

Are there plans to improve the infrastructure or add more resources? As Thiemo said, improving individual tests is, for the most part, not really useful, because there isn't a single source of slowness. Running fewer tests is also an option, but I'm not sure if doing that cleverly is that easy: MW has a complex ecosystem, so it's not just a matter of flipping a switch.

IMHO, the lowest-hanging fruit is parallelization. If we find a way to make that work (both in MW and the CI infrastructure), we should see a huge reduction in CI times, possibly more than can be achieved by running fewer test, and at a lower maintenance cost. Once that's out of the way, perhaps start considering a more fine-grained testing infrastructure to run fewer tests.

At any rate, it's clear that resources are needed to reach this goal. There's no magic trick that will cut the CI runtime by just snapping one's fingers. And I agree with Kosta that this can only happen if a person or team chooses to own this goal, else it'll be really hard to find the necessary resources.

Slow CI can, at times, have a massive effect on productivity. And as "fun" as it might sound, having your patch wait 40 minutes after the +2, only for it to fail a random selenium test and having to go through gate-and-submit again really makes me wanna throw my laptop out of the window sometimes.

In T225730#9535536, @Daimona wrote:

Slow CI can, at times, have a massive effect on productivity. And as "fun" as it might sound, having your patch wait 40 minutes after the +2, only for it to fail a random selenium test and having to go through gate-and-submit again really makes me wanna throw my laptop out of the window sometimes.

Somewhat orthogonal to this task, but wanted to note that T323750: Provide early feedback when a patch has job failures can help you know sooner when there's a failure. It has to be enabled per repo, though. (Patch for MW core if anyone wants to look.) It's a band-aid fix, yes, but you at least know of a failure earlier and can trigger a rebase to restart the tests.

I think there are two common use-cases where slow CI is annoying:

When submitting a change for code review and not learning quickly that it needs further improvements. As Kosta says, the early feedback bot could help with this.
When backporting changes, and having to do a merge (or several merges) in a 60-minute window. IMO we should just make most CI on wmf/ branches non-voting, they provide little value (even when they fail, it rarely represents a real error). T307180: Drop Selenium tests from gate-and-submit-wmf

Agreed that early feedback can help. One more thing to consider though: the bot helps when your patch is the only thing in CI, or at least if there aren't many patches being tested. But CI can get really full sometimes (e.g. if LibUp is running). When that happens, you also need to wait for all the previous jobs to complete before yours even starts, and this can only be improved by making everything faster.

ArthurTaylor mentioned this in T361118: [EPIC][Infra] Reduce CI test runtime for Wikibase and related extensions.Mar 27 2024, 3:26 PM

hashar closed subtask T248531: Abort a Zuul pipeline when one job completed with failures (change zuul scheduler's failure check from areAllJobsComplete to didAnyJobFail) as Declined.Mar 29 2024, 3:06 PM

pmiazga subscribed.Apr 15 2024, 11:00 AM

Change #1036311 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] DNM: Measuring impact of some improvements

https://gerrit.wikimedia.org/r/1036311

Change #1064477 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] qunit: Replace slow mw.messages reset with empty object reset

https://gerrit.wikimedia.org/r/1064477

Change #1064477 merged by jenkins-bot:

[mediawiki/core@master] qunit: Replace slow mw.messages reset with empty object reset

https://gerrit.wikimedia.org/r/1064477

Change #1070884 had a related patch set uploaded (by Krinkle; author: Thiemo Kreuz (WMDE)):

[mediawiki/core@master] phpunit: Set much smaller defaults for RandomImageGenerator

https://gerrit.wikimedia.org/r/1070884

Change #1070884 merged by jenkins-bot:

[mediawiki/core@master] phpunit: Set much smaller defaults for RandomImageGenerator

https://gerrit.wikimedia.org/r/1070884

ReleaseTaggerBot added a project: MW-1.43-notes (1.43.0-wmf.22; 2024-09-10).Sep 5 2024, 9:00 PM

Change #940120 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/core@master] Make MapCacheLRU in LanguageFactory static

https://gerrit.wikimedia.org/r/940120

	F37145008: Screenshot 2023-07-21 at 03.06.49.png
	Jul 21 2023, 2:08 AM

	F37145004: Screenshot 2023-07-21 at 03.02.01.png
	Jul 21 2023, 2:08 AM

	F29513257: Screenshot 2019-06-13 at 15.57.50.png
	Jun 13 2019, 3:28 PM

	F29513308: Screenshot 2019-06-13 at 15.59.18.png
	Jun 13 2019, 3:28 PM

	F30679987: wmf-quibble-core
	Oct 14 2019, 6:51 PM

Reduce runtime of MW shared gate Jenkins jobs to 5 minOpen, MediumPublicActions

Description

Objective

Status quo

Updated status quo

Scope of task

Stuff done

Ideas to explore and related work

Details

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Reduce runtime of MW shared gate Jenkins jobs to 5 min
Open, MediumPublic
Actions

Related Objects
Search...