Page MenuHomePhabricator

[Epic] Move geoshape expansion to Kartographer parse-time
Open, Needs TriagePublic

Description

We want to have more efficient map views for readers by immediately querying the external data (geopoints, geoshapes, etc.) into the extension ParserCache data when editing is done.

Requirements

  • Implement in Kartographer, behind a feature flag in parallel with the existing behavior.
    • POC is done in SimpleStyleParser.php but this normalization function should be moved out of this class anyway.
    • Should be performed before simplestyle transforms, for the Commons "Data" .map path.
    • Set "expensive" parser flag. --> T324973: Set expensive parser flag for maps expansion during parse time
    • Inlined geoshape will go directly into the parser cache along with other geometry, probably no changes needed there.
    • Put it behind a feature flag, old code should still be run by default.
    • Failed requests can fall back to the old behavior gracefully after logging a warning.
  • MediaWiki must be able to route requests to kartotherian directly.
    • Development environment requires some tweaks to docker-compose.yml to allow local MediaWiki to contact local kartotherian.
    • Production might not allow this until we set it up explicitly.
  • Must work for ids and SPARQL query
  • Each service is handled slightly differently by the external data parser, see mapdata for the reference implementations that we must match, https://gerrit.wikimedia.org/g/mapdata/+/refs/changes/66/858566/26/src/ExternalDataParser.js#40 . The distinct handling for each group:
    • "page" or maps from commons are passed through directly, by a non-recursive "extend" or "array_merge" at the GeoJSON object level. --> https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Kartographer/+/867280
    • geoshape, geoline, and geopoint have the same top-level merge, and also any default "properties" along with the original external mapdata link are merged under the properties of each Feature in the expanded "features" of a FeatureCollection, if any.
    • geomask calculates the reverse of a polygon.
      • TODO: It looks like default simplestyle properties are not copied into features for geomask. Document this on the help page and as a follow-up task in Phabricator. T325286
  • Don't damage karotherian code for expanding geoshapes, this is needed for our migration fallback strategy. In the long term, this will happily go away.
  • Must work for mapdata which is a list of GeoJSON objects
    • ExternalDataLoader's "parse" entrypoint iterates over its array argument and calls "extend" for each singular geojson. I'm not sure where a single geojson is coerced to be an array.
  • Monitor kartotherian and geoshapes services during deployment.

Open questions

  • What should we do when the expensive flag signals too expensive?
  • Should we fall back to the old behaviour if it is faulty?

For review

Enable feature flag KartographerExternalDataParseTimeFetch in extension.json.
Change kartotherian server url in local settings:

$wgKartographerMapServer = "https://maps.wikimedia.org";

Example that has default properties and properties in external data:

<mapframe text="Geopoints via sparql" width="300" height="300" zoom="13" align="left" latitude="43.735011" longitude="7.413197">
{
  "type": "ExternalData",
  "service": "geopoint",
  "query": "SELECT DISTINCT ?id ?geo (IF(?type = wd:Q33506, 'museum', '') AS ?marker_symbol) WHERE { ?id wdt:P17 wd:Q235; wdt:P31 ?type; wdt:P625 ?geo.} LIMIT 50",
  "properties": {
  "marker-size": "large",
  "marker-symbol": "circle",
  "marker-color": "#FF5733",
 }
}
</mapframe>

Details

SubjectRepoBranchLines +/-
mapdatamaster+19 -14
mediawiki/extensions/Kartographermaster+15 -15
mediawiki/extensions/Kartographermaster+35 -1
mediawiki/extensions/Kartographermaster+6 -5
operations/mediawiki-configmaster+3 -5
mediawiki/extensions/Kartographermaster+19 -12
mediawiki/extensions/Kartographermaster+31 -49
mediawiki/extensions/Kartographermaster+20 -2
mediawiki/extensions/Kartographermaster+0 -4
operations/mediawiki-configmaster+3 -0
mediawiki/extensions/Kartographermaster+10 -11
mediawiki/extensions/Kartographermaster+217 -4
mediawiki/extensions/Kartographermaster+18 -1
mediawiki/extensions/Kartographermaster+90 -12
mediawiki/extensions/Kartographermaster+196 -7
mediawiki/extensions/Kartographermaster+139 -0
mediawiki/extensions/Kartographermaster+13 -6
mediawiki/extensions/Kartographermaster+68 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 855984 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Move geoshape expansion to Kartographer parse-time

https://gerrit.wikimedia.org/r/855984

Change 862931 had a related patch set uploaded (by Svantje Lilienthal; author: Svantje Lilienthal):

[mediawiki/extensions/Kartographer@master] Prepare for testing

https://gerrit.wikimedia.org/r/862931

Change 862940 had a related patch set uploaded (by Svantje Lilienthal; author: Svantje Lilienthal):

[mediawiki/extensions/Kartographer@master] Fixing property handling

https://gerrit.wikimedia.org/r/862940

Change 862952 had a related patch set uploaded (by Svantje Lilienthal; author: Svantje Lilienthal):

[mediawiki/extensions/Kartographer@master] Added test for external data loader

https://gerrit.wikimedia.org/r/862952

Change 862931 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Prepare for testing

https://gerrit.wikimedia.org/r/862931

Change 863007 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Introduce tiny ParserFunctionTracker service

https://gerrit.wikimedia.org/r/863007

Change 862952 abandoned by Svantje Lilienthal:

[mediawiki/extensions/Kartographer@master] Added test for external data loader

Reason:

Merged in 862940

https://gerrit.wikimedia.org/r/862952

awight updated the task description. (Show Details)
awight updated the task description. (Show Details)

Change 862940 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Fixing property handling and tests

https://gerrit.wikimedia.org/r/862940

Change 863007 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Introduce tiny ParserFunctionTracker service

https://gerrit.wikimedia.org/r/863007

Change 865045 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Add basic integration test for ParserFunctionTracker

https://gerrit.wikimedia.org/r/865045

Change 867280 had a related patch set uploaded (by Svantje Lilienthal; author: Svantje Lilienthal):

[mediawiki/extensions/Kartographer@master] Do not parse external data from WikiCommons

https://gerrit.wikimedia.org/r/867280

Change 867679 had a related patch set uploaded (by Svantje Lilienthal; author: Svantje Lilienthal):

[mediawiki/extensions/Kartographer@master] Handle GeoMasks

https://gerrit.wikimedia.org/r/867679

I'm a little confused by the checked-off line,

"page" or maps from commons are passed through directly, by a non-recursive "extend" or "array_merge" at the GeoJSON object level. --> https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Kartographer/+/867280

The patch linked seems to skip "page" data entirely, rather than fetching and passing through. I remember we had planned to do this step incrementally with the first phase being no expansion for "page" data, but in that case we should change the task description to say "skip 'page' for now", and create a follow-up task to implement it later.

Change 867280 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Do not parse external data from WikiCommons

https://gerrit.wikimedia.org/r/867280

Change 867679 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Handle GeoMasks

https://gerrit.wikimedia.org/r/867679

Change 865045 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Add basic integration test for ParserFunctionTracker

https://gerrit.wikimedia.org/r/865045

Change 869178 had a related patch set uploaded (by Awight; author: Awight):

[operations/mediawiki-config@master] [beta] Expand mapframe ExternalData

https://gerrit.wikimedia.org/r/869178

Change 869178 merged by jenkins-bot:

[operations/mediawiki-config@master] [beta] Expand mapframe ExternalData

https://gerrit.wikimedia.org/r/869178

Change 870863 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Remove redundant checks from ExternalDataLoader::extend

https://gerrit.wikimedia.org/r/870863

Change 870864 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Make private ExternalDataLoader methods private again

https://gerrit.wikimedia.org/r/870864

Change 870865 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Don't use non-existing geometry as a mask

https://gerrit.wikimedia.org/r/870865

Change 870886 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Fix geomask expansion possibly failing when HTTP request fails

https://gerrit.wikimedia.org/r/870886

Change 870889 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mapdata@master] Don't use non-existing geometry as a mask

https://gerrit.wikimedia.org/r/870889

Change 866431 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Skip invalid ExternalData without a url

https://gerrit.wikimedia.org/r/866431

Change 870863 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Remove redundant checks from ExternalDataLoader::extend

https://gerrit.wikimedia.org/r/870863

Change 870886 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Fix geomask expansion possibly failing when HTTP request fails

https://gerrit.wikimedia.org/r/870886

Change 866431 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Skip invalid ExternalData without a url

https://gerrit.wikimedia.org/r/866431

thiemowmde renamed this task from Move geoshape expansion to Kartographer parse-time to [Epic] Move geoshape expansion to Kartographer parse-time.Jan 5 2023, 8:41 AM

Change 870864 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Make private ExternalDataLoader methods private again

https://gerrit.wikimedia.org/r/870864

awight moved this task from Doing to Demo on the WMDE-TechWish-Sprint-2023-01-18 board.

We'll enable this feature today—there should be no noticeable user-facing changes, this is just an optimization.

Change 884496 had a related patch set uploaded (by Awight; author: Awight):

[operations/mediawiki-config@master] Revert "Enable kartographer external data parse time fetch for all wikis"

https://gerrit.wikimedia.org/r/884496

Change 884496 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "Enable kartographer external data parse time fetch for all wikis"

https://gerrit.wikimedia.org/r/884496

Mentioned in SAL (#wikimedia-operations) [2023-01-30T12:55:12Z] <awight@deploy1002> Started scap: Backport for [[gerrit:884496|Revert "Enable kartographer external data parse time fetch for all wikis" (T323113)]]

Mentioned in SAL (#wikimedia-operations) [2023-01-30T13:14:37Z] <awight@deploy1002> Started scap: Backport for [[gerrit:884496|Revert "Enable kartographer external data parse time fetch for all wikis" (T323113)]]

Mentioned in SAL (#wikimedia-operations) [2023-01-30T13:16:12Z] <awight@deploy1002> awight: Backport for [[gerrit:884496|Revert "Enable kartographer external data parse time fetch for all wikis" (T323113)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-01-30T13:23:11Z] <awight@deploy1002> Finished scap: Backport for [[gerrit:884496|Revert "Enable kartographer external data parse time fetch for all wikis" (T323113)]] (duration: 08m 34s)

So, we need to consider an exciting detail of this feature: once the external data is expanded, it changes the GeoJSON mapdata which in turn causes the group ID to change. This makes new and old image URLs and mapdata responses incompatible even for the same revision. We need a the migration which allows these to interoperate safely when the feature is deployed. For example, hash the GeoJSON before expanding external data?

Change 886340 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Skip parse-time expansion during preview

https://gerrit.wikimedia.org/r/886340

Change 886340 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Skip parse-time expansion during preview

https://gerrit.wikimedia.org/r/886340

Change 886369 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Kartographer@master] Fix incomplete parse-time expansion for "page" data source

https://gerrit.wikimedia.org/r/886369

Change 886369 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Fix incomplete parse-time expansion for "page" data source

https://gerrit.wikimedia.org/r/886369

Change 870865 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/extensions/Kartographer@master] Don't use non-existing geometry as a mask

Reason:

Obsolete via T332973.

https://gerrit.wikimedia.org/r/870865

Change 870889 merged by jenkins-bot:

[mapdata@master] Don't use non-existing geometry as a mask

https://gerrit.wikimedia.org/r/870889