⚓ T58711 [Task] Update the wb_terms table so it does not have a numeric entity id

Status	Subtype	Assigned	Task
Open		None	T115858 Design improvements for mw.ForeignStructuredUpload.BookletLayout
Open		None	T115865 Insert image in content immediately after it's uploaded, skipping the "General settings" step
Duplicate		None	T115864 Figure out if the description of the image can be used as the caption on-wiki
Open	Feature	None	T53032 When inserting an image, set its caption by default to be the Commons image description
Open	Feature	None	T39534 Wikimedia Commons should support searching by color
Duplicate		None	T39535 Wikimedia Commons should support filtering by color
Resolved		None	T19503 Provide metadata support on Wikimedia Commons
Resolved		None	T51662 VisualEditor: Use Multimedia/Wikidata's proposed rich structured meta-data in the image insertion dialog
Resolved		None	T68108 [Epic] Store media information for files on Wikimedia Commons as structured data
Duplicate		None	T66288 basic support for structured data on mediawiki files
Invalid		Lydia_Pintscher	T76012 make use of new entity type for multimedia / structured data of media files
Open		None	T109579 [Epic] Give more sister projects access to Wikidata
Open		None	T187900 There is no way to reference a specific quote on Wikiquote
Stalled		None	T71753 [Story] Wikibase / Wikidata support on Wikiquote
Open		None	T67626 [Epic] Support for queries on-wiki (automated list generation)
Resolved		Addshore	T76019 [Story] Support new types of Entities in Wikibase Client
Resolved		thiemowmde	T135650 [Task] Migrate PropertySuggester away from assuming all entities are numeric
Resolved		Addshore	T75496 [Epic] Support new types of Entities in Wikibase Repository
Declined		None	T58711 [Task] Update the wb_terms table so it does not have a numeric entity id
Invalid		None	T70378 Write migration script for schema changes to wb_terms table
Declined		None	T114903 Migrate wb_terms to using prefixed entity IDs instead of numeric IDs
Resolved		Lydia_Pintscher	T159718 Evaluate how to best add a column for full entity ID to wb_terms without affecting wikidata.org users
Resolved		None	T159851 Add a column for full entity ID to wb_terms table
Resolved		WMDE-leszek	T168708 Populating full entity column in wb_terms table when running MW update.php should be fast and run only once
Declined		None	T169720 Remove column storing numeric entity ID in wb_terms once it has become redundant
Declined		None	T169721 Remove term_entity_id during database schema update when term_full_entity_id is used instead
Declined		None	T169722 Remove term_entity_id column from wb_terms table when no longer needed on WMF cluster
Resolved		Ladsgroup	T188995 Clean up the code base for term_entity_id
Resolved		aude	T162533 Script for populating term_full_entity_id column in wb_terms table
Resolved		Marostegui	T162539 Deploy schema change for adding term_full_entity_id column to wb_terms table
Resolved		Ladsgroup	T162673 Adjust code reading from wb_terms table to use full entity ID
Resolved		Andrew	T167114 Open view for term_full_entity_id in wb_terms table in labs
Resolved		Ladsgroup	T171460 Populate term_full_entity_id on www.wikidata.org
Resolved		aude	T171461 Populate term_full_entity_id on test.wikidata.org
Resolved		Ladsgroup	T167229 Change configuration of Wikidata in production to write term_full_entity_id
Resolved		Ladsgroup	T165197 Change configuration of test Wikidata to write term_full_entity_id
Resolved		Ladsgroup	T172776 Property labels missing on some items

• bzimport raised the priority of this task from to High.Nov 22 2014, 2:15 AM

• bzimport added a project: MediaWiki-extensions-WikibaseRepository.

• bzimport set Reference to bz56711.

• bzimport added a subscriber: Unknown Object (MLST).

Tobi_WMDE_SW created this task.Nov 7 2013, 9:55 AM

Rationale: we are dropping the assumption that ids will always be prefix+number. For the current code and use case, wikidata.org, this works fine, but we need to migrate away from this in order to support things like meta-data storage on commons.

https://gerrit.wikimedia.org/r/#/c/101197/

I fixed compat with sqlite and several other issues. The tests now pass: https://gerrit.wikimedia.org/r/#/c/114490/

The commits are doing some stuff I don't like, though that can be fixed after we got rid of the main issue, the bad assumption in the table, which this commit fixes.

Springle wrote at https://gerrit.wikimedia.org/r/#/c/101197/

This one still seems dangerous to me :-) I understand the reason for the change, however please do also consider:

Have we done any real profiling of the new query forms against the production dataset? I'd really like to see how much of an impact this has on data and index disk usage, and more importantly on runtime memory usage. Happy to do this if a Dev can generate a few thousand samples of each query type...

Would it be wise to keep a numeric entity id field as an interim step on the wikidata production dataset, so we can fail back if needs be? Ie, treat this as a denormalization step (which is /all it is/ for now) until #1 is assured? That might even make the migration less painful.

VARBINARY(255) smells like an arbitrary size choice :-) Variable field widths really start to matter for large datasets as the server must convert it to fixed-width BINARY while working. If the choice /was/ arbitrary, can we arbitrarily choose to make this smaller from the get go?

Lydia_Pintscher added a project: Wikidata.Dec 1 2014, 2:27 PM

Lydia_Pintscher removed a subscriber: Unknown Object (MLST).

Lydia_Pintscher removed a subscriber: Unknown Object (MLST).Dec 1 2014, 2:29 PM

Lydia_Pintscher moved this task from incoming to needs discussion or investigation on the Wikidata board.Dec 27 2014, 4:11 PM

Lydia_Pintscher moved this task from needs discussion or investigation to consider for next sprint on the Wikidata board.Jun 19 2015, 2:38 PM

Lydia_Pintscher moved this task from consider for next sprint to hold on the Wikidata board.Jun 30 2015, 2:03 PM

Fundamental questions in today's sprint start came up: Do we still want this at all? We should try get rid of the database table anyway and the change is very expensive on such a large database table so we do want to avoid it unless absolutely necessary.

thiemowmde renamed this task from Update the wb_terms table so it does not have a numeric entity id to [Task] Update the wb_terms table so it does not have a numeric entity id.Aug 13 2015, 4:25 PM

thiemowmde updated the task description. (Show Details)

thiemowmde removed a project: Patch-For-Review.

thiemowmde set Security to None.

thiemowmde removed a subscriber: • Wikidata-bugs.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 13 2015, 4:25 PM

Ricordisamoa subscribed.Aug 22 2015, 2:14 PM

• adrianheine added a subtask: T114903: Migrate wb_terms to using prefixed entity IDs instead of numeric IDs.May 18 2016, 11:50 AM

daniel added a parent task: T140891: Factor EntityPerPage::getEntitiesWithoutTerm out into it's own service.Jul 20 2016, 12:40 PM

hoo removed a parent task: T140891: Factor EntityPerPage::getEntitiesWithoutTerm out into it's own service.Oct 10 2016, 5:27 PM

daniel added a parent task: T159703: [Tracking] Support for Forms and Senses.Mar 9 2017, 2:35 PM

Lydia_Pintscher closed subtask T70378: Write migration script for schema changes to wb_terms table as Invalid.Apr 23 2017, 5:17 PM

Lydia_Pintscher removed a parent task: T159703: [Tracking] Support for Forms and Senses.May 8 2017, 2:01 PM