Page MenuHomePhabricator

Option to remove translation pages from search results
Open, LowPublicFeature

Description

This includes:

  • new keyword in CirrusSearch to remove them from search results
  • new option in AdvancedSearch to do so

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aklapper changed the subtype of this task from "Task" to "Feature Request".

CirrusSearch already provides inlanguage:en: https://www.mediawiki.org/wiki/Help:CirrusSearch#Inlanguage

Reopen.

  • Each translatable page also have a /en subpage
  • In fact Translate extension allow mark pages in any languages for translation and language of pages can be changed arbitrarily (even without Translate extension)

Thanks for filing. Just so I better understand your comments above, are you saying that the CirrusSearch language filter is not sufficient because translation pages have English subpages and still show up in search results? I'm trying to better understand why the language filter workaround doesn't work for you.

Firstly translation pages have English subpages; secondly even uncommon, it is possible for any page in any wiki to be in any languages and there is interface (Special:PageLanguage) to change it.

Examples:

A big source of frustration here is that for certain keywords on wikis like mediawiki.org or Meta you will get a ton of half-translated Extension:Translate subpages, all containing the English original and this having high specificity scores while actually being utterly irrelevant. Sometimes search is borderline useless because of that.

I think the ideal solution there would be to not index untranslated text on translated subpages, or maybe somehow coalesce translations of the same page into a single search result.