Wiktionary:Requests for moves, mergers and splits
Wiktionary Request pages (edit) see also: discussions | |||||
---|---|---|---|---|---|
Requests for cleanup add new request | history | archives Cleanup requests, questions and discussions. |
Requests for verification
Requests for verification in the form of durably-archived attestations conveying the meaning of the term in question. |
Requests for deletion
Requests for deletion of pages in the main and Reconstruction namespace due to policy violations; also for undeletion requests. |
Requests for deletion/Others add new request | history Requests for deletion and undeletion of pages in other (not the main) namespaces, such as categories, appendices and templates. | ||
Requests for moves, mergers and splits add new request | history | archives Moves, mergers and splits; requests listings, questions and discussions. |
Language treatment requests add new request | history Requests for changes to Wiktionary's language treatment practices, including renames, merges and splits. | ||||
{{attention}} • {{rfap}} • {{rfdate}} • {{rfquote}} • {{rfdef}} • {{rfeq}} • {{rfe}} • {{rfex}} • {{rfi}} • {{rfp}} |
All Wiktionary: namespace discussions 1 2 3 4 5 - All discussion pages 1 2 3 4 5 |
This page is designed to discuss moves (renaming pages), mergers and splits. Its aim is to take the burden away from the Beer Parlour and Requests for Deletion where these issues were previously listed. Please note that uncontroversial page moves to correct typos, missing characters etc. should not be listed here, but moved directly using the move function.
- Appropriate: Renaming categories, templates, Wiktionary pages, appendices, rhymes and occasionally entries. Merging or splitting temp categories, templates, Wiktionary pages, appendices, rhymes.
- Out of scope: Merging entries which are alternative forms or spellings or synonyms such as color/colour or traveled/travelled. Unlike Wikipedia, we don’t redirect in these sort of situations. Each spelling gets its own page, often employing the templates
{{alternative spelling of}}
or{{alternative form of}}
. - Tagging pages: To tag a page, you can use the general template
{{rfm}}
, as well as one of the more specific templates{{move}}
,{{merge}}
and{{split}}
.
Note that discussions for splitting, merging, and renaming languages, once held here, are now held at WT:Language treatment requests.
2015
West African Pidgin English varieties
Ethnologue has assigned codes to some but not all of the varieties of West African Pidgin English, and we in turn have incorporated some (e.g. pcm) but not all (e.g. not gpe) of those codes. As WP notes, the "contemporary English-based pidgin and creole languages are so similar that they are sometimes grouped together under the name 'West African Pidgin English'" (a name which also denotes their predecessor which developed in the 1700s). WP's examples are illustrative, particularly in that its Ghanaian and Nigerian Pidgin English examples are identical. I propose to merge at least the following three varieties into wes, renaming it "West African Pidgin English":
- Ghanaian Pidgin English (gpe)
- Nigerian Pidgin English (pcm)
- Cameroonian Pidgin English (wes)
We could also discuss whether or not to merge Sierra Leone Krio (kri, which WP notes its often mistaken for English slang due to its similarity to English, but which has a somewhat distinct alphabet), Pichinglis / Fernando Po Creole (fpe), and Liberian Kreyol / Liberian Pidgin English (lir). - -sche (discuss) 21:11, 11 August 2015 (UTC)
- The question is a very complex one. Firstly (but of least importance), scholars are divided on which lects have creolised and which have not, but it is generally agreed upon that at least some of the language you mentioned are not pidgins, which would make the name "West African Pidgin English" somewhat of a misnomer (the more neutral name "Wes-Kos" have been suggested as an alternative, but even linguists haven't fully adopted it). Secondly, all these lects are remarkably similar on a lexical level, but that's unsurprising; after all, they resulted from separate but very similar language contact events, and then probably modified each other (one scholar posits that Krio and Cameroonian Pidgin English relexified each other to some degree after pidginisation). The similarities are also obscured by the fact that there is nothing close to an agreed orthography for most of these, and pronunciation does differ a bit across West Africa. Linguistically, I'd probably merge them all, but practically that may not be the best decision. I know we have entries in pcm, but probably next to nothing for the rest, and if somebody wants to add them, given how each lect is very neatly assigned to a certain West African country, at least it won't be confusing for them to do so. Conclusion: the literature is schizophrenic, the lects mutually intelligible, and the existing situation remarkably unproblematic. Therefore I abstain. —Μετάknowledgediscuss/deeds 21:19, 16 August 2015 (UTC)
Per Wiktionary:Votes/2011-04/Lexical categories, move:
- Category:en:Exonyms -> Category:English exonyms
- and also all the language variations (Category:es:Exonyms -> Category:Spanish exonyms, etc.)
Rationale: This makes these categories nominally consistent with all other categories that describe the words ("Category:English blablabla") rather than their meanings ("Category:en:blablabla"), such as all categories listed in Category:English terms by etymology.
In fact, I believe Category:English exonyms should be a subcategory of Category:English terms by etymology.
It's interesting to note that Category:English terms by etymology was once called Category:en:Etymology before it was moved multiple times. --Daniel Carrero (talk) 23:22, 11 October 2015 (UTC)
- Being an exonym is not a matter of how a word was created. In fact, terms often don't start off as exonyms, but become exonyms as the languages diverge and evolve. So it's not appropriate to put it under etymology. —CodeCat 00:11, 12 October 2015 (UTC)
Oppose: Exonyms should remain as a category and English exonyms should be a subcategory of it.Purplebackpack89 20:15, 12 October 2015 (UTC)
- I nominated specifically "Category:en:Exonyms -> Category:English exonyms", you mentioned "English exonyms should be […] ", so I don't see how this would work as an oppose vote to my nomination. I don't suppose you wanted the category to remain named "Category:en:Exonyms", right?
- In any event, the format that other umbrella categories use according to Wiktionary:Votes/2011-04/Lexical categories is "Category:Exonyms by language" -> "Category:English exonyms". Like "Category:Nouns by language" -> "Category:English nouns". --Daniel Carrero (talk) 00:16, 13 October 2015 (UTC)
- Oh, sorry, I missed the "en" in there. Retracting my vote. Purplebackpack89 00:22, 13 October 2015 (UTC)
- No problem, thank you. --Daniel Carrero (talk) 00:26, 13 October 2015 (UTC)
- This should not be controversial, but it's wise to check. DCDuring TALK 23:32, 14 October 2015 (UTC)
- No problem, thank you. --Daniel Carrero (talk) 00:26, 13 October 2015 (UTC)
- Oh, sorry, I missed the "en" in there. Retracting my vote. Purplebackpack89 00:22, 13 October 2015 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 03:11, 29 October 2021 (UTC)
- This would be a good bot job. - excarnateSojourner (talk | contrib) 00:39, 12 April 2022 (UTC)
- I was going to move these categories as proposed using excarnateSojournerBot, but I discovered that Category:Exonyms's data in Module:category tree/topic cat/data/Places lists
"places"
as one of its parents, which (because it is a topic cat) makes e.g. Category:en:Places a parent of Category:en:Exonyms. I expect if I move the exonyms code from Module:category tree/topic cat/data/Places to Module:category tree/poscatboiler/data/names it will then try to make e.g. Category:English exonyms a child of Category:English places when we instead want Category:Places to remain a topic cat. So I think changing Category:Exonym's parent to something like Category:Terms by etymology subcategories by language will be a necessary part of this operation (but see CodeCat's objection above). Category:Places also currently contains improper nouns such as track and fringe, which should not be descendants of Category:Names. — excarnateSojourner (talk · contrib) 06:32, 27 December 2022 (UTC)
- The situation of all of our names categories is complicated, compounded by the unclear scope of some, e.g. the exonyms category seems to only contain place exonyms, not other exonyms like German or Xerxes. And suppose someone attests a foreign exonym of an English-speaking place [e.g. Japanese-derived "Rondon" for "London"] in English, the way e.g. Deutsch#English or google books:"speak Eigo" are attested in English: would that go in the "English exonyms"/"en:Exonyms" category?
It's been suggested that we need to revamp the system more widely, also doing something about e.g. transliterations of foreign names (Pyotr, Putin, Kaifeng, etc); even the question of whether and how Placenames should be a subset of Names has come up before, though I'm having trouble finding the discussion (I think there's more than just the discussion in the section immediately below this one, and Category talk:en:Place names and Category talk:en:Names and WT:Info desk/2013/July, but I can't find it offhand). On a balance, names are a lot more like a "POS" category than a "topic" category. I agree they aren't per se terms by etymology, since as noted above, they only sometimes originate as exonyms, sometimes they originate as endonyms and then the speakers of the language get forcibly relocated, or the language evolves into two. (Is Icelandic Rín an exonym for the Rhine? Icelanders do not live near the Rhine, but the name goes back to when their ancestors did...) - -sche (discuss) 16:36, 27 December 2022 (UTC)
Recategorize into Category:Names by language
Pinging some editors from the discussion above: @User:Daniel Carrero, @User:Rua, @User:Purplebackpack89, @User:-sche
As I explained above, it seems infeasible to rename cat:Exonyms (and its subcategories) without also changing what its parent category is. So I propose we remove cat:Exonyms from cat:Places, add it to cat:Names by language, rename it to cat:Exonyms by language, and rename its subcategories to e.g. cat:English exonyms. Exonyms are not places; they are names. I realize this would extend the breadth of cat:English names and its siblings. I think this makes sense, but I would also accept cat:Exonyms by language being under cat:Terms by semantic function by language. — excarnateSojourner (talk · contrib) 03:48, 25 February 2023 (UTC)
Continuation of #Category:en:Names into Category:English names
Reviving the earlier discussion, I'm still bothered by the fact that we have two different categories for names. But the previous discussion also made it clear that it's not as easy as just merging them.
- I think
Category:en:Place namesshould probably be renamed to Category:en:Places, since it's really meant to contain terms for places. That is, since it's a topical/set-type category, the focus should be on the referent of the word, whereas part-of-speech categories like Category:English names focus on the word itself. A word is a name, and it refers to something bearing that name. - Category:en:Named roads should probably be given some other parent than Category:en:Names; roads are not a subset of names, after all. We already have Category:en:Roads, so removing the names category would be enough.
Category:en:Transliteration of personal namesshould probably be renamed and made to fit into Category:English terms transliterated from other languages somehow. Transliteration of a name doesn't seem particularly different from transliteration of any other word, so we might also just decide to get rid of the distinction and merge them entirely.- Category:en:Demonyms is a bit more problematic and I brought it up before, though I don't remember where. "Demonym", again, is a term focused on the word, not the referent. A word is a demonym. Perhaps this could be renamed to something else? Category:en:Peoples maybe?
- Category:en:Languages could probably just be removed from the category.
- Category:en:Letter names seems like a good candidate to be renamed to Category:English letter names, to fit alongside Category:English letters.
- Category:en:Couple nicknames I don't really know about. I suppose it's thematically quite similar to Category:English female given names? So Category:English couple nicknames?
Category:English surnames from Japanese, finally, should just be removed from the category, as it has more suitable parents already.
—CodeCat 00:45, 10 November 2015 (UTC)
- FWIW, what I am going to say is somewhat off-topic and maybe I'm minority on that, but I would not mind using the naming system "Category:English xxxx" for all topical categories: Category:en:Chess -> English terms related to chess. (or any better name along those lines) --Daniel Carrero (talk) 00:59, 10 November 2015 (UTC)
- "Category:en:Transliteration of personal names" could be renamed to "Category:English names transliterated from other languages", I suppose. What's the matter with the demonyms category? It contains demonyms, as expected. Would it be better titled "English demonyms", on the model of "English phrases"? - -sche (discuss) 06:02, 10 November 2015 (UTC)
- "Category:en:Transliteration of personal names" would be better named "English transliterations of (foreigners') personal names". Notice the existence of e.g.Category:Latvian transliterations of English names. Names of non-English speakers are not English names. I agree with CodeCat that place names belong to topic categories.--Makaokalani (talk) 14:32, 10 November 2015 (UTC)
- Here's the old discussion if anyone wants to read it. - excarnateSojourner (talk | contrib) 15:58, 12 April 2022 (UTC)
- Category:en:Place names was deleted by Equinox in 2017-05 because it was empty. Category:Transliteration of personal names (and its language-specific subcategories) were moved to Category:Foreign personal names in 2021-09 with the help of WingerBot. - excarnateSojourner (talk | contrib) 16:14, 12 April 2022 (UTC)
- Move Category:en:Demonyms to Category:English demonyms. This would be another job for a bot. - excarnateSojourner (talk | contrib) 04:57, 5 October 2022 (UTC)
- cat:en:Demonyms has the same problem as cat:en:Exonyms (as explained in the discussion above): it is a child of cat:en:Places, so moving it will not be straightforward. — excarnateSojourner (talk · contrib) 03:51, 6 February 2023 (UTC)
- @ExcarnateSojourner There being no opposition here, only support (albeit mostly old support), and no opposition or interest when I brought this up in the BP, let's revise whatever needs to be revised to put (at a minimum) all given names and surnames into subcategories of Category:Names by language, instead of some of them being in subcategories of Category:Names. The split is haphazard and arbitrary; I see the intention — put a name that was given within English in one top-level category and a name transliterating a foreign name in a different top-level category — but in practice that's not maintained, since e.g. Alexandra in the context of discussing ancient Greek is transliterating the Ancient Greek name, Sergei has been given to babies born in the Anglosphere (and to characters in English fiction), and we don't maintain such a split with place names. - -sche (discuss) 16:01, 24 April 2023 (UTC)
- It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche (discuss) 14:37, 5 May 2023 (UTC)
- (Assistance solicited at Module talk:names#en:Russian_male_given_names,_etc.) - -sche (discuss) 14:48, 5 May 2023 (UTC)
- It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche (discuss) 14:37, 5 May 2023 (UTC)
Recategorize Category:Demonyms and Category:Ethnonyms
Pinging some editors from the discussion above: @User:Rua, @User:Daniel Carrero
As I explained in the discussion about exonyms above, renaming the language-specific subcategories of cat:Demonyms properly will require removing it from the topic category tree and adding it to the set category tree. We should similarly recategorize cat:Ethnonyms, another child of cat:Names that did not yet exist when this discussion started. I propose recategorizing them into Category:Terms by semantic function subcategories by language, unless someone can find a better place, and renaming them cat:Demonyms by language and cat:Ethnonyms by language. — excarnateSojourner (talk · contrib) 06:55, 25 February 2023 (UTC)
2016
Linear A
Strangely enough we have a language code for Linear A [lab], even though Linear A is a writing system and not a language. I have no idea why it was encoded or why we have it. -- Liliana • 15:01, 5 March 2016 (UTC)
- It's very odd. The script code for Linear A is "Lina"; the language code for Minoan is "omn"; but there's also a language code "lab" for a language called "Linear A". I have no idea what ISO and SIL were thinking, but I'm in favor of deleting "lab" from our modules. —Aɴɢʀ (talk) 17:43, 5 March 2016 (UTC)
- I'll bet their thinking is that the language written in the script may be an unknown language, which would be consistent with w:Linear A. There do seem to be a large number of hypotheses about Linear A, nearly on the same order as the total number of recorded instances of the script. DCDuring TALK 18:33, 5 March 2016 (UTC)
- I see. Reading Minoan language more carefully, I see that it's written in both Cretan hieroglyphs and Linear A, but since neither writing system has been deciphered, it isn't known whether it's the same language in two writing systems or two different languages. So maybe "omn" means Minoan in Cretan hieroglyphs and "lab" means Minoan in Linear A, and they may or may not refer to the same language. Given that the language is unknown and undeciphered, I wonder why we have one Minoan lemma: kuro. How do we know this word was pronounced "kuro" and that it means "total"? —Aɴɢʀ (talk) 07:25, 6 March 2016 (UTC)
- It's in the wrong script anyway (it was added before Unicode covered Linear A), but afaik Linear A can be read simply by using the known values for Linear B syllables, which are visually similar. This word is always found at the end of lists, followed by a number, so the meaning was easy to figure out. -- Liliana • 10:39, 6 March 2016 (UTC)
- I see. Reading Minoan language more carefully, I see that it's written in both Cretan hieroglyphs and Linear A, but since neither writing system has been deciphered, it isn't known whether it's the same language in two writing systems or two different languages. So maybe "omn" means Minoan in Cretan hieroglyphs and "lab" means Minoan in Linear A, and they may or may not refer to the same language. Given that the language is unknown and undeciphered, I wonder why we have one Minoan lemma: kuro. How do we know this word was pronounced "kuro" and that it means "total"? —Aɴɢʀ (talk) 07:25, 6 March 2016 (UTC)
- I'll bet their thinking is that the language written in the script may be an unknown language, which would be consistent with w:Linear A. There do seem to be a large number of hypotheses about Linear A, nearly on the same order as the total number of recorded instances of the script. DCDuring TALK 18:33, 5 March 2016 (UTC)
- I reckon we should indeed delete this language code; we can always change our minds once some decoding happens. @Liliana-60 (or anyone else), can we move kuro#Minoan to a Linear A entry? —Μετάknowledgediscuss/deeds 00:25, 2 April 2016 (UTC)
I see no evidence that this exists as a separate language, and move that it be merged with tr. The literature which references it seems to describe the dialect of Turkish which may be spoken by Gagauz people in the Balkan Peninsula. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)
- Wikipedia, citing Ethnologue, insists that Balkan Gagauz Turkish, Gagauz, and Turkish are all separate, and a few sources do seem to take that view, e.g. Cem Keskin, Subject agreement-dependency of accusative case in Turkish, or, Jump-starting grammatical machinery (2009) speaks of "Balkan Gagauz Turkish, Gagauz, Turkish, Iraqi Turkmen, North and South Azerbaijani, Salchuq, Aynallu, Qashqay, Khorasan Turkic, Turkmen, Oghuz Uzbek, Afshar, and possibly Crimean Tatar". Other references speak of Balkan Gagauz Turkish as a variety of Gagauz, e.g. James Minahan's Encyclopedia of the Stateless Nations says "The Gagauz speak a Turkic language [...] also called Balkan Gagauz or Balkan Turkic, [which] is spoken in two major dialects, Central and Southern, with the former the basis of the literary language. Other dialects [include] Maritime Gagauz" (which comports with w:Gagauz's list of its dialects). Matthias Brenzinger's Language Diversity Endangered also treats Balkan Gagauz "or slightly misleading, Balkan Turkic" in his entry on Gagauz, but says it that the Balkan "varieties might deserve the status of outlying languages but very little information is available about them." (A few generalist references seem to subsume all
gag
intotr
.) I would leave them all separate, pending more conclusive evidence that they should be merged. - -sche (discuss) 23:58, 3 July 2016 (UTC)- I think there's some confusion about what exactly we're talking about, and whether it's Gagauz or Turkish. Just because they use the term "Balkan Gagauz Turkish" doesn't mean that they're referring to the language with ISO 639-3 code bgx. When I look at who's citing the references listed for bgx at Glottolog, Manević (the reference for its classification) is cited in papers clearly talking about the dialects of tr. These are the only actual words attributed to this lect that I can find. —Μετάknowledgediscuss/deeds 00:33, 4 July 2016 (UTC)
- @Tropylium, on the subject of Turkic languages spoken in Europe, do you know anything about this one, and about its differences or similarity to Gagauz and standard Turkish? - -sche (discuss) 01:08, 11 May 2017 (UTC)
- I'm not previously familiar with this dispute, but here are a few handbooks on the topic:
- Menges in The Turkic Languages and Peoples has the following slightly complicated quote (p. 11): "The Turkic languages spoken farthest west are the Balkanic dialects of Osman and Gagauz in Bosnia, Bulgaria and Macedonia. These seem to form two groups, one of possibly pre-Osman origin, and a later Osman one. To the former belong the Gaǯaly in Deli-Orman (Eastern Bulgaria), who, according to V. A. Moškov, are descended from the Päčänäg, Uz, and Torci (?), the Surguč, numbering about 7000 people in the district (vilājät) of Edirnä, who call themselves Gagauz. In Moškov's opinion, they, too, go back to the Päčänägs (?) and the Macedonian Gagauz; they number ca. 4000 people in southeastern Macedonia." — It seems clear that some group(s) corresponding to "Balkan Gagauz" is being identified here, but I am not even sure how to parse the sentence structure; e.g. are "Uz" and "Torci" some of the pre-Osman Turkic groups, or some of the alleged ancestors of the Gaǯaly? ("Osman" is, of course, Turkish.)
- Hendrik Boeschoten in a classificatory chapter in Routledge's The Turkic Languages mentions that "a few speakers [of Gagauz] in northern Bulgaria, Romania and Greece, adhere to the Orthodox faith, and have their own history." This again seems to refer to "Balkan Gagauz", but with no indication of being its own language.
- So far I would gather from this that "Balkan Gagauz" is at most a sister language of "non-Balkan Gagauz", and perhaps indeed just a different dialect group (perhaps one whose features are not reflected in written standard Gagauz). But the Manević 1954 paper would be more informative on this topic, if anyone wants to hunt it down. --Tropylium (talk) 11:55, 11 May 2017 (UTC)
- I'm not previously familiar with this dispute, but here are a few handbooks on the topic:
- @Allahverdi Verdizade, Crom daba: Here's an old, unresolved issue that could benefit from Turkicist eyes. —Μετάknowledgediscuss/deeds 23:59, 8 September 2018 (UTC)
- I think Balkan Gagauz should be merged with gag, especially since it contains no entries. The few terms that would be specific for Gagauz spoken outside of the traditional Gagauz area in Moldova/Romania/Bulgaria can be dealt with within gag entries. The only thing is that some etymologies of other Turkic languages sometimes refer to Balkan Gagauz instead of Gagauz, because editors didn't know the difference between two. Otherwise I don't see any problems with merging them two.
- On the other hand, Gagauz should definitely NOT be merged with Turkish, that is pretty obvious to me.Allahverdi Verdizade (talk) 05:09, 9 September 2018 (UTC)
- @Metaknowledge This is a hard question, I can offer only guesswork.
- I can't find any good maps for the distribution of Gagauz and (Muslim) Turks proper in the Balkans, most don't show Balkan Gagauz at all although we know they exist at least in Bulgaria and Macedonia.
- It seems that they are not easily separated geographically from Muslim Turks although they presumably live in different localities. I'm guessing this means that their languages ("Balkan Gagauz Turkish" and "Rumelian Turkish") could be the same, although maybe only the latter call their language "Turkish", so I guess that they (would?) use Standard Turkish in education and administration.
- This would be a good argument to merge Balkan Gagauz into Turkish, except that this paper shows that Balkan Turkic (if this really is a single language) is quite distinct from Anatolian Turkish and perhaps worth considering a different language. Baskakov also considers Balkan Turkish and (Moldovan) Gagauz to form a clade within Oghuz and Anatolian Turkish and Azerbaijani to form another. Crom daba (talk) 21:35, 30 September 2018 (UTC)
- @Anylai, can you find anything in Turkish on the possible differences between Balkan Gagauz and Rumelian Turkish? Crom daba (talk) 21:38, 30 September 2018 (UTC)
- Merge / delete it. The distribution of the name, the way it is “mentioned”, points towards it being a ghost language. The name is not attestable as used by anyone having particular information about it; nobody can add anything under it either in such a situation where it is a content-filled concept for nobody. Its alleged synonyms “Balkan Turkish” and “Rumelian Turkish” show it is just an SOP term for Turkish as spoken on the Balkans respectively Rumelia, i.e. remnant speakers of the Ottoman rule. German Balkantürkisch, distinguished from Türkeitürkisch as a regiolect. Fay Freak (talk) 13:38, 2 December 2020 (UTC)
Even more languages without ISO codes, part 6
This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)
- Alingpo language (tbq-alp) — perhaps should be named Yiqing
- Alo Teqel language (map-alt)
- Antequera Zapotec (omq-anz) — hard to say how different it is, but it's extinct, so a finite lexicon
- Auteco language (azc-aut)
- I have yet to find content in this lect to judge how different it is from our other Nahuatls. - -sche (discuss) 04:32, 24 May 2017 (UTC)
Aveteian language (map-ave)- Wikipedia (and Lyle Campbell, Anna Belew, Cataloguing the World's Endangered Languages, 2018) says this is dix "Dixon Reef". Is it not? (Or if it is, should the name associated with that code be changed?) - -sche (discuss) 20:10, 1 August 2020 (UTC)
- Bantang language (tbq-ban)
- Chashan language (tbq-cha)
- Damu language (sit-dam)
- Daylami language (ira-day)
- I've asked some of our editors of Iranian languages for input. - -sche (discuss) 00:17, 2 August 2020 (UTC)
- Based on feedback there, not added at this time, although I note that content in the language seems to exist, which suggests we would eventually need to figure out a header to include it under. - -sche (discuss) 20:44, 2 August 2020 (UTC)
- I've asked some of our editors of Iranian languages for input. - -sche (discuss) 00:17, 2 August 2020 (UTC)
- Jo language (crp-joo)
- Kasabe language (alv-kas)
- Kasong language (aav-kas) — questionable whether this is a separate language
- Komi-Yazva language (urj-kya)
- Perhaps better prm-kya? Also while I am not convinced treating the Komi varieties as separate languages altogether is the best solution, as long as we do so, we might moreover need Old Komi. --Tropylium (talk) 18:44, 11 July 2016 (UTC)
- Kurbet language (crp-kur)
Australian languages
- Bugurnidja language (aus-bug)
- Dyirringany language (aus-dyi)
- Gulidjan language (aus-gul)
- Gunindiri language (aus-gun)
- Kok Thawa language (aus-kth)
- Kureinji language (aus-kur)
Mirning language (aus-mir)- has an ISO code (to be added), see BP - -sche (discuss) 04:17, 14 October 2020 (UTC)
- Ngaro language (aus-ngr)
- Ngaygungu language (aus-ngg)
Ngumbarl language (aus-ngu)- has an ISO code (to be added), see BP - -sche (discuss) 04:17, 14 October 2020 (UTC)
- Wik Ompom language (aus-wom)
- Wik Paach language (aus-wpa)
- Yiman language (aus-yim)
Tasmanian and other
- Northeastern Tasmanian:
Northeastern, Pyemmairre language (aus-pye)Done- alt names/varieties: Plangermaireener, Plangamerina, Cape Portland, Ben Lomond, Pipers River
North Midlands, Tyerrernotepanner language (aus-tye)— Bowern considers this a dialect; perhaps we should just trust her- now has an ISO code which should be added instead, see BP shortly - -sche (discuss) 04:27, 14 October 2020 (UTC)
- Lhotsky/Blackhouse Tasmanian language (aus-lbt) — the worst name in Bowern's set!
- I'm not sure... the very language is "reconstructed" by Bowern on the assumption that three wordlists (of which only two make it into the name) attest the same language, although apparently none of the three bothered to name the language. The chance of someone "would run across [a word in] it and want to know what it means" seems nonexistent. If we wanted to host the wordlists, we could do that in an appendix or on Wikisource. - -sche (discuss) 16:09, 9 August 2016 (UTC)
- Bowern's methods are scientific; but I would feel better if more than one scholar was saying there was one language in this set of wordlists, the way that for e.g. Port Sorrell, Dixon & Crowley and Glottolog agree that there is a unit/lect there. - -sche (discuss) 16:55, 4 June 2017 (UTC)
- and what of "Norman Tasmanian"? - -sche (discuss)
- Here is another language we might need a code for: Ma(') Pnaan (poz-map?), also known by the exonyms Punan Malinau and Punan Segah, a language of Borneo / East Kalimantan, summarized by Antonia Soriente here and elsewhere. Compare the other things listed at Punan language. - -sche (discuss) 05:21, 29 August 2016 (UTC)
Marrithiyel
Maridan [zmd], Maridjabin [zmj], Marimanindji [zmm], Maringarr [zmt], Marithiel [mfr], Mariyedi [zmy], Marti Ke [zmg]: should these be merged? References speak of a singular Marrithiyel language. - -sche (discuss) 21:30, 20 July 2016 (UTC)
Some more missing American languages
Here are a few more North American languages for which we could add codes:
Akokisa (nai-ako).WP says it is attested certainly in two words in Spanish records (Yegsa "Spaniard[s]", which Swanton suggests is similar to Atakapa yik "trade" + ica[k] "people"; and the female name Quiselpoo), and possibly in more words in a wordlist by Jean Béranger in 1721 (if the wordlist is not some other language).
Labrador Inuit Pidgin French, less often called Belle-Isle Pidgin, was spoken in Labrador from the late 1600s (probably since before the 1660s, but first written down in 1694) until at least the mid 1760s, based on Inuktitut, French, Basque, Montagnais, and possibly Spanish and Breton. Louis-Jacques Dorais, An Inuit Pidgin around Belle-isle Strait (1996; with reference to "Clermont - Martijn 1980; Dorais 1980; Bakker 1988"), covers the records:
- Louis Jolliet recorded words at Baie Saint-Louis in 1694, including the 'greeting' thou tcharacou, saying the latter word is "peace", which Dorais says is "corroborated by two other sources, from 1717 (characoua [...]) and 1720 (characo [...]). But a text from 1743 (Privy Council 1927: 3284), written by the French merchant Louis Fornel, gives to characo the meaning 'war'." Thou is probably from tu. The other would could be Basque txarrakoa "bad", thus "are you bad?".
- Le Cour in 1742 records some more words: bons camaras "good comrades", tous camaras "all comrades", capitaine "captain", kellanoré (which Dorais says "seems to be Le Cour's [or the pidgin's?] rendering of Inuktitut kinaunali 'but who is he?'?), the personal name Amargo (a rendering of Amaqqut "Wolves"), rénombek "bead" (probably a loanword), maumek "file" (probably a loanword), monkoumek "knife" (probably a loanword from Montagnais mukuma:n, as spelled in Marguerite Ellen MacKenzie Towards a Dialectology of Cree-Montagnais-Naskapi).
- Louis Fornel in 1743 recorded more: tout camara "all comrades", troquo balena "let us trade whale" (from French troquons!), non characo "no war" (sic, per Fornel).
- Jens Haven wrote other words in 1764-5: makagua "peace" (perhaps from Basque bake[a] "peace" plus a suffix -koa), kutta (French couteau "knife"), memek "to drink" (from Inuktitut imiq "drinking water").
- Few references discuss the lect and it is difficult to judge whether it is really a language or just something like broken French or like Spanglish (which I think we exclude), but the fact that the Inuit apparently changed the meaning and even part of speech of words in their own language when speaking pidgin suggests it is more on the pidgin-language side of that continuum than the code-switching side.
- Algonquian–Basque pidgin (crp-abp). Wikipedia has a sample. The Atlas of Languages of Intercultural Communication, citing Bakker, says it was spoken from at least 1580 (and perhaps as early as 1530s) through 1635, and "only a few phrases and less than 30 words attributable to Basque were written down" (though apparently more words, attributable to other sources, were also recorded).
- Guachichil (Cuauchichil, Quauhchichitl, Chichimeca) (
nai-gch or,if Guachí is added as sai-gch, perhaps nai-gcl to prevent the two similarly-named lects from being mixed up by only typoing the initial n vs s), apparently sparsely attested. - Concho (nai-cnc). The Handbook of North American Indians, volume 10, says "three words of Concho [...] were recorded in 1581 [and] look like they may be [...] Uto-Aztecan".
- Jumano (Humano, Jumana, Xumana, Chouman, Zumana, Zuma, Suma, and Yuma) (nai-jmn). The Handbook says "It has been established that the Jumano and Suma spoke the same language. Three words have been recorded" of it.
and from South America:
- Peba / Peva (sai-peb), said by Erben to more properly by called Nijamvo, Nixamvo. Spoken in "the department of Loreto" in Peru. Attested in wordlists by Erben and Castelnau, which Loukotka provides, and which disagree with each other substantially: munyo (Erben) / money (Castelnau) "canoe, small boat"; nero (E) / yuna (C) "demon"; nebi (E) / nemey (C) "jaguar"; teki (E) / tomen-lay (C) "one", manaxo (E) / nomoira (C) "two"; etc. I would even consider that one might not be the same language as the other... what's with these languages that survive in disparate wordlists? lol.
- possibly Saynáwa: fr.Wikt grants a code to this variety of Yaminawá language, described here (see also [1]).
- -sche (discuss) 04:04, 16 August 2016 (UTC)
- Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledgediscuss/deeds 04:08, 16 August 2016 (UTC)
- Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, [...] known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts [...] Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as
{{lb|aqp|Akokisa}}
and then Béranger's forms as{{lb|aqp|possibly|Akokisa}}
without needing to agonize over which header to put them under. - -sche (discuss) 15:31, 16 August 2016 (UTC)
- Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, [...] known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts [...] Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as
- Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledgediscuss/deeds 04:08, 16 August 2016 (UTC)
Nkore-Kiga
As can be seen at w:Nkore-Kiga language, Kiga [cgg] should definitely be merged into Nyankore [nyn]. Unfortunately, this might require a rename to something that is both hyphenated and considerably less common that just plain "Nyankore" (though that is, strictly speaking, merely the name of the main dialect). —Μετάknowledgediscuss/deeds 05:21, 18 September 2016 (UTC)
- I'm not sure. WP suggests the merger was politically motivated, but many reference works do follow it. Ethnologue says there as "Lexical similarity [of] 78%–96% between Nyankore, Nyoro [nyo], and their dialects; 84%–94% with Chiga [cgg], [...and] 81% with Zinza [zin]" (Kiga, meanwhile, is said to be "77% [similar] with Nyoro [nyo]"), as if to suggest nyn is about as similar to cgg as to nyo, and indeed many early references treat Nkore-Nyoro like one language, where later references instead prefer to group Nkore with Kiga. Ethnologue mentions that some authorities merge all three into a "Standardized form of the western varieties (Nyankore-Chiga and Nyoro-Tooro) [...] called Runyakitara [...] taught at the University and used in internet browsing, but [it] is a hybrid language." (For comparison, Ethnologue says English has 60% lexical similarity to German.) - -sche (discuss) 00:16, 2 June 2017 (UTC)
Itneg lects
See w:Itneg language. All the dialects have different codes, but we really should give them a single code and unify them. I came across this problem with the entry balaua, which means "spirit house" (but I can't tell in which specific dialect). It's also known as Tinggian (with various different spellings), and this may be a better name for it than Itneg. —Μετάknowledgediscuss/deeds 02:09, 23 September 2016 (UTC)
What distinguishes these two? —suzukaze (t・c) 03:31, 9 October 2016 (UTC)
- We have Category:Latin script characters and other subcategories of Category:Characters by script.
- To follow suit, maybe we should merge Category:Chinese hanzi and Category:Chinese Han characters into
Category:Chinese script charactersCategory:Han script characters. --Daniel Carrero (talk) 03:35, 9 October 2016 (UTC)- Oppose merging both into the already existing Category:Han script characters as it is for all hanzi, while the two being discussed here are for Chinese hanzi only. —suzukaze (t・c) 04:00, 9 October 2016 (UTC)
- Allright. Apparently, these two categories are for single-character entries. We have Category:Japanese terms written with one Han script character, so these could be merged into Category:Chinese terms written with one Han script character, and we could populate Category:Chinese character counts like Category:Japanese character counts. --Daniel Carrero (talk) 04:04, 9 October 2016 (UTC)
- Oppose this one too since Japanese entries have Category:Japanese Han characters, for any kanji used in the Japanese language (regardless of whether it can be used independently as a word or not), while Category:Japanese terms written with one Han script character is for words that feature only one kanji, such as 好き. —suzukaze (t・c) 04:17, 9 October 2016 (UTC)
- Allright. Apparently, these two categories are for single-character entries. We have Category:Japanese terms written with one Han script character, so these could be merged into Category:Chinese terms written with one Han script character, and we could populate Category:Chinese character counts like Category:Japanese character counts. --Daniel Carrero (talk) 04:04, 9 October 2016 (UTC)
- Oppose merging both into the already existing Category:Han script characters as it is for all hanzi, while the two being discussed here are for Chinese hanzi only. —suzukaze (t・c) 04:00, 9 October 2016 (UTC)
If there is no meaningful difference between these, I propose keeping Category:Chinese Han characters as it is managed by {{poscatboiler}}
and merging Category:Chinese hanzi into it. —suzukaze (t・c) 04:17, 9 October 2016 (UTC)
- @Wyang, Atitarev, is there a difference between Category:Chinese hanzi and Category:Chinese Han characters, or can Category:Chinese hanzi be merged into Category:Chinese Han characters as suzukaze proposes? - -sche (discuss) 00:27, 28 March 2017 (UTC)
- They can be merged, IMO. --Anatoli T. (обсудить/вклад) 00:52, 28 March 2017 (UTC)
- (reviving this discussion after almost three years) Merge per Suzukaze-c's proposal above. — justin(r)leung { (t...) | c=› } 03:30, 19 January 2020 (UTC)
There seems to be no notable difference between the two categories so they should be merged I guess. Ffffrr (talk) 21:40, 10 December 2021 (UTC)
Update? For reference, it looks like the "Chinese hanzi" category is populated by this code in Module:zh-pron. 70.172.194.25 00:21, 27 May 2022 (UTC)
Paraguayan Guaraní [gug]
I just noticed that we have this for some reason. Guaraní is a dialect continuum that is quite extensive, both in inter-dialect differences and in geography, and certain varieties have been heavily influenced by Spanish or Portuguese. That said, our Guaraní [gn] content is, as far as I can tell, pretty much entirely on Paraguayan Guaraní, which for some reason has a different code, [gug]. My attention was brought to this by User:Guillermo2149 changing L2 headers (I have not reverted his edits, but they do cause header-code mismatch). We could try splitting up the Guaraní dialects, but it would hard to choose cutoffs and would definitely confuse potential editors, of which we have had more since Duolingo released a Guaraní course. I think the best choice is to merge [gug] into [gn] and mark words extensively for which dialects or countries they are used in. @-sche —Μετάknowledgediscuss/deeds 01:29, 1 November 2016 (UTC)
- Support [gn] and [grn] are the codes of the macrolanguage, [gug] is the code for the specific dialect spoken in Paraguay, also, until now, I haven't found any [gn] lemma to be out of [gug]. --Guillermo2149 (talk) 01:52, 1 November 2016 (UTC)
- Support. — Ungoliant (falai) 11:00, 1 November 2016 (UTC)
- Support merging gn and gug. - -sche (discuss) 14:33, 1 November 2016 (UTC)
- Support —Aɴɢʀ (talk) 15:02, 1 November 2016 (UTC)
- @Guillermo2149, Ungoliant MMDCCLXIV, -sche, Angr: I see now that there are three more Guaraní dialect codes that we have: Mbyá Guaraní [gun], Chiripá [nhd], and Western Bolivian Guaraní [gnw]. I presume that we should merge these into [gn] as well, but the case is arguably less clear given that in our current state, all our [gn] lemmas are really [gug]. What do you all think? —Μετάknowledgediscuss/deeds 22:51, 14 November 2016 (UTC)
- I stick by my motto, "When in doubt, merge". —Aɴɢʀ (talk) 09:53, 15 November 2016 (UTC)
- I think we should actually merge [gn] into [gug] and not viceversa. By the way, [gn] is the only one that should be merged, [gun] has similar and some equal words but the language is very different, and [nhd] is similar and very close to [gug] but it's slightly different and always confused with [gug] --Guillermo2149 (talk) 00:37, 7 December 2016 (UTC)
- Don't forget there's also [gui] and apparently also [tpj]. - -sche (discuss) 04:28, 16 May 2017 (UTC)
2017
Merger into Scandoromani
I propose that the Para-Romani lects Traveller Norwegian, Traveller Danish and Tavringer Swedish (rmg
, rmd
and rmu
) be merged into Scandoromani. TN, TD and TS are almost identical, mostly differing in spelling (e.g. tjuro (Sweden) vs. kjuro (Norway) meaning 'knife', gräj vs. grei 'horse' etc.). WP treats them as variants of Scandoromani. My langcode proposal could be rom-sca
, or maybe we could just use rmg
, which already has a category. --176.23.1.95 20:19, 25 January 2017 (UTC)
- Im supporting it. Traveller Norwegian is sometimes referred to as Tavring, and, to be honest, Ive never herd nobody use the term Traveller Norwegian as a language. People are calling it rather Taterspråk or Fantemål, even when books states it as a derigatory therm. The other problem is that we've got in fact 2 differnet Norwegian Traveller languages (the Romani-based and the Månsing-based). So it look like a total mess rite now Tollef Salemann (talk) 07:55, 2 April 2023 (UTC)
- I don't think this makes sense if the orthographies are consistently different, which seems to be the case. Otherwise, we could use the same logic to merge quite a few of the Slavic languages, which obviously doesn't make sense. Theknightwho (talk) 13:43, 2 April 2023 (UTC)
- Ok, but Traveller Norwegian is not quite right term, cuz the Romani-based TN has two or more branches, which are quite different from eachother, while the main one is allmost the same as the Swedish and had often the same name(s). Meenwhile, there is also a Germanic TN version, unrelated to the Romani-ish TN variations. I mean, we need at least two more L2 in this case, even if we gonna merge TN and Swedish Tavring.
- PS there are also Swedish stuff like Knoparmoj and Loffarspråk and more, and they still have remnants in some rare Swedish/Norwegian sociolects. Maybe they also need their L2? Or can we treat them as sociolects? Tollef Salemann (talk) 13:59, 2 April 2023 (UTC)
Chinese Pidgin English (cpi)
This is not a separate language at all, it's just English with different grammar and some loanwords, but other than that it's completely intelligible with standard English. As such, it should be moved to Category:Chinese English. -- Pedrianaplant (talk) 15:19, 8 February 2017 (UTC)
- That's not at all the impression I get from Chinese Pidgin English. It seems to be a distinct language to me, as much as any other English-based pidgin. —Aɴɢʀ (talk) 16:45, 8 February 2017 (UTC)
- We did delete Hawaiian Pidgin English in the past though (see Template talk:hwc). I don't see how this case is any different. -- Pedrianaplant (talk)
- I know we did, but I didn't participate in that discussion (only 3 people did), and I disagree with it too, probably even more strongly than I disagree with merging cpi. —Aɴɢʀ (talk) 17:02, 8 February 2017 (UTC)
- We did delete Hawaiian Pidgin English in the past though (see Template talk:hwc). I don't see how this case is any different. -- Pedrianaplant (talk)
- Basically, this is a terminological problem. There may have been a true pidgin in each of these cases, but it has not been recorded. What is called a pidgin in many descriptive works is instead a dialect of English that is very easy to understand, nothing like the real English-based pidgins and creoles that I have studied. If you look at the actual quotations used to support lemmas in Chinese Pidgin English, you find that it is Chinese English. Support merge, but leave [cpi] as an etymology-only code. —Μετάknowledgediscuss/deeds 23:16, 8 February 2017 (UTC)
- At least some texts seem very distinct, to the point of unintelligibility; consider "Joss pidgin man chop chop begin" (Whedon's translator begins chopping things? or "god's businessman begins right away"?). On the other hand, other sentences given by Wikipedia are quite intelligible...and possibly not attestable under the stricter CFI to which English is subject. I'm not sure what to do. (Our short previous discussion also didn't reach a firm resolution.) - -sche (discuss) 17:46, 8 March 2017 (UTC)
- I mean, I use joss and chop chop in English normally (having grown up in a fairly Chinese environment likely has something to do with that)... and I think that was chosen as an especially extreme example. —Μετάknowledgediscuss/deeds 03:32, 25 March 2017 (UTC)
More unattested languages
The following languages have ISO codes, but those codes should be removed, as there is no linguistic material that can be added to Wiktionary. This list is taken from Wikipedia's list of unattested languages, but I have excluded languages which are not definitively extinct (and thus which may have material become available). If there was any reliable source I could find corroborating the WP article's claim of lack of attestation, it is given after the language. —Μετάknowledgediscuss/deeds 04:15, 4 April 2017 (UTC)
- Aguano language [aga]
- Unclear if it even existed per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).
Barbacoas language [bpb](the Wikipedia article has a discussion of the conflation of this unattested language with Pasto, which needs a code; for clarity, I think this [bpb] should be retired and an exceptional code made explicitly for Pasto)- Retired, following the ISO, see Wiktionary:Beer parlour/2020/October#2019-2020_ISO_code_changes. Content, if needed for migration to a Pasto code, was m["bpb"] = { "Barbacoas", "Q2669202", "sai-bar", otherNames = {"Pasto"}, scripts = Latn, } - -sche (discuss) 06:23, 14 October 2020 (UTC)
- Dek language [dek]
- Giyug language [giy]
- AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
- The 1992 International Encyclopedia of Linguistics, v. 1, p. 337, says "Giyug: 2 speakers reported in 1981, in the Peron Islands in Anson Bay, southwest of Darwin." The 2003 edition repeats the claim that "2 speakers remain". Wikipedia says it's extinct and unattested, but Glottolog, although having no resources on it, suggests it's not extinct. Might be best to leave it alone for now. - -sche (discuss) 01:13, 6 August 2020 (UTC)
- AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
Mawa language (Nigeria) [wma](We call this "Mawa", if removed, [mcw] Mahwa (Mawa language (Chad) can be renamed to the evidently more common spelling "Mawa".)- Removed, and mcw renamed. Glottolog had only one reference to support the existence of Mawa, Temple (1922), which does not even include a section under that header. There may be confusion with the section on the "Marawa", but that does not even mention what language those people speak. (Temple also knows very little about linguistics; while skimming through, I found that Margi (a Chadic language) was said to be similar to the languages of South Africa. —Μετάknowledgediscuss/deeds 01:39, 6 August 2020 (UTC)
- Nagarchal language [nbg]
- Appendix I in The Indo-Aryan Languages records this language as being a subdialect of Dhundari [dhd] and the 1901 Indian Census concurs; this is at odds with its description as an unattested Dravidian language, but the geographical specifications seem to match up.
- Ngurmbur language [nrx]
- AIATSIS says: "Harvey (PMS 5822) treats Ngomburr as a dialect of Umbukarla N43, but in Harvey (ASEDA 802), it is listed as a separate language." Nicholas Evans confirms in The Non-Pama-Nyungan Languages of Northern Australia that it is unattested.
- Tremembé language [tme]
- Truká language [tka]
- Wakoná language [waf]
- Wasu language [was]
- Unclassified due to its absence of data per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).
Yenish
The Yenish "language" (which we call Yeniche) was given the ISO code yec, despite being clearly not a separate language from German. Instead, it is a jargon which Wikipedia compares to Cockney (which has never had a code) and Polari (which had a code that we deleted in a mostly off-topic discussion). The case of Gayle, which is similar, is still under deliberation at RFM as of now. Most tellingly, German Wiktionary considers this to be German, and once we delete the code, we should make a dialect label for it and add the contents of de:Kategorie:Jenisch to English Wiktionary. @-sche —Μετάknowledgediscuss/deeds 00:49, 7 April 2017 (UTC)
- I don't see how that's most tellingly; I don't know about the German Wiktionary, but major language works frequently treat things as dialects of their language that outsiders consider separate languages.--Prosfilaes (talk) 03:01, 10 April 2017 (UTC)
- The (linked) English Wikipedia article even says "It is a jargon rather than an actual language; meaning, it consists of a significant number of unique specialized words, but does not have its own grammar or its own basic vocabulary." Despite the citation needed that follows, that sentence is about accurate, as such this should be deleted. -- Pedrianaplant (talk) 10:53, 30 April 2017 (UTC)
- (If kept, it should be renamed.)
There are those who argue that Yenish should have recognition (which it indeed gets, in Switzerland) as a separate language. And it can be quite divergent from Standard German, with forms that are as different as those of some of the regiolects we consider distinct. Many examples from Alemannic or Bavarian-speaking areas are better considered Alemannic or Bavarian than Standard German. But then, that's a sign that it is, as some put it, a cant overlaid onto the local grammar, rather than a language per se. Ehh... - -sche (discuss) 03:22, 9 July 2017 (UTC)
What's the difference? --Barytonesis (talk) 20:19, 17 April 2017 (UTC)
- Apparently (Google n-grams) the term could be used with or without an object. The definition should be somewhat different. An example of use without a direct object is "to rake over the coals of failure". I don't know how to word this in a substitutable way. It seems to mean something like "to belabor (something negative (result, process), obvious from context) as if in reprimand". DCDuring (talk) 15:14, 3 January 2018 (UTC)
Move entries in CAT:Khitan lemmas to a Khitan script
The Khitan wrote using a Siniform script. Are these Chinese transcriptions of Khitan? —suzukaze (t・c) 02:22, 13 August 2016 (UTC)
- I'm a little confused about what's going on here. Are you RFV-ing every entry in this category? Or are you just looking for evidence that Khitan was written using this script? —Mr. Granger (talk • contribs) 12:45, 13 August 2016 (UTC)
- The Khitans had their own script. These entries use the Chinese script. —suzukaze (t・c) 17:30, 13 September 2016 (UTC)
- I understand that, but I don't understand what your goal is with this discussion. If you want to RFV every entry in the category, then I'd like to add
{{rfv}}
tags to alert anyone watching the entries. If you want to discuss what writing systems Khitan used, maybe with the goal of moving all of these entries to different titles, then I'm not sure RFV is the right place for the discussion. (Likewise with the Buyeo section below.) —Mr. Granger (talk • contribs) 17:55, 13 September 2016 (UTC)
- I understand that, but I don't understand what your goal is with this discussion. If you want to RFV every entry in the category, then I'd like to add
- Moved to RFM. - -sche (discuss) 21:04, 30 April 2017 (UTC)
Some spurious languages to merge or remove, 2
- remove Adabe [adb]
Geoffrey Hull, director of research for the Instituto Nacional de Linguística in East Timor, notes (in a 2004 Tetum Reference Grammar, page 228) that "the alleged Atauran Papuan language called 'Adabe' is a case of the mistaken identity of Raklungu," a dialect (along with Rahesuk and Resuk) of Wetarese. He notes (in The Languages of East Timor, Some Basic Facts) that only Wetarese is spoken on the island, and Studies in Languages and Cultures of East Timor likewise says "The three Atauran dialects—with the northernmost of which the dialect of nearby Lirar is mutually intelligible—are unquestionably Wetarese, and not dialects of Galoli, as Fox and Wurm suggest for two of them (n. 32). The same authors refer (ibidem) to a supposedly Papuan language of Atauro, the existence of which appears to be entirely illusory." (The error appears to have originated not with Fox and Wurm but with Antonio de Almeida in 1966.) - -sche (discuss) 01:45, 31 May 2017 (UTC)
- We could repurpose the code into one for those three Atauran varieties of Malayo-Polynesian Wetarese, Rahesuk, Resuk, and Raklu Un / Raklungu (the last of which Ethnologue does list as an alt name of adb, despite their erroneous family assignment of it), perhaps under the name "Atauran Wetarese" for clarity. - -sche (discuss) 01:52, 31 May 2017 (UTC)
- remove Agaria [agi]
Glottolog makes the case that this is spurious. - -sche (discuss) 07:57, 31 May 2017 (UTC)
Arma
Arma (aoh) is also said to be "a possible but unattested extinct language"; I am trying to see if that means it is entirely unattested, or if there are personal/ethnic/place names, etc. - -sche (discuss) 09:45, 3 June 2017 (UTC)
- Removed, see Wiktionary:Beer_parlour/2020/October#2019-2020_ISO_code_changes. - -sche (discuss) 06:18, 14 October 2020 (UTC)
The VU Amsterdam report linked to here seems to indicate that one lect has been given multiple codes, and that "Jair" at least is spurious. Further research wouldn't hurt. —Μετάknowledgediscuss/deeds 00:24, 3 October 2019 (UTC)
Categories in Category:Letters
Can we come up with more descriptive names than Category:Aa please? —CodeCat 22:37, 14 May 2017 (UTC)
- IMO they are fine as they are. We could use "Letter Aa", etc, I guess. - excarnateSojourner (talk | contrib) 04:51, 29 April 2022 (UTC)
Apparently this is not a set category, despite its name seeming like one. User:Smuconlaw apparently intended it to be about things related to limbs. I think it should be renamed to more clearly reflect that. —CodeCat 17:35, 17 May 2017 (UTC)
- What is a "set category"? — SMUconlaw (talk) 17:36, 17 May 2017 (UTC)
- A category that contains items belonging to a particular set. See Category:List of sets. A characteristic of set categories is that they have plural names. —CodeCat 17:37, 17 May 2017 (UTC)
- Hmmm, I'm not sure what it's supposed to be. I was just following the example of other categories under "Category:Body" such as "Category:Buttocks", "Category:Face", "Category:Muscles", "Category:Organ systems", "Category:Skeleton", "Category:Skin", and "Category:Teeth". — SMUconlaw (talk) 17:44, 17 May 2017 (UTC)
- I'm currently working with User:-sche on a more permanent solution to issues like this. —CodeCat 19:00, 17 May 2017 (UTC)
- OK, thanks. — SMUconlaw (talk) 22:10, 17 May 2017 (UTC)
- Has this been resolved? - excarnateSojourner (talk | contrib) 23:17, 29 March 2022 (UTC)
- @Rua (CodeCat), @-sche Do you have any idea if this was ever resolved? — excarnateSojourner (talk · contrib) 04:03, 6 February 2023 (UTC)
- Unfortunately, I don't think the issue of how to distinguish set and topic categories has been resolved yet (the basic idea was that they need some more distinct naming convention than just "set categories are plural": something like "Category:en:Set:Foobar" vs "Category:en:Topic:Foobar" or something). However, looking at the contents of this category and the "Muscles" category, it looks like the issue of what should be in a category named "Limbs" (or "Muscles") was resolved by changing its contents to being a "set" category...? - -sche (discuss) 18:04, 6 February 2023 (UTC)
- I'm currently working with User:-sche on a more permanent solution to issues like this. —CodeCat 19:00, 17 May 2017 (UTC)
- Hmmm, I'm not sure what it's supposed to be. I was just following the example of other categories under "Category:Body" such as "Category:Buttocks", "Category:Face", "Category:Muscles", "Category:Organ systems", "Category:Skeleton", "Category:Skin", and "Category:Teeth". — SMUconlaw (talk) 17:44, 17 May 2017 (UTC)
- A category that contains items belonging to a particular set. See Category:List of sets. A characteristic of set categories is that they have plural names. —CodeCat 17:37, 17 May 2017 (UTC)
This should be handled with {{liushu}}
, since jiajie is one of the six categories (liushu). — justin(r)leung { (t...) | c=› } 18:36, 17 May 2017 (UTC)
- Can both of these templates be renamed to include a language code? —CodeCat 19:01, 17 May 2017 (UTC)
{{jiajie}}
should be merged with{{liushu}}
, which could be renamed as{{Han liushu}}
, following{{Han compound}}
and{{Han etym}}
. It might not be a good idea to use a particular language code because these templates are intended for use in multiple languages now. They used to be used under Translingual, but we have decided to move the glyph origin to their respective languages. — justin(r)leung { (t...) | c=› } 20:22, 17 May 2017 (UTC)- You can use script codes as prefixes too. We have Template:Latn-def, Module:Cans-translit and such. —CodeCat 20:26, 17 May 2017 (UTC)
Entries in CAT:Taos lemmas with curly apostrophes
Many Taos entries use curly apostrophes to represent glottal stops. They should either use the easy-to-type straight apostrophe ' that many other languages use, or the apostrophe letter ʼ that Navajo and a few other languages use. - -sche (discuss) 21:36, 20 May 2017 (UTC)
- I agree. The headword template interprets the curly apostrophe as a punctuation mark (because it is), and automatically links words such as adùbi’íne as adùbi’íne. (Personally, I think the apostrophe letter looks better, but there may be other considerations.) — Eru·tuon 21:45, 20 May 2017 (UTC)
- Oh, and I just learned of the Unicode character ꞌ for the saltillo. But no entries use it, and I am averse to introducing yet another visually-almost-identical symbol to represent the glottal stop, next to the three (counting the curly apostrophe) mentioned above that are already in use, plus the ˀ that some entries use. - -sche (discuss) 02:23, 21 May 2017 (UTC)
- I'm in favor of standardizing on U+02BC MODIFIER LETTER APOSTROPHE for any language that uses an apostrophe-looking thing as a letter. —Aɴɢʀ (talk) 13:52, 21 May 2017 (UTC)
- Probably reasonable for glottalizationy apostrophes. At least Skolt Sami uses ʹ U+02B9 MODIFIER LETTER PRIME for suprasegmental palatalization though, which should likely be kept separate. --Tropylium (talk) 16:55, 21 May 2017 (UTC)
- I'm in favor of standardizing on U+02BC MODIFIER LETTER APOSTROPHE for any language that uses an apostrophe-looking thing as a letter. —Aɴɢʀ (talk) 13:52, 21 May 2017 (UTC)
- Oh, and I just learned of the Unicode character ꞌ for the saltillo. But no entries use it, and I am averse to introducing yet another visually-almost-identical symbol to represent the glottal stop, next to the three (counting the curly apostrophe) mentioned above that are already in use, plus the ˀ that some entries use. - -sche (discuss) 02:23, 21 May 2017 (UTC)
- I've moved quite a few of these; about 140 remain to be moved. - -sche (discuss) 04:49, 24 July 2018 (UTC)
The Category:E language surely has numbers, which would require this category to be used. Other suggestions for the food additive category name would be welcome. Maybe "List of E numbers"? DTLHS (talk) 16:31, 27 May 2017 (UTC)
- If we adopt a systematic naming scheme for topic and set categories as CodeCat and I have been discussing, then I guess it could be "Category:mul:set:E numbers" or "Category:Translingual:set:E numbers". However, independent of whether or not such prefixes ("Translingual:set:") come into use, a more intelligible name like the one you propose, replacing "E" with "European food additive", would be good. Other food-additive numbering schemes in use in Europe could also go in the same category. - -sche (discuss) 18:48, 27 May 2017 (UTC)
- Support. Very good find. —Μετάknowledgediscuss/deeds 03:50, 28 May 2017 (UTC)
- Disagree. They are not called European food additive numbers, they are E numbers. SemperBlotto (talk) 18:05, 28 May 2017 (UTC)
- @SemperBlotto: So what do you want to do about numbers in the E language? —Μετάknowledgediscuss/deeds 18:09, 28 May 2017 (UTC)
- I think you may be implying that the category should be something like mul:E numbers just in case any of our users think E is a language. I wouldn't object to that. SemperBlotto (talk) 18:12, 28 May 2017 (UTC)
- To be clear: E is a language, spoken in China. CAT:E language. (And like CAT:English numbers, it will have a "numbers" category someday when our coverage of it improves.) Perhaps a move should be postponed for a little while, though, while we see if we can come up with a systematic naming scheme for topic and set categories (see my talk page). - -sche (discuss) 18:33, 28 May 2017 (UTC)
- Since there's been no progress towards systematically changing how topic and set categories are named, this one does need to be renamed, because it does conflict with the expected 'numbers' category of the existing E language. Does anyone else want to weigh in on whether the name should be "Category:European food additive numbers" or "Category:mul:E numbers"? - -sche (discuss) 22:34, 18 November 2018 (UTC)
- To be clear: E is a language, spoken in China. CAT:E language. (And like CAT:English numbers, it will have a "numbers" category someday when our coverage of it improves.) Perhaps a move should be postponed for a little while, though, while we see if we can come up with a systematic naming scheme for topic and set categories (see my talk page). - -sche (discuss) 18:33, 28 May 2017 (UTC)
- I think you may be implying that the category should be something like mul:E numbers just in case any of our users think E is a language. I wouldn't object to that. SemperBlotto (talk) 18:12, 28 May 2017 (UTC)
- @SemperBlotto: So what do you want to do about numbers in the E language? —Μετάknowledgediscuss/deeds 18:09, 28 May 2017 (UTC)
Should perhaps be moved to long story? W3ird N3rd (talk) 06:42, 9 August 2017 (UTC)
- In contrast to long story short, neither seems entryworthy to me. They are quite transparent. Checking “long story”, in OneLook Dictionary Search., one notes that none of those references find it inclusionworthy, whereas “long story short”, in OneLook Dictionary Search. shows some coverage. DCDuring (talk) 11:01, 9 August 2017 (UTC)
sense: Noun: "(aviation) A large multi-engined aircraft. The term heavy normally follows the call-sign when used by air traffic controllers."
In the aviation usage AA21 heavy ("American Airline flight 21 heavy") the head of the NP is AA21, heavy being a qualifying adjective indicating a "wide-bodied", ergo "heavy", aircraft.
Move to noun with any adjustments required. DCDuring (talk) 13:19, 24 August 2017 (UTC)
- @DCDuring You're proposing we move from noun to noun? Did you mean from noun to adjective? - excarnateSojourner (talk | contrib) 05:57, 18 October 2022 (UTC)
- I don't know what I meant 5 years ago, but that's what I mean now: move it to adjective. Though it would be good to confirm that there is not sufficient attestation of heavies and/or [DET] heavy. DCDuring (talk) 12:48, 18 October 2022 (UTC)
- I can find the plural in reference to large (sometimes restricted to widebody) commercial aircraft and heavy bombers (sometimes 2-engine, always at least 4-). Also "heavy" motor vehicles (eg. large trucks, esp semis). I'm not entirely sure what heavy refers to when used by the pilot of a Cessna. DCDuring (talk) 12:57, 18 October 2022 (UTC)
- I don't know what I meant 5 years ago, but that's what I mean now: move it to adjective. Though it would be good to confirm that there is not sufficient attestation of heavies and/or [DET] heavy. DCDuring (talk) 12:48, 18 October 2022 (UTC)
Renaming mey
We currently have it as "Hassaniya" (which we used to spell as Hassānīya; those macra were removed along the way, presumably by Liliana, although I don't see any discussion; MG deleted the old category once it was empty). To match the other colloquial Arabic languages, it should be "Hassaniya Arabic". (Note: if Arabic is merged, this will become moot.) —Μετάknowledgediscuss/deeds 07:07, 16 September 2017 (UTC)
- This seems a bit different from most of the other forms of Arabic which are "[Adjective referring to a place] Arabic", where just calling the lect "Libyan" (etc) would be more awkward. Still, I have no objection to a rename, though I don't have time to rename all the categories right now. I also notice that, while Hassaniya is probably still the most common spelling overall, it seems like Hassaniyya started to become more common around 2003. - -sche (discuss) 04:03, 29 December 2017 (UTC)
Categories about country subdivisions to include the country name
This will include at least the following:
- Category:Abkhazia → Category:Abkhazia, Georgia
Category:Alabama→ Category:Alabama, USA- Category:Alberta → Category:Alberta, Canada
- Category:Andhra Pradesh → Category:Andhra Pradesh, India
- Category:Aomori Prefecture → Category:Aomori Prefecture, Japan
Category:Arizona→ Category:Arizona, USACategory:Arkansas→ Category:Arkansas, USA- Category:Barisal Division → Category:Barisal Division, Bangladesh
Categories for certain things that are located within these subdivisions will also be named, e.g. Category:Cities in Aomori (Prefecture) → Category:Cities in Aomori Prefecture, Japan. —Rua (mew) 13:07, 16 October 2017 (UTC)
- Support. I oppose the existence of categories with language code like "en:" in the first place, but what is proposed here seems to be an improvement over the status quo. --Daniel Carrero (talk) 20:27, 20 October 2017 (UTC)
- I would have opposed a lot of these, but I was too late on the scene. DonnanZ (talk) 15:51, 12 November 2017 (UTC)
- Support all except Category:Abkhazia, Georgia (for which I abstain as I do not properly understand the political situation explained by User:Palaestrator verborum). - excarnateSojourner (talk|contrib) 03:34, 29 October 2021 (UTC)
- US states were moved by MewBot (talk • contribs) in 2017. - excarnateSojourner (talk | contrib) 22:00, 27 April 2022 (UTC)
The rename has been put on hold until there is a clear consensus either way. Please vote! —Rua (mew) 15:11, 14 November 2017 (UTC)
- @Rua It looks sane to me if politics are let out. But why is Abkhazia in Georgia though it is an independent state, statehood only depending on factual prerequisites and not on diplomatic recognition which has nothing to do with it? Where does the Crimea belong to? (article Sevastopol is only in Category:en:Ukraine because it has not really been edited since 2014.) I can think of two solutions: First possibility: We focus on geographical and cultural constants. Second possibility: We focus on the actual political power. I disprefer the second slightly because it can mean much work in cases of war (i.e. how much the Islamic state holds etc., or say the current factions in Libya). But in neither case Abkhazia is in Georgia. But the first possibility does not even answer what the Crimea belongs to, i.e. I am not sure if it is historically correct to speak of the Crimea as Ukraine. And geographical terms are often fuzzy and subject to editorial decisions. All seems so easy if you start your concepts from the United States, which do not even have a name for the region they are situated in. And even for the USA your idea is questionable because the constituent states of the United States are states in their own right (Teilstaat, Gliedstaat in German), as is also the case for the Federal Republic of Germany and the Russian Federation partially (according to the Russian constitution only those of the 85 subjects are states which are called Republic, not the Oblasti etc.). Is Tatarstan Russia? Not even Russians can agree with such a sentence, as in Russia one sharply distinguishs русские and россияне, Россия and Российская федерация. Technically Ceuta and Melilla are in Morocco because Spain is not in Africa. Also, Kosovo je Srbija, and it would become just a coincidence if a place important in Serbian history is listed as X, Kosovo or X, Serbia. Palaestrator verborum (loquier) 16:06, 14 November 2017 (UTC)
@Rua: Most of these categories like Category:en:Special wards in Tokyo are back on the {{delete}}
list. I think these should be removed again for the time being. DonnanZ (talk) 18:02, 14 November 2017 (UTC)
- Starting with the above, I don't know how the Tokyo ward system works, but I imagine it's a subdivision of the city. In England wards are subdivisions in cities, boroughs, local government districts, and possibly counties. "Wards in" is the natural usage.
- Municipalities similarly. For example in Norway there are hundreds of municipalities (kommuner) which are subdivisions within counties (fylker). Some of these can be large, especially in the north, but so are the counties in the north. To me "municipalities in" is the natural wording.
- States and provinces in the USA and Canada: In nearly all cases it is unnecessary to add the country name as the names are unambiguous. The only exception I can think of is Georgia, USA. This could also apply to prefectures in Japan and states in India (is there a Punjab in Pakistan?). DonnanZ (talk) 18:52, 14 November 2017 (UTC)
- Yes, there is, like there is in India. Maybe categorisations should be abundant? Cities can belong to Punjab as well as to Punjab, India, and the Crimea is part of administration of both the Russian Federation and the Republic Ukraine at least for some purposes in the Republic Ukraine. We can make the least thing wrong by adding Sheikh Zuweid (presuming it exists) as well to the Islamic State as to the Arab Republic of Egypt, because we do not want to judge morally and formally states and terror organizations are indistinguishable. On the other hand of course we need sufficient data to relate towns to administrative divisions and ISIS presumably does not publish organigrams. Palaestrator verborum (loquier) 19:44, 14 November 2017 (UTC)
This is a newly created (September 2017) topical category. It should be renamed to something that does not imply that it contains expressions that are directive. It contains terms that relate to direction or, more frequently, terms that can be confused with direction. I recognize that Direction would not be a suitable category name. I don't have any suggestion. It may be that the category is ill-conceived. DCDuring (talk)
- I see nothing wrong with it. If it contained directive expressions, it would be called Category:English directives or similar. We have voted in the past to keep topical category naming distinct from other categories, so the naming scheme is considered indicative of its use/meaning/function. —Rua (mew) 20:37, 22 December 2017 (UTC)
- I'm not surprised that you see nothing wrong, what with the cat scheme being otherwise so perfect.
- I favor keeping topical categories as far way as possible from our other entry categories.
- But, unlike other categories that have names that are plural in form, Category:en:Directives contains neither examples nor names of the referents of its category name, ie of directives. It contains a dog's breakfast of terms that the categorizer, User:51.9.55.214, thought to be connected to some sense of the noun(?) directive. One mistake was to pick as name for a concept/category a de-adjectival noun. Probably the name was made plural to avoid confusion with the adjective.
- If you can make sense of the rationale for the membership in the category of ban, bare minimum, beckoning, behest, besaiel, beseeching, bidding, bill, blacklist, blackmail, bloodlust, blueprint, booty call, boundary, boycott, breve, bribe, and bytecode, you, Gunga Din, are a better man than I. I am at a loss to understand the common element among these terms. Is each suppopsed to be a type of directive? If no one can come up with a better name for the category, or prune membership rationally, or split it into multiple comprehensible cateogries, or RfDO it, I will RfDO it. DCDuring (talk) 02:43, 23 December 2017 (UTC)
- Bytecode in the sense of compiler directive! Really pushing it a bit. Equinox ◑ 02:49, 23 December 2017 (UTC)
2018 — January
Is {{list helper 2}}
an improved version of {{list helper}}
? Can all instances of {{list helper}}
be converted to {{list helper 2}}
? --Per utramque cavernam (talk) 22:33, 3 January 2018 (UTC)
2018 — February
...keeping the redirect. Or is there a sensible distinction between the two that we want to maintain? - -sche (discuss) 18:43, 19 February 2018 (UTC)
- I was hesitant to recreate CAT:English misconstructions, but labelling evolutionary stable strategy as an "eggcorn" seems like a stretch. --Per utramque cavernam (talk) 18:47, 19 February 2018 (UTC)
- Oh wait, that's not what you're suggesting. --Per utramque cavernam (talk) 18:47, 19 February 2018 (UTC)
- I changed the eggcorn template to categorize into the misconstruction category, emptying Category:English eggcorns and Category:Vietnamese eggcorns, although that should be undone if there is some distinction I am missing that it would be good and feasible to maintain. - -sche (discuss) 18:49, 19 February 2018 (UTC)
- Well, I feel that there's a semantic aspect to eggcorns that isn't really present in evolutionary stable strategy, trompe-d'œil or analysises. --Per utramque cavernam (talk) 18:53, 19 February 2018 (UTC)
- True, but that distinction seems a bit fuzzy; e.g., dominate is labelled an eggcorn (because it's homographic to a valid word?) while unfortunant is labelled a misconstruction. And evolutionary in evolutionary stable strategy is also a word. (But I'm not opposed to making a dinstinction; I'm just pointing out the issues with it, devil's-advocate-style.) - -sche (discuss) 19:25, 19 February 2018 (UTC)
- @-sche: I agree that the distinction is fuzzy (in fact, I'd even say that the distinction between "misconstructed", "nonstandard" and "proscribed" is fuzzy: compare our treatment of developmentation, abortation and pronounciate). Still, I think it's not entirely without merit, although I would be hard pressed to give you a specific set of criteria.
- I wouldn't call dominate an eggcorn, but without any quotation it's hard to judge anyway.
In fact, I'm going to RFV it.not necessary: it's used indeed. - Another thing: I don't like the way idiosyncratic is used in our def of eggcorn. It seems to be used as a synonym of "odd, strange, peculiar, eccentric", but it shouldn't be. --Per utramque cavernam (talk) 20:02, 19 February 2018 (UTC)
- orange is a result of misconstruction of naranga, isn't it? But orange is certainly not nonstandard. (Other cases of loss of juncture are apron, newt, nickname) Though misconstructions may tend to be nonstandard (for all intensive purposes, at least), they can become standard over time, as with many "errors". DCDuring (talk) 20:09, 19 February 2018 (UTC)
- It's specifically a rebracketing/metanalysis, which you could say is a type of misconstruction. However, I certainly wouldn't want to label orange as a misconstruction; that's true diachronically, but not synchronically. I do want to label it as a rebracketing, though. --Per utramque cavernam (talk) 20:21, 19 February 2018 (UTC)
- orange is a result of misconstruction of naranga, isn't it? But orange is certainly not nonstandard. (Other cases of loss of juncture are apron, newt, nickname) Though misconstructions may tend to be nonstandard (for all intensive purposes, at least), they can become standard over time, as with many "errors". DCDuring (talk) 20:09, 19 February 2018 (UTC)
- True, but that distinction seems a bit fuzzy; e.g., dominate is labelled an eggcorn (because it's homographic to a valid word?) while unfortunant is labelled a misconstruction. And evolutionary in evolutionary stable strategy is also a word. (But I'm not opposed to making a dinstinction; I'm just pointing out the issues with it, devil's-advocate-style.) - -sche (discuss) 19:25, 19 February 2018 (UTC)
- Well, I feel that there's a semantic aspect to eggcorns that isn't really present in evolutionary stable strategy, trompe-d'œil or analysises. --Per utramque cavernam (talk) 18:53, 19 February 2018 (UTC)
- It's hard to find references rather than intuition to support classifying terms one way or another, but I suppose the difference between developmentation and pronounciate vs unfortunant and dominate is that I think the first two are intentional (jocular) errors and the second two are unintentional. If we keep the categories separate, should "eggcorns" be a subcategory of "misconstructions" or a "sibling category" on the same level (cross-linked)? - -sche (discuss) 20:39, 19 February 2018 (UTC)
- And then there are entries like firstable which only say they're eggcorns in the etymology, not the definition... - -sche (discuss) 21:06, 26 February 2018 (UTC)
April 2020 duplicate discussion: Template:eggcorn of
To Template:misconstruction of.
- Discussion moved from #Template:eggcorn of.
"Eggcorn" is a lovely term for our own amusement, but it is an inside joke that makes Wiktionary more closed to normal users. I believe that a term like misconstruction is more understandable to normal people and includes all eggcorns, mondegreens, etc. DCDuring (talk) 19:52, 9 April 2020 (UTC)
- A Google search for eggcorn brings up Wikipedia for the first entry. A Google search for misconstruction brings up "is misconstruction a real word" and dictionaries. Eggcorn might be slightly whimsical, but misconstruction is not a word used by normal people.--Prosfilaes (talk) 04:59, 22 April 2020 (UTC)
- Well, the whole concept is not discussed by "normal people". Misconstruction is immediately understandable to the average educated English speaker, even if they've never heard the term before; eggcorn isn't. —Mahāgaja · talk 08:14, 22 April 2020 (UTC)
- Keep as is. It links to the entry eggcorn, so users are never more than a click away from comprehension. If you think it's an inside joke, then the in-group is all of linguistics, and we might apply the same logic to eliminating the word illative from our entries — only linguists know what it means, and why should we use the most exact word when a vaguer one might do? —Μετάknowledgediscuss/deeds 18:17, 23 April 2020 (UTC)
- Keep. I don't like dumbing things down to appeal to the broadest population possible. What about the people who want more precise information, or who want to learn whimsical words to describe things? I'm quite happy with us filling a niche that other dictionaries don't fill, since that's why I use Wiktionary in the first place. Besides, the kind of people who aren't interested in expanding their vocabulary tend not to look up words in the dictionary very much anyway. Andrew Sheedy (talk) 16:02, 24 April 2020 (UTC)
- FWIW I proposed something similar earlier / further up the page, #Template:eggcorn_of_into_Template:misconstruction_of. I think the issue is less that the term is opaque, and more that the distinction is fuzzy/questionable, compare my comments above. - -sche (discuss) 17:35, 3 May 2020 (UTC)
Why is this in the singular? It just looks weird in the case of a title like this. (Somewhat irrelevant, extra issue: the page needs a lede to explain what a shortcut is.) PseudoSkull (talk) 05:23, 21 February 2018 (UTC)
- Support on both counts. —Μετάknowledgediscuss/deeds 19:23, 20 March 2018 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 03:42, 29 October 2021 (UTC)
- @PseudoSkull There has been a section explaining what shortcuts are this whole time. It's just not right at the top, which might have been done intentionally to make the table of common shortcuts as quickly accessible as possible. - excarnateSojourner (talk | contrib) 06:10, 18 October 2022 (UTC)
2018 — March
This is extremely trivial, not to mention something that could be found even if it were not categorised. I think that it suits an appendix much better, so I propose that its contents be moved to Appendix:English words ending in -gry. —Μετάknowledgediscuss/deeds 03:23, 15 March 2018 (UTC)
- A benefit to having it as a category is that theoretically it ought to be addable by the headword templates examining the pagename (like "English terms spelled with Œ"), which, if implemented (...if it could be implemented without excessive memory costs), would allow it to be kept up to date automatically. - -sche (discuss) 17:16, 15 March 2018 (UTC)
- That is true, but I don't really think we should be using headword templates to collate trivia. —Μετάknowledgediscuss/deeds 17:47, 15 March 2018 (UTC)
- Delete per proponent. --Per utramque cavernam 18:09, 31 May 2018 (UTC)
- Is there something like Category:English lemmas but sorted from the end, like anger, ranger, hunger, angry, hungry? --幽霊四 (talk) 19:40, 6 February 2021 (UTC)
- At http://tools.wmflabs.org/dixtosa/ you can get a list of all entries in any category that end with any string you like. —Mahāgaja · talk 20:58, 6 February 2021 (UTC)
- Is there something like Category:English lemmas but sorted from the end, like anger, ranger, hunger, angry, hungry? --幽霊四 (talk) 19:40, 6 February 2021 (UTC)
- Support the proposed move per nom. - excarnateSojourner (talk|contrib) 05:00, 29 October 2021 (UTC)
2018 — April
Entries for Japanese prefecture names that end in 県 (ken, “prefecture”)
I would like to request the move of the content of entries like 茨城県 (Ibaraki-ken, literally “Ibaraki prefecture”) to simply 茨城 (Ibaraki, “Ibaraki”), cf. Daijisen. 県 is not an essential part of the name.
(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Atitarev, Dine2016, Poketalker, Cnilep, Britannic124, Fumiko Take, Dine2016): —Suzukaze-c◆◆ 03:19, 19 April 2018 (UTC)
- As a counterargument, Shogakukan's 国語大辞典 entry for 茨城 (Ibaraki) has one sense listed as 「いばらきけん(茨城県)」の略 ("Ibaraki-ken" no ryaku, "short for Ibaraki-ken"), and the 茨城 page on the JA Wikipedia is a disambig pointing to 茨城県 as one possible more-specific entry. ‑‑ Eiríkr Útlendi │Tala við mig 03:52, 19 April 2018 (UTC)
- (edit conflict) It seems like a two-word phrase to me. I am not a native speaker, but I think that if someone asked "水戸市は何県?" ((in) What prefecture is Mito?) then "茨城です。" (It's Ibaraki) would be a correct answer. Entries such as 奈良 and 広島 should have both the city and the prefecture. (I see that 奈良 currently does.) Cnilep (talk) 04:01, 19 April 2018 (UTC)
- 茨城県です would also be correct and probably more common. At least 東京 and 東京都 are clearly distinguished. No one in Izu Ōshima would say he/she is from 東京. — TAKASUGI Shinji (talk) 04:04, 19 April 2018 (UTC)
- Yes, 茨城県 is also correct. And if someone asked どこの出身? (Where are you from?) the answer would probably be 奈良県 rather than 奈良, or else expect a follow-up question. But I don't think that is necessarily a matter of word boundaries. Compare Pittsburgh, Pennsylvania and Pittsburgh, Kansas; the fact that it is usually necessary, and always acceptable to specify the latter doesn't mean that Pittsburgh on its own is not a proper noun. By same token, I think that 茨城 (et alia) is a word. That's the point I had in mind. I will say nothing about what is more common. I don't even have good intuitions about frequency in my native language. Cnilep (talk) 04:54, 19 April 2018 (UTC)
- I fully agree that 茨城 is a term worthy of inclusion. I also think that 茨城県 is a term worthy of inclusion. We have entries for both New York and New York City, and even New York State. Similarly, I think we should have entries for
[PREFECTURE NAME]
, and also for[PREFECTURE NAME]
県 and[PREFECTURE NAME]
市 and[PREFECTURE NAME]
郡, etc., as appropriate. ‑‑ Eiríkr Útlendi │Tala við mig 05:03, 19 April 2018 (UTC)- I believe New York is a special case because there is both the state and the city. We have Washington State, but we don't have City of Chicago or State of Oregon. —Suzukaze-c◆◆ 18:40, 19 April 2018 (UTC)
- A lot (maybe all?) of the prefecture names minus the 県 (-ken) suffix are polysemous. Listing a few from the north to the south, limiting just to geographical senses, and just in the same regions at that:
- 青森 (Aomori): a prefecture and a city
- 岩手 (Iwate): a prefecture, a city, and a township
- 秋田 (Akita): a prefecture and a city
- 山形 (Yamagata): a prefecture, a city, and a village
- 宮城 (Miyagi): a prefecture, a county, a township, a rural area (ancient Japan), a village, an island, and a mountain
- 福島 (Fukushima): a prefecture, a city, and a township
- 新潟 (Nīgata): a prefecture, a city, a park, and a village
- 栃木 (Tochigi): a prefecture and a city
- 茨城 (Ibaraki): a prefecture, a county, and a township
- Jumping south a bit to touch on Anatoli's example further below:
- 奈良 (Nara): a prefecture, a city, a township, and a village
- I am consequently in support of including both the bare name, and the qualified name(s), much as we already do for similar situations with English terms. ‑‑ Eiríkr Útlendi │Tala við mig 21:35, 19 April 2018 (UTC)
- They are polysemic because most prefectures were named after their capital city during the abolition of the han system. Exceptions include 埼玉 and 沖縄, where cities are named after their prefecture. — TAKASUGI Shinji (talk) 12:23, 23 April 2018 (UTC)
- A lot (maybe all?) of the prefecture names minus the 県 (-ken) suffix are polysemous. Listing a few from the north to the south, limiting just to geographical senses, and just in the same regions at that:
- I believe New York is a special case because there is both the state and the city. We have Washington State, but we don't have City of Chicago or State of Oregon. —Suzukaze-c◆◆ 18:40, 19 April 2018 (UTC)
- I fully agree that 茨城 is a term worthy of inclusion. I also think that 茨城県 is a term worthy of inclusion. We have entries for both New York and New York City, and even New York State. Similarly, I think we should have entries for
- Yes, 茨城県 is also correct. And if someone asked どこの出身? (Where are you from?) the answer would probably be 奈良県 rather than 奈良, or else expect a follow-up question. But I don't think that is necessarily a matter of word boundaries. Compare Pittsburgh, Pennsylvania and Pittsburgh, Kansas; the fact that it is usually necessary, and always acceptable to specify the latter doesn't mean that Pittsburgh on its own is not a proper noun. By same token, I think that 茨城 (et alia) is a word. That's the point I had in mind. I will say nothing about what is more common. I don't even have good intuitions about frequency in my native language. Cnilep (talk) 04:54, 19 April 2018 (UTC)
- 茨城県です would also be correct and probably more common. At least 東京 and 東京都 are clearly distinguished. No one in Izu Ōshima would say he/she is from 東京. — TAKASUGI Shinji (talk) 04:04, 19 April 2018 (UTC)
- Generally support. Less duplication is good, and it is not much different from Chinese etc. for which we generally delemmatise, if not completely hard-redirect, these forms. Wyang (talk) 04:49, 19 April 2018 (UTC)
- Support. For a dictionary, I think we don't need to keep entries with both prefecture name and prefecture, despite the usage but it's always helpful to provide usage notes (e.g. normally used with 県: ~県) and usage examples, e.g. 奈良県 (Nara ken, “Nara (prefecture)”). --Anatoli T. (обсудить/вклад) 05:45, 19 April 2018 (UTC)
Same suffix as in быль (bylʹ), убыль (ubylʹ), прибыль (pribylʹ), отрасль (otraslʹ), поросль (poroslʹ). а belongs to the stem. Guldrelokk (talk) 23:27, 20 April 2018 (UTC)
- @Atitarev, Benwing2, Chignon: Please voice an opinion; if you agree, the couple of entries using this suffix need to be modified. —Μετάknowledgediscuss/deeds 01:52, 16 April 2019 (UTC)
- Agreed. The two entries need a change. --Anatoli T. (обсудить/вклад) 01:57, 16 April 2019 (UTC)
- ruwikt: Категория:Русские слова с суффиксом -ль (Category:Russian words suffixed with -ль). --Anatoli T. (обсудить/вклад) 02:08, 16 April 2019 (UTC)
- @Guldrelokk, Benwing2, Chignon: I have modified entries, the category is orphaned, -ль (-lʹ) still needs to be defined. --Anatoli T. (обсудить/вклад) 03:30, 16 April 2019 (UTC)
- @Atitarev, can you please resolve this? —Μετάknowledgediscuss/deeds 07:41, 6 March 2021 (UTC)
- @Guldrelokk, Benwing2, Chignon: I have modified entries, the category is orphaned, -ль (-lʹ) still needs to be defined. --Anatoli T. (обсудить/вклад) 03:30, 16 April 2019 (UTC)
- ruwikt: Категория:Русские слова с суффиксом -ль (Category:Russian words suffixed with -ль). --Anatoli T. (обсудить/вклад) 02:08, 16 April 2019 (UTC)
- Agreed. The two entries need a change. --Anatoli T. (обсудить/вклад) 01:57, 16 April 2019 (UTC)
2018 — July
Category:Bengali script and related
After some discussion on Category talk:Baybayin script (that went a bit off-topic), some of the Indian language editors (@Bhagadatta, Msasag and myself) have agreed that this category should be renamed to Category:Eastern Nagari script, the reasons being (1) several languages other than Bengali use this script, and (2) the Bengali alphabet is just a subset of this script and lacks some of the glyphs used by other Bengali-script languages (most prominently Assamese which has a separate r-glyph). I want to make sure that there are no objections to this by editors who were not in the discussion. —AryamanA (मुझसे बात करें • योगदान) 02:06, 20 July 2018 (UTC)
@Asm sultan, Dubomanab Kutchkutch (talk) 05:35, 21 July 2018 (UTC)
- Support -- Bhagadatta (talk) 08:38, 21 July 2018 (UTC)
The result of the discussion is RFM-moved to Category:Eastern Nagari script. --Sbb1413 (he) (talk • contribs) 10:58, 27 March 2023 (UTC)- Oppose – I had closed the discussion and renamed the category to Category:Eastern Nagari script, only to find out that there's a separate Category:Assamese script. --Sbb1413 (he) (talk • contribs) 11:16, 27 March 2023 (UTC)
The two verb senses are bad IMHO. The first should be at busy oneself, I think, since it is always reflexive AFAIK. The second one doesn't sound right at all -- "He busied her" isn't something I've heard. Is that real at all? 69.255.250.219 02:36, 29 July 2018 (UTC)
- Support the move of verb sense 1 to busy onself. Send verb sense 2 to RFV. - excarnateSojourner (talk|contrib) 05:46, 29 October 2021 (UTC)
- It's not purely reflexive, so I oppose the move for sense 1. Examples: "I will […] busy him with my affairs till he forgets his own" [2]; "And what has been busying you?" [3]; " […] he busied you with other chores" [4]. Rarer than I thought, since I've heard e.g. "sorry for busying you" in real life, but it's a thing. Sense 2 I'm unfamiliar with. —Al-Muqanna المقنع (talk) 23:51, 2 December 2022 (UTC)
- [[busy oneself]] might be a good hard redirect to the appropriate sense of busy, which would benefit from
{{lb|en|usually reflexive}}
and corresponding usage examples. DCDuring (talk) 14:46, 3 December 2022 (UTC)- The sense at [[busy]] should remain, whether or not there is a separate lemma entry for busy oneself. DCDuring (talk) 14:48, 3 December 2022 (UTC)
- Redirecting busy oneself and a label makes sense, agreed. —Al-Muqanna المقنع (talk) 16:10, 3 December 2022 (UTC)
- The sense at [[busy]] should remain, whether or not there is a separate lemma entry for busy oneself. DCDuring (talk) 14:48, 3 December 2022 (UTC)
- [[busy oneself]] might be a good hard redirect to the appropriate sense of busy, which would benefit from
2018 — August
Nahuatl is sometimes treated as a language, and sometimes as a family of languages. Right now, Wiktionary is treating it as both simultaneously, which doesn't make sense. "Nahuatl" should be removed as a language. --Lvovmauro (talk) 11:55, 30 August 2018 (UTC)
- I agree the current arrangement doesn't make sense; it is a relic of very early days on Wiktionary, and has persisted mostly because it's not entirely clear how intelligible the varieties are and hence whether it's better to lump them all into
nah
, or retirenah
and separate everything. But enough varieties are not intelligible that I agree with retiringnah
(or perhaps finally converting it to a family code). - -sche (discuss) 20:34, 31 August 2018 (UTC)
- I think a family code for Nahuan languages is really needed since there are many cases where we don't know specifically which variety a word was borrowed from. --Lvovmauro (talk) 09:55, 9 September 2018 (UTC)
- @Lvovmauro: OK, thanks to you and a few other editors, all words with ==Nahuatl== sections have been given more specific headers. However, as many as a thousand translations remain to be dealt with before the code can be made a family code and Category:Nahuatl language moved on over to Category:Nahuan languages. - -sche (discuss) 06:48, 19 September 2018 (UTC)
- A disturbingly large number of these translations are neologisms with no actual usage. Some of them don't even obey the rules of Nahuatl word formation. --Lvovmauro (talk) 11:03, 19 September 2018 (UTC)
- @Lvovmauro: Feel free to remove obvious errors / unattested neologisms. If a high proportion of the translations are bad, it might even be reasonable to start presuming they're bad and just removing them, since they already suffer from the problem of using an overbroad code. - -sche (discuss) 00:28, 21 October 2018 (UTC)
- Someone with more time on their hands than me at the moment will need to delete all the subcategories of Category:Nahuatl language, and then the category itself, in preparation for moving 'nah' from the language-code module to the family-code module so the categories won't be recreated by careless misuse of 'nah' in the labels etc of 'nci' entries. - -sche (discuss) 00:24, 21 October 2018 (UTC)
Mecayapan Nahuatl saltillos
A number of Mecayapan Nahuatl words are currently written with U+0027 APOSTROPHE, which is a punctuation mark and not a letter. And a couple are using U+02BC MODIFIER LETTER APOSTROPHE, which is the wrong shape for this language. They should all be written with U+A78C LATIN SMALL LETTER SALTILLO instead.
- a̱'ti → a̱ꞌti
- babasoti' → babasotiꞌ
- cacahua' → cacahuaꞌ
- ca̱la̱' → ca̱la̱ꞌ
- coyo̱' → coyo̱ꞌ
- epaso̱' → epaso̱ꞌ
- hui̱lo̱' → hui̱lo̱ꞌ
- ichca' → ichcaꞌ
- ilhui' → ilhuiꞌ
- ocoʼ → ocoꞌ
- po̱cho̱' → po̱cho̱ꞌ
- sihua̱' → sihua̱ꞌ
- soqui' → soquiꞌ
- ta̱ga' → ta̱gaꞌ
- tepe̱' → tepe̱ꞌ
- ti̱lti' → ti̱ltiꞌ
- toca' → tocaꞌ
- tomaʼ → tomaꞌ
- to̱ca̱' → to̱ca̱ꞌ
- to̱to̱' → to̱to̱ꞌ
- tzi̱ca' → tzi̱caꞌ
- xo̱chi' → xo̱chiꞌ
--Lvovmauro (talk) 09:48, 31 August 2018 (UTC)
- Or perhaps they should just be moved to use the Modifier Letter Apostrophe, cf WT:RFM#Entries_in_CAT:Taos_lemmas_with_curly_apostrophes, to avoid over-proliferation of different apostrophe-ish letters. I think we should try to be consistent within the Nahuatl languages, at least, in which codepoint we use. - -sche (discuss) 20:26, 31 August 2018 (UTC)
- Most Nahuan languages don't use any sort of apostrophe. Mecayapan is unusual. --Lvovmauro (talk) 01:54, 1 September 2018 (UTC)
2018 — September
Arawak and Island Carib
Any objections to me renaming Lokono arw
(4 entries) and Kalinago crb
(0 entries) to Lokono and Kalhiphona, respectively? Arawak is easily confused with the Arawak/Arawakan proto language and family, and Carib is one of two often confounded languages, the Carib language and the Island Carib language. --Victar (talk) 04:03, 6 September 2018 (UTC)
- No objection to renaming Arawak, but I'm not sure about Kalhiphona, which seems to be quite rare even on a Google web search, and which seems to invite as much possible confusion (in its various spellings) with the various spellings of Garifuna as it avoids with other "Carib"s. - -sche (discuss) 06:56, 19 September 2018 (UTC)
Only used for Danish. —Rua (mew) 17:15, 9 September 2018 (UTC)
- I don't envisage using them in Norwegian. DonnanZ (talk) 13:53, 11 September 2018 (UTC)
It’s not about goon but go-on. Most books on Japanese seem to use kan-on and go-on with a hyphen rather than the correctly Romanized kan’on and goon. — TAKASUGI Shinji (talk) 15:42, 22 September 2018 (UTC)
2018 — October
I propose to rename Category:Korean determiners to Category:Korean adnominals, just like Category:Japanese adnominals. Korean gwanhyeongsa are grammatically almost identical to Japanese rentaishi or adnominals, which may or may not be determiners. Gwanhyeongsa are generally divided into three classes: demonstrative gwanhyeongsa, numeral gwanhyeongsa, and qualifying gwanhyeongsa ([5]). The last ones are not determiners. (pinging @Atitarev, Eirikr, Garam, HappyMidnight, KoreanQuoter) — TAKASUGI Shinji (talk) 23:31, 10 October 2018 (UTC)
- Support. --Garam (talk) 08:21, 12 October 2018 (UTC)
- Tentatively Support. Let's check with User:Wyang who was also involved and had an opinion in a related discussion on the group of words ending in 적 (的, jeok). --Anatoli T. (обсудить/вклад) 02:42, 13 October 2018 (UTC)
- I feel determiner is the more common name for this in English; the different definitions of these terms across languages should not be a concern - e.g. we also use adjective differently for Korean. adnominal may be confused with the -eun, -neun, -eul, -deon forms of Korean verbs and adjectives. Wyang (talk) 03:57, 13 October 2018 (UTC)
- @Wyang: The problem is that Category:Korean determiners contains words other than determiners. It will be all right to have both Category:Korean adnominals and Category:Korean determiners without renaming if you want, just like Category:Japanese adnominals and Category:Japanese determiners. — TAKASUGI Shinji (talk) 10:31, 13 October 2018 (UTC)
@Tibidibi, AG202 —Fish bowl (talk) 11:32, 7 February 2022 (UTC)
ichthyosaur vs. ichthyosaurus, and other terms like these.
I'm in a dispute with an editor over the exact meaning and differences between these two terms - are they the same or must we tell apart the order from the genus? Is there is a standard to follow? Дрейгорич (talk) 15:55, 27 October 2018 (UTC)
- The standard is making a survey of contemporary and past usages and using that to inform the definitions. DTLHS (talk) 16:15, 27 October 2018 (UTC)
- I've gone ahead and cleaned up the definitions, and linked to the scientific genus in the entry in case anyone wants that. Дрейгорич (talk) 16:23, 27 October 2018 (UTC)
2018 — November
Language request: Old Cahita
Mayo and Yaqui are mutually intelligible and sometimes considered to be a single language called Cahita. But their speakers apparently consider them to be distinct languages, and they have distinct ISO codes (mfy
and yaq
) and are currently treated distinctly by Wiktionary.
I'm not requesting that they be merged, but separating them is a problem because an important early source, the Arte de la lengua cahita conforme à las reglas de muchos peritos en ella (published 1737 but written earlier) treats them as a single language, and also includes an extinct dialect called Tehueco. I'd like to add words from the Arte but I can't list them specifically as either Mayo or Yaqui.
One solution would be treat to the language of the Arte as a distinct historical language, "Old Cahita", which would then be the ancestor of Mayo and Yaqui. The downside is there only seems to be one linguist currently using this name. --Lvovmauro (talk) 11:32, 4 November 2018 (UTC)
- On linguistic grounds, it seems like we should merge Yaqui and Mayo. Jacqueline Lindenfeld's 1974 Yaqui Syntax says "Yaqui and Mayo are sufficiently similar to be mutually intelligible", the Handbook of Middle American Indians says "the modern known representatives of Cahitan—Yaqui and Mayo—are mutually intelligible", and various more general references say "Yaqui and Mayo are mutually intelligible dialects of the Cahitan language", "The Yaqui and Mayo speak mutually intelligible dialects of Cahita". (There are political considerations behind the split, which a merger might upset, so adding Old Cahita would also work, but we have tended to be lumpers...) - -sche (discuss) 23:03, 18 November 2018 (UTC)
- I wouldn't object to merging them. --Lvovmauro (talk) 08:58, 19 November 2018 (UTC)
Cleanup suggestions for some badly attested Semitic languages, needing admin action
- Discussion moved from Wiktionary:Grease_pit/2018/November#Cleanup suggestions for some badly attested Semitic languages, needing admin action.
- Pray somebody add
|scripts = {"Narb"
} to Module:languages/data3/x after line 1026 for xna. (Otherwise mentions of words in it are shown in slanted letters.)- Added. DTLHS (talk) 03:17, 14 November 2018 (UTC)
- It seems that even MediaWiki:Common.css needs a new class for Narb added, to get
font-style: normal
; Sarb is there and has it, Narb is not there. If the mention of a North Arabian word in عَنْكَبُوت (ʕankabūt) works then it is complete. Also I see that in Module:scripts/data Narb does not havedirection = "rtl"
while Sarb has. Fay Freak (talk) 14:43, 15 November 2018 (UTC)- Good catch. I've updated Common.css and Mobile.cc and set it to display rtl. Sadly, it seems there are no fonts that display it. If you or I could find a good image of what the letters are supposed to look like, I might have time to make a basic font iff the letters don't have to be joined the way they do in Arabic. - -sche (discuss) 22:08, 18 November 2018 (UTC)
- I as an Archfag recently had a great update three weeks ago that adds displaying support for Old North Arabian, amongst other things like which improved Arabic and Syriac script rendering everywhere. gucharmap calls the name of the font by “Noto Sans Old North Arabian”, which I find in the filelist of the noto-fonts package. @-sche Fay Freak (talk) 22:29, 18 November 2018 (UTC)
- Good catch. I've updated Common.css and Mobile.cc and set it to display rtl. Sadly, it seems there are no fonts that display it. If you or I could find a good image of what the letters are supposed to look like, I might have time to make a basic font iff the letters don't have to be joined the way they do in Arabic. - -sche (discuss) 22:08, 18 November 2018 (UTC)
- It seems that even MediaWiki:Common.css needs a new class for Narb added, to get
- Added. DTLHS (talk) 03:17, 14 November 2018 (UTC)
- I think everything under Category:Old North Arabian script languages should be “Ancient North Arabian” (xna), it is to wonder that Dadanitic (sem-dad), Hismaic (sem-his), Safaitic (sem-saf), Taymanitic (sem-tay), Dumaitic (sem-dum), Hasaitic (sem-has), Thamudic (sem-tha) are separate languages on Wiktionary (some also with no script assigned). (Prolly someone went through some lects and added all he found.) Those lects are at a level of attestion or study where it does not even matter whether they are dialects or languages, and “Thamudic” is even a collective term for any of the Ancient North Arabian lects not further classified. Many inscriptions cannot be classified unto more specific lects anyway (you know, people also were nomads and wrote graffiti here and there) and they can only be entered as “Ancient North Arabian”. With words being found randomly and in concise consonantal writing I don’t see why one would pursue separation other than by stating the find spot.
- Also, “Qatabanian” (xqt), “Sabaean” (xsa), “Minaean” (inm), “Harami” (xha, redirects to “Minaean” on Wikipedia), Hadrami (xhd) – likewise otiose distinctions, regarding form and amount of attestion of Epigraphic South Arabian, as the name says only epigraphically attested, without any vowels –, have been unpopular in use already, entries and etymologies use the header “Old South Arabian” (sem-srb). I suggests to cross out those. Etymology-only is possible so one can use those in
{{cog}}
when in an individual case a word is known to be attested as of one of the dialects. North Arabian epigraphy categorization is more complex and it is better anyway to mention in each etymology where a lexeme has been encountered.- Himyaritic (sem-him), as an attested language, is rather mythical because the Ḥimyarites wrote Sabaean. Wikipedia mentions “three Himyaritic texts”, at the same time in the Encyclopedia of Arabic Language and Lingustics s.v. we read about two: “It is not even possible to establish whether they were written in the same language. The first text dates from around 100 C.E. and the second from around 300 C.E.” And about the secondary material from Early Medieval Arabs: “It is easy to see that quotations from Himyaritic offer very different readings according to the manuscripts.” Or according to others, mentioned in the EALL, Ḥimyarite is the same as Arabic, only with peculiar features (which might as well derive from Arabicized transmission, or later language fusion or whatever, much that could fool us). It could be grouped with those spurious languages if this category held languages from Antiquity.
- Gurage is according to Wolf Leslau, it’s most eminent scholar, one language with twelve dialects; others share this view. The material for this language, particularly by Leslau across his works, only lists words as “Gurage”, without qualifying if they are “Inor”, “Mesqan” or some other Gurage, so on Wiktionary one cannot simply give “Gurage” words (which has recently been done in Semitic comparisons by abusing the code of the largest dialect Sebat Bet Gurage, in spite of the source saying “Gurage”). The following dialects I find on en.Wiktionary as languages: Kistane/Soddo (gru), Mesqan (mvz), Sebat Bet Gurage (sgw), Silt'e (stv), Inor (ior), Muher (sem-mhr), Mesmes (mys), Chaha (sem-cha), Wolane (wle), Zay (zwa); some of these are considered subdialects of Sebat Bet Gurage. There are more I don’t find on Wiktionary. It’s perhaps like with the Aramaic dialects yore or the Low German dialects today. People publish Westphalian dictionaries but it’s still Low German and so treated by Wiktionary. I suspect that instead of holding controversial subdivisions deriving from Ethnologue we should, holding to the sources, keep the Wiktionary-language level higher. The source for a certain word can be further qualified by labels as with Coptic. I mean that with language, unlike with biological taxonomy, one cannot simply assume that distinctiveness of a taxon is ascertained by experiments and then authoritatively published in some reference. As the individual forms are described in this dictionary, one must weigh if the data allows distinction at all. Currently it looks to me that hence Gurage must be lumped; I don’t know if, with new data or emerging different literary standards, separating the lects with separate codes will later be convenient (the increase in language material will be disappointing and unlikely someone will come and add Gurage in thousands of entries anyway, let’s be realistic), but I doubt that it would be comfortable. See also Why is Old Novgorodian a separate language in Wiktionary? This is the question: Is the difference in data enough to justify separation? The actual language-dialect distinction does not matter, it must be seen functionally, for dictionary purposes, for dictionary purposes. And if linguists publish material as “Gurage” the distinction is probably not good for Wiktionary headers. Isn’t it out of scope of Wiktionary to distinguish lect clusters when they are generally unwritten and chiefly written by and variously lumped and splitted by linguists? That’s a difficult question. Also I fear that such distinctions might be precisely the cause why nobody comes and pours out his rich Gurage knowledge. An adept would not be sure to distinguish, pendulating between two extremes, not witting if he should split as much as he can by all kinds of criteria or if to standardize and to abstract. To help though first all mentioned codes need the Ge'ez and Latin script both assigned, and the macrolanguage created. Maybe there will be late order from early ambiguity. Though I would perhaps do the order by lumping and labelling by location, were I that certain aficionado.
- The obese Wiktionary:List of languages currently comprising 8055 lects needs cuts however. Fay Freak (talk)
- This discussion really belongs at rfm, because that's where we normally discuss changes to whether or how we recognize a language. The Grease pit is for discussing how to implement something along those lines- not whether it should be implemented. The other option would be at the Beer parlour, but this seems like something that would benefit from the more specialized focus of rfm. Chuck Entz (talk) 03:39, 14 November 2018 (UTC)
- Good distinction. I hesitated at 4:13 AM where to put it because of the mixed content. Moved. Fay Freak (talk) 14:16, 14 November 2018 (UTC)
- This discussion really belongs at rfm, because that's where we normally discuss changes to whether or how we recognize a language. The Grease pit is for discussing how to implement something along those lines- not whether it should be implemented. The other option would be at the Beer parlour, but this seems like something that would benefit from the more specialized focus of rfm. Chuck Entz (talk) 03:39, 14 November 2018 (UTC)
- Some prior discussion of Thamudic et al is on Category talk:Hismaic language; IIRC they were separated because literature does mention them as distinct entities, but if they were very similar or often treated as one language, and especially if there's difficulty in assigning specific texts to specific ones due to similarity, that would be an argument for reversing that decision and going back to the conservative approach of treating them all as one language with 'dialect'/'region' labels where appropriate.
(As to the venue, yes, these discussions tend to happen on RFM for quirky historical reasons — originally the discussions entailed actually merging or splitting language templates — although some have proposed the Beer Parlour as a more logical venue. There are minor benefits and drawbacks to either venue; this venue does have the advantage that discussions stay on the page until resolved.) - -sche (discuss) 17:20, 14 November 2018 (UTC)- I avoided Beer Parlour because I thought it is only for matters already affecting people, but it would not affect anyone we know now. Fay Freak (talk) 14:43, 15 November 2018 (UTC)
- Who is likely to have access to resources on Africa's Semitic languages that could help judge what to do with Gurage? User:Metaknowledge, User:Wikitiki89? Wikipedia insists "The Gurage languages do not constitute a coherent linguistic grouping", which seems incompatible with merging them. William A. Shack, in his book on The Gurage, writes that "each Gurage dialect is usually understood only by its own speakers, and there is a rough correlation between the contiguity of dialect groups and the extent to which their dialects are mutually intelligible." (Steven Danver, in his (general-focus) encyclopedia, says "the languages of the different groups of Ethiopian Gurage are seldom mutually intelligible.") Marvin Lionel Bender, in his 1976 Language in Ethiopia, says "Although seventeen varieties of Gurage dialects are listed, mutual intelligibility reduces this to four languages and three dialect clusters as follows (Hetzron classification):
Gogot, Misqan, Muxir, Soddo
East Gurage (Inneqor, Silti, Urbareg, Weleni, Zway)
Central West Gurage (Chaha, Gumer, Gura Izha)
Peripheral West Gurage (Ener, Geto, Indegegn, Innemor)"
However, his very next sentence is: "Gogot, Muxir, Soddo comprise a geographical (non-genetic) grouping of non-mutually-intelligible languages known as 'North Gurage'", all of which seeems to suggest that merging all of the Gurages would not be sound.
- -sche (discuss) 17:28, 14 November 2018 (UTC)- The cited grouping of course adds to the confusion. Three languages, but four dialects clusters, not mentioning their intersections? Well, we will not find out how one should see them without deep-diving. But the question is which direction Wiktionary should go: likely the current division is not correct. Should Wiktionary just add all possible splits so they can be cleaned up later when someone would commit himself to add the whole Gurage and judge about which distinctions are most convenient or should we have one macro-code because distinction is hopeless? The reason why I have even mentioned Gurage is that for example Leslau’s Etymological Dictionary of Geʿez which I like to use just gives words as “Gurage”, which sounds like there is a common vocabulary. Fay Freak (talk) 14:43, 15 November 2018 (UTC)
- Perhaps you can deduce from Leslau's literature list which Gurage language he gets his data from? He seems to have written an etymological dictionary of Gurage as well, presumably its foreword could clear things up.
- His own field studies. I hade linked his Etymological Dictionary of Gurage (“according to Wolf Leslau” etc.). Fay Freak (talk) 15:23, 17 November 2018 (UTC)
- As a volunteer project (run on fancy), we really have no other choice than to wait for someone to investigate the matter deeply and order the languages in a manner that facilitates their lexicographical work.
- Maybe we need non-genetic language group categories and ways to give forms in unindentified languages belonging to language groups. Crom daba (talk) 15:49, 15 November 2018 (UTC)
- Perhaps you can deduce from Leslau's literature list which Gurage language he gets his data from? He seems to have written an etymological dictionary of Gurage as well, presumably its foreword could clear things up.
- The cited grouping of course adds to the confusion. Three languages, but four dialects clusters, not mentioning their intersections? Well, we will not find out how one should see them without deep-diving. But the question is which direction Wiktionary should go: likely the current division is not correct. Should Wiktionary just add all possible splits so they can be cleaned up later when someone would commit himself to add the whole Gurage and judge about which distinctions are most convenient or should we have one macro-code because distinction is hopeless? The reason why I have even mentioned Gurage is that for example Leslau’s Etymological Dictionary of Geʿez which I like to use just gives words as “Gurage”, which sounds like there is a common vocabulary. Fay Freak (talk) 14:43, 15 November 2018 (UTC)
- @Fay Freak, -sche: A bit late, but here are my responses to the three outstanding problems (your #2–4):
- It is fairly evident that Ancient North Arabian is not a single language, and I advocate that sem-xna be abolished rather than the specific language codes; read Al-Jallad (2018), "What is Ancient North Arabian?". He sees Safaitic (which he has written a grammar of) and Hismaic as being of the same continuum as Old Arabic, but they are obviously too distinct from Classical Arabic for lexicographical purposes. He supports the distinctness of the others as languages, and of the various "Thamudic" lects. Based on Al-Jallad, I would prefer we split Thamudic B, C, D, etc as necessary; each language will have a very small corpus, but it seems like the most honest way to do it, and if more inscriptions are found, the lettered Thamudic wastebaskets will probably get their own names as the others did.
- Old South Arabian is also not a single language, though Sabaean was the standard that the other lects imitated, and I advocated that sem-srb be abolished as well. Multhoff (2019) in The Semitic Languages makes the case for four distinct languages: Sabaean, Minaean, Qatabanian, and Hadrami. She makes no mention, however, of Harami. Macdonald (2000), "Reflections on the linguistic map of pre-Islamic Arabia" explains that "Harami" is a name given to a few Sabaean texts that seem to have been contaminated by other Semitic languages, which is not at all an unusual feature and not unique to that site, so I suggest we remove that code.
- As for Himyaritic, I now think I was wrong to include it. There are three texts often attributed to it, but see Stein (2008), "The ‘Himyaritic’ Language in Pre-Islamic Yemen", which makes a strong argument to consider these as simply very late examples of Sabaean, which is indisputably the language of the other texts of the region in that script.
- Finally, for Gurage, the chief problem is that some scholars follow Hetzron in saying that Gurage is polyphyletic, in which case lumping would be committing a grave error (and the same charge has been levelled for Aramaic, with perhaps more evidence). Meyer (2011) in the International Handbook does seem to support the unity of Gurage, and treats the lects together, which gives me hope for lumping, but he is unwilling to commit to whether they should be considered dialects or languages. I think your Gurage-adding genius is mythical, so we have to choose which is least bad: many languages with scanty coverage, because their forms may be similar to forms entered under a different L2 header; or one Gurage language with decent coverage, but many forms that are not marked for what dialect they belong to and therefore a poor resource. I hesitantly support merger, given those choices. —Μετάknowledgediscuss/deeds 03:13, 10 August 2020 (UTC)
- An addendum: "Hadrami" is a terrible name for xhd, and invites confusion with Hadrami Arabic. Wikipedia uses "Hadramautic", but N-grams and a quick literature review suggests that "Hadramitic" is more common. @Fay Freak, -sche again (yes, I know I'm pestering, but I don't want to move forward on all this alone, both because I am fallible and because some of these, particularly splitting OSA, would require a bit of work, although in that case there is an online corpus that will help immensely). —Μετάknowledgediscuss/deeds 02:40, 17 August 2020 (UTC)
- Re North Arabian: Many works I browsed through speak of Old North Arabian as a unit with dialects, but also carefully specify what lects (including Thamudic B vs C, etc) words are attested in. Some imply, in their presentations, that a large number of words are identical between dialects, at least in the sample of vocabulary that they're treating (e.g., the pronouns treated in Roger D. Woodard, The Ancient Languages of Syria-Palestine and Arabia (2008), pages 197-198), though this seems to be because the authors are presenting 'normal', normalized and romanized forms, given Al-Jallad's evidence that words (even the supposedly distinctive definite article) varied not just among dialects but even within the writings of individual speakers. The native script also loses many possible differences in pronunciation, but then, we are a written, writing-based dictionary. I find slightly more works speaking of "Ancient North Arabian dialects" than "Ancient North Arabian languages", and the fact that some authors have argued the varieties are the same language not only as each other but even as Arabic itself does suggest a high degree of similarity (or that the scholars in question are lumpers). As we're dealing with small, extinct and apparently clearly delineated corpora, it seems like the conservative approach of treating each under its own L2 could be better, and we could retire xna ... unless we need it as a wastebasket for unsorted things, which Al-Jallad (and Fay Freak, above) suggests we would. (Bah, It's messy business, deciding what's a language and what's a dialect...) I will try to dig into the rest later. - -sche (discuss) 04:10, 19 August 2020 (UTC)
Well myself I have added Sabaean, Minaean, Qatabanian entries meanwhile, understanding and quoting a few inscriptions, although apart from some occasional features I noticed little how such an inscription can be classified as either, other than by provenience or rulers or gods mentioned—but that must be due to my blasé comparative approach that also makes me read Romance without recognizing the individual language. So somehow the volition to a merge is gone, though the lumping codes “Old South Arabian” and “Old North Arabian” must be kept for inscriptions no one has classified. Both are useful.
For Himyaritic, however, nothing is left. As here said already, the three alleged Himyaritic inscriptions don’t even need to be in the same language, and they aren’t even from anything to be called Ḥimyar (there are “Lesser Himyarites” and “Greater Himyarites” and the ethnic identity is fragmentary, too, by the way). In the “Critical Reevaluation” of the Ḥimyaritic language – cited by Wikipedia on Himyaritic language one does not know what for: their “undeciphered-k language” header recently introduced is surely a made-up term, oddly suggesting that these inscriptions are yet another language when those “k-language” inscriptions are exactly those otherwise claimed for Himyaritic, so we see Wikipedia editors had no clue and phantasize together languages due to their disdain for primary sources – helpfully includes a map, also coming to the conclusion “we have no reason to assume the existence of some “non-Ṣayhadic” language in pre-Islamic Yemen that was spoken besides the (Late) Sabaic idiom known from the inscriptions.” That from the fact that “Himyaritic” words typically given from Arabic sources are all also found in Sabaic, and the grammar found in the three inscriptions, including the prefixed instead of postfixed article which is only found in two of them, is too either found in Sabaean or can well be ascribed to their being poetry, which is also the reason for their being poorly understood. Many Arabic poems are also hard to understand and mostly helped by the copious material for the language which is not the case for languages with so limited a corpus, like Old South Arabian. Even in the Digests, Latin prose, not all passages are of discoverable meaning.
What would hinder man though to add understood words with quotes from the ominous inscriptions as Sabaean? Or anything from Arabic sources transmitted as Himyaritic instead of Arabic as Sabaean? For there is no evidence for it being a particular language. You see, from the corpus-based standpoint Wiktionary takes Himyaritic must go. Nothing can get the header “Himyaritic”, it can only be mentioned at Sabaean or Old South Arabian entries that Himyaritic nature is suggested by those who have come to believe in this extraordinary claim for which extraordinary evidence is not provided. Fay Freak (talk) 04:18, 6 August 2021 (UTC)
- I went on and moved our only “Ḥimyaritic” entry after that famous sentence to Yemeni Arabic in which the word طَيِّب (ṭayyib) for “gold” turns out otherwise known, and to be nothing else than Classical Arabic طَيِّب (ṭayyib, “good”) meaning “refined” and therefore gold, while Old South Arabian could not have developed such sense, so it is clear the famous quote one has been so inept to classify is at best only macaronic Sabaean-Yemenite Arabic. It is well put by Marijn van Putten:
- The Arab grammarians were interested in describing correct usage of language of Classical Arabic. It is quite clear that Himyaritic (and by extension Yemeni Arabic) did not fall in the category of 'correct usage'. Within this context, it is of course not surprising that anything that is "wrong" and from Yemen might be denoted as Himyaritic. This would then include both varieties of Yemeni Arabic and some surviving vestiges of Ancient South Arabian. Fay Freak (talk) 04:59, 13 September 2021 (UTC)
- Now also in a new article by Koutchoukali like communis opinio, though his blogs transpire by him stalking Wiktionary: later Muslim historians would refer to anything related to South Arabia’s pre-Islamic history as “Himyaritic,” all memory of its other states having passed away. Fay Freak (talk) 01:01, 18 September 2021 (UTC)
Merging Classical Mongolian into Mongolian
"Classical Mongolian" refers to the literary language of Mongolia used from 17th to 19th century created through a language reform associated with increased Buddhist cultural production (this started in the 16th century, but language standardization took place later). In the 20th century, (outer) Mongolia became independent from China and later adopted a Cyrillic orthography based on the spoken language, while Inner Mongolia kept her Uyghur script.
The literary language of Inner Mongolia continues Classical Mongolian in terms of its orthography as well as most of its grammar (to an extent that Janhunen (?) calls the situation bilingual). Modern varieties, in both Outer and Inner Mongolia, have greatly expanded their lexicons through borrowing of modern terms, but they also both consider all of Classical Mongolian lexicon to be a part of their language, and will put it in their dictionaries, even transcribed into Cyrillic.
The actual problem I have with this division is that when it comes to borrowings from (Classical) Mongolian, we sometimes cannot ascertain whether they precede the 20th century or not, or more common still, we know they precede the 19th century (and post-date the 16th), but they obviously come from a spoken variety and not "Classical Mongolian" as a literary language. Crom daba (talk) 17:14, 15 November 2018 (UTC)
- Yes. I find it also strange that Wiktionary distinguishes Ottoman Turkish from Turkish, it’s like distinguishing pre-1918 Russian from “Russian”, or like one reads about “Ottoman Turks” instead of “Turks”. Also Kazakh and the other Turkic language do not get extra codes for Arabic spelling, this situation is even more comparable, innit. Kazakhs in China write in Arabic script, Mongols in China in Mongolian script, but the languages are two and not four. Or also it sounds as with Pali. Am I correct to assume that Classical Mongolian texts get reedited in Cyrillic script? Then you could base all on Cyrillic and make Mongolian script soft redirects, because even words died out before the introduction of Cyrillic can be found in Cyrillic. Fay Freak (talk) 15:23, 17 November 2018 (UTC)
- @Fay Freak, the situation is similar to Turkish, but it creates less problems there since the Arabic script Turkish is obsolete and most relevant loans are pre-Republican.
- In principle it could be possible to collapse all of Mongolian into Cyrillic, but this would be extremely politically incorrect.
- Collapsing everything (potentially even Buryat, Daur and Middle Mongolian) into Uyghur script, like we do with Chinese, would perhaps make more sense, but 1) it's a pain to enter 2) Cyrillic is generally more accessible and useful to our users and (Outer) Mongolians 3) most of my materials are in Cyrillic 4) it corresponds poorly to the spoken forms 5) its Unicode encoding corresponds poorly to its actual form 6) the encoding doesn't correspond that well to the spoken form either. Crom daba (talk) 16:50, 18 November 2018 (UTC)
- This is tricky, because as far as language headers and having entries for terms in the language, it seems like we could often resolve which language a word is in(?) by knowing the date of the texts it's attested in. It is, as you say, etymologies where it's hardest to ascertain dates. (Still, if we merged the lects, we could retain an "etymology only" code for borrowings that were clearly from Classical Mongolian, like is done for Classical Persian, etc.) I'm having a hard time finding any references on the mutual intelligibility of the two stages; most references are concerned with the intelligibility or non-intelligibility of modern Khalkha, Kalmyk, etc. If we kept the stages separate, etymologies could always say something like "from Mongolian foo, or a Classical Mongolian forerunner". - -sche (discuss) 22:50, 18 November 2018 (UTC)
- @-sche, yes, the Persian model would be desirable.
- It doesn't make much sense to speak of intelligibility between Classical and Modern Mongolian, Classical Mongolian is exclusively a written language, its spelling reflects the phonology of 13th-century Mongolian (early Middle Mongolian). The same spelling is used in Modern Mongolian as written in Uyghur script.
- The biggest problem with Classical Mongolian is how redundant it is. For any word that is shared between modern and classical periods, and that is probably most of the lexicon, we would need to make two identical entries in Uyghur script for modern and classical Mongolian. Crom daba (talk) 11:18, 19 November 2018 (UTC)
- That seems not unlike how we handle Serbo-Croatian and Hindi-Urdu. — [ זכריה קהת ] Zack. — 14:25, 30 November 2018 (UTC)
- Indeed. The way we handle them sucks. Crom daba (talk) 12:52, 1 December 2018 (UTC)
- I agree. All this duplication is a huge waste of resources. Per utramque cavernam 13:22, 1 December 2018 (UTC)
- Not exactly; Serbo-Croatian and Hindi-Urdu have redundant entries in different scripts on different pages, while I understand Crom daba's point to be that we would need to have redundant ==Mongolian== and ==Classical Mongolian== entries on the same pages for most Mongolian/Uyghur script words, which would be more like having duplicate Bosnian and Croatian entries on the same pages, not our current system. And Serbo-Croats are testier about their language(s) being lumped than speakers of Classical Mongolian... ;) - -sche (discuss) 17:29, 3 December 2018 (UTC)
- Indeed. The way we handle them sucks. Crom daba (talk) 12:52, 1 December 2018 (UTC)
- That seems not unlike how we handle Serbo-Croatian and Hindi-Urdu. — [ זכריה קהת ] Zack. — 14:25, 30 November 2018 (UTC)
- OK, does anyone object to the merge? If not, I can try to do it with AutoWikiBrowser later, or Crom or others could start reheadering our small number of Classical Mongolian entries, fixing any wayward translations, etc. For etymologies of terms that are known to derive from Classical Mongolian, we should be able to just move
cmg
over to Module:etymology languages/data. - -sche (discuss) 17:29, 3 December 2018 (UTC)
- @Crom daba, Fay Freak I made the few ==Classical Mongolian== entries we had into ==Mongolian== entries (labelled "Classical Mongolian" unless there was already a modern Mongolian section on the same page), but many of the categories still need to be deleted, and one needs to check whther anything else is left that would break before "cmg" is moved from being a language code to being an etymology-only code. - -sche (discuss) 02:46, 27 September 2020 (UTC)
- There's no full correspondence between different Mongolian scripts and none of the scripts is totally phonetic. It's not just the spelling, the phonologies are different but sometimes one script represents the true or historical pronunciation and it's not necessarily Cyrillic, which is strange. There are words that only exist on one or the other, which is quite understandable, cf. modern ᠱᠠᠹᠠ (šafa, “sofa”) in Inner Mongolia (from 沙發/沙发 (shāfā) and софа (sofa, “sofa”) in outer Mongolia (from софа́ (sofá). I support the merge, though but I am curious if classical Mongolian terms are equally representable in Cyrillic and Arabic scripts. In other words, are there terms in classical Mongolian, which are different from modern and there's no Cyrillic form for them? I think I saw them.
- Duplication of entries is a waste. You may think I am biased but I think Mongolian should be presented/lemmatised in Cyrillic (Uyghurjin should also be available in all entries where it can be found) - for which resources are much more accessible. (Serbo-Croatian should be lemmatised on the Roman alphabet, on the other hand, let's finish the senseless duplications of entries)
- Also supporting the Ottoman Turkish/Turkish merge. --Anatoli T. (обсудить/вклад) 03:25, 27 September 2020 (UTC)
- @Atitarev In Mongol khelnii ikh tailbar toli we see the term уйгуржин бичиг is described as ‘монгол бичгийн дундад эртний үеийн хэлбэр’ (‘early form of the Mongolian/Khudam script’). Middle Mongolian in uigurjin with its own rules shall not to be equated with the later ‘Classical’-Modern script and orthography. I maintain uigurjin (with its specific glyph forms and spelling rules) shall be treated as a term only for Middle Mongolian.
- Similarly I also object treating Northern Yuan – Qing (‘Classical’) Mongolian and Modern Mongolian-script Mongolian as one literary language standard. In fact orthographic standardisations and modifications make written Modern Mongolian such different from Classical. Personally I’d like to display a historical feature of this language collectively under ‘Classical Mongolian’, as only this term directly interlinks with an Inner Asian historical and linguistic tradition. LibCae (talk) 16:40, 7 May 2021 (UTC)
2018 — December
Renaming agu
We currently call this "Aguacateca", but "Aguacateco" is much more common. (Wikipedia opts for "Awakatek", which is rapidly becoming more common but is probably not there yet — not that we can't be crystal-ballsy if we want to when it comes to names rather than entries.) —Μετάknowledgediscuss/deeds 05:42, 19 December 2018 (UTC)
- You're right that several modern (and a few older) sources seem to use Awakatek. In turn, historically Aguacatec has been used in the titles of many reference works on it, and seems like it may be the most common name (ngrams), although it's also the name of the people-group. (Others: Awakateko, Awaketec, Qa'yol, Kayol, and variously spellings of Chalchitec sometimes considered a distinct lect.) - -sche (discuss) 04:31, 19 August 2020 (UTC)
2019 — January
"comparative adjectives" > "adjective comparative forms"
Apparently there was a recent vote to remove the ambiguity of comparative and superlative categories. What I don't understand is why the name "comparative adjectives" was chosen, which suggests a lemma category, yet it's now being subcategorised under non-lemmas. Lemma subcategories are named "xxx POSs", as can be seen in Module:category tree/poscatboiler/data/lemmas. Non-lemma subcategories are named "POS xxx forms", visible in Module:category tree/poscatboiler/data/non-lemma forms. Therefore, the obvious place for comparative forms of adjectives is the "adjective comparative forms" category we used to have. The new name, although voted on, stands out as an exception among all of our existing categories and is inconsistent. It should therefore either be renamed back to reflect its non-lemma status, or it should be moved back under its original lemma parent category. —Rua (mew) 23:57, 10 January 2019 (UTC)
@Surjection, Erutuon —Rua (mew) 00:09, 11 January 2019 (UTC)
- The vote was here: Wiktionary:Votes/2018-07/Restructure comparative and superlative categories. — Eru·tuon 00:13, 11 January 2019 (UTC)
- Participles are not lemmas yet they are called "(language) participles", so it's not as if the comparatives/superlatives would exactly be exceptions of some kind. They even have their own "participle forms" categories! The former also applies to gerunds. — surjection ⟨
?
⟩ 09:13, 11 January 2019 (UTC)- And to make it clear, "adjective/adverb comparative/superlative forms" categories are to be made obsolete as a direct result of the vote. — surjection ⟨
?
⟩ 09:16, 11 January 2019 (UTC)- Yes, and that should be undone, because as I said, the name "comparative adjectives" suggests that they are lemmas because of our existing naming scheme. Participles are non-lemmas by virtue of being participles, but adjectives are lemmas, so "comparative adjectives" are also lemmas. Are you implicitly proposing to rename all non-lemma categories to this new scheme, e.g. "dual adjectives", "plural nouns", "possessive nouns", "feminine adjectives"? If the vote is upheld then I will propose this change to make things consistent again. —Rua (mew) 12:00, 11 January 2019 (UTC)
- I certainly would not assume "comparative adjectives" refer to lemmas in any way as much as "participles" don't. If we go back to "adjective comparative forms", what do you suggest for the name of the category with inflected forms of such? And don't just say "put them in 'Adjective forms'", because that at the very least isn't consistent as I stated below. In the old system, there was no consistency at all - inflected forms of comparatives and superlatives went to either the same category as them or Adjective forms without any sort of rule. — surjection ⟨
?
⟩ 12:17, 11 January 2019 (UTC)- I would not even categorise inflected forms of comparatives in a special way. They are just adjective forms. I don't even think comparatives should be categorised separately at all, there is no obvious need to do so. The example of possessive forms is perhaps the best parallel, since they have inflection tables of their own in Northern Sami and many other languages. Do you propose renaming them to "possessive nouns" so that there can be a separate "possessive noun forms" category? —Rua (mew) 12:28, 11 January 2019 (UTC)
- If you feel comparatives too don't need a special category, I'm personally fine with bunching all of them under "adjective forms", but that will too need wider consensus to implement. When it comes to those possessive nouns, I would argue comparatives and superlatives are closer to participles than to those possessive forms, which is why I believe they're not a good parallel and should be considered separately. — surjection ⟨
?
⟩ 12:40, 11 January 2019 (UTC)- Why? —Rua (mew) 12:46, 11 January 2019 (UTC)
- Many participle forms develop into adjectives of their own right and some comparative/superlatives too have developed into their own forms. Possessive forms by comparison basically never have, showing that they are fundamentally different in some way. — surjection ⟨
?
⟩ 12:49, 11 January 2019 (UTC)
- Many participle forms develop into adjectives of their own right and some comparative/superlatives too have developed into their own forms. Possessive forms by comparison basically never have, showing that they are fundamentally different in some way. — surjection ⟨
- Why? —Rua (mew) 12:46, 11 January 2019 (UTC)
- If you feel comparatives too don't need a special category, I'm personally fine with bunching all of them under "adjective forms", but that will too need wider consensus to implement. When it comes to those possessive nouns, I would argue comparatives and superlatives are closer to participles than to those possessive forms, which is why I believe they're not a good parallel and should be considered separately. — surjection ⟨
- I would not even categorise inflected forms of comparatives in a special way. They are just adjective forms. I don't even think comparatives should be categorised separately at all, there is no obvious need to do so. The example of possessive forms is perhaps the best parallel, since they have inflection tables of their own in Northern Sami and many other languages. Do you propose renaming them to "possessive nouns" so that there can be a separate "possessive noun forms" category? —Rua (mew) 12:28, 11 January 2019 (UTC)
- I certainly would not assume "comparative adjectives" refer to lemmas in any way as much as "participles" don't. If we go back to "adjective comparative forms", what do you suggest for the name of the category with inflected forms of such? And don't just say "put them in 'Adjective forms'", because that at the very least isn't consistent as I stated below. In the old system, there was no consistency at all - inflected forms of comparatives and superlatives went to either the same category as them or Adjective forms without any sort of rule. — surjection ⟨
- Yes, and that should be undone, because as I said, the name "comparative adjectives" suggests that they are lemmas because of our existing naming scheme. Participles are non-lemmas by virtue of being participles, but adjectives are lemmas, so "comparative adjectives" are also lemmas. Are you implicitly proposing to rename all non-lemma categories to this new scheme, e.g. "dual adjectives", "plural nouns", "possessive nouns", "feminine adjectives"? If the vote is upheld then I will propose this change to make things consistent again. —Rua (mew) 12:00, 11 January 2019 (UTC)
- And to make it clear, "adjective/adverb comparative/superlative forms" categories are to be made obsolete as a direct result of the vote. — surjection ⟨
- In fact, unlike this new system which has parallels, I'm fairly sure the old system of having "adjective comparative forms" but then the forms of comparatives under "adjective forms" is more of an exception. — surjection ⟨
?
⟩ 09:32, 11 January 2019 (UTC)- Not really. We don't have separate non-lemma categories for everything in Module:category tree/poscatboiler/data/lemmas and in fact we don't need to. Under the old system, all comparative forms could be categorised under "adjective comparative forms", so that includes all case forms of comparatives. There was never any need to separately categorise forms of comparatives. In fact I'm generally opposed to subcategorising non-lemmas, so that's why I moved everything in Dutch to just "adjective forms". We don't need a subcategory for every possible type of non-lemma form. However, if we do have them, then they should be named consistently. —Rua (mew) 12:00, 11 January 2019 (UTC)
- We don't have separate non-lemma categories for the reason that many of them are simply not inflectable on and upon themselves. Again, participles have separate categories for the main participle and inflected forms of such - why should this not apply to comparative and superlative adjectives? — surjection ⟨
?
⟩ 12:17, 11 January 2019 (UTC)- What I get out of your argument is that you think "POS xxx forms" should become "xxx POSs" when the form has its own inflections. But then what about cases like English, where comparatives don't have their own forms and are simply adjective forms? Or cases like Dutch or Swedish, where there are multiple superlative forms but their inflections are shown on the lemma? How is an editor supposed to know what the name of the category for any particular adjective form is, when some of them are named differently from others? —Rua (mew) 12:28, 11 January 2019 (UTC)
- That is indeed my argument for comparatives and superlatives due to their so far horridly inconsistent handling. In the case of English and all other languages, they will only have "comparative adjectives", no "comparative adjective forms", much like English would have "participles" that too aren't lemmas but would not have "participle forms". In cases like Dutch, Swedish and such where comparative/superlative forms are more numerous, those need to be handled on a language by language basis, ideally to choose one of the forms as the most lemma-esque (such as which form dictionaries primarily use to describe the comparative/superlative of an adjective), and if not one can be decided, it is more of a tricky situation (possibly all into "comparative/superlative adjective forms"?). Editors in turn can rely on other existing entries and eventually remember these entries much like the existing ones are, or use language-specific headword templates. Yes, the new system is by no means perfect, but I would argue it is miles better than what we had before. — surjection ⟨
?
⟩ 12:38, 11 January 2019 (UTC)- But again, how is an editor of these languages supposed to know that, while adjective forms normally go in "adjective xxx forms", it is somehow different for comparative and superlative forms? You still haven't answered this. Your argument is based on sublemma-ness, but this differs per language, not all languages treat comparatives and superlatives as sublemmas. The categorisation should allow for both treatments depending on the needs of the individual language, not force a particular treatment on all languages. The fact that you think it makes sense for Finnish doesn't mean it makes sense for English. Now we have Category:English comparative adjectives for an adjective form, but Category:English noun plural forms for a noun form. How is that consistent? —Rua (mew) 12:45, 11 January 2019 (UTC)
- I did already answer that question - read the latter part of my previous response. Many a time has an editor checked an existing entry to see how something is formatted, and I doubt there would be a single editor that has never done that. Many of the languages with comparatives and superlatives set up have language-specific headword templates, and many of those too have ACCEL which can too give the correct headword category autom- oh wait, it can't anymore since someone removed that capability. — surjection ⟨
?
⟩ 12:49, 11 January 2019 (UTC)- You have not answered the question. An editor cannot, based on the rule that non-lemma categories are named "adjective xxx forms", guess the correct name of the category for comparative forms, whereas they could before. Instead, there is now a single exception that comparatives are named "comparative adjectives". Where are all the other "xxx POSs" categories for non-lemmas? Again, are you proposing that all non-lemmas be renamed to match this new scheme? If not, what justifies this single exception? —Rua (mew) 12:54, 11 January 2019 (UTC)
- I did already answer that question - read the latter part of my previous response. Many a time has an editor checked an existing entry to see how something is formatted, and I doubt there would be a single editor that has never done that. Many of the languages with comparatives and superlatives set up have language-specific headword templates, and many of those too have ACCEL which can too give the correct headword category autom- oh wait, it can't anymore since someone removed that capability. — surjection ⟨
- But again, how is an editor of these languages supposed to know that, while adjective forms normally go in "adjective xxx forms", it is somehow different for comparative and superlative forms? You still haven't answered this. Your argument is based on sublemma-ness, but this differs per language, not all languages treat comparatives and superlatives as sublemmas. The categorisation should allow for both treatments depending on the needs of the individual language, not force a particular treatment on all languages. The fact that you think it makes sense for Finnish doesn't mean it makes sense for English. Now we have Category:English comparative adjectives for an adjective form, but Category:English noun plural forms for a noun form. How is that consistent? —Rua (mew) 12:45, 11 January 2019 (UTC)
- That is indeed my argument for comparatives and superlatives due to their so far horridly inconsistent handling. In the case of English and all other languages, they will only have "comparative adjectives", no "comparative adjective forms", much like English would have "participles" that too aren't lemmas but would not have "participle forms". In cases like Dutch, Swedish and such where comparative/superlative forms are more numerous, those need to be handled on a language by language basis, ideally to choose one of the forms as the most lemma-esque (such as which form dictionaries primarily use to describe the comparative/superlative of an adjective), and if not one can be decided, it is more of a tricky situation (possibly all into "comparative/superlative adjective forms"?). Editors in turn can rely on other existing entries and eventually remember these entries much like the existing ones are, or use language-specific headword templates. Yes, the new system is by no means perfect, but I would argue it is miles better than what we had before. — surjection ⟨
- What I get out of your argument is that you think "POS xxx forms" should become "xxx POSs" when the form has its own inflections. But then what about cases like English, where comparatives don't have their own forms and are simply adjective forms? Or cases like Dutch or Swedish, where there are multiple superlative forms but their inflections are shown on the lemma? How is an editor supposed to know what the name of the category for any particular adjective form is, when some of them are named differently from others? —Rua (mew) 12:28, 11 January 2019 (UTC)
- We don't have separate non-lemma categories for the reason that many of them are simply not inflectable on and upon themselves. Again, participles have separate categories for the main participle and inflected forms of such - why should this not apply to comparative and superlative adjectives? — surjection ⟨
- Not really. We don't have separate non-lemma categories for everything in Module:category tree/poscatboiler/data/lemmas and in fact we don't need to. Under the old system, all comparative forms could be categorised under "adjective comparative forms", so that includes all case forms of comparatives. There was never any need to separately categorise forms of comparatives. In fact I'm generally opposed to subcategorising non-lemmas, so that's why I moved everything in Dutch to just "adjective forms". We don't need a subcategory for every possible type of non-lemma form. However, if we do have them, then they should be named consistently. —Rua (mew) 12:00, 11 January 2019 (UTC)
Which question exactly have I not answered? The question was "how would an editor of these languages know the correct name for the categories?", which I have now answered not less than twice in my two previous responses. Instead, what it seems you are arguing is that the new scheme creates inconsistency in terms of the category names for non-lemma forms. Indeed, if other derivations are shown to be just like participles or comparative/superlatives, I'm happy to agree to move them under a similar scheme as well, but the possessive forms you brought up above are not an example of such. — surjection ⟨?
⟩ 12:58, 11 January 2019 (UTC)
Since it seems that this is the new norm for naming categories, I have proposed to rename all existing categories to match the new naming scheme at WT:BP. —Rua (mew) 13:16, 11 January 2019 (UTC)
- @Rua Given the edits you have made to the templates and modules are still in place, are you willing to revert those yourself or are you asserting that you are overriding the consensus established by the vote? — surjection ⟨
?
⟩ 21:10, 11 January 2019 (UTC)
Reconcile Category:#### terms derived from the shape of letters and Category:#### terms making reference to character shapes
See also Category talk:Terms making reference to character shapes by language.
Perhaps they could be merged, or perhaps both could be kept (Japanese: characters; letters?), but the naming should be consistent, at the least. —Suzukaze-c◇◇ 11:08, 20 January 2019 (UTC)
- Merge, perhaps into Category:Terms derived from character shapes by language (a bit shorter, and inclusive of non-letter characters). - excarnateSojourner (talk | contrib) 04:50, 28 April 2022 (UTC)
2019 — February
These should be merged, I think. Per utramque cavernam 12:39, 2 February 2019 (UTC)
- Yes, IMO, into someone's blood runs cold, with hard redirects from both. DCDuring (talk) 15:43, 2 February 2019 (UTC)
- I would support a hard redirect. Imetsia (talk) 23:34, 1 August 2021 (UTC)
- Formerly entitled Category:Taxonomic eponyms
As above. —Rua (mew) 13:35, 2 February 2019 (UTC)
- As with Category:Specific epithets. DCDuring (talk) 15:41, 2 February 2019 (UTC)
- @Benwing2, Rua, DCDuring: I guess there is nothing to move here and this can be solved by an addition to module data so that we can auto-cat after adding
{{cln|langcode|taxonomic eponyms}}
in entries. I mean, in order to categorize the{{named-after}}
stuff more specifically. Fay Freak (talk) 23:45, 7 November 2020 (UTC)- I think all of these that are entire taxonomic names must be Translingual, by virtue of being taxonomic names. The ones that are specific epithets would have the same language code for the taxonomic eponyms as for the specific epithet. DCDuring (talk) 01:02, 8 November 2020 (UTC)
- @DCDuring: I am not exactly sure what you mean. I mean that “taxonomic eponyms” can be added to the topical data or to the etymological data (Category:Taxonomic names, the supercategory of Category:Taxonomic eponyms, resides in the former for some reason, but I devise the taxonomic eponym categories as motivated by etymological description, so the latter it should be), whereas Category:Taxonomic eponyms cannot because it cannot generally be applied onto all languages (only to Translingual and perhaps Latin words that also are epithets). @Rua mixed up different issues here, the reasoning “as above” is not comprehensible thus. Fay Freak (talk) 12:00, 9 November 2020 (UTC)
- The question then is whether Translingual appears as "Translingual " or "mul:"? I have thought that "specific epithets" is a category having to do with the usage of the term. Thus the categorization should be the result of a label or of a non-gloss definition.DCDuring (talk) 18:57, 9 November 2020 (UTC)
- Since "Translingual" is a junk supercategory, not comparable to our language categories, based on an attribute of the usage of some terms. The category includes CJKV characters, airport ocdes, other international abbreviations, symbols, and codes, some non-taxonomic scientific terms, and who-knows-what-else, as well as taxonomic names. The effort to act as if every linguistic entity in Wiktionary fits into a relatively well-defined hierarchy of language families, languages, and dialects comes a-cropper with the entities thrown into Translingual, just as the taxonomic naming system has its troubles with hybridisation and trans-taxon gene transfer (eg, from viruses or from the assimilation of prokaryotes into eukaryotes as organelles).
- Specific epithets have a function within taxonomic terms that has nothing whatsoever to do with the fact that taxonomic names are used translingually, but has everything to do with names in the taxonomic/biological "language". 'Specific epithet' is a grammatical role within certain classes of taxonomic names. DCDuring (talk) 22:02, 9 November 2020 (UTC)
- The question then is whether Translingual appears as "Translingual " or "mul:"? I have thought that "specific epithets" is a category having to do with the usage of the term. Thus the categorization should be the result of a label or of a non-gloss definition.DCDuring (talk) 18:57, 9 November 2020 (UTC)
- @DCDuring: I am not exactly sure what you mean. I mean that “taxonomic eponyms” can be added to the topical data or to the etymological data (Category:Taxonomic names, the supercategory of Category:Taxonomic eponyms, resides in the former for some reason, but I devise the taxonomic eponym categories as motivated by etymological description, so the latter it should be), whereas Category:Taxonomic eponyms cannot because it cannot generally be applied onto all languages (only to Translingual and perhaps Latin words that also are epithets). @Rua mixed up different issues here, the reasoning “as above” is not comprehensible thus. Fay Freak (talk) 12:00, 9 November 2020 (UTC)
- I think all of these that are entire taxonomic names must be Translingual, by virtue of being taxonomic names. The ones that are specific epithets would have the same language code for the taxonomic eponyms as for the specific epithet. DCDuring (talk) 01:02, 8 November 2020 (UTC)
- @Rua, DCDuring, Fay Freak: Heads up that I amended Module:category tree/poscatboiler/data/terms by etymology to standardize these categories and so we now have Category:Taxonomic eponyms by language. I realize this makes the deletion discussion a little more confusing, since the main category has changed, so just giving visibility to the subcats Category:Arabic taxonomic eponyms, Category:English taxonomic eponyms, and Category:Translingual taxonomic eponyms and the fact that the main category under discussion was emptied and deleted for being empty. I've put the notice on the new main category and changed this subheading. —Justin (koavf)❤T☮C☺M☯ 15:54, 13 March 2022 (UTC)
- By which of our definitions of eponym is Anna's hummingbird an eponym? DCDuring (talk) 16:05, 13 March 2022 (UTC)
Seems to be inconsistently integrated in so far as the latter in its name contains “verbs” but the former does not contain “noun”, and the latter gets categorized as Category:Lemmas subcategories by language but the former as Category:Terms by etymology subcategories by language. Outside the category structure we have Category:Taos deverbal nouns which nobody has noticed. I have no tendency towards any gestalt so far, and I can’t decide either. Furthermore somebody will have to make a complement {{denominal}}
for {{deverbal}}
– so far there is only an Arabic-specific {{ar-denominal verb}}
. Fay Freak (talk) 18:31, 25 February 2019 (UTC)
- A lot of this is redundant to our suffix derivation categories. In many cases, the suffix used already determines what something is derived from. For example, -ness always forms deadjectival nouns, it can't really be anything else. —Rua (mew) 18:47, 25 February 2019 (UTC)
- Please see Wiktionary:Etymology_scriptorium/2018/May#основать. Per utramque cavernam 19:13, 25 February 2019 (UTC)
- True, for “a lot”, and if you know the deep intricacies of Wiktionary’s category structure.
- Category:Russian deverbals that contains now 53 entries has only entries the etymology of which consists in just removing the verb ending and using the stem. I see we have for this case Category:Russian words suffixed with -∅ – we just need to implement something like Category:Latin words suffixed with -o that is split by purpose of the suffix, Category:Latin words suffixed with -o (denominative), Category:Latin words suffixed with -o (compound verb) and so on, which is bare laudable. Now you only need to tell people, @Rua, how to create this id stuff, for to me it is a secret thus far.
- However this does not work with non-catenative morphology thus far – you may link the previous discussions on those infix categorization matters here, but even if that pattern collecting is solved the derived terms listed at صَلِيب (ṣalīb, “cross”), for instance, would only be categorized by pattern but nothing would imply that the terms are denominal –, and the point I have made about the categorization and naming of these categories is still there. But I give you green light in any case, if you want to replace all those “[language] deverbals” and “[language] denominal verbs” categorizations by suffigation categories of the format “[language] words suffixed with -∅ [deverbal]”, as well if it concerns action towards categorization of noncatenative morphology language terms, since your idea of uniformity is correct. Fay Freak (talk) 19:49, 25 February 2019 (UTC)
- Nonconcatenative morphology is still an underexplored part of Wiktionary, which is kind of annoying. But quite often, we simply show the concatenative part as the affix, and then leave a usage note saying what other changes occur when this form of derivation is used. For example on Northern Sami -i and -hit. —Rua (mew) 20:40, 25 February 2019 (UTC)
- How to create an affix category with an id: add the id to the definition line in the affix's entry with
{{senseid|language code|id}}
, add{{affix|language code|affix|id1=id}}
(at minimum) to the etymology section of a term that uses the affix, find the resulting red-linked category and create it with{{auto cat}}
. — Eru·tuon 20:51, 25 February 2019 (UTC)- Thanks, this is easier than I imagined, so it takes the category name from
{{senseid}}
. I thought it is in some background module data. Now where to document it? Add it to the documentation of{{affix}}
under|idN=
? This is the main or even only use of this parameter in this template, right? Fay Freak (talk) 21:18, 25 February 2019 (UTC)- It's not that
{{senseid}}
has any effect on the category name, but that a category with a parenthesis after it, such as Latin words suffixed with -tus (action noun), expects a matching{{senseid}}
in the entry for -tus, in this case{{senseid|la|action noun}}
because the link in the category description points to-tus#Latin-action_noun
, which is the format of the anchor created by{{senseid}}
. The|id=
type parameters, including in{{affix}}
, generally create a link of that type. In{{affix}}
, the parameter also has the effect of changing the category name. Sorry, I am not sure if I am explaining this clearly. — Eru·tuon 22:36, 25 February 2019 (UTC)- You explain this clearly. I just rolled it up from that side that I need to choose the name in
{{senseid}}
that I want to have in the category name so later with affix I will categorize in a reasonably named category because in other cases the id can arbitrary – not that{{senseid}}
has an effect on the category name. Fay Freak (talk) 22:53, 25 February 2019 (UTC)
- You explain this clearly. I just rolled it up from that side that I need to choose the name in
- It's not that
- Thanks, this is easier than I imagined, so it takes the category name from
- Our affix system is not sufficient to handle morphological derivation we have to deal with (unless you want us to introduce lambdas...) Serbo-Croatian hardly has the intricacy of Arabic conjugation, but there are plenty of nouns that are created from verbal roots through apophony, and this needs to be categorized somehow. Crom daba (talk) 17:24, 2 March 2019 (UTC)
- @Crom daba At least for Indo-European, we do have a system for handling combinations of affixation + ablaut, like on *-os (notice the parentheses showing the root grade) and -ος (-os). Our current system totally fails where there is no affix, though, a case which also exists in Indo-European. For example, there are some Indo-European forms of derivation, called "internal derivation", which are built entirely around changing ablaut grades and accents: *krótus (“strength”) > *krétus (“strong”) or τόμος (tómos, “slice”) > τομός (tomós, “sharp”). We have no systematic way to indicate this kind of derivation, but it is sorely needed. —Rua (mew) 23:42, 30 April 2019 (UTC)
2019 — March
1 member in this category, whose purpose I cannot discern and whose name seems like poor English to me. Note: "dismissal" is in Module:labels/data and should be removed from there if this fails. —Μετάknowledgediscuss/deeds 04:16, 31 March 2019 (UTC)
- Thanks, suzukaze. Accordingly moved to RFM with both cats listed; I now see what the intent is, but I still think the name is bad. —Μετάknowledgediscuss/deeds 04:24, 31 March 2019 (UTC)
- I think it fits in the same idea as Category:en:Greetings, but with a different naming scheme. "Greetings" should probably not be a set category, because sets group words by semantics (i.e. what the words refer to), rather than by function. —Rua (mew) 21:16, 7 April 2019 (UTC)
- Thanks, suzukaze. Accordingly moved to RFM with both cats listed; I now see what the intent is, but I still think the name is bad. —Μετάknowledgediscuss/deeds 04:24, 31 March 2019 (UTC)
- Add Category:Punjabi dismissals. Module:labels/data specifies that [[Category:<Language name> dismissals]] be added whenever the context label "dismissal" is used, but nothing has been added for this in the relevant category data module. This apparently predates Module:labels/data, since it was migrated in with all the other usage labels in August, 2013. We should either do something with these categories or get rid of the categorization in Module:labels/data- it's silly to have things showing up in Category:Categories with invalid label just because someone added a context label. Chuck Entz (talk) 17:39, 17 October 2020 (UTC)
2019 — April
Topical and set categories group terms based on what they refer to, but this category doesn't contain terms for greetings, it contains terms that are greetings. In other words, the name of the category refers to the word itself, not to its meaning, like Category:English nouns and unlike Category:en:Colors. So the category shouldn't be named and categorised like a set category, but instead should be named Category:English greetings. It belongs somewhere in Category:English phrasebook or Category:English terms by semantic function or something like that. —Rua (mew) 21:21, 7 April 2019 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 06:40, 29 October 2021 (UTC)
As above, these terms do not refer to farewells, they are farewells: the category name pertains to the word rather than the meaning. —Rua (mew) 21:27, 7 April 2019 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 06:41, 29 October 2021 (UTC)
As above. —Rua (mew) 21:28, 7 April 2019 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 06:42, 29 October 2021 (UTC)
Again, as above. —Rua (mew) 21:28, 7 April 2019 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 06:43, 29 October 2021 (UTC)
Category:Translingual numerals or Category:Translingual numeral symbols
- Discussion moved from WT:RFDO#Category:Translingual numerals or Category:Translingual numeral symbols.
We currently have both Category:Translingual numerals and Category:Translingual numeral symbols. If there's a difference, I'm not sure what it is. If not, I'm assuming we should merge on into the other. -- Beland (talk) 21:22, 26 April 2019 (UTC)
- Numerals can be words (one, two in spelling alphabets), while numeral symbols are not (Roman numerals). The difference is subtle, but I think it is there. — surjection ⟨
??
⟩ 18:51, 19 October 2021 (UTC)
- @Beland It sounds like this discussion would fit better at WT:RFM. Do you mind if I move it there? - excarnateSojourner (talk | contrib) 22:55, 5 November 2022 (UTC)
- I've moved the discussion. — excarnateSojourner (talk · contrib) 19:09, 16 November 2022 (UTC)
Wiktionary:English entry guidelines vs "About (language)" in every other language
Some years ago, there was an RFM to rename all these pages, the discussion of which is archived at Wiktionary talk:English entry guidelines#RFM discussion: November 2015–August 2018. The original nomination mentions "and likewise for other languages", meaning that the intent was to rename these pages in parallel for every language. In the end, only the English page was moved, so that now the English page has a name different from all the others. User:Sgconlaw suggested starting a new discussion instead of moving the pages after the RFM has long been closed.
My own opinion on this is to rename the pages in other languages to match the English one. That was the original intent of the first RFM, and the new name better describes what these pages are for. The name "about" instead suggests something like a Wikipedia page where you can write any interesting fact about the language, which is of course not what they're actually for. Some discussion may be needed regarding the shortcuts of all these pages. They currently follow the format of WT:A(language code), so e.g. WT:AEN but also WT:ACEL-BRY with hyphens in the name. The original shortcuts should probably be kept, at least for a while, but we may want to think of something to match the new page name as well. —Rua (mew) 13:00, 29 April 2019 (UTC)
- Support. —Μετάknowledgediscuss/deeds 18:17, 29 April 2019 (UTC)
- Support renaming for accuracy and consistency. —Ultimateria (talk) 22:32, 14 May 2019 (UTC)
- Support – Jberkel 23:53, 14 May 2019 (UTC)
- @Benwing2, would you be interested in executing this proposal? —Μετάknowledgediscuss/deeds 06:21, 30 March 2020 (UTC)
- @Benwing2, checking in again, as it seems to be causing confusion, judging from a thread lower down on this page. —Μετάknowledgediscuss/deeds 00:44, 29 March 2021 (UTC)
- @Metaknowledge Apologies, I missed this from a year ago. I'll go ahead and rename. Benwing2 (talk) 00:51, 29 March 2021 (UTC)
- @Metaknowledge FYI, this may take a little while. Lots of these pages have redirects to them and MediaWiki doesn't handle double redirects, so I have to find all the links to these pages (at least, those in redirects) and fix them. Benwing2 (talk) 01:19, 29 March 2021 (UTC)
- @Benwing2: You mean you have to fix the redirects themselves, right? I hope that we can continue to use the WT:AFOO redirects even after the moves are complete. —Μετάknowledgediscuss/deeds 01:28, 29 March 2021 (UTC)
- @Metaknowledge Yes, the redirects need to be fixed to point to the new pages. Benwing2 (talk) 01:31, 29 March 2021 (UTC)
- @Metaknowledge One more thing: Some 'About' pages aren't just "About LANG". What should we rename the following?
- WT:About Algonquian languages: Does WT:Algonquian languages entry guidelines work, or should it just be WT:Algonquian entry guidelines?
- WT:About sign languages: Should it be WT:Sign languages entry guidelines, WT:Sign language entry guidelines, or something else?
- WT:About Arabic/Egyptian, WT:About Arabic/Moroccan, WT:About Chinese/Cantonese, WT:About Chinese/Cantonese/Taishanese, WT:About Chinese/Gan, WT:About Chinese/Hakka, WT:About Chinese/Jin, ... (other Chinese varieties), WT:About Lingala/Old: Does WT:Arabic/Egyptian entry guidelines, WT:Chinese/Cantonese/Taishanese entry guidelines, etc. work, or should we normalize to e.g. WT:Egyptian Arabic entry guidelines, WT:Cantonese entry guidelines, WT:Gan entry guidelines (or WT:Gan Chinese entry guidelines?), WT:Hakka entry guidelines (or WT:Hakka Chinese entry guidelines?), WT:Old Lingala entry guidelines, etc.? Cf. also Wiktionary:About Contemporary Arabic.
- Other subpages: Wiktionary:About Chinese/phonetic series, Wiktionary:About Chinese/phonetic series 2, Wiktionary:About Chinese/references, Wiktionary:About Chinese/tasks, Wiktionary:About French/Todo, Wiktionary:About German/Todo, Wiktionary:About German/Todo/missing a-d (and others), Wiktionary:About Greek/Glossary, Wiktionary:About Greek/Draft new About Greek, Wiktionary:About Hungarian/Participles, Wiktionary:About Hungarian/Todo, Wiktionary:About Japanese/Etymology, Wiktionary:About Korean/Romanization, Wiktionary:About Korean/references, Wiktionary:About Korean/Historical forms, Wiktionary:About Norwegian/Layout1, Wiktionary:About Norwegian/Layout2, Wiktionary:About Norwegian/Layout3, Wiktionary:About Spanish/Todo (probably completely outdated), Wiktionary:About Spanish/Todo/missing a-d (and others), Wiktionary:About Swahili/missing a-z, Wiktionary:About Tibetan/references, Wiktionary:About Vietnamese/references
- Wiktionary:About Japanese-English bilingual: What about this?
- Wiktionary:About Han script, Wiktionary:About Hangul script: Does WT:Han script entry guidelines work, or should it just be Wiktionary:Han script guidelines or something else?
- Wiktionary:About International Phonetic Alphabet, Wiktionary:About given names and surnames, Wiktionary:About undetermined languages: Not languages.
- @Benwing2: You mean you have to fix the redirects themselves, right? I hope that we can continue to use the WT:AFOO redirects even after the moves are complete. —Μετάknowledgediscuss/deeds 01:28, 29 March 2021 (UTC)
Benwing2 (talk) 01:57, 29 March 2021 (UTC)
- @Benwing2: 1. I don't think we need the word "languages". 2. The second option sounds more grammatically correct. 3 & 4. I would go with subpages, but you may want to hold off on those, as some of the pages are heavily used and links to them will have to be fixed. Opinions solicited: @Justinrleung, suzukaze-c, Atitarev, Tibidibi 5. It should be moved somewhere very inconspicuous; we could even delete it and nobody would miss it. 6. I guess the former? 7. The first one is now fine, the second can stay where it is, and the third seems somewhat useless (but @-sche may have an opinion). —Μετάknowledgediscuss/deeds 02:29, 29 March 2021 (UTC)
- No opinion, although I am of the belief that many of our WT:<CJK> pages should be in the Appendix instead. —Suzukaze-c (talk) 04:33, 29 March 2021 (UTC)
- I would probably like WT:Chinese entry guidelines/Cantonese, WT:Chinese entry guidelines/Gan, etc. for the ones in 3 so that they are still treated as subpages of WT:Chinese entry guidelines. — justin(r)leung { (t...) | c=› } 05:57, 29 March 2021 (UTC)
- I agree with Justinrleung WRT the Korean pages as well.--Tibidibi (talk) 01:51, 31 March 2021 (UTC)
- I think there's nothing on Wiktionary:About Algonquian languages that requires that page to exist, anyway, and am just going to make it a hard redirect it to the About Proto-Alg. page instead of the soft redirect which is currently its entire contents, keeping the old edit history and old talk page comments. - -sche (discuss) 18:58, 26 July 2021 (UTC)
Note: There is another open discussion below on this exact topic. - excarnateSojourner (talk | contrib) 23:58, 21 October 2022 (UTC)
from Wiktionary:English entry guidelines to Wiktionary:About English (currently it's only redirect)
Reason: to align it with all other WT:About LANGUAGE pages, such as:
--幽霊四 (talk) 18:44, 6 February 2021 (UTC)
- See “Wiktionary talk:English entry guidelines#RFM discussion: November 2015–August 2018”. — SGconlaw (talk) 21:33, 6 February 2021 (UTC)
- @Sgconlaw, Rua: Partial closure of the RFM was clearly not the best solution. Someone with a bot should move all of these and update the redirects. Rua, would you be willing to do that? —Μετάknowledgediscuss/deeds 23:04, 8 March 2021 (UTC)
- @Benwing2, would you be interested in helping out with this mess? —Μετάknowledgediscuss/deeds 23:18, 21 July 2021 (UTC)
- @Metaknowledge I looked into this awhile ago and never finished it, sorry, because of various complexities. I will try to look into this soon. Benwing2 (talk) 02:24, 24 July 2021 (UTC)
- @Benwing2 Any update on this? - excarnateSojourner (talk | contrib) 18:01, 27 April 2022 (UTC)
- @Metaknowledge I looked into this awhile ago and never finished it, sorry, because of various complexities. I will try to look into this soon. Benwing2 (talk) 02:24, 24 July 2021 (UTC)
- @Benwing2, would you be interested in helping out with this mess? —Μετάknowledgediscuss/deeds 23:18, 21 July 2021 (UTC)
- @Sgconlaw, Rua: Partial closure of the RFM was clearly not the best solution. Someone with a bot should move all of these and update the redirects. Rua, would you be willing to do that? —Μετάknowledgediscuss/deeds 23:04, 8 March 2021 (UTC)
You know a discussion page has become too large and stale when there are two open discussions on the exact same topic. - excarnateSojourner (talk | contrib) 00:02, 22 October 2022 (UTC)
- I've moved this duplicate discussion to be under the same L2 heading as the original. — excarnateSojourner (talk · contrib) 07:41, 25 February 2023 (UTC)
2019 — May
toponyms
I think the categories for toponyms (e.g. English terms derived from toponyms) should be moved to a category just called [language] toponyms (e.g. English toponyms). It feels inconsistent to have English terms derived from toponyms while also having English eponyms. —Globins (yo) 01:14, 6 May 2019 (UTC)
- A term derived from a toponym is an eponym, but is not a toponym itself. So the current names make sense. —Rua (mew) 11:45, 9 May 2019 (UTC)
- Sense 2 for toponym is "a word derived from the name of a place," and the entry mentions eponym as a coordinate term. —Globins (yo) 00:04, 10 May 2019 (UTC)
- @Globins Wiktionary's category structure only follows the first definition, which is the more common meaning. We shouldn't mix up the two definitions. —Rua (mew) 17:52, 13 May 2019 (UTC)
- @Rua: In that case, English eponyms should be moved to English terms derived from eponyms since our current category name follows the less common definition of eponym. —Globins (yo) 21:16, 14 May 2019 (UTC)
- Not really. An eponym is derived from a name. A toponym is a name. So a term derived from a toponym is derived from a name, but a term derived from an eponym is derived from another word that is then derived from a name. They're not equivalent. —Rua (mew) 21:18, 14 May 2019 (UTC)
- I think "eponymic terms" would be better if you want to preserve the "name that a term is derived from" sense of eponym (as opposed to the "term derived from a name" sense). "Terms derived from eponyms" seems odd, maybe tautological, to me because a name is not inherently an eponym, but only when we are discussing the fact that a term is derived from it. — Eru·tuon 21:35, 14 May 2019 (UTC)
- @Globins Do you have any response to Rua or Erutuon? It would be nice to mark this discussion as resolved if it isn't going to go anywhere. - excarnateSojourner (talk | contrib) 03:58, 19 October 2022 (UTC)
- @ExcarnateSojourner I think I agree with Erutuon's category name suggestion then. —Globins (yo) 17:56, 19 October 2022 (UTC)
- @Rua: In that case, English eponyms should be moved to English terms derived from eponyms since our current category name follows the less common definition of eponym. —Globins (yo) 21:16, 14 May 2019 (UTC)
- @Globins Wiktionary's category structure only follows the first definition, which is the more common meaning. We shouldn't mix up the two definitions. —Rua (mew) 17:52, 13 May 2019 (UTC)
- Sense 2 for toponym is "a word derived from the name of a place," and the entry mentions eponym as a coordinate term. —Globins (yo) 00:04, 10 May 2019 (UTC)
Category:Thesaurus into various members of Category:Thesaurus entries by language
Right now, this contains a mess of various languages, mostly English but also some others. Thesaurus pages have distinct level 2 language sections just like regular entries do. Thus, we should treat thesaurus pages like we do regular entries and split them by language. The topical categories pose a problem, as they also need to be language-specific, but there is no naming scheme for them yet and no category tree. I propose moving their contents into our regular topical and set categories, to avoid creating a parallel tree of topical thesaurus categories. —Rua (mew) 17:48, 13 May 2019 (UTC)
- I agree that lumping them into one category is messy, and leads users to take categorization into their own hands. See Category:Danish thesaurus entries, which contains pages with 3 distinct naming schemes! I support putting Thesaurus entries into the existing topical categories. Some Thesaurus categories already are, e.g. Category:Thesaurus:Mathematics. Ultimateria (talk) 17:57, 20 May 2019 (UTC)
- Support Allahverdi Verdizade (talk) 21:38, 23 May 2021 (UTC)
I would say to go into an entry for a Polish entry for example, to remove ”Category:Thesaurus” and only leave “Category:Polish thesaurus entries”, would that be ok? Ffffrr (talk) 11:22, 25 October 2021 (UTC)
This was previously submitted to deletion, but kept (why it wasn't RFMed instead I don't know). —Rua (mew) 18:46, 19 May 2019 (UTC)
- Support. DonnanZ (talk) 18:50, 19 May 2019 (UTC)
- Comment: perhaps "Category:Automobile parts" would be better, to match the parent category "Category:Automobiles". — SGconlaw (talk) 19:02, 20 May 2019 (UTC)
- I'm ok with that too. @DonnanZ? —Rua (mew) 09:39, 24 May 2019 (UTC)
- @Rua, Sgconlaw: The word "automobile" is not common in British English, but I think "car" is used everywhere, hence my preference for Category:Car parts. DonnanZ (talk) 09:22, 3 June 2019 (UTC)
- But car is more ambiguous. DCDuring (talk) 10:28, 3 June 2019 (UTC)
- I don't mind one way or another, but the whole category tree then needs to be renamed for consistency. (@Donnanz: how is car ambiguous? Do you mean it could be confused for, say, a train carriage or something?) — SGconlaw (talk) 10:34, 3 June 2019 (UTC)
- Well, car is used especially in US English for a railroad car (either freight or passenger), and can be used in BrE for a railway passenger carriage. I feel the word auto can be ambiguous as well; "auto parts" can be used in the UK, but "car parts" is preferred. The word "auto" isn't used for a motor car in the UK. There is another category, Category:Automotive, so Category:Automotive parts may be a solution. DonnanZ (talk) 13:52, 3 June 2019 (UTC)
- I was employed in the motor trade for many years, supplying car parts of all descriptions, even body shells on one or two occasions. DonnanZ (talk) 14:23, 3 June 2019 (UTC)
- In that case it seems to me that "Category:Automobile parts" is least ambiguous. I'm not sure "Category:Automotive" is well named (why an adjective?); "Category:Road transport" would be better. — SGconlaw (talk) 15:15, 3 June 2019 (UTC)
- Category:Nautical also uses an adjective, and there may be others. DonnanZ (talk) 15:46, 3 June 2019 (UTC)
- Yeah, not too hot on that one either. My suggestion would be "Category:Water transport". — SGconlaw (talk) 15:58, 3 June 2019 (UTC)
- As long as I can type
{{lb|en|car part(s)}}
and get the topical category Category:en:Automobile parts, my increasingly arthritic fingers would be happy. DCDuring (talk) 16:50, 3 June 2019 (UTC)- I can only sympathise. Depending on the outcome here, if you feel like fiddling around with modules I think Module:category tree/topic cat/data/Technology is the right one. DonnanZ (talk) 11:36, 4 June 2019 (UTC)
- As long as I can type
- Yeah, not too hot on that one either. My suggestion would be "Category:Water transport". — SGconlaw (talk) 15:58, 3 June 2019 (UTC)
- Category:Nautical also uses an adjective, and there may be others. DonnanZ (talk) 15:46, 3 June 2019 (UTC)
- In that case it seems to me that "Category:Automobile parts" is least ambiguous. I'm not sure "Category:Automotive" is well named (why an adjective?); "Category:Road transport" would be better. — SGconlaw (talk) 15:15, 3 June 2019 (UTC)
- I don't mind one way or another, but the whole category tree then needs to be renamed for consistency. (@Donnanz: how is car ambiguous? Do you mean it could be confused for, say, a train carriage or something?) — SGconlaw (talk) 10:34, 3 June 2019 (UTC)
- But car is more ambiguous. DCDuring (talk) 10:28, 3 June 2019 (UTC)
- @Rua, Sgconlaw: The word "automobile" is not common in British English, but I think "car" is used everywhere, hence my preference for Category:Car parts. DonnanZ (talk) 09:22, 3 June 2019 (UTC)
2019 — June
As has been pointed out here, "have" isn't part of the term. Chuck Entz (talk) 12:24, 25 June 2019 (UTC)
- As I see it, have isn't part of the metaphor, but it is part of an expression that is not in turn a form of tie someone's hands. The passive (one's) hands are/were/being/been tied are such forms, though none make for a good lemma entry or likely searches. DCDuring (talk) 13:38, 25 June 2019 (UTC)
- @DCDuring: Thanks. Also the second meaning of tied: restricted (which even offers the quotation: but the county claims its hands are too tied) --Backinstadiums (talk) 14:25, 25 June 2019 (UTC)
- It's still a metaphor: a county doesn't have directly have hands. DCDuring (talk) 17:39, 25 June 2019 (UTC)
- @DCDuring: Thanks. Also the second meaning of tied: restricted (which even offers the quotation: but the county claims its hands are too tied) --Backinstadiums (talk) 14:25, 25 June 2019 (UTC)
- For an example of tie someone’s hands being used in the active voice: “It will tie our hands for another nine years with respect to a labor contact [sic] with no layoff clauses and raises that are built in.”
- In general, for any expression of form “⟨VERB⟩ someone’s ⟨NOUN⟩”, there is a corresponding expression “ have/get one’s ⟨NOUN⟩ ⟨VERBed⟩”. For example, cut someone’s hair → have one’s hair cut. Or knock someone’s socks off → get one’s socks knocked off. Or lower someone’s ears → have one’s ears lowered. If the expression is idiom, sometimes we have one, sometimes the other, and sometimes both. --Lambiam 21:24, 25 June 2019 (UTC)
- Indeed. Unless the active form is very uncommon, I'd prefer it as the lemma. I don't think that we would be wrong have both the active-voice expression and the have and/or get expressions, even though we could argue that it is a matter of grammar that one can transform certain expressions in the way Lambian describes. DCDuring (talk) 22:31, 25 June 2019 (UTC)
Request to merge Haitian Vodoun Culture language [hvc] into Haitian Creole language [ht]
According to Wikipedia and Ethnology, hvc
"appears to not be an actual language, but rather an assortment of words, songs, and incantations – some secret – from various languages once used in Haitian Vodoun ceremonies". Our only entries for it are Langaj and Langay, i.e. the two forms of the lect's name for itself. I suggest we consider it a variety of ht
instead. Thoughts? Pinging @EncycloPetey as the creator of the two entries, although he hasn't been around for over a month. —Mahāgaja · talk 12:00, 28 June 2019 (UTC)
- Although Ethnologue says it is "probably not a separate language", it does not say which language to which it might belong. So merging it into another language would be original research, unless a source documents its inclusion in Haitian Creole. Nota bene: at the time I created the entries, neither WP nor ethnologue expressed doubts about the distinctness of Haitian Vodoun Culture language. I am aware of its current doubtfulness, but it could also be considered a liturgical language in its own right. Without some authoritative statement, I'd hesitate to merge it into another language. There is more than one language spoken in Haiti. --EncycloPetey (talk) 23:28, 28 June 2019 (UTC)
- Support. Hebblethwaite says in the excellent Vodou Songs in Haitian Creole and English that this is not only not a language, but that "[t]he words and chunks have mostly become incomprehensible to Vodouists and have a ritual or mystical purpose." Apparently the langaj used with different loa can be from entirely different languages from different parts of Africa. The entries we have are indisputably Haitian Creole (and which I think should not be capitalised according to orthographic rules). —Μετάknowledgediscuss/deeds 04:44, 31 March 2020 (UTC)
- I can find little about this apart from the references mentioned already. The Encyclopedia of Language and Linguistics says it is "used as a second language only", as does Toyin Falola, Niyi Afolabi, and Adérónké Adésolá Adésànyà's Migrations and Creative Expressions in Africa... (2008), which says on page 157 "Langay, also referred to by linguists as “Haitian Vodoun Culture Language” (Gordon 2005), is used in Haiti as a second language for religion, song, and dance. Although it clearly has some Haitian Creole words, it is assumed that some of its vocabulary may be African.", as if they aren't even familiar enough with it to be sure what its vocabulary is. Interestingly, Jeffrey E. Anderson's article on it in The Voodoo Encyclopedia (2015), says that some pieces of it are attested from speakers/songs in the Mississippi Valley and its Voodoo tradition, outside Haiti / Vodou, which suggests the ISO's awkward FYROM-esque designation is, well, awkward. (Anderson does caution that "most [records] show little sign of langaj apart from a few words and some personal names of spirits", and "the origin of those that do appear to incorporate langaj is often unclear; the tendency of early authors to uncritically assume that Haitian Vodou and Mississippi Valley Voodoo were essentially the same thing renders it possible that some songs reportedly belonging to the Mississippi Valley may actually have been Haitian".) From what little I could find, it seems like a set of vocabulary (rather than a language per se) that might be compared to e.g. pandanus-avoidance vocabulary or Polari. (Procedurally, it will be subject to the same attestation requirements either way, and can be labelled and categorized.) - -sche (discuss) 07:54, 2 August 2020 (UTC)
2019 — August
devil a ...
I request that someone renames the entry devil a bit to devil a, because in practice "devil a" can be followed by any noun phrase, and it is neither practical nor desirable to have a headword for every combination. I then would also add divil a as a separate headword with no more content than referring it to "devil a" as a dialect variant. JonRichfield (talk) 08:13, 18 August 2019 (UTC)
We say ourselves in the entry for oxymoron that its use to mean "contradiction in terms" is loose and sometimes proscribed (despite the fact that many people use it this way nowadays). We say much the same thing at contradiction in terms as well.
The so-called oxymorons in this category are all or almost all contradictions in terms, where the contradiction is accidental or comes about only by interpreting the component words in a different way from their actual meanings in the phrase. An oxymoron in the strict sense has an intentional contradiction. I think we should be more precise about this, in the same way as we already are with using the term "blend" instead of "portmanteau", which has a narrower meaning. I therefore suggest we move this page to "Category:English contradictions in terms" (but see my second comment below). Likewise for any corresponding categories for other languages. — Paul G (talk) 06:51, 25 August 2019 (UTC)
On second thoughts, I think this category should be retained but restricted to true oxymorons, such as "bittersweet" and "deafening silence". Ones such as "man-child" and "pianoforte" are not intended to be oxymoronic and are only accidentally contradictions in terms. — Paul G (talk) 17:18, 26 August 2019 (UTC)
- Support. Andrew Sheedy (talk) 01:30, 27 January 2020 (UTC)
2019 — September
Church Slavonic from Old Church Slavonic
Discussion started at Wiktionary:Beer_parlour/2019/September#I_want_to_add_Church_Slavonic_terms. A new language code for a newer version of Church Slavonic? --Anatoli T. (обсудить/вклад) 12:01, 16 September 2019 (UTC)
Three L2 sections for New World languages have diacritics that would seem to require that they be moved. I have no familiarity with these languages. DCDuring (talk) 04:24, 23 September 2019 (UTC)
- If the diacritics are like macrons in Latin, only aiding in pronouncing the term or the like, they may be fine where they are. I don't have time at this moment to check. - -sche (discuss) 15:38, 24 October 2019 (UTC)
- OK. Thanks. DCDuring (talk) 16:44, 9 December 2022 (UTC)
- Possibly relevant: the names of the languages, Purukotó, Sapará and Wayumará, all have the same diacritic on the word-final syllable as the word "tuná" itself. These are all poorly attested Cariban languages in Spanish-speaking countries with names that don't follow Spanish orthographic rules. Chuck Entz (talk) 19:28, 9 December 2022 (UTC)
- -sche can decide this, if it gets gotten around to. I've removed the RfM tag. DCDuring (talk) 20:48, 9 December 2022 (UTC)
2019 — October
Canonical name of "fan"
Our canonical name for fan
is "Fang (Guinea)", which is unfortunate since it isn't spoken in Guinea. It's spoken primarily in Gabon and Equatorial Guinea. I'd recommend calling it "Fang (Gabon)" since there seem to be more speakers in Gabon than in Eq.G. and since the name of Gabon is shorter. —Mahāgaja · talk 11:53, 17 October 2019 (UTC) I'd recommend calling it "Fang (Equatorial Guinea)" since according to Ethnologue there are more speakers in that country than in Gabon. —Mahāgaja · talk 12:09, 22 October 2019 (UTC)
- The fact that there are even some speakers in Cameroon (according to Wikipedia) but it's not the same as "Fang (Cameroon)" is icing on the confusion-cake... and they're both Bantoid languages, so disambiguating by family doesn't help, and both spoken in Central Africa, so we can't disambiguate by mere region as we sometimes do. - -sche (discuss) 16:00, 24 October 2019 (UTC)
- @-sche: They are different branches of Bantoid, though. We could call [fak] "Fang (Beboid)" and [fan] "Fang (Bantu)". —Mahāgaja · talk 22:04, 24 October 2019 (UTC)
- Ah, true; I saw that Wikipedia classified Fang language (Cameroon) as "Western Beboid" but caveated that it was not necessarily a valid family, but if Beboid overall is valid, then that works and is clearer (IMO) than picking just one of the countries to list. I would support renaming them in that way. The only other languages that come to mind which are disambiguated by family are "Austronesian Mor" and "Papuan Mor" both spoken in West Papua), which are mentioned on WT:LANG; probably we should change those to "Mor (Austronesian)" and "Mor (Papuan)" for consistency. - -sche (discuss) 18:00, 25 October 2019 (UTC)
- @-sche: They are different branches of Bantoid, though. We could call [fak] "Fang (Beboid)" and [fan] "Fang (Bantu)". —Mahāgaja · talk 22:04, 24 October 2019 (UTC)
- Moved to Fang (Bantu) language and Fang (Beboid) language. —Mahāgaja · talk 18:46, 1 November 2019 (UTC)
- I was going to rename Austronesian and Papuan Mor to use the 'parenthetical, postpositive' naming format for consistency with the Fangs and with how languages with country-name disambiguators are named, but I notice we also have e.g. Austronesian and Papuan Gimi, Austronesian and Sepik Mari (and some other Maris), and several other such languages, and I don't have time to rename all of those, so consistency will have to wait. (There is also "Sepik Iwam", but it appears to actually get called that, to distinguish it from the other Sepik language called Iwam which is spoken in the same place and belongs to the same Iwam subfamily of Upper Sepik. Confusing!) - -sche (discuss) 08:41, 28 November 2019 (UTC)
- The reason I haven't archived this is that several other languages, mentioned above, still need to be renamed to fit the format of the other languages. I or someone else just need(s) to find the time... - -sche (discuss) 01:48, 6 August 2020 (UTC)
2019 — December
Can 為止 be used without 到? —Suzukaze-c◇◇ 07:21, 31 December 2019 (UTC)
- @Suzukaze-c: In modern varieties, basically no, but there are some fossilized constructions like 迄今為止. — justin(r)leung { (t...) | c=› } 08:40, 31 December 2019 (UTC)
- Hm, then I suppose having both are fine (unless someone thinks 到……為止 should go). 為止 might benefit from usage notes. —Suzukaze-c◇◇ 09:51, 31 December 2019 (UTC)
2020 — January
'Cities in Foo' and 'Towns in Foo'
@Donnanz, Fay Freak, Rua I'm not sure what the real difference is between a city and a town, and I suspect most people don't know either. For this reason I think we should maybe merge the two into a single 'Cities and towns in Foo' category. Benwing2 (talk) 03:54, 17 January 2020 (UTC)
- I oppose this merger. I would not think to look for a category with such an unintuitive name, and I do not know of any examples where this is problematic. Wikipedia seems to be able to choose which word to use without trouble, so why can't we? —Μετάknowledgediscuss/deeds 05:36, 17 January 2020 (UTC)
- Eliminating one of them is a good idea where there is no meaningful distinction between cities and towns. But that's going to be a country-specific decision: England makes the decision, the Netherlands does not. I think in cases without a distinction, we should keep "cities" and eliminate "towns". —Rua (mew) 10:15, 17 January 2020 (UTC)
- I wouldn't recommend merging them. It's a complex subject though, and the rules defining cities and towns can differ from country to country, and from state to state in the USA; I have come across "cities" with a population of less than 1,000 in the USA, sometimes around 50, but apparently they have that status. Cities in the UK have that status as granted by a monarch, towns can be harder to define in metropolitan areas, and villages can call themselves towns if they have a town council. Some villages large enough to be towns prefer to keep the village title. DonnanZ (talk) 10:34, 17 January 2020 (UTC)
- The odds that editors will accurately/consistently distinguish these categories when adding (the template that generate) them ... seems low. However, even if the categories are merged, that problem will remain on the level of the displayed definitions. And, apparently some users above want to keep them distinct. So, meh. - -sche (discuss) 05:36, 18 January 2020 (UTC)
- I can see arguments for both sides, actually. The idea needs a lot more thought, as you would probably have to drag in villages etc. as well. DonnanZ (talk) 14:15, 19 January 2020 (UTC)
- Could merge them into Municipalities in Foo and have the various alternatives point to that category. Of course there are some "cities" which contain several municipalities, but I don't think there is a word which comprises every form of village/town/hamlet/city/urban area. - TheDaveRoss 12:47, 21 January 2020 (UTC)
- In New York State alone, we have cities, towns, villages (which are subdivisions of towns), and unincorporated places, all of which exist within counties, except NY City, which is coextensive with 5 counties, each of which is coextensive with a borough of the City. The identities and borders of these places in NYS are generally fairly stable, though subject to occasional revision. Legislative and judicial districts are separate, with legislative districts changing after each decennial census. Census-designated places form a parallel structure with relationships to the state systems. The census system has the virtue of being uniform for the entire US, but the borders of many census places do not necessarily correspond to the borders of larger governmental units such as states and counties. Within New York State there are lists of each type of jurisdiction. In principle each US state has its own names for classes of jurisdictions. Finally, in popular practice, place names for inhabited place can differ from the names of governmental units and tend to have different boundaries even when the names are the same.
- In light of the lack of homogeneity even within the US, let alone between countries, I think we need to respect national and state and provincial naming systems. If there is a worldwide system for categorizing places, we could also follow that, but I have not heard of such a system. Does the EU have some uniform system?
- In the absence of any generally accepted uniform universal or near-universal system for categorizing places, I think we need to accept the fact that nations and semi-sovereign parts of nations (eg, US states, Canadian provinces) each have their own naming systems, which are accepted within their boundaries. I think it would be foolish for us to attempt to have our own system for categorizing places and derelict for us to fail to use the various national and subnational categories.
- If the categories then don't lend themselves to a uniform universal categorization system, too bad. DCDuring (talk) 17:55, 20 September 2020 (UTC)
This has both a Jamaican Creole section and an English section labeled as "Jamaican". I undid the removal of the English section by an IP before I realized what they were doing, but I don't want to revert myself and take it off the radar before someone else has a look at it to make sure the English section really is unnecessary. Chuck Entz (talk) 04:39, 23 January 2020 (UTC)
- @Dentonius Any thoughts? Vox Sciurorum (talk) 00:12, 29 November 2020 (UTC)
- A few of the quotes look like Patwa. But the others suggest that the word is used and generally understood by English speakers. Our -claat words are popular in Hip Hop music and are often used by African Americans (among others). I'd keep the English section. — Dentonius 08:03, 29 November 2020 (UTC)
- I testify that it is correct to have an English section because of widespread use in Multicultural London English, and the pronunciation section is correct. This is also true for bloodclaat and bumboclaat. I.e. while “normal” Britons say bloody many of them new English say bloodclaat instead, and raasclaat. Fay Freak (talk) 11:07, 1 December 2020 (UTC)
2020 — February
Merge with out on a limb. Canonicalization (talk) 11:25, 9 February 2020 (UTC)
- Support DCDuring (talk) 13:57, 9 February 2020 (UTC)
- Nah not merged. GreyishWorm (talk) 23:56, 9 November 2022 (UTC)
- GreyishWorm (Wonderfool, now blocked) closed this without consensus or an explanation, so I've reopened it. — excarnateSojourner (talk · contrib) 07:27, 25 February 2023 (UTC)
Probably should be merged ~somehow~, or at least refer to each other. —Suzukaze-c◇◇ 03:30, 15 February 2020 (UTC)
- There is a slight loss of precision in the synonymy from merging, but I have made funny paper the lemma, without (yet) deleting funny papers, which should, IMO, be just the plural of funny paper. If someone thinks the possibility of lost precision outweighs the more straightforward presentation for normal users, please discuss. DCDuring (talk) 14:53, 15 February 2020 (UTC)
- @DCDuring Our current definitions (which match my understanding) imply that the two are synonymous, rather than funny papers being a plural of funny paper. So I think it makes sense to list funny paper (rarer) as a synonym of funny papers. - excarnateSojourner (talk | contrib) 20:33, 13 April 2022 (UTC)
- I agree with @ExcarnateSojourner – my impression is that funny papers is more common, though feel free to do a search and see. — Sgconlaw (talk) 19:19, 21 April 2023 (UTC)
- @DCDuring Our current definitions (which match my understanding) imply that the two are synonymous, rather than funny papers being a plural of funny paper. So I think it makes sense to list funny paper (rarer) as a synonym of funny papers. - excarnateSojourner (talk | contrib) 20:33, 13 April 2022 (UTC)
L.S., LS, lectori salutem, locus sigilli
The only related wiktionary entry that now appears is that of L.S.; no entries appear for the terms being abbreviated (!), and search of the full terms does not even currently link to the abbreviation/initialism page. (Nor does a seach of LS bring one to a disambiguation of L.S. and LS!) All of these issues can be rectified easily by any registered editor with a reasonable understanding of disambiguation and markup (e.g., through creation of disambiguating tags and pages, and through duplication of relevant content for new definitions pages based on the abbreviation page).
Note, as an academic, I will not regulary or traceably work on Wiktionary, because of its lack of sourcing requirements for entry and note content. This leaves it with no basis for veracity, its persistent state, and a poor state indeed. (This weakness is more significant than that of Wikipedia, which is weak in largest part for its failure to adhere to its own rules and guidelines regarding sourcing.) Cheers. 2601:246:C700:19D:49BF:AECD:6AA6:2E34 16:26, 25 February 2020 (UTC)
- I just don't understand what you're after. I reverted your edit because it radically changed what seemed to be an ok entry. We don't have disambiguation pages on Wiktionary and I suggest you read up on our rules and guidelines before you start deleting info again. --Robbie SWE (talk) 18:11, 25 February 2020 (UTC)
2020 — March
We have four definitions here and probably ought to have one; compare centripetal force, with its one, simple def. —Μετάknowledgediscuss/deeds 15:36, 12 March 2020 (UTC)
- See w:History of centrifugal and centripetal forces. The article just lacks cites, dates, etc to support the historical definitions. I suppose that we should just give up on trying to cover the historical definitions and leave that to our betters at WP. DCDuring (talk) 17:55, 12 March 2020 (UTC)
Attested only in West Germanic, so it should be moved to Reconstruction:Proto-West Germanic/smalt. I did that already, but Rua reverted me, so I'm bringing it here for discussion. —Mahāgaja · talk 20:25, 16 March 2020 (UTC)
- It cannot have been formed in Proto-West Germanic, as there was no productive means to do so. Therefore, it must have existed in Proto-Germanic. —Rua (mew) 20:26, 16 March 2020 (UTC)
- Ablaut was still very productive in Proto-West Germanic, as it is still today in Germanic languages. One might claim there is no productive means to form dove as the past tense of dive or snuck as the past tense of sneak in modern English, and yet they exist. —Mahāgaja · talk 20:31, 16 March 2020 (UTC)
- I'm not convinced that ablaut was productive even in the most recent stage of Proto-Germanic, let alone Proto-West Germanic. What evidence do we have that it was? —Rua (mew) 20:39, 16 March 2020 (UTC)
- Well, if the fact that it's productive in the modern Germanic languages doesn't convince you, I don't know what will. —Mahāgaja · talk 20:59, 16 March 2020 (UTC)
- In what way is it productive in the modern languages? —Rua (mew) 10:16, 17 March 2020 (UTC)
- I just said: we still use it in modern English to form past tenses. And often in ways that don't even have parallels, so it can't be simply analogy. Dive/dove can be formed from drive/drove, but sneak/snuck and drag/drug don't have direct parallels that allow us to call them simple analogy, because there aren't any other verbs in /iːk/ → /ʌk/ or /æɡ/ → /ʌɡ/, so the only way speakers can have created them is by knowing that the language has a general process of ablaut. And even in Proto-Germanic *smultą/*smaltą isn't exactly a productive pattern: PG didn't generally create exact synonyms of nouns by changing their ablaut grade without any other affixation. So this derivation is just as irregular in PG as it is in PWG, so why not call it PWG since it doesn't exist outside of West Germanic? —Mahāgaja · talk 11:37, 17 March 2020 (UTC)
- @Mahagaja: English drag derives from a strong verb, and was influenced by Old Norse draga with its indicative past "dró-". English sneak could also be derived from a strong verb but why its has snuck is beyond me, possibly by analogy. dove comes from a strong verb Proto-Germanic *dūbaną and dive from *dūbijaną. English drive has as its past participle "drove, drave, driv" with driv being the original, drove possible from *draib and drave possibly from before Middle English? "draib" (PG) -> "drāf" (OE) -> "drove" (E). None of this points towards productivity of ablaut or of the -an suffix but that English can reshape strong verbs by merging weak verbs or reshaping their pattern through analogy. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 21:13, 20 March 2020 (UTC)
- Consider also yeet, which is a running joke among young people online. They've decided that the past tense form is yote, which, given that it's completely made up, has no historical process to explain it whatsoever. That means that whoever made this up was aware of ablaut and (sort of) how it works.
- Besides, this wouldn't be the first case of a form appearing to be in complete violation of all the rules of historical linguistics. It's always been a matter of probability, with the occasional exception proving the rule. Those poor early Germanic people didn't have access to the Neo-Grammarian literature, so they can be excused for getting it wrong now and then... Chuck Entz (talk) 06:13, 21 March 2020 (UTC)
- @Mahagaja: English drag derives from a strong verb, and was influenced by Old Norse draga with its indicative past "dró-". English sneak could also be derived from a strong verb but why its has snuck is beyond me, possibly by analogy. dove comes from a strong verb Proto-Germanic *dūbaną and dive from *dūbijaną. English drive has as its past participle "drove, drave, driv" with driv being the original, drove possible from *draib and drave possibly from before Middle English? "draib" (PG) -> "drāf" (OE) -> "drove" (E). None of this points towards productivity of ablaut or of the -an suffix but that English can reshape strong verbs by merging weak verbs or reshaping their pattern through analogy. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 21:13, 20 March 2020 (UTC)
- I just said: we still use it in modern English to form past tenses. And often in ways that don't even have parallels, so it can't be simply analogy. Dive/dove can be formed from drive/drove, but sneak/snuck and drag/drug don't have direct parallels that allow us to call them simple analogy, because there aren't any other verbs in /iːk/ → /ʌk/ or /æɡ/ → /ʌɡ/, so the only way speakers can have created them is by knowing that the language has a general process of ablaut. And even in Proto-Germanic *smultą/*smaltą isn't exactly a productive pattern: PG didn't generally create exact synonyms of nouns by changing their ablaut grade without any other affixation. So this derivation is just as irregular in PG as it is in PWG, so why not call it PWG since it doesn't exist outside of West Germanic? —Mahāgaja · talk 11:37, 17 March 2020 (UTC)
- In what way is it productive in the modern languages? —Rua (mew) 10:16, 17 March 2020 (UTC)
- Well, if the fact that it's productive in the modern Germanic languages doesn't convince you, I don't know what will. —Mahāgaja · talk 20:59, 16 March 2020 (UTC)
- I'm not convinced that ablaut was productive even in the most recent stage of Proto-Germanic, let alone Proto-West Germanic. What evidence do we have that it was? —Rua (mew) 20:39, 16 March 2020 (UTC)
- Ablaut was still very productive in Proto-West Germanic, as it is still today in Germanic languages. One might claim there is no productive means to form dove as the past tense of dive or snuck as the past tense of sneak in modern English, and yet they exist. —Mahāgaja · talk 20:31, 16 March 2020 (UTC)
- I agree ablaut is still (marginally / semi)productive in modern English (this, page 59 from the 1973 Meeting Handbook of the Linguistic Society of America, says "It is also reasonably clear that semiproductive processes like English ablaut are the subject of general rules. For example, new items like snuck, dove, and drug attest to the viability of the ablaut M - rules."). I lack enthusiasm for figuring out whether this is PWGmc or PGmc. - -sche (discuss) 04:30, 18 August 2020 (UTC)
- FWIW this noun is possibly the source of Finnish malto (“flesh (of fruit)”), which would require a NWG dating at least. Unfortunately not assured, though… there is also a partly homonymous Proto-Northern Finnic *malto (“soft”) with a different Germanic etymology proposed (currently given at malto). In fact this even suggests to me that *smalt might be, instead of a regular derivative of any kind, just a meld of the competing *smultą with the adjective *maltaz, which both can be reconstructed at least for NWG. (The latter's reflexes include ON maltr > maltur and OHG malz.) --Tropylium (talk) 17:31, 1 January 2021 (UTC)
Retiring Moroccan Amazigh [zgh]
We renamed this code from "Standard Moroccan Amazigh" to "Moroccan Amazigh", but failed to note that the "standard" part was key. This is a standardised register of the dialect continuum of Berber languages in Morocco, promoted by the Moroccan government since 2011 as an official language. Marijn van Putten says this is essentially Central Atlas Tamazight [tzm], but most of the people producing texts in it are native speakers of Tashelhit [shi], so there is a bit of re-koineisation. However, if we move forward with good coverage of the Berber languages, every entry in [zgh] will be a duplicate of [tzm] or else a duplicate of [shi] marked with some sort of dialectal context label. By the way, the fact that there is an ISO code seems to be a political consideration rather than a linguistic one; compare the case of "Filipino", which we merged into Tagalog, or "Standard Estonian", which we merged into Estonian. @Fenakhay, -sche —Μετάknowledgediscuss/deeds 21:31, 16 March 2020 (UTC)
- Hmm, I see it's a rather recent attempt at standardization, too. I don't feel like I know enough about Tamazight to be confident about what to do, but it does seem like, if this is based on tzm, it could be handled as tzm (perhaps even, instead of putting "non-[ordinary-]tzm" entries at shi+label, they could be tzm+label, unless they're obviously shi words). - -sche (discuss) 15:44, 19 March 2020 (UTC)
- Generally, it seems the [shi] words are quite obvious; the main differences between [tzm] and [shi] are lexical (as far as I can tell, [tzm] has more internal diversity w/r/t phonology than differences with [shi]). But they're in a continuum anyway, and WP claims that there's debate on where to draw the dividing line. —Μετάknowledgediscuss/deeds 16:35, 19 March 2020 (UTC)
- And “Moroccan Amazigh” does not sound like a language name anyway if you have not been told it is one, it seems like “Berber as spoken in Morocco”, another reason to remove it. Fay Freak (talk) 15:59, 21 March 2020 (UTC)
2020 — April
- Discussion moved to #Template:eggcorn of into Template:misconstruction of.
to blue in the face (now a redirect to until one is blue in the face).
In addition of all the tense, person, and number variants (also contractions) of the current entry one can find variants omitting the pronoun, adding adverbs, using till or 'til instead of until; [VERB] oneself blue in the face; go|become|turn blue in the face; and blue-in-the-face and blue in the face as adjectives outside any of these expressions. The unchanging core of these is the set phrase blue in the face. It also has medical use (synonym cyanotic), which renders the figurative sense evolution and meaning obvious. DCDuring (talk) 17:39, 15 April 2020 (UTC)
The reconstructed infinitive form is useful to understand what the underlying verb is but it is never used in a sentence to convey meaning, like Azerbaijani *imək, Uzbek *emoq. —92.184.116.176 23:50, 20 April 2020 (UTC)
- @Allahverdi Verdizade: Seems to me like reconstructions are not meant for this purpose. Is there a better way to lemmatise this verb? —Μετάknowledgediscuss/deeds 23:38, 8 March 2021 (UTC)
- @Metaknowledge: I agree, and I don't think it's needed for any purpose, at least not for Azerbaijani. There was a user (or anon?) who insisted on adding those "underlying" verbs and creating templates for them, but I never understood the linguistics behind this reasoning. Allahverdi Verdizade (talk) 23:42, 8 March 2021 (UTC)
- @Allahverdi Verdizade: So can we just delete them? How should they be lemmatised? —Μετάknowledgediscuss/deeds 01:27, 9 March 2021 (UTC)
- You could lemmatize imiş as a free morpheme-form of -miş, i.e. Allahverdi Verdizade (talk) 01:36, 9 March 2021 (UTC)
- @Allahverdi Verdizade: So can we just delete them? How should they be lemmatised? —Μετάknowledgediscuss/deeds 01:27, 9 March 2021 (UTC)
- @Metaknowledge: I agree, and I don't think it's needed for any purpose, at least not for Azerbaijani. There was a user (or anon?) who insisted on adding those "underlying" verbs and creating templates for them, but I never understood the linguistics behind this reasoning. Allahverdi Verdizade (talk) 23:42, 8 March 2021 (UTC)
2020 — May
Most use of state's evidence is clearly of state + 's + evidence. I haven't found any use that is suggestive of a restriction to a witness's testimony, except with the use of turn. Also compare “turn state's evidence”, in OneLook Dictionary Search. with “state's evidence”, in OneLook Dictionary Search.. DCDuring (talk) 14:16, 13 May 2020 (UTC)
- They can't be state + 's + evidence when the phrase encompasses proceedings where the prosecutor is not a state (e.g., a municipality, county, or country). bd2412 T 05:32, 6 August 2020 (UTC)
- Sense 3 of state should cover it. I think if 3(a) doesn't cover it, then "Never do anything against conscience even if the state demands it." is not an appropriate citation thereof; I think Einstein would consider national, state, and city governments all part of "the state".--Prosfilaes (talk) 07:04, 17 August 2020 (UTC)
I wonder if these all ought to be merged into some entry akin to "play the ____ card" or something. There appear to be other words substituted aside from victim, race, and gender. Tharthan (talk) 22:09, 21 May 2020 (UTC)
- I lament that our way of handling snowclones is not optimal, banishing them to appendix-space, such that the choices here amount to 'have these multiple similar entries in the mainspace where users find them' or 'banish them to a tidy but less-findable appendix'. However, I see that we have a sense at card for this (although the definition could use some work), and between putting a link there and redirects from these entries, I suppose we could get by with migrating these to the snowclone appendix. Centralizing them does seem sensible since there are so many. ("Play the religion card" also exists.) - -sche (discuss) 23:56, 21 May 2020 (UTC)
- Maybe a title like play the prejudice card. — This unsigned comment was added by 2600:387:9:9::bf (talk) at 14:37, 2022 September 4.
- Interesting idea. Perhaps there would be an extensive entry for play the (something) card, but with full entries for the main attestable instances (eg, race/gender and perhaps victim, derived terms, and a usage note about "(something)". Play the X card seems to be something that would be highly productive, unless its use in too many cases would be deemed a microaggression. Attestation for play the (something) card would have to be limited to "somethings" other than the forms that have their own attestation. Other instances that I can readily find are disability, oppression, and queer. The uses of feminist and bully don't fit the "victim" semantics, which might warrant a second figurative definition for play the (something) card in addition to a
{{&lit}}
"definition". DCDuring (talk) 15:10, 4 September 2022 (UTC)- BTW, we are not alone in having an entry for these, but MWOnline only has one for use/play the race/gender card. Collins and Cambridge Advance Learner's have only play the race card. DCDuring (talk) 15:21, 4 September 2022 (UTC)
- Besides those, there's "play the poverty card", "play the gay card", "play the abuse card", "play the disabled card", "play the rape card", etc., as well as ones which, as you say, seem like they may have different semantics (e.g. some uses of "play the Muslim card" in reference to legislation ?to get Muslim support?, and some uses of "play the Holocaust card"?) ... it seems too productive to have entries for every attested X (it becomes SOP). Should this be in the mainspace as play the something card, or at Appendix:Snowclones/play the X card like Appendix:Snowclones/X is the new Y? For snowclones like this that require placeholders other than "someone"/"one" in the title, we seem to in recent years prefer to put them in Appendix:Snowclones/ rather than in mainspace, but I do see a handful of mainspace titles where "something" is a placeholder, like give something a go. If redirect all the variations people might search for, add usexes to the relevant sense we list at card, and maybe add a usex to whichever sense of play is relevant, it should be sufficiently findable. - -sche (discuss) 16:48, 4 September 2022 (UTC)
- I'd favor having a full entry for any term (presumably they would be attestable) that another dictionary had. It is unfortunate that our basic search engine searching for "play the disabled card" (with or without quotes) does not take a user to any of our existing play the X card entries. (I have added test entries for play the card and play the something card.) That would imply that we could use hard redirects for as many attestable instances of the snow clone as seem likely to help users. It may well be that the hard redirects should go to the snowclone appendix subpages, but there is no particular reason to do so in preference to a mainspace entry. Concern about the aesthetics of headwords with a placeholders seem misplaced. And (who knows?) someone might actually search for the expression using a placeholder and find it if it were in principal namespace. DCDuring (talk) 20:27, 4 September 2022 (UTC)
- Also, as the MWOnline entry shows play is not strictly essential; it can be replaced by use, among other verbs, such as deploy. So, perhaps a sense of card is an appropriate target for redirects. But I doubt that the entry for card is the right place for an intelligible presentation. For one thing, any etymology (sense derivation), usage notes, and derived terms or collocations (eg, race card) would necessarily be separated from the relevant definition for the polysemous noun, so as not to appear on the same screen. And, even if they did, that they belonged together would not be at all obvious. I realize that this kind of argument, if applied, might make for some inconsistency in our presentation of snowclones and might violate a strict reading of idiomaticity, but cases like this may merit exceptional treatment. DCDuring (talk) 21:05, 4 September 2022 (UTC)
- I'd favor having a full entry for any term (presumably they would be attestable) that another dictionary had. It is unfortunate that our basic search engine searching for "play the disabled card" (with or without quotes) does not take a user to any of our existing play the X card entries. (I have added test entries for play the card and play the something card.) That would imply that we could use hard redirects for as many attestable instances of the snow clone as seem likely to help users. It may well be that the hard redirects should go to the snowclone appendix subpages, but there is no particular reason to do so in preference to a mainspace entry. Concern about the aesthetics of headwords with a placeholders seem misplaced. And (who knows?) someone might actually search for the expression using a placeholder and find it if it were in principal namespace. DCDuring (talk) 20:27, 4 September 2022 (UTC)
- Besides those, there's "play the poverty card", "play the gay card", "play the abuse card", "play the disabled card", "play the rape card", etc., as well as ones which, as you say, seem like they may have different semantics (e.g. some uses of "play the Muslim card" in reference to legislation ?to get Muslim support?, and some uses of "play the Holocaust card"?) ... it seems too productive to have entries for every attested X (it becomes SOP). Should this be in the mainspace as play the something card, or at Appendix:Snowclones/play the X card like Appendix:Snowclones/X is the new Y? For snowclones like this that require placeholders other than "someone"/"one" in the title, we seem to in recent years prefer to put them in Appendix:Snowclones/ rather than in mainspace, but I do see a handful of mainspace titles where "something" is a placeholder, like give something a go. If redirect all the variations people might search for, add usexes to the relevant sense we list at card, and maybe add a usex to whichever sense of play is relevant, it should be sufficiently findable. - -sche (discuss) 16:48, 4 September 2022 (UTC)
- BTW, we are not alone in having an entry for these, but MWOnline only has one for use/play the race/gender card. Collins and Cambridge Advance Learner's have only play the race card. DCDuring (talk) 15:21, 4 September 2022 (UTC)
- Interesting idea. Perhaps there would be an extensive entry for play the (something) card, but with full entries for the main attestable instances (eg, race/gender and perhaps victim, derived terms, and a usage note about "(something)". Play the X card seems to be something that would be highly productive, unless its use in too many cases would be deemed a microaggression. Attestation for play the (something) card would have to be limited to "somethings" other than the forms that have their own attestation. Other instances that I can readily find are disability, oppression, and queer. The uses of feminist and bully don't fit the "victim" semantics, which might warrant a second figurative definition for play the (something) card in addition to a
I don't think this is a special phrase with "you're", it sounds like a phrasal verb be on. They want a fight? They're on! She issued a challenge, so she's on!. You can also use it in reference to the fight itself, e.g. the fight is on. 76.100.241.89 18:51, 23 May 2020 (UTC)
- Just noting to compare good on you→good on someone above. — 69.120.69.252 02:46, 24 May 2020 (UTC)
- You're on might be considered distinct because it is usually a speech act, indicating acceptance of a bet or dare. DCDuring (talk) 17:34, 24 May 2020 (UTC)
- Hmm, perhaps. But the IP is right that "on" can be used with other pronouns. I suppose the question is whether this is better viewed as someone is on, be on, or just on: we already have a sense for this at on, "(informal) Destined, normally in the context of a challenge being accepted; involved, doomed. "Five bucks says the Cavs win tonight." ―"You're on!" Mike just threw coffee onto Paul's lap. It's on now." - -sche (discuss) 04:25, 1 August 2020 (UTC)
- You're on might be considered distinct because it is usually a speech act, indicating acceptance of a bet or dare. DCDuring (talk) 17:34, 24 May 2020 (UTC)
2020 — July
I suggest that this entry be moved to Reconstruction:Proto-Slavic/vьlkodlakъ, since the -dl- cluster in the Czech descendant vlkodlak indicates that the cluster was still present in the Proto-Slavic form and was reduced to -l- in the other descendants. --108.20.184.19 00:44, 10 July 2020 (UTC)
- User:Bezimenen, seems sensible? PUC – 12:02, 10 July 2020 (UTC)
- @PUC: I have no objections to the move, however, I'm not entirely sure that *vьlkodlakъ was the primary form. Semantically, it makes sense to analyze the lemma as Proto-Slavic *vьlkodolkъ = *vьlkъ (“wolf”) + *dolka (“skin”) + *-ъ with -ol- > -la- metathesis or possible *vьlkodьlakъ (less likely in view of East Slavic forms with *-olo-, e.g. Russian вурдала́к (vurdalák, “vampire”)[1] /first recorded in written form in 18-19 cent./). You should consult with User:Rua in regard to which form should be created - *vьlkodlakъ or *vьlkodolkъ. I'm not so familiar with the style that Wiktionary likes to follows. Безименен (talk) 12:25, 10 July 2020 (UTC)
- If the original form had -dl-, why do we not see it in the other languages that preserve it, such as Polish? —Rua (mew) 13:25, 10 July 2020 (UTC)
- Not sure, but looking again at the entry, it seems not only Czech but also Serbo-Croatian and Slovene preserve the -dl- as well. --108.20.184.19 16:51, 10 July 2020 (UTC)
- Serbo-Croatian (and, I believe, Slovene) never preserves Proto-Slavic -dl- clusters, so the Serbo-Croat form indicates either some such form as Proto-Slavic *vьlkodolkъ or a later epenthesis of -d- by analogy with dlaka. — Vorziblix (talk · contribs) 16:20, 27 July 2020 (UTC)
- Not sure, but looking again at the entry, it seems not only Czech but also Serbo-Croatian and Slovene preserve the -dl- as well. --108.20.184.19 16:51, 10 July 2020 (UTC)
- If the original form had -dl-, why do we not see it in the other languages that preserve it, such as Polish? —Rua (mew) 13:25, 10 July 2020 (UTC)
An example of w:U and non-U English, which probably should be decided for the latter. While “scent” can possibly be broader, this category also has the danger of just about including anything that has a strong odour naturally. Hence I included بَارْزَد (bārzad, “galbanum”) and جُنْدُبَادَسْتَر (jundubādastar, “castoreum”). The English category has a weak six entries since created in 2011. But even Category:en:Perfumes includes dubious things. I doubt perfumes are something that can be categorized well – it’s basically anything smelly? –, maybe delete all? Fay Freak (talk) 01:09, 27 July 2020 (UTC)
- I think a case could be made for "scent" being not something that smells, but smell itself (like musk and maybe putridity). I don't see any reason why perfumes can't be categorized. I don't think it's meant to include anything that could be used as the scent of a perfume, but words that specifically describe perfumes. For instance, cologne isn't "cologne-scented", it's the name of a type of perfume; jasmine is a plant, but it is also used as the word for a perfume, not just to describe a perfume (you could say, "She always wore a liberal quantity of jasmine" and not just "She always wore a liberal quantity of jasmine-scented perfume". Of course, you could also say "She always wore a liberal quantity of Autumn Breeze" because it's a proper noun, but I don't think you could say "She always wore a liberal quantity of lilac". Instead you would say "lilac perfume".) Andrew Sheedy (talk) 03:07, 27 July 2020 (UTC)
- So keep Category:Perfumes, in case I wasn't clear. I'd lean towards keeping Category:Scents as well, but I'd have to hear a few more opinions first. Hearing the value of having the category for other languages would be helpful. Andrew Sheedy (talk) 03:09, 27 July 2020 (UTC)
2020 — August
Should this be moved to one's ship comes in? It can be found introduced by other things than when, e.g. "until one's ship comes in", "waiting for one's ship to come in", and so on. —Mahāgaja · talk 10:14, 5 August 2020 (UTC)
- The when form is the one I hear most often. Vox Sciurorum (talk) 12:08, 8 October 2020 (UTC)
and Category:Dialectal Delta Yokuts, Category:Dialectal Kings River Yokuts, Category:Dialectal Buena Vista Yokuts, Category:Dialectal Northern Valley Yokuts (and possibly a few others) seem to be the only categories named in this manner, so they should perhaps be renamed. But e.g. Category:English dialectal terms does not contain subcategories, so I guess the standard(?) thing to do would be to ... move these to Category:Regional Delta Yokuts(?). (Category:Regional English does ultimately contain several non-regional dialect subcategories, like Category:African-American Vernacular English.) - -sche (discuss) 16:14, 7 August 2020 (UTC)
Move to speak highly of.
I have yet to find usage at Google NGrams that has this expression without a following PP headed by of.
At OneLook no lemming has this, whereas two have speak highly of. DCDuring (talk) 01:29, 14 August 2020 (UTC)
- Speak highly about; speak highly regarding? I see some examples of these. — SGconlaw (talk) 02:28, 11 October 2020 (UTC)
Adjective section should be merged into noun section.
I do not believe that the adjective shows a word that is truly an adjective, rather than a noun used attributively. Moreover, the noun section lacks a definition like "an organism or object with a blue tail", which is precisely the sense claimed by the adjective section.
Is this page used for merging of sections of the same entry, in the same language? DCDuring (talk) 13:59, 31 August 2020 (UTC)
- Yes. the listed derived birdnames are actually compounds blue + tail + X, as becomes obvious in German, e.g. Blauschwanz-Fruchttaube, for which nobody would create an adjective entry. Fay Freak (talk) 14:05, 31 August 2020 (UTC)
2020 — September
Ungjetë is actually just a variant of ungjet, which is the standard form. ArbDardh (talk) 19:13, 12 September 2020 (UTC)ArbDardh
Tagged but not posted here: Merge with up to something.
Be is not essential to the idiom. Some other copulas work, eg. seem, appear, look. DCDuring (talk) 05:13, 20 September 2020 (UTC)
2020 — October
Category:Regional English, Category:English dialectal terms (NOT subcategories)
IMO it does not make sense to have some terms categorized directly into Category:Regional English (not its subcategories) and other terms categorized directly into Category:English dialectal terms, because in practice no-one seems to be maintaining a distinction as far as putting one kind of entry in one and another in the other, it seems haphazard as to whether an entry uses e.g. {{lb|en|US|regional}}
/ {{lb|en|UK|regional}}
like pope, mercury, jack, snap, wedge, phosphate, tab, or gob, or else uses {{lb|en|US|dialectal}}
/ {{lb|en|UK|dialectal}}
like pope (!), admire, haunt, on, sook, book, yinz, and gon. Many of the {{lb|en|US|dialectal}}
/ {{lb|en|UK|dialectal}}
terms go on to specify which regions they're used in, like "Pittsburgh and Appalachia" or "Northern England" or "Scotland". And we put every more specific dialect category as a subcat of "Regional", not of "Dialectal". I'm not entirely sure which category the entries in the two top-level categories should be consolidated into, but I'm inclined to think they should go in one or the other. Or do we want to try to implement some distinction? (At the very least, entries that use "regional" but then go on to specify the regions, like "US, regional, Pittsburgh", can drop the unnecessary "regional".) The one situation I can think of where simply changing "regional" to "dialectal" would not work is that some entries are labelled "regional AAVE". Thoughts? - -sche (discuss) 01:06, 10 October 2020 (UTC)
- I personally think that dialectal and regional terms should be separated. Since a term for something in a region from an out-of-region dialect should be categorize into both regional dialects. -- 65.92.244.147 16:29, 22 November 2020 (UTC)
- That doesn't make sense. It's not the thing referred to that makes it regional or dialectal, it's the term itself. Do you have an example in mind? Chuck Entz (talk) 18:21, 22 November 2020 (UTC)
- I think the real problem is that it's not clear what we mean when we say something is dialectal. Linguistically, a dialect can be any speech variety that is separate from the rest of the language. With a language such as English that has multiple standards, you could say that much of the language is dialectal, though no one uses the term that way. I suspect there may be a value judgment involved: dialectal English is the way local people talk when they're not using proper English. Regional has less of that: I say potayto and you say potahto, but that's just a matter of geography. Theoretically, sociolects like AAVE and Cockney would be better described as dialectal than regional, but I'm not sure whether they're described as either. For a lot of people, though, it's probably whatever it's called in the references they check (or copy from). Chuck Entz (talk) 18:21, 22 November 2020 (UTC)
- "dialectal English is the way local people talk when they're not using proper English".
- What, pray tell, is proper English? General Australian? Standard Canadian English? General American (*had trouble including that as a suggestion with a straight face*)? Standard Indian English?
- If someone were to suggest that whatever is arbitrarily declared to be the 'standard' dialect of the English in their country is thus "proper English", and every other dialect is not, then that is obvious nonsense. I get that that is the reason why you used the phrasing value judgement, but if what you suggest to be going on is actually going on, then that is a problem.
- Wiktionary aims to be descriptive, not prescriptive. So if the category "Regional English" is being used to suggest that certain dialectal terms are more "proper" than others, then we need to get rid of one category or the other. Tharthan (talk) 18:42, 22 November 2020 (UTC)
- I'm not agreeing with the value judgment. I was too lazy this morning to put everything in quotation marks. The basic problem is that this terminology goes back to earlier academic standards and it's hard to tell what it means in a more modern context. A dialectologist or other linguist would probably have a more rigorous definition, but we don't seem to. Chuck Entz (talk) 19:36, 22 November 2020 (UTC)
There was a discussion about this in 2014 which was closed (in 2016) after little input, but: should this be -trix? The only word listed as a derivative of this which is not -trix is ambassadrix, and viewing it as containing a suffix *"-rix" while simultaneously viewing ambassadress as containing "-ess" is not consistent anyway (why not view it as -ix, at that point? or more compellingly, as a blend influenced by -trix?). Perhaps if there were two more "-rix"es, it could suggest "-rix" had become an alternative form derived from -trix (although again, the lack of a verb *ambassade makes viewing ambassadrix as *ambassade suffixed with -rix rather than ambassador blended with -trix questionable), but the main form appears to be -trix. (Or, actually, the main process by which English acquires -(t)rix words appears to be borrowing directly from Latin without the application of any suffix in English in the first place.) No? - -sche (discuss) 03:00, 10 October 2020 (UTC)
- There may be some cases where the term doesn't exist in Latin. Either way, this page should be moved to -trix. Ultimateria (talk) 22:25, 15 October 2020 (UTC)
- I agree, merge into -trix. - excarnateSojourner (talk | contrib) 06:11, 7 March 2022 (UTC)
(Notifying Atitarev, Tooironic, Suzukaze-c, Justinrleung, Mar vin kaiser, Geographyinitiative, RcAlex36, The dog2, Frigoris, 沈澄心, 恨国党非蠢即坏, Thedarkknightli, Michael Ly): I have no idea what the intended difference between these two categories is, but in practice there's none. The former gets triggered by the Shanghainese Wu label while the latter gets triggered by either Shanghai or Shanghainese. They should be merged. Benwing2 (talk) 04:03, 11 October 2020 (UTC)
- Comment: If we are trying to make a distinction, one category should be referring to Shanghainese Wu, and another should be referring to any variety spoken in Shanghai (i.e. both Shanghainese Wu and Mandarin). I don't know if this distinction should/can be made, though. — justin(r)leung { (t...) | c=› } 04:06, 11 October 2020 (UTC)
- @Justinrleung: Can 硬盤人 be an example that is used in "general Chinese in Shanghai"? --沈澄心✉ 07:45, 11 October 2020 (UTC)
- @沈澄心: Yes, I think so. — justin(r)leung { (t...) | c=› } 07:49, 11 October 2020 (UTC)
- I guess the issue then is, do we have native Shanghainese speakers here who can make this distinction? It looks to me like most entries in both categories are Wu terms. Benwing2 (talk) 22:05, 11 October 2020 (UTC)
- If we have any entries that make this distinction (and one such entry has been convincingly adduced above), then merger would result in losing information. Do you want Shanghai-specific Mandarin terms to go uncategorised as such? —Μετάknowledgediscuss/deeds 03:26, 12 October 2020 (UTC)
- @Benwing2, Metaknowledge: @Thedarkknightli probably knows the Mandarin terms and may know some of the Wu terms. For Shanghainese, we have some resources we can consult, so it's the Mandarin terms that are more difficult to figure out. The terms that are in CAT:Shanghainese are Wu for sure (and I would prefer to call the category "Shanghainese Wu" to make it clear). We would need to sift through the CAT:Shanghainese Chinese category to check what's actually Wu and relabel them with "Shanghainese Wu" or just "Wu". BTW, there might be some need to revamp other labels/categories, like "Sichuan" displaying as "Sichuanese" and categorizing to CAT:Sichuanese Mandarin, which could be confusing when we introduce terms in Sichuanese Hakka or Xiang (which we might have some already). — justin(r)leung { (t...) | c=› } 03:40, 12 October 2020 (UTC)
- If we have any entries that make this distinction (and one such entry has been convincingly adduced above), then merger would result in losing information. Do you want Shanghai-specific Mandarin terms to go uncategorised as such? —Μετάknowledgediscuss/deeds 03:26, 12 October 2020 (UTC)
- I guess the issue then is, do we have native Shanghainese speakers here who can make this distinction? It looks to me like most entries in both categories are Wu terms. Benwing2 (talk) 22:05, 11 October 2020 (UTC)
- @沈澄心: Yes, I think so. — justin(r)leung { (t...) | c=› } 07:49, 11 October 2020 (UTC)
- @Justinrleung: Can 硬盤人 be an example that is used in "general Chinese in Shanghai"? --沈澄心✉ 07:45, 11 October 2020 (UTC)
- (edit conflict) A native Shanghainese speaker would be User:辛时雨 but he is not very active.
- What we lack with regional labels, which is specific to Chinese since the merger needs to work for varieties and subvarieties is the ability to add variety specific categories,
{{lb|zh|Shanghai|Wu}}
is meant to not only label a term but also categorise it as Shanghainese Wu but{{lb|zh|Shanghai}}
is for general Chinese, esp. Mandarin. --Anatoli T. (обсудить/вклад) 03:43, 12 October 2020 (UTC)- I think you would need to use
{{lb|zh|Shanghai Wu}}
or something, not{{lb|zh|Shanghai|Wu}}
, since I don't think the same label ("Shanghai") can categorize into two categories. Anyway, add my voice to those saying that if there is intended to be a distinction here, the category names (and, probably, boilerplate texts) should be made clearer. We could also consider "see also"-style crossreferencing them, like Category:Louisiana French and Category:Louisiana Creole French language. - -sche (discuss) 17:26, 13 October 2020 (UTC)
- I think you would need to use
- Rename to Category:Shanghai Wu (阿拉) and Category:Shanghai Chinese (硬盤人). —Fish bowl (talk) 06:36, 6 February 2022 (UTC)
list of entries to examine
- 乃 (nǎi)
- 來得/来得 (láide)
- 哈哈 (hāhā)
- 四季豆 (sìjìdòu)
- 囡 (nān)
- 外加 (wàijiā)
- 套子 (tàozi)
- 差頭/差头
- 後年子/后年子
- 意粉 (yìfěn)
- 拆白黨/拆白党 (chāibáidǎng)
- 拿摩溫/拿摩温 (námówēn)
- 歐巴桑/欧巴桑 (ōubāsāng)
- 特意 (tèyì)
- 盡頭牙/尽头牙 (jìntouyá)
- 硬盤/硬盘 (yìngpán)
- 硬盤人/硬盘人 (yìngpánrén)
- 站頭/站头 (zhàntou)
- 童子雞/童子鸡 (tóngzǐjī)
- 老娘
- 老虎窗 (lǎohǔchuāng)
- 蒲桃 (pútáo)
- 討生活/讨生活 (tǎoshēnghuó)
- 辣 (là)
- 飛機票/飞机票 (fēijīpiào)
- 飯單/饭单 (fàndān)
Eastern Subanen language
@-sche, Metaknowledge We have the northern, western and central variants as languages, as well as a couple of other Subanen/Subanon/Subanuns. Someone manually put Pagadian into Category:Category:Cebuano terms derived from Eastern Subanen, which causes an error. Ethnologue recognizes an Eastern Subanen language and even assigns a code sfe. Should this exist on Wiktionary? Benwing2 (talk) 03:52, 27 October 2020 (UTC)
- I knew this language name/code seemed vaguely familiar to me, and after poking around I find that it's because I noticed we were missing it a few years ago, but I didn't have time/resources to look into whether we should add it (which isn't a helpful comment, I realize). Wikipedia has little to say about it; the one innovation they mention it has is shared with several other Subanens. What little I see on Google Books suggests Eastern Subanun is a more common spelling for the ethnic group; I see little mention of the lect under any spelling. It's hard to draw firm conclusions, but on the face of it I don't see why we're missing the code (it indeed seems to be one of the major dialects / dialect clusters). - -sche (discuss) 08:06, 27 October 2020 (UTC)
2020 — November
Category:en:Artificial languages
This should probably be moved to Category:Conlangs, because:
- The term "artificial language" is ambiguous (see artificial language and w:Artificial language). While Category:Conlangs may be hard to parse for the uninitiated, opacity is better than ambiguity. There's also other categories with potentially opaque names, such as Category:Forteana, Category:Semiotics, and Category:Combinatorics.
- The term conlang/constructed language is generally more common than artificial language, especially within the community that makes/uses them. It's telling that Wikipedia uses w:Constructed languages for what the category covers, and the Language Creation Society doesn't mention "artificial languages" once on its homepage.
- Changing the name of the category will lead to greater consistency with Category:Conlanging, putting the contrast between the purpose of each category (names of constructed languages vs. conlanging terminology) in sharper relief.
. Hazarasp (parlement · werkis) 11:29, 3 November 2020 (UTC)
- The odd choice of wording was intended to avoid the topical category conflicting with Category:Constructed languages, which is a holding category for those languages. Given that our MediaWiki trappings make it impossible to resolve this conflict, I support this proposal as a better compromise. —Μετάknowledgediscuss/deeds 06:16, 8 November 2020 (UTC)
- @Metaknowledge Do you mean that Category:en:Constructed languages would conflict with Category:Constructed languages? Would Category:en:Conlangs be an option? - excarnateSojourner (talk | contrib) 06:23, 7 March 2022 (UTC)
- Strictly speaking, it's not a conflict between Category:en:Constructed languages and Category:Constructed languages, but between two possibly versions of Category:Constructed languages. Our structure for topical categories (categories for entries associated with a particular topic) and set categories (categories for names of things) consists of an umbrella category that holds all the language-specific categories that are the same as the umbrella category, but prefixed with the language code of the language in question. In other words, you can't have Category:en:Constructed languages without an umbrella category called Category:Constructed languages. The problem is that Category:Constructed languages can't be both an umbrella category for language-specific categories and the container for things like Category:Esperanto language and Category:Volapük language. In order to have language-specific categories, you have to have an umbrella category that doesn't conflict with the container category for constructed languages. Currently that umbrella category is Category:Artificial languages, but the proposal here is to change it to Category:Conlangs. Thus,Category:en:Artificial languages would become Category:en:Conlangs. Chuck Entz (talk) 07:22, 7 March 2022 (UTC)
- @Metaknowledge Do you mean that Category:en:Constructed languages would conflict with Category:Constructed languages? Would Category:en:Conlangs be an option? - excarnateSojourner (talk | contrib) 06:23, 7 March 2022 (UTC)
One should almost certainly be an alt-form of the other. I’m not sure which is best as lemma or whether the pronunciations should be identical. — Vorziblix (talk · contribs) 05:16, 17 November 2020 (UTC)
It looks like this is only citable with a pronoun, so the lemma should be zijn kat sturen. ←₰-→ Lingo Bingo Dingo (talk) 17:40, 26 November 2020 (UTC)
2020 — December
Tagged by Adam78 in July 2019, but apparently never listed. The specific diffs for the taggings are Special:Diff/53620744 and Special:Diff/53620742. The entries seem to have some distinct definitions listed, with take advantage of having "To exploit, for example sexually." and take advantage having "To profit from a situation deliberately." They also seem to share the definition "To use or make use of."/"To make use of something." Of final note, take advantage has a quotation with a usage that is not followed by of. Feel free to move this entry into the 2019 section if appropriate. —The Editor's Apprentice (talk) 19:58, 1 December 2020 (UTC)
Two senses, one sense in each entry, and each entry is an alt form of the other. Would be nicer to have one entry with the content and the other with only alt forms. Equinox ◑ 00:27, 20 December 2020 (UTC)
- This is a recurring issue; compare blacksnake and black snake which I brought up at WT:TR § when words are interchangeable but senses are usually spelled one way or the other. If the rope sense is usually spelled scare line and the phrase sense is usually scare-line, I could see why someone would set the entries up the way they are, but I agree it'd be easier to get an overview of all the senses if they were in one place, probably whichever spelling is more common overall regardless of sense. Meh. (Compare also breakfast's religious sense.) - -sche (discuss) 20:39, 29 December 2020 (UTC)
- I favor the current approach, but I could be convinced about an alternative if I could see a mock-up. DCDuring (talk) 16:15, 28 February 2021 (UTC)
Merging Thesaurus:large and Thesaurus:gigantic
I recently happened upon Thesaurus:gigantic and Thesaurus:large. They seem to generally serve the same purpose, though gigantic does describe a greater size than large. Does anyone have any objections to me merging them? I would keep whatever page gets the most views and transform the other into a redirect. Depending on the number, I might also replace all links that exist to one with links to the other. Best. —The Editor's Apprentice (talk) 19:32, 25 December 2020 (UTC)
- No objection, provided the distinction between large and gigantic is made clear somewhere. — SGconlaw (talk) 20:34, 25 December 2020 (UTC)
- I've discovered that there are an even larger number of similar thesaurus pages than the two I mentioned in my first post. These include WS:small and WS:tiny, WS:gigantic thing and WS:giant, as well as WS:ample and WS:big. Would it make sense to similarly merge the first two pairs as well include the third pair in the merger that I mentioned in my first post? Finally, would it make down to cut down WS:size so that it is only a directory to other thesaurus pages? —The Editor's Apprentice (talk) 19:13, 28 December 2020 (UTC)
- I think it's fine to merge thesaurus pages that relate to the same part of speech, so long as the shades of meaning of the different words are made clear on the unified page. However, pages that relate to different parts of speech (e.g., gigantic (adjective) v. gigantic thing (noun)) should probably be kept separate – and if I read your post correctly you aren't suggesting such pages be merged. No objection either to most of the content in Thesaurus:size being moved to specific pages and the former being converted to a directory (though synonyms of the word size can be retained). — SGconlaw (talk) 15:08, 29 December 2020 (UTC)
- Sounds good. For the record, the way I am currently thinking of noting that there are shades of meaning is by putting a note at the top of the page stating something along the lines of "the implied extent that the object is larger than usual is different for each of these terms with some referring to even bigger sizes than others". Is this sufficient in your mind or would using dashes to separate different parts of the list as is sometimes done on thesaurus pages based on their implied size necessary? —The Editor's Apprentice (talk) 20:52, 30 December 2020 (UTC)
- I'm concerned such a note won't be clear enough. It would probably be better to split the terms into completely different sections, with shorter notes (e.g., one section for big: "Sense: the weakest sense of great size"; and another for gigantic: "Sense: of very great size"). — SGconlaw (talk) 21:33, 30 December 2020 (UTC)
- Fair enough. I think I should be able to do that and still be able list antonyms in a meaningful way, specifically by doing a similar division between them based on the strength of their senses. For some of the terms I may not be able to accurately determine what section they should be in due to my unfamiliarity, but I'll do my best and shouldn't run into too much trouble. —The Editor's Apprentice (talk) 01:06, 1 January 2021 (UTC)
- I'm concerned such a note won't be clear enough. It would probably be better to split the terms into completely different sections, with shorter notes (e.g., one section for big: "Sense: the weakest sense of great size"; and another for gigantic: "Sense: of very great size"). — SGconlaw (talk) 21:33, 30 December 2020 (UTC)
- Sounds good. For the record, the way I am currently thinking of noting that there are shades of meaning is by putting a note at the top of the page stating something along the lines of "the implied extent that the object is larger than usual is different for each of these terms with some referring to even bigger sizes than others". Is this sufficient in your mind or would using dashes to separate different parts of the list as is sometimes done on thesaurus pages based on their implied size necessary? —The Editor's Apprentice (talk) 20:52, 30 December 2020 (UTC)
- I think it's fine to merge thesaurus pages that relate to the same part of speech, so long as the shades of meaning of the different words are made clear on the unified page. However, pages that relate to different parts of speech (e.g., gigantic (adjective) v. gigantic thing (noun)) should probably be kept separate – and if I read your post correctly you aren't suggesting such pages be merged. No objection either to most of the content in Thesaurus:size being moved to specific pages and the former being converted to a directory (though synonyms of the word size can be retained). — SGconlaw (talk) 15:08, 29 December 2020 (UTC)
- I've discovered that there are an even larger number of similar thesaurus pages than the two I mentioned in my first post. These include WS:small and WS:tiny, WS:gigantic thing and WS:giant, as well as WS:ample and WS:big. Would it make sense to similarly merge the first two pairs as well include the third pair in the merger that I mentioned in my first post? Finally, would it make down to cut down WS:size so that it is only a directory to other thesaurus pages? —The Editor's Apprentice (talk) 19:13, 28 December 2020 (UTC)
- Hah, separate pages for "large" vs "big" is particularly unnecessary; at least with gigantic vs large/big it's obvious what distinction someone was going for. I wouldn't necessarily consider "most page views" the deciding metric as far as where to put the page, since a majority of page views likely come from (a) which page our entries link to, and (b) which page has content, both of which would change if we update the page title and/or links. In this case, though, it seems that large not only gets the most pageviews (44 in the last 30 days, vs 18 for big, also more views in the last 90 days) but is also the more common word. - -sche (discuss) 20:24, 29 December 2020 (UTC)
- I agree with you that page views aren't as relevant for thesaurus pages. I think the content should be at the most generic or common synonym. With that, I think it would make sense to look at informal language rather than book corpa which tend to be more formal. I suspect big is more common in informal contexts, but have currently hit the rate limit for the english-corpa.org so cannot do the relevant research yet. When the month changes, I should be all good and I plan to follow the links here on which corpa cover informal English. —The Editor's Apprentice (talk) 20:52, 30 December 2020 (UTC)
- Okay, I did the searches in the "(personal) blog", "spoken", "interview", "non-academic", and overall "internet" genres of various BYU corpa (those at english-corpa.org) as well as in WorldLex-English in the "BlogFreqPm" (frequency in blogs, per million words) column. I decided to exclude television corpa because I couldn't identify a way to filter out scripted television and just focus on live television. I found big to be more common than larger in almost all cases. Given that, do you think that it would make sense to put the content at WS:big? Thanks and I hope you the best. —The Editor's Apprentice (talk) 20:05, 5 January 2021 (UTC)
- I agree with you that page views aren't as relevant for thesaurus pages. I think the content should be at the most generic or common synonym. With that, I think it would make sense to look at informal language rather than book corpa which tend to be more formal. I suspect big is more common in informal contexts, but have currently hit the rate limit for the english-corpa.org so cannot do the relevant research yet. When the month changes, I should be all good and I plan to follow the links here on which corpa cover informal English. —The Editor's Apprentice (talk) 20:52, 30 December 2020 (UTC)
- So far I've merged Thesaurus:small and Thesaurus:tiny as well as Thesaurus:gigantic thing and Thesaurus:giant. I think its going pretty well so far, though slowly. —The Editor's Apprentice (talk) 23:46, 11 January 2021 (UTC)
- Support original merge proposal. - excarnateSojourner (talk|contrib) 06:50, 29 October 2021 (UTC)
- @The Editor's Apprentice I've merged big and gigantic into large, following SGconlaw's advice to keep large and gigantic as separate sections. (I chose large mostly arbitrarily over big. If anyone prefers big I don't mind it being moved there.) All that remains is updating links. - excarnateSojourner (talk | contrib) 08:05, 7 March 2022 (UTC)
- Thank you for completing the merger and finalizing the work I abandoned. —The Editor's Apprentice (talk) 18:58, 11 March 2022 (UTC)
- Oppose Wikisaurus has long needed quality improvement. Looking at WS:large, for example, are hefty, tall, broad, long, and rotund really synonyms? Why are they in a pool of synonyms with the heading "Sense: of considerable or relatively great size or extent". A shotput ball is hefty without being large in normal use. I'd have thought that we would differentiate linear, areal, and volume spatial extent and mass/weight, as well as the various terms that are applied to nouns that don't actually have such properties, eg, problem, success. This proposed merger is probably a step backward and certainly a distraction from the underlying quality problem. DCDuring (talk) 14:52, 7 March 2022 (UTC)
- Good point. When you mention differentiating between linear, areal, and such, are you thinking separate pages? The pages I merged into Thesaurus:large did not have any of these distinctions, and now we have one page to split into linear, areal, etc. terms from, so I don't see how it is a step backward. - excarnateSojourner (talk | contrib) 21:26, 9 March 2022 (UTC)
2021 — January
Two assorted groups of adjective and adverb senses. Merge? Equinox ◑ 14:48, 5 January 2021 (UTC)
- I'd bet that you couldn't come up with definitions on the merged entry that were both complete and subsitutable as both adjective and adverb in such definitions. Also I'd expect that synonyms might need to be distinguished by PoS. DCDuring (talk) 16:22, 5 January 2021 (UTC)
- I don't mean that Adj and Adv should be merged, but rather that the two named entries should be merged. Equinox ◑ 21:48, 5 January 2021 (UTC)
- Support for the sake of deduplication. - excarnateSojourner (talk|contrib) 06:52, 29 October 2021 (UTC)
These are terms that were historically used in the Dutch East Indies, perhaps to some degree also in Malay-speaking territories of the Dutch East India Company. A rename to Category:Dutch_East_Indies_Malay makes the most sense. It is doubtful that a category "Netherlands Malay" is needed because the number of speakers of Malay in the Netherlands is not very high. ←₰-→ Lingo Bingo Dingo (talk) 19:57, 10 January 2021 (UTC)
- Support, assuming this can be demonstrated with cites for all the entries. Categories for polities that existed in the past is a good practice when merited; cf. Category:Rhodesian English. —Μετάknowledgediscuss/deeds 22:54, 10 January 2021 (UTC)
Move to “merry Christmas and a happy New Year” as merry and happy are adjectives which do not need to be capitalized, and per the quotation and the entries merry Christmas and happy New Year. J3133 (talk) 08:55, 15 January 2021 (UTC)
- The capitals seem to be common, albeit not mandatory ([6], [7], [8]). Other forms might be:
- As alternative forms (comma or not, different capitalisation) get there own entries, keep what exists. As for the main entry, which form is the most common? --幽霊四 (talk) 18:58, 6 February 2021 (UTC)
- I see the capitalized form more often than the other forms (as illogical as it is), so oppose. But a usage note or at least an alternative forms section should make note of the other forms ("New Year" is rarely lowercase, in my experience). Andrew Sheedy (talk) 22:41, 4 April 2021 (UTC)
- Similar entries include Merry Xmas, Happy Christmas, Happy Holidays, and Happy Thanksgiving. (There are a few in lowercase as well.) - excarnateSojourner (talk|contrib) 07:01, 29 October 2021 (UTC)
- Equinox moved Happy Holidays to happy holidays, saying "caps not required by default, though of course banners and greeting cards etc. often use title casing". - excarnateSojourner (talk | contrib) 20:56, 21 April 2022 (UTC)
Another pair of almost duplicate entries. Equinox ◑ 01:16, 16 January 2021 (UTC)
Merge; both contain the same code. J3133 (talk) 12:30, 20 January 2021 (UTC)
- These seem clumsy anyway.
{{m}}
and its friends already know not to italicize scripts other than Latn; can't{{quote-book}}
and its friends be taught the same thing? —Mahāgaja · talk 10:09, 30 January 2021 (UTC)
2021 — February
Significant overlap. Equinox ◑ 18:43, 6 February 2021 (UTC)
Merge (note that both have additional translations and reconnoiter was WOTD). J3133 (talk) 09:08, 11 February 2021 (UTC)
- I've merged the verb translations for reconnoiter and reconnoitre. They are all now on reconnoiter#Translations, and reconnoitre#Translations points to this with
{{trans-see}}
.Voltaigne (talk) 12:38, 20 March 2021 (UTC)
Partly duplicated material, and then there are two noun senses that only occur in one of the entries, plus the verb (which probably doesn't occur without a u). Equinox ◑ 03:32, 13 February 2021 (UTC)
Not synonyms, of course, but certain senses overlap almost entirely (except people have edited one and not the other without realising). Equinox ◑ 04:12, 14 February 2021 (UTC)
- An approach would be to put all and only the true definitions that are most commonly use a given spelling in that spelling and also have a definition in each saying that it is a synonym of the other spelling. That might not be exactly true, but would be close. To rely on the other term appearing in related terms seems a bit weak. DCDuring (talk) 04:19, 14 February 2021 (UTC)
- Yeah, I think that's what we may have to do, with glosses in the
{{synonym of}}
s to make clear that each entry being a{{synonym of}}
the other is not (just) circular. Like egoist vs egotist (we are not the only dictionary to have a sense line defining each of those terms as the other, in addition to other definitions). - -sche (discuss) 19:45, 14 February 2021 (UTC)- If possible and if properly executed, the approach I advocate gets you out of circularity for each individual definition. DCDuring (talk) 23:00, 14 February 2021 (UTC)
- Yeah, I think that's what we may have to do, with glosses in the
banane Gros Michel to Gros Michel (French)
I would do this myself (it's SOP), but I don't know gender and other grammatical details. Chuck Entz (talk) 07:12, 16 February 2021 (UTC)
2021 — March
Equinox ◑ 00:38, 1 March 2021 (UTC)
- Merge into whichever is used most often. - excarnateSojourner (talk|contrib) 07:03, 29 October 2021 (UTC)
Probably verb main lemma Oxlade2000 (talk) 11:06, 13 March 2021 (UTC)
American Italian or Italian English? Category fusion or entry deletion or renaming or... help.
Fairly new user here, been adding some Italian English/American terms to the wiki and found out a few additions in that the category was named American Italian which is why I couldn't find it. Which term should we use, and can we pick one so we can put them all in one place? (Also, on the topic, what's a good way to categorize things when we know we're from one of four dialects but aren't sure...which?)
https://en.wiktionary.org/wiki/Category:American_Italian I have the American Italian category here; I've managed to move gabbadost and jamoke there, but https://en.wiktionary.org/wiki/Italian_English will have to be renamed I think, depending on which is a more accurate name; additionally, I don't know what the correct syntax relationship between "definition of the dialect, dialect category, etc." is.
- American Italian is the Italian language as spoken in America. Italian English is the English language as spoken by Italians. Since the US is the main place with English-speaking Italians, there tends to be overlap, especially with code-switching.
- That said, they aren't the same thing. Most-English-speaking Italians in the US don't really speak Italian, but their usage has enough differences from mainstream English for the "Italian English" label and category to be useful. Likewise, Italian as spoken in the US has enough differences from Italian as a whole that the label and category "American Italian" are useful.
- The same people may often be speaking both, but the important distinction is: are they speaking Italian, or are they speaking English? Chuck Entz (talk) 21:40, 13 March 2021 (UTC)
I'm not sure about that? Andrew Longo categorizes American Italian as the pidgin language that comes from using both Italian and English together so I'm not sure how to answer "are they speaking Italian or English" when the answer frequently seems to be "both" and "neither". And part of the issue re: American Italian in particular is that they are speaking Italian - it's just dialect/Neapolitan or Sicilian, which is why it doesn't register as 'Italian' to Italians. Elliott Dunstan (talk) 16:34, 14 March 2021 (UTC)
I think this should be moved to sprong in het duister and converted to a noun. The forms with maken are not overwhelmingly common compared to other uses. ←₰-→ Lingo Bingo Dingo (talk) 14:56, 28 March 2021 (UTC)
2021 — April
There are several "rotation" senses that are patchily duplicated between these entries. Equinox ◑ 19:39, 9 April 2021 (UTC)
The definitions are different (I think fight shy is better; the other is too vague) and it seems that the entries should be merged anyway. Note that fight shy can occur alone, without of. Equinox ◑ 02:45, 12 April 2021 (UTC)
This is a subcategory of "Category:English irregular plurals". Should it be renamed "Category:English irregular plurals ending in "-en"", to match other similar subcategories? (If so, "Category:English plurals ending in "-a"" and "Category:English plurals ending in "-oi"" should also be renamed.) — SGconlaw (talk) 17:18, 24 April 2021 (UTC)
- I would support this, but I note that there is some historical dispute about what makes a plural "irregular". For example, it is generally agreed that adding an "s" to a noun is the prime example of a regular plural, and it is generally agreed that added "es" after a final "s" or "x" or "ch" (bosses, boxes, crutches) is also a way of forming regular plurals. Some consider "f"/"fe" to "ves" endings (wolf to wolves, wife to wives) to be regular plurals as well, though I consider that a stretch. bd2412 T 19:49, 24 April 2021 (UTC)
- Sounds like that needs to be decided first, then. If there is consensus that it is regular in English for plurals to be created by adding -a, -en and -oi (?!), then the corresponding subcategories should not have "Category:English irregular plurals" as a parent. — SGconlaw (talk) 19:53, 24 April 2021 (UTC)
- We could go by what is says in the Wikipedia article on the subject, which classifies "f"/"fe" to "ves" endings as "near-regular plurals". bd2412 T 22:53, 24 April 2021 (UTC)
- Yes, that's a good idea. Going by that article, we should then rename the subcategories mentioned above as "Category:English irregular plurals ending in "-a"", "Category:English irregular plurals ending in "-en"" and "Category:English irregular plurals ending in "-oi"". However, I'm not sure if we should rename "Category:English irregular plurals ending in "-ves"" as "Category:English plurals ending in "-ves"". The article seems to be talking more about the sounds that form plurals than the spelling. — SGconlaw (talk) 14:20, 25 April 2021 (UTC)
- We could go by what is says in the Wikipedia article on the subject, which classifies "f"/"fe" to "ves" endings as "near-regular plurals". bd2412 T 22:53, 24 April 2021 (UTC)
- Sounds like that needs to be decided first, then. If there is consensus that it is regular in English for plurals to be created by adding -a, -en and -oi (?!), then the corresponding subcategories should not have "Category:English irregular plurals" as a parent. — SGconlaw (talk) 19:53, 24 April 2021 (UTC)
- JoeyChen (talk • contribs) moved Category:English plurals ending in "-a" to Category:English irregular plurals ending in "-a", and similarly for "-en" and "-oi" in May and June, 2021, but did not thoroughly move the plurals themselves to the new category names. I have manually moved the (remaining) "-a" and "-oi" terms as there were not many, but "-en" still contains over 400, so I'm going to leave that for a bot. These plurals should use
{{en-irregular plural of|<singular form>|ending=en}}
on their sense line to add them to the new ("irregular") category, and should have the old, explicitly added category removed. - excarnateSojourner (talk | contrib) 02:46, 22 October 2022 (UTC)
2021 — May
Should be moved to spit in the face of, since one can spit in the face of, e.g., the law, the government, hip-hop culture, and other non-people nouns. Imetsia (talk) 23:48, 7 May 2021 (UTC)
Actually it should be spit in one's face, as there are already many such set phrases involving a genitive construction where the variable object is represented as one's. For example: change one's mind—though there is also change someone's mind, which is redundant and should probably be deleted.— 69.120.64.15 02:31, 10 May 2021 (UTC)
- In this case (spitting) it should be "someone's", because "one" spits in "someone" else's face. We use "one" where the phrase is constructed so that it happens to oneself. Equinox ◑ 20:27, 23 May 2021 (UTC)
- On the other hand, it's conceivable that someone says, "How dare you spit in my face?", meaning that the person addressed has treated the speaker disrespectfully. — SGconlaw (talk) 17:44, 4 June 2021 (UTC)
- Ah I see now, the distinction is that one's constructions are supposed to be reflexive. The distinction in titling however is not obvious and I wish it were made clear somewhere. — 69.120.64.15 03:57, 5 June 2021 (UTC)
- In this case (spitting) it should be "someone's", because "one" spits in "someone" else's face. We use "one" where the phrase is constructed so that it happens to oneself. Equinox ◑ 20:27, 23 May 2021 (UTC)
Canonical name of "mep"
Currently, the canonical name of the language in WT is spelled Miriwung, even though every primary/secondary source I could find recommended the spelling Miriwoong, as that is consistent with the language's own orthography, while the spellings "Miriwung" and "Miriuwung" are considered nonstandard. Can someone fix it? --Numberguy6 (talk) 14:47, 8 May 2021 (UTC)
- It's not exactly hard to find sources spelling it as Miriwung, but I'm sure you're right. @-sche? —Μετάknowledgediscuss/deeds 22:52, 21 July 2021 (UTC)
Fix to the standard format for reference templates.
For the same reason, also:
and perhaps:
- Template:R:AFB → Template:R:gd:AFB
- Template:R:LG → Template:R:gd:LG
- Template:R:MacBain → Template:R:gd:MacBain
and others of similar form.
There are many others, but I won't attempt to list them all. Some, such as Template:R:Duden, Template:R:DIL and Template:R:Dwelly, are used on hundreds of pages and therefore moving the template would be a disruptive change; in such cases the original titles should be kept as redirects for convenience–but I doubt that these more established ones will be considered for moving at all.
There doesn't seem to be any official policy on the naming of reference templates; the common Template:R:[code]:[name or abbreviation(:[year])]
format is merely a convention. Some categories of reference templates such as Category:Proto-Indo-European reference templates have members which are almost entirely named according to this format; others have none at all; most are a mix. The inconsistency is confusing. Would it be improper to suggest a rule that all (future) reference templates include the respective language/family code when applicable? I would appreciate feedback from any big contributors on how you handle this type of thing, considering that templates have often been renamed according to this "standard" formatting in the past, despite lack of much discussion or an official guideline. Such a discussion would at least inform Wiktionarians on how to name reference templates in the future.
— 69.120.64.15 02:20, 10 May 2021 (UTC)
- Edit: Silly me, I didn't notice these parlour discussions on it:
- — 69.120.64.15 04:46, 10 May 2021 (UTC)
Can/should the Irish religion senses at these two entries be merged somehow? Equinox ◑ 20:26, 23 May 2021 (UTC)
2021 — June
All of the other Proto-Tocharian entries so far use ⟨y⟩ for this phoneme */j/, equivalent to Adams' ⟨i̯⟩. This is also the letter used on the Wikipedia article for Proto-Tocharian and in the standard romanization of Tocharian languages, which we use, not to mention for the corresponding phoneme in PIE, *y. It would be nonsensical and confusing to use ⟨j⟩ instead for the Proto-Tocharian stage only. The page was created recently (April), so presumably its creator just forgot to check the existing entries. — 69.120.64.15 03:37, 5 June 2021 (UTC)
- Wait, apparently there is a distinction in how Adams uses ⟨i̯⟩ versus ⟨y⟩ for Proto-Tocharian, but I have no clue what it is. (It has nothing to do with PIE *d versus *y, for instance, and nothing to do with laryngeals.) — 69.120.64.15 04:18, 5 June 2021 (UTC)
- Ok, it seems to be non-phonemic and have to do with the following vowel. */jä/ (⟨ä⟩ ≈ IPA /ɨ/) and */jē/ are written ⟨i̯ä⟩ and ⟨i̯ē⟩ respectively, but /jV/ for all other vowels seem to use ⟨y⟩. I doubt this is a necessary distinction for Wiktionary to make, since it seems entirely predictable from environment, but I'm still unsure what purpose it is meant to serve. @GabeMoore, might you be able to weigh in? — 69.120.64.15 04:34, 5 June 2021 (UTC)
when all you have is a hammer, everything looks like a nail versus if all you have is a hammer, everything looks like a nail
Suggest making when all you have is a hammer, everything looks like a nail the primary and if all you have is a hammer, everything looks like a nail the alt form. Rationale: sounds better and more hits on Google in a 4:3 ratio. Cheers, Facts707 (talk) 17:19, 12 June 2021 (UTC)
- Support for the sake of deduplication. - excarnateSojourner (talk|contrib) 07:14, 29 October 2021 (UTC)
I'm all but certain that one can't have a word without pronounced vowels, but I feel that it reads better if it's explicitly stated anyway. Johano★ 01:15, 15 June 2021 (UTC)
- One can't? Hmmm... Chuck Entz (talk) 03:28, 15 June 2021 (UTC)
- @Chuck Entz: Thanks for the chuckle. — Fytcha〈 T | L | C 〉 04:08, 13 January 2022 (UTC)
- @Fytcha: Shhh...Chuck Entz (talk)
- @Chuck Entz: Thanks for the chuckle. — Fytcha〈 T | L | C 〉 04:08, 13 January 2022 (UTC)
- Those categories also have some questionable entries, particularly with Welsh loanwords like cwm and crwth... Seems like it just checks for the absence of a/e/i/o/u/y. – Guitarmankev1 (talk) 12:48, 24 June 2021 (UTC)
- Maybe "Category:English words without vowel letters"? —Mahāgaja · talk 14:03, 24 June 2021 (UTC)
- Yeah (except maybe "English terms"), that would also reduce how dumb it looks that the category includes lots of numbers which are quite regularly pronounced with vowels, and things where the vowels have merely been obscured (b****cks), and abbreviations that aren't even "words" per se, like BHD. - -sche (discuss) 22:03, 8 July 2021 (UTC)
- I support moving to Category:English terms spelled without vowels. - excarnateSojourner (talk|contrib) 07:19, 29 October 2021 (UTC)
- Yeah (except maybe "English terms"), that would also reduce how dumb it looks that the category includes lots of numbers which are quite regularly pronounced with vowels, and things where the vowels have merely been obscured (b****cks), and abbreviations that aren't even "words" per se, like BHD. - -sche (discuss) 22:03, 8 July 2021 (UTC)
- Maybe "Category:English words without vowel letters"? —Mahāgaja · talk 14:03, 24 June 2021 (UTC)
Yeah I think the category should be renamed then. Ffffrr (talk) 20:50, 16 December 2021 (UTC)
Also, why is it a subcategory of Category:English shortenings? Sure, a lot of shortenings omit the vowels, but the converse isn't true: hmm, grr, 1984 (unless every number is a shortening of its spelled out form, which doesn't seem all that useful). Do I need to start a separate request to remove a subcategory? Medmunds (talk) 18:53, 18 March 2022 (UTC)
- Sorry, that was off topic here. Answered my own question; moving this to Category talk:English words without vowels. — Medmunds (talk) 00:20, 21 March 2022 (UTC)
2021 — July
Like other Proto-Dravidian compound words, should not contain a hyphen. — 69.120.64.15 23:33, 19 July 2021 (UTC)
A bit fiddly: one entry is a verb and the other a noun, and they both have multiple senses with slight distinctions that should be ironed out. Equinox ◑ 13:13, 25 July 2021 (UTC)
Lenape
The decision to deprecate Lenape as a canonical language needs to be revisited. It seems to me that removing Lenape as a canonical language is like removing French as a canonical language and instead having only Quebecois and Parisian. Classifying Wënami and Munsee as seperate canonical languages is a narrow linguistic differentiation that quickly spills into the political. In a 2013 discussion (archived at Category talk:Unami language) the difference between Wënami amd Munsee was compared to English and Scotts. This is a faulty comparison because the political and cultural situation is in no way analogous (and language versus dialect is always a deeply political not just academic subject). The continuum of dialects once spoken in Southern New York, New Jersey, Delaware, and Pennsylvania were on the brink of disappearing forever. What we are witnessing now is a process of standardization that could revive the Lenape language. I hope the Wiktionary community would not want to contribute to fracturing such efforts. Unnecessary rigidity about what is to be considered a canonical language could jeopardize such standardization and revival. I strongly urge everyone to reconsider having a canonical language category for Lenape, and specifying the dialect of origin for specific terms, usages, and grammatical conventions. Note that all orthographies for Lenape are relatively recent inventions based on various European languages, and that standardization here is more than warranted. Andreas.b.olsson (talk) 05:17, 29 July 2021 (UTC)
- @-sche Chuck Entz (talk) 05:45, 29 July 2021 (UTC)
- Also pinging @DCDuring as a participant in the previous discussion. —Mahāgaja · talk 06:06, 29 July 2021 (UTC)
- This is tricky. Some people want to document each language as such, especially with an eye to their past as mutually unintelligible languages (Marianne Mithun, The Languages of Native North America, 2001, page 331; Siebert even seems to suggest Munsee was more closely related or similar to Mahican than to Unami) with extensive differences in orthography and phonology (e.g. in having l vs r in their reflexes of PA's *r/*l phoneme). (Archaeological as well as linguistic evidence suggests the distinction between the two groups goes back to prehistoric times.) Merging northerly Munsee into the now more dominant southern Unami could complicate documentation of that critically endangered language, which some people are studying and learning from its remaining speakers in Canada. On the other hand, the most prominent revitalization efforts in the United States, based mostly on southern Unami, do seem to speak of Lenape as a single language (and FWIW, Canadian Munsee efforts, albeit with a different spelling, also seem to speak of a Lunaape language), apparently aiming to standardize the two into one language with an eye towards keeping it alive into the future. We have to think carefully about what to do here. - -sche (discuss) 17:52, 29 July 2021 (UTC)
- I have nothing to contribute beyond the wish for some precision in the etymologies of toponyms, which can easily be accomplished with qualifiers or labels without complicating the creation of good language entries. DCDuring (talk) 18:33, 29 July 2021 (UTC)
- You say on Talk:mochipwis that "I'm a learner of modernized Lenape, a language in the process of being revived. I am not an expert in the differences between Wënami versus Munsee. What I do know is that there are no first-language speakers of either dialect left." This goes to the heart of the issue, I think: Wiktionary does not only cover modern languages as they are currently spoken, but also covers languages that existed in the past. Although (as noted in the earlier discussion) I initially created "Lenape" content under a unified code, following the "lumper" approach of the US revivalists, the differences between the lects are (as Chuck noted in that same disscusion) historically extensive, to the point that the modern linguistic literature I've been able to find that says anything about their intelligibility accepts Mithun's statement that they were mutually unintelligible, which militates against combining them. (OTOH, if modern speakers are trying to merge them, that militates in favor of a merger.) Wiktionary does merge e.g. many Sinitic languages even when they're mutually unintelligible when spoken (and conversely we split e.g. Bokmal and Nynorsk even though they're not merely mutually intelligible but the same language), so either approach could be made to work. - -sche (discuss) 19:29, 29 July 2021 (UTC)
- @-sche: What about recognizing all three codes? We could have Lenape (either using
del
or creating a new code for it, e.g.del-len
, so as to keepdel
for the family) for the modern language undergoing revitalization, and also haveunm
andumu
for the historical stages. —Mahāgaja · talk 19:57, 29 July 2021 (UTC)- Adding a link here to the important Lënape Talking Dictionary maintained by the Tribe of Delaware Indians. It should be recognized that the code
del
(for Delaware) is potentially problematic and controversial. If I say in New York City “I speak a little Delaware” I’m likely to get quizzical looks. But if I say “I’m learning Lënape” an educated resident of the city is likely to remember social studies classes in elementary school, contextualize, and understand what I mean. Wouldn’t it be better to have a new code for modernized Lënape that contains the letters in Lënape? Andreas.b.olsson (talk) 21:56, 29 July 2021 (UTC)- We use the ISO 693-3 codes for languages, so that's out of our control. They're not going to change the code; they've refused for www, which has some significant real-life problems. In practice, the codes may sometimes look like abbreviations, but in theory, there's no necessary connection; it's just a three letter code.--Prosfilaes (talk) 01:30, 30 July 2021 (UTC)
- @Prosfilaes there is no ISO designation for the Lënape variants. So what language does the ISO code
del
stand for? Old Wënami? Mondern Lënape? Munsee dialects still spoken in Moraviantown? Andreas.b.olsson (talk) 02:50, 30 July 2021 (UTC)- @Andreas.b.olsson: In the ISO itself,
del
stands for a macrolanguage called Delaware whose individual languages are Munseeumu
and Unamiunm
(see https://iso639-3.sil.org/code/del). We're not obliged to use ISO's names, though, so we are free to calldel
Lenape or Lënape rather than Delaware. Don't get hung up on the letters used for the code; they don't have to bear any relation to the name of the language (xcl
stands for Old Armenian, for example, and the code for Mapudungun isarn
). —Mahāgaja · talk 07:40, 30 July 2021 (UTC)- @Mahagaja:Fair enough about the abstraction. I did not know that ISO had a more complex sub-categorization. Thank you for providing the link and helping my knowledge about these standardizations grow. My concern was related to the fact that
del
is derived from Delaware – the European name for the region – and not Lënapehokink, the Lënape name. However, as far as I understand the Lënape are themselves comfortable with the syncretic designation and retain it in their name (Nation of Delaware Indians). I should therefore not assume any offending connotation in the designationdel
. I’m trying to reach out to the Oklahoma branch of the Lënape. Let’s see what they say about this issue. As a dominant constituency they should have a say about this actively evolving linguistic situation. On an additional note, there is a significant difference in the orthography being actively taught by the mainly Munsee branch in Moraviantown Canada.Andreas.b.olsson (talk) 08:49, 30 July 2021 (UTC)
- @Mahagaja:Fair enough about the abstraction. I did not know that ISO had a more complex sub-categorization. Thank you for providing the link and helping my knowledge about these standardizations grow. My concern was related to the fact that
- @Andreas.b.olsson: In the ISO itself,
- @Prosfilaes there is no ISO designation for the Lënape variants. So what language does the ISO code
- We use the ISO 693-3 codes for languages, so that's out of our control. They're not going to change the code; they've refused for www, which has some significant real-life problems. In practice, the codes may sometimes look like abbreviations, but in theory, there's no necessary connection; it's just a three letter code.--Prosfilaes (talk) 01:30, 30 July 2021 (UTC)
- Adding a link here to the important Lënape Talking Dictionary maintained by the Tribe of Delaware Indians. It should be recognized that the code
- @-sche: What about recognizing all three codes? We could have Lenape (either using
I’m not sure there is anything wrong in the abstract splitting of the dialects/languages. It’s the naming in Wiktionary for these lects that does not reflect how these lects are referred to in actuality (where there is unfortunately ambiguity). A proper and respectful name is needed for the following variants:
- Variant currently taught at Talking Lënape Dictionary and extensively documented by Jim Rementer et al in recent history (late 20th century).
- Variant currently taught in Moraviantown where there seems to be wide variations between the last handful of fist-language speakers. I’m not sure any of them are still alive but the community seems to be sticking with their own historical orthographical variant rather than using Rementer’s et al. revised system (the one I’m learning and using).
- Historical variant as recorded in the time of David Zeisberger (18th century).
The modern variant in Moraviantown seems to be referred to alternatively as Lunaape and Munsee Language. The modern variant based on Rementer et al work is consistently referred to as Lënape or alternatively without the schwa (Lenape). Andreas.b.olsson (talk) 13:06, 30 July 2021 (UTC)
- How about the following naming:
- Lënape
- Lunaape
- Old Lenape
- The distance between the lects spoken by the Unami and Munsee tribes at the time of Zeisberger and prior seems largely speculative. Andreas.b.olsson (talk) 16:27, 30 July 2021 (UTC)
- For the purpose of historical linguistics Old Lenape can be divided into:
- Southern Unami
- Northern Unami
- Munsee
- Note here that Unami seems to be a term derived from historical linguistics. I believe it should not be used to denote any of the revitalized lects currently in use. However, the Stock-Bridge Munsee do refer to their Lunaape variant in English as the “Munsee language”. Andreas.b.olsson (talk) 14:54, 31 July 2021 (UTC)
- Unless the Stockbridge-Munsee Community object, I would suggest using Lunaape as the name of their orthographic variant and dialect.Andreas.b.olsson (talk) 15:04, 31 July 2021 (UTC)
- For the purpose of historical linguistics Old Lenape can be divided into:
Here is what I propose in terms of naming, categorization, and ISO encoding.
- Old Lenape:
del
- Munsee:
umu
- Unami:
unm
- Munsee:
- Lënape:
lnp
- Lunaape:
lne
LNP and LNE are not used yet by ISO 639-3. Andreas.b.olsson (talk) 15:29, 1 August 2021 (UTC) North and South Unami can be treated as minor variants (i.e. dialects). Munsee is treated as distinct, leaving the question open about whether it was more closely related to Mohican. Andreas.b.olsson (talk) 15:35, 1 August 2021 (UTC)
- We can't be making up our own three-letter codes, but we can put them after a family code, separated by a hyphen. So if we wanted to, we could in theory create
alg-lnp
,alg-lne
, etc. —Mahāgaja · talk 10:08, 2 August 2021 (UTC)- Why can’t we be making up codes if they are not used? This can be in conjunction with ra request to ISO that these lects be added with those codes. It’s one thing to request a change, another the treat the ISO 639 as extensional. With its increased influence Wiktionary ought have some influence on ISO. Andreas.b.olsson (talk) 16:40, 2 August 2021 (UTC)
- We can't make up codes because ISO might assign them to some other languages in the future. I think you might be overestimating Wiktionary's influence; also, the ISO committee would take a whole lot of convincing before they agreed to two more codes. Considering you started out this discussion arguing against treating Unami and Munsee as two different languages, it seems like it would be quite a stretch to convincingly argue that they are, in fact, five different languages. From everything you've explained above, I'm starting to think we should recognize only
del
as a canonical language and reassignunm
andumu
to be etymology-only codes for subvarieties ofdel
; if necessary, we can add more etymology-only codes (since these are local to Wiktionary and have nothing to do with ISO) for other chronolects and regiolects. —Mahāgaja · talk 18:38, 2 August 2021 (UTC)- @Mahagaja I think your low-balling Wiktionary’s growing influence, and a few people’s ability to influence seemingly immobile institutions. My point has evolved based on @-sche’s input about linguistic historicity and the strong difference between orthography used by those who learn Lunaape versus Lënape. Importantly, there is extremely strong merit in treating the language recorded by Zeisberger et al in the 18th century as a different language. It is grammatically quite distant from Lënape and Lunaape. Note that the
del
,unm
, andumu
seem based on the historical study by Moravian missionaries (who referred to the language as “Delawarische Sprache”). There needs to be a clear distinction between these historical lects (which I’m referring to as “Old Lenape”) and the languages spoken today. Andreas.b.olsson (talk) 03:26, 3 August 2021 (UTC)- Seeming immobile to whom? They've seemed relatively responsive to me; they offer a final decision on most requests in an annual wrapup. I certainly question any need to bluster ahead and then expect SIL to follow; that seems likely to be counterproductive.--Prosfilaes (talk) 09:31, 4 August 2021 (UTC)
- I would also note here that the German entry in Wikipedia is titled “Delawarisch” and talks about it first as a single language, and then alternatively two languages depending on one’s view. The equivalent English entry is “Delaware languages”. However, in both entries the focus is on the language(s) studied by Moravian missionaries et al, and then linguists studying those initial European studies (which haphazardly invented wildly different orthographies based on the authors own mother tongues). @-sche is right: it’s complicated. Here are the facts:
- Old Lenape is grammatically very different from modern variants.
- There are two very different modern orthographical variants.
- If this were Norwegian, you would without hesitation have Gammelnorsk, Bokmål, and Nynorsk. Andreas.b.olsson (talk) 04:01, 3 August 2021 (UTC)
- So if we are going to base decisions on precedent – which one ought to - then the way Wiktionary has approached the orthographically distinct Norwegian Bokmål and Nynorsk speaks for having distinct language codes for Lënape and Lunaape regardless of whether they should be treated as the same language or not. Unfortunately,
del
really in the end refers to Old Lenape since the ISO further subdivides it into Unami and Munsee, which are really references to the language(s) studied by Zeisberger’s et al and those who continue to study their studies. To use it for Lënape and Lunaape would be confusing and improper.Andreas.b.olsson (talk) 04:25, 3 August 2021 (UTC)- This is not Norwegian, and there's no other language that is presented like Norwegian. The closest are Ottoman Turkish and modern Turkish and Urdu and Hindi, and in both of those cases you're looking at different scripts, Arabic and Latin or Devanagari plus vocabulary divergence. One has to look at all the precedent, not just one example.--Prosfilaes (talk) 09:31, 4 August 2021 (UTC)
- On a last note for tonight, the codes
alg-lnp
andalg-lne
proposed by @Mahagaja for Lënape and Lunaape could be a reasonable compromise.Andreas.b.olsson (talk) 04:34, 3 August 2021 (UTC)
- @Mahagaja I think your low-balling Wiktionary’s growing influence, and a few people’s ability to influence seemingly immobile institutions. My point has evolved based on @-sche’s input about linguistic historicity and the strong difference between orthography used by those who learn Lunaape versus Lënape. Importantly, there is extremely strong merit in treating the language recorded by Zeisberger et al in the 18th century as a different language. It is grammatically quite distant from Lënape and Lunaape. Note that the
- We can't make up codes because ISO might assign them to some other languages in the future. I think you might be overestimating Wiktionary's influence; also, the ISO committee would take a whole lot of convincing before they agreed to two more codes. Considering you started out this discussion arguing against treating Unami and Munsee as two different languages, it seems like it would be quite a stretch to convincingly argue that they are, in fact, five different languages. From everything you've explained above, I'm starting to think we should recognize only
- Why can’t we be making up codes if they are not used? This can be in conjunction with ra request to ISO that these lects be added with those codes. It’s one thing to request a change, another the treat the ISO 639 as extensional. With its increased influence Wiktionary ought have some influence on ISO. Andreas.b.olsson (talk) 16:40, 2 August 2021 (UTC)
- IMO, as someone who as added a good portion on the Munsee entries, unless someone can demonstratively show the need, I would say the way things are set up right now is perfectly fine. More work should be made to show the linguistic differences before any new codes are created. This all seems like hypothetical conjecture to me. --
{{victar|talk}}
05:25, 3 August 2021 (UTC)- Having looked closer at Zeisberger’s work, I should say that how distant Old Lenape is from Lënape and Lunaape is uncertain to me. There seems to have been a lot of simplification in how verbs are used, with certain modes falling greatly out of favor (e.g the subordinate mode). The issue is complicated by the very different orthography used by Zeisberger, who used German inspired phonological conventions. If Wiktionary were to want to document these changes though, it ought to leave a space for Old Lenape. The issue reminds me of Swedish, which went through a drastic change in orthography in the late 19th and early 20th century.
- I think what I get back to is that Unami is not the proper L2 name for a language. The proper name for this language is Lënape. So even if we keep the coding structure in tact, the name of that lect needs to be changed (Unami => Lënape). Andreas.b.olsson (talk) 09:24, 3 August 2021 (UTC)
- Out of curiosity @Victar, how come you chose Munsee as the name of the L2 lect and not Lunaape? I see that the community in Moraviantown uses either designation. Nonetheless, what made you chose one over the other? If you are a member of the Stockbridge-Munsee community, please help me better understand and my sincerest apologies for potentially overstepping. Andreas.b.olsson (talk) 09:36, 3 August 2021 (UTC)
- We (Wiki/Wikt) base much of our language nomenclature and classifications on the recommendations of Ethnologue.. --
{{victar|talk}}
19:15, 3 August 2021 (UTC)
- We (Wiki/Wikt) base much of our language nomenclature and classifications on the recommendations of Ethnologue.. --
I guess another thing that bothers me is that the code unm
– however abstract in the mind of the Wiktionary community – carries the stigma of a split. It’s continued use will cause continued sclerosis. It’s does not seem a term that all parties – Lënape in Oklahoma, Munsee in Canada, and enthusiasts like myself in New York City – can all rally around. It seems to me that there is something quite new going on, an amalgamation and standardization that should be captured in vivo by Wiktionary. Andreas.b.olsson (talk) 10:14, 3 August 2021 (UTC)
(Sorry, I was busy/distracted.) If we could reach out to Lenape folks, including Munsee speakers or learners, for clarification of whether they consider the lects one language (in particular, if Munsee consider their language to be the same as Unami, or if it is only the US-based folks running the Unami-based Lenape Talking Dictionary who name it as if it speaks for all Lenape), that'd provide useful evidence: if speakers want to consider the lects one language, it'd be evidence that we should merge (and distinguish with qualifiers, etc); otherwise, the current setup is fine. It seems some of the objection comes down to just the names, preferring "Lenape" to "Unami"; in another discussion Andreas notes that speakers would sooner say the equivalent of "I speak Lenape" than "I speak Unami", but this is not evidence that they are one language (consider various Dogon languages, which are all Dogons though not all mutually intelligible, including Bangime language whose speakers consider themselves Dogon who "seem unaware that [their language] is not mutually intelligible with any Dogon language"). Simply renaming "Unami" to "Unami Lenape" (and "Munsee" to "Munsee Lenape") is an option, although we usually try to use the most commonly used names for languages, which may be the current names. The shape of the codes is a non-issue since they're mostly not visible to readers, only editors. This discussion does make me realize that they're reader-facing in the names of certain categories, and that this is probably confusing, but that's a general issue not specific to Lenape. (How many people looking at our Khanty content would guess that the list of berries is at "Category:kca:Berries"? Shouldn't we always use language names?) There's no linguistic basis for introducing more than one new code (e.g. a five-way split!), as floated above. - -sche (discuss) 02:24, 5 August 2021 (UTC)
- I’m the newcomer here @-sche. Though I have used Wiktionary for a long time, I did not prior to this participate in its maintenance. I will defer to you all on the best back-end encoding, and whether proliferating codes for anticipated future linguistic differentiating makes little sense. My objection – and this was not clear to me before either – seems reduced to the fact that “Lënape” was stripped from how the lect(s) are referred to. I think there is sufficient online evidence to make the affirmation that the so called Unami branch is almost exclusively referred to as Lënape. Therefore I think it would be warranted to make this change as soon as possible. Having looked at quite a few of the “Unami” entries it seems to me that they are almost all derived from the work of the Talking Lënape Dictionary project. Hence the issue of the distance between the lect recorded by Zeisberger et al versus the more recent collected speech samples of the Talking Lënape team is mute, and can be resolved later. Note though this may mean having to encode for another lect at some point in the future (especially since the spelling used by the Moravian missionaries was based on German and very different). I agree fully that the Munsee community should have a say in whether to treat their lect as the same as that taught by the Talking Lënape Dictionary project. It should be noted though that it can be factually stated that they have to date used a very different orthography in lessons and materials they have posted online. Andreas.b.olsson (talk)
@-sche, @Mahagaja, @Victar can we split out the decision about the naming for the lect with the code unm
and rename it from Unami to Lënape? There is overwhelming evidence that Lënape should be the proper name of the language, and not Unami. The code issue and whether to treat the Munsee branch as a separate language (and what to call it) can be dealt with separately and later.
- @-sche, @Mahagaja, @Victar asking again about splitting out the issue of the language's proper name and making an initial narrower decision. I would like to continue building out Wiktionary's Lënape entries, but first I would like to make sure the language currently referred to as "Unami" is referred to by its proper name. Andreas.b.olsson (talk) 13:01, 12 August 2021 (UTC)
- I have no objection (other than that it's hard to type) to renaming
unm
Lënape. —Mahāgaja · talk 13:10, 12 August 2021 (UTC) - AFAICT "Lenape" is a far more common name than "Lënape", so renaming "Lenape" to "Lënape" does not seem appropriate. And because Unami is only one variety of Lenape, renaming Unami to Lenape/Lënape would be confusing. The difficulty seems to be that the main US-based language project produces its Talking Dictionary seemingly based on Unami but named as if it's one overarching "Lenape"/"Lënape" language. (I know this kind of thing has come up before, where a prominent dictionary of what it considers "a language" doesn't specify which of two actually-distinct lects its words are, or conflates them; I will track down other examples if necessary, but it seems tangential.) I don't want to stand in the way of a Native language revitalization effort, so the above-discussed option of leaving Unami and Munsee for the historical (and historically not mutually intelligible) varieties and making Lenape a language code (rather than a family code) for the "unified" language has some appeal, but it would entail duplicating a lot of content which is currently Unami. There is also the option of renaming Unami to "Unami Lenape" to get the "Lenape" name in there. - -sche (discuss) 19:19, 30 August 2021 (UTC)
- Again, I agree -- renaming Unami to Lenape would simply cause confusion whilst adding no benefit and renaming it to "Unami Lenape" does not serve to disambiguate. --
{{victar|talk}}
20:13, 30 August 2021 (UTC)- I’m not sure I understand for whom this would be confusing @Victar. I agree @-sche that Lenape may be simpler to type than Lënape. Anyway, the schwa indicator is optional in the dominant modern dialect of Lënape. I have a deep hunch here that Linnaean taxonomic orthodoxy and perpetuation of erroneous past anthropological claims are at fault for the strict linguistic distinction between Unami and Munsee. The more I study Lënape, the fewer distinctions I see in the various dialects except for in the choice of orthography (which noticeably stems from quite recent European attempts to record an unwritten language). Andreas.b.olsson (talk)
- Again, I agree -- renaming Unami to Lenape would simply cause confusion whilst adding no benefit and renaming it to "Unami Lenape" does not serve to disambiguate. --
- I have no objection (other than that it's hard to type) to renaming
I suspect that Wiktionary is perpetuating a European construct erected between the late 18th and early 20th century. Of course, these faulty past errors of European linguists have had real effect, seemingly causing “two languages” to appear to exist. I need to further study the Munsee branch to substantiate my claims.
I continue to believe Unami is a culturally and linguistically inappropriate term for any dialect of Lënape. If the language spoken by the descendants of the Unami tribe of the Lënape were to have any other name than Lenape – which I insist is the appropriate designation – it might be Delaware. This is a name I believe the (predominantly) Unami descendants living in Oklahoma accept and sometimes use in English despite its European origins.
Wanishi Andreas.b.olsson (talk)
2021 — August
Category:English temporal location adverbs > Category:English punctual adverbs, also for Gothic and Hungarian
Module:category tree/poscatboiler/data/lemmas already links to "punctual adverbs" but this module does not include "temporal location adverbs", although these apparently correspond to each other. (The former is defined as "adverbs that express a single point or span in time".) Also, "temporal location" looks a bit like a contradiction in terms.
For your convenience, the links to the other two category pairs are
- Category:Gothic temporal location adverbs > Category:Gothic punctual adverbs and
- Category:Hungarian temporal location adverbs > Category:Hungarian punctual adverbs.
Thanks in advance. Adam78 (talk) 17:21, 18 August 2021 (UTC)
Translations (and definitions, the usage notes, etc.) should be merged. J3133 (talk) 16:40, 25 August 2021 (UTC)
- Support for the sake of deduplication. - excarnateSojourner (talk|contrib) 07:30, 29 October 2021 (UTC)
Overlap with armenisht. Also, it may be an adjective not a noun Wubble You (talk) 19:01, 27 August 2021 (UTC)
2021 — September
Names of sah
, alt
, xgn-kha
and request for Soyot
The Constitution of the Republic of Sakha (Yakutia) (https://iltumen.ru/constitution) officially used язык саха referring to the language sah
. A government decree («О Правилах орфографии и пунктуации языка саха») which approved the language’s current orthography, used язык саха instead of якутский язык from its annexe. However, this usage is not mandatorily popularised. I suggest Sakha to be adopted instead of Yakut due to the Constitution reference.
Whence atv
‘Northern Altai’ is not a singule language/dialect but a group of several (Kumandy, Chelkan & Tubalar), atv
shall be split into subcodes. Furthermore Southern Altai is only a classifying term, Altai as an official term shall be suggested for alt
.
Khamnigan xgn-kha
, as a transitional dialect (with conservative phonology) between Buryat and Mongolian, its simple name may not create ambiguity.
In addition I also request a code for Soyot. It will help contrasting Sayan Turkic languages. LibCae (talk) 06:36, 2 September 2021 (UTC)
- The Constitution of the Republic of Sakha is not our guide to using English names. In the case of [sah], most scholarly descriptions use "Yakut" (e.g. The Turkic Languages), there are far more raw Google hits for "Yakut language" than "Sakha language", and Google Ngrams show a preference for "Yakut" that has not waned over time (but we don't know past 2008, after which the data are incomplete).
- I can't comment on the other code requests, but it would be more convincing if there were some evidence in favour of the need for these codes and their distinctiveness from their closest relatives. —Μετάknowledgediscuss/deeds 16:11, 2 September 2021 (UTC)
- I don’t see the argument how more information would come to light if we split Northern Altai. Surely also Northern Altai and Southern Altai are the most usual names, in either English or Russian. For that number of speakers Northern Altai has, how could there be a benefit? The major factor for editors is what sources they use, whether they indicate the sources and whether those are clear about the place of origin. I had many books about “the Aramaic dialect of [village X]” where I don’t know which damn language code of Wiktionary it is supposed to belong to, Wiktionary making codes centered around city A and B but not village X, in the end I ignored to add anything. Fay Freak (talk) 17:00, 2 September 2021 (UTC)
- Oppose renaming Yakut
- Support splitting
atv
- Support renaming
alt
to Altai - Abstain regarding
xgn-kha
- Support creating a code for Soyot, quite strongly so. Allahverdi Verdizade (talk) 17:13, 2 September 2021 (UTC)
Not the same as take a knee TVdinnerless (talk) 23:56, 2 September 2021 (UTC)
- What's the difference? - excarnateSojourner (talk|contrib) 07:35, 29 October 2021 (UTC)
Merge into checked and entering? —Suzukaze-c (talk) 06:11, 3 September 2021 (UTC)
Saltillo in Rapa Nui
Previous discussion: User talk:Kwamikagami#Saltillo
Pinging @Kwamikagami, Metaknowledge as users that were already part of the discussion. For others, the following TLDR:
- User:Kwamikagami has moved a few Rapa Nui pages en masse from a straight apostrophe (U+0027) to a saltillo (U+A78C)
- The reason they give for this is that, since unicode classifies the apostrophe as a punctuation mark, rather than a letter, it shouldn't be used as a letter, and thus the visually similar saltillo should be used.
- The counter-reason given is that Unicode's classification is arbitrary and has little to do with actual usage in the language, which we as Wiktionary want to follow.
- There is one mention of the saltillo being open to usage, in Kieviet (2007).
- There is yet to be found at least one usage of the saltillo for Rapa Nui in the wild. since both Kieviet (2007) and schoolbooks published by the Chilean government use either a straight apostrophe (U+0027) [This one is most common], or a curly apostrophe (U+2018) provided with a font that renders it similar to a prime (U+2032). Other grammar books and dictionaries use any of the three characters.
I believe we should move these pages back to a straight apostrophe, and set the use of the straight apostrophe in stone at WT:ARAP. What do others think? Thadh (talk) 10:53, 3 September 2021 (UTC)
- We have four sources:
- We have Du Feu, who used a special font because the usual fonts available to her were inadequate for Rapa Nui, which required two special letters (the glottal stop and the engma). If the ASCII apostrophe were adequate for glottal stop, there would've been no need for a special letter.
- We have Kieviet, who states that, now that Unicode provides for the saltillo, there is no longer a need for a special font.
- We have the ministry dictionary, which uses an apostrophe letter -- not ASCII input with smart quotes, because it has the '9' shape at the beginning of a word.
- We have the ministry educational material, which uses a hodgepodge of ASCII apostrophes, curly apostrophes and curly quotation marks -- that is, sometimes '1' shaped, sometimes '9' shaped and sometimes '6' shaped, with little consistency. Presumably we wish to aim for better than that, even if it is common.
- In most languages that use an apostrophe-like letter for glottal stop, it's common to substitute a keyboard <'>, but that doesn't mean we should do the same. When writing Chechen, it's common to use a digit <1> for palochka, but again that doesn't mean we should do the same. When writing Ossetian, it's common to use a Latin rather than Cyrillic æ, but if you did that in a domain name, it would likely be tagged as phishing. The shortcuts people take with typography may be common, but a dictionary is expected to be more professional. kwami (talk) 15:10, 3 September 2021 (UTC)
- To briefly summarise the important points of what I said on
Thadh'sKwamikagami's talk page: This move should have been raised here first, so the weight of the evidence should have to point to the saltillo for us not to move it back. Kwami is from Wikipedia, and believes that we should be "more professional", even at the cost of ignoring all actual usage in a language community. (He has not, to the best of my knowledge, taken me up on my suggestion that he should go to the Wikipedias of languages like Rapa Nui and Hausa that use the apostrophe, and tell them that they're doing it wrong — just us.) I was open to the possibility that the saltillo might see actual use, but the fact that it doesn't makes this seem to be all about the Unicode specifications, which are not relevant to a descriptive dictionary. As I result, I support moving back to the apostrophe. —Μετάknowledgediscuss/deeds 16:15, 3 September 2021 (UTC)
- To briefly summarise the important points of what I said on
- There are several recent cases where @Mahagaja has advocated a particular Unicode character instead of a the straight apostrophe in such cases, but I don't remember the specifics off the top of my head. Chuck Entz (talk) 16:18, 3 September 2021 (UTC)
- We do need to use Unicode correctly. The straight apostrophe (U+0027) and curly apostrophe (U+2018) are punctuation marks and should not be used as letters. That's what the saltillo (U+A78C) and modifier letter apostrophe (U+02BC) are for. If using punctuation marks as letters were acceptable, Unicode wouldn't have bothered creating those characters. Using punctuation marks for letters is as bad as mixing Latin and Cyrillic (which is something we used to do for Montenegrin Serbo-Croatian, but don't anymore), as Kwami points out, and just because other sources do it doesn't mean we should. We can, of course, have hard redirects from spellings with the more easily typable straight apostrophe, or put the correct page name in
{{also}}
if the spelling with the straight apostrophe exists (as a punctuation mark) in another language. But Kwami was quite right to move these Rapa Nui pages to the spelling using the correct character, and they should not be moved back. —Mahāgaja · talk 16:39, 3 September 2021 (UTC)- @Mahagaja: So if nearly everyone writing text in a given language (say, tens of millions of people) use a character that you consider "wrong", we should still avoid it because it doesn't respect Unicode? Whatever happened to descriptivism? (And if you think this is a silly hypothetical, it's not — I just described the situation with the apostrophe in Hausa.) —Μετάknowledgediscuss/deeds 21:56, 3 September 2021 (UTC)
- It reminds me of when I started adding entries in the Cupeño language and had to figure out how to deal with a letter that the (pre-Unicode) main source defends as being very easy to replicate by filing bits off the $ key on a typewriter. People work with what they have available, and it doesn't always fit neatly into the right categories. Chuck Entz (talk) 23:00, 3 September 2021 (UTC)
- @Metaknowledge: If tens of millions of people used Rapa Nui, it would have its own keyboard layout and the saltillo would be easy to type for them. Descriptivism applies to language, not orthography. It's not anti-descriptivist to say that recieve is a misspelling, and using an apostrophe as a letter is also a misspelling. The only difference is that using an apostrophe instead of a saltillo isn't a mistake that can be made when writing by hand or by typewriter or that can be detected in a photocopy or a scan, so it's more subtle (like mixing Latin and Cyrillic), but it's still a mistake. —Mahāgaja · talk 06:49, 4 September 2021 (UTC)
- @Mahagaja: As I said, my example wasn't a hypothetical. There are somewhere around 60 million native speakers of Hausa per WP. Mac offers lots of keyboards for lots of languages, including one for Hawaiian complete with ʻokina, but it doesn't provide a Hausa one. When I search for Hausa keyboards on Google, they provide the apostrophe and quotation marks, but no character designated by Unicode as a letter. So are you really maintaining that nearly all typed material in Hausa is misspelt? —Μετάknowledgediscuss/deeds 07:11, 4 September 2021 (UTC)
- Yes, though of course that's not the Hausa users' fault, it's the fault of the software companies that care more about providing support for a minority language spoken by 24,000 people in the United States than about providing support for a language spoken by tens of millions of people in Africa (i.e. systemic racism). I don't blame Hausa users for doing the best they can with the materials available to them, and I know it's unrealistic to expect them all to type
ʼ
instead of just hitting the apostrophe key, but as a dictionary it's our responsibility to do things the right way rather than the easy way. —Mahāgaja · talk 07:29, 4 September 2021 (UTC)- @Mahagaja: Systemic racism is the root cause of lots of annoying things, but some of those things are set in stone. At this point, Hausa users have no reason to follow Unicode rules even when they can. I'm sure the editors at Hausa Wikipedia can figure out how to get the "correct" character if they wanted to, but I see that you too have no interest in going over there and telling them they're doing it wrong. I have a radical idea: let's respect their choices. —Μετάknowledgediscuss/deeds 17:57, 4 September 2021 (UTC)
- I have a better idea. We'll let Hausa Wikipedia worry about Hausa Wikipedia, and we'll worry about Wiktionary, which, as I said, has a responsibility to use Unicode correctly, even when other Wikimedia projects use it wrong. —Mahāgaja · talk 18:11, 4 September 2021 (UTC)
- I thought we had a responsibility to document languages, not be Unicode purists. —Μετάknowledgediscuss/deeds 18:33, 4 September 2021 (UTC)
- If Wikitionary or our browsers represent the languages incorrectly, because they follow the Unicode definition that punctuation marks are punctuation marks, then we are not documenting the languages correctly. If a language commission chooses a specific Unicode point that is one thing, but that's seldom the case. Since we by necessity choose a Unicode point for each letter regardless, we might as well choose one that represents the language well. kwami (talk) 01:17, 11 September 2021 (UTC)
- I thought we had a responsibility to document languages, not be Unicode purists. —Μετάknowledgediscuss/deeds 18:33, 4 September 2021 (UTC)
- I have a better idea. We'll let Hausa Wikipedia worry about Hausa Wikipedia, and we'll worry about Wiktionary, which, as I said, has a responsibility to use Unicode correctly, even when other Wikimedia projects use it wrong. —Mahāgaja · talk 18:11, 4 September 2021 (UTC)
- Just to jump in quickly, as someone who is Nigerian and has had to go through the process of creating my own keyboards to be able to type properly in Yorùbá and as someone who is learning Hawaiian, while there definitely is systemic racism when it comes to African languages, I really would not pit them against Hawaiian. Hawaiian still lacks a ton of support, sometimes even less than Hausa, Igbo, & Yorùbá (see: spellcheck on PC Microsoft Word or language packs for Windows), and people are still trying to get more support for it. At the same time, Hawaiian is more than just "a minority language spoken by 24,000 people in the United States", it is an indigenous language that currently is the product of tons of effort gone towards revitalizing it and making sure that it's well-supported. And so, please do not pit them against each other saying that Hawaiian having more support (even though it doesn't) is systemic racism. The communities are aiming for very similar goals and are all dealing with racism in our own ways, not from each other.
- Re: the main issue at hand. I would go with what the speakers of the language use. It's similar to what we do for Hausa, Igbo, & Yorùbá tones. No matter how annoying it can be, since the majority of speakers don't write tones out, we don't put them in page titles and only in headword lines, since we want people to be able to find words that they see "in the wild", which will often not be tone-marked. So it's a similar issue here, if the majority of speakers and majority of texts don't use the special character and it's not hard prescribed, then the page title shouldn't change, and the special character can be put in the headword line. That's my personal take on that issue. AG202 (talk) 13:48, 11 September 2021 (UTC)
- @Mahagaja: Systemic racism is the root cause of lots of annoying things, but some of those things are set in stone. At this point, Hausa users have no reason to follow Unicode rules even when they can. I'm sure the editors at Hausa Wikipedia can figure out how to get the "correct" character if they wanted to, but I see that you too have no interest in going over there and telling them they're doing it wrong. I have a radical idea: let's respect their choices. —Μετάknowledgediscuss/deeds 17:57, 4 September 2021 (UTC)
- Yes, though of course that's not the Hausa users' fault, it's the fault of the software companies that care more about providing support for a minority language spoken by 24,000 people in the United States than about providing support for a language spoken by tens of millions of people in Africa (i.e. systemic racism). I don't blame Hausa users for doing the best they can with the materials available to them, and I know it's unrealistic to expect them all to type
- @Mahagaja: As I said, my example wasn't a hypothetical. There are somewhere around 60 million native speakers of Hausa per WP. Mac offers lots of keyboards for lots of languages, including one for Hawaiian complete with ʻokina, but it doesn't provide a Hausa one. When I search for Hausa keyboards on Google, they provide the apostrophe and quotation marks, but no character designated by Unicode as a letter. So are you really maintaining that nearly all typed material in Hausa is misspelt? —Μετάknowledgediscuss/deeds 07:11, 4 September 2021 (UTC)
- @Metaknowledge: If tens of millions of people used Rapa Nui, it would have its own keyboard layout and the saltillo would be easy to type for them. Descriptivism applies to language, not orthography. It's not anti-descriptivist to say that recieve is a misspelling, and using an apostrophe as a letter is also a misspelling. The only difference is that using an apostrophe instead of a saltillo isn't a mistake that can be made when writing by hand or by typewriter or that can be detected in a photocopy or a scan, so it's more subtle (like mixing Latin and Cyrillic), but it's still a mistake. —Mahāgaja · talk 06:49, 4 September 2021 (UTC)
- It reminds me of when I started adding entries in the Cupeño language and had to figure out how to deal with a letter that the (pre-Unicode) main source defends as being very easy to replicate by filing bits off the $ key on a typewriter. People work with what they have available, and it doesn't always fit neatly into the right categories. Chuck Entz (talk) 23:00, 3 September 2021 (UTC)
- @Mahagaja: So if nearly everyone writing text in a given language (say, tens of millions of people) use a character that you consider "wrong", we should still avoid it because it doesn't respect Unicode? Whatever happened to descriptivism? (And if you think this is a silly hypothetical, it's not — I just described the situation with the apostrophe in Hausa.) —Μετάknowledgediscuss/deeds 21:56, 3 September 2021 (UTC)
- We do need to use Unicode correctly. The straight apostrophe (U+0027) and curly apostrophe (U+2018) are punctuation marks and should not be used as letters. That's what the saltillo (U+A78C) and modifier letter apostrophe (U+02BC) are for. If using punctuation marks as letters were acceptable, Unicode wouldn't have bothered creating those characters. Using punctuation marks for letters is as bad as mixing Latin and Cyrillic (which is something we used to do for Montenegrin Serbo-Croatian, but don't anymore), as Kwami points out, and just because other sources do it doesn't mean we should. We can, of course, have hard redirects from spellings with the more easily typable straight apostrophe, or put the correct page name in
This isn't a case of leaving out elements like tone marks. All RS's for Rapa Nui use the glottal stop. It's a matter of deciding which Unicode point to use for it, not whether to include it.
Re. the poor support for W. African languages, that's not racism so much as bias in the interests of the people developing Unicode. When Unicode decided they would no longer accept precomposed Latin, there was a call for people to get what they needed in before the deadline. But the respondents were all working on European languages. After West Africans started complaining that Unicode didn't adequately support their languages, the Unicode people realized they'd fucked up. At least, the ones I've talked to say they wished they'd realized what was going to happen, and spent more time on major African languages than on minor European languages.
Now that there are supplemental planes, there's room for more precomposed Latin. But as computers improve, there's less and less need for it, so I doubt they'll start accepting precomposed letters again.
I find it amazing that you could write Yoruba without tone. I mean, you can write it without vowels as long as you include the tone! kwami (talk) 23:06, 18 October 2021 (UTC)
- There are many reasons that folks don't write Yoruba with tones, partly because of a lack of solid education, partly because of a lack of technological support, partly because you can (usually) tell what you mean from context, and a ton more. There was a solid seminar done last year at the British Library about it actually, but yea it's complicated. I wish that precomposed characters could be brought back, but that's a pipe dream. I don't think that you can write it without vowels as long as you include the tone though, as Yoruba is very vowel-heavy, and it'd get confusing quickly.
- In terms of the question at hand though, I brought up the comparison more of a way to show how proscribed writing & everyday writing can interact on Wiktionary. If the majority of speakers type/write one way in informal & formal registers, that way should be the way that should be primarily reflected on Wiktionary, while the proscribed way can be shown in the headword line or an alternative form or whatever. However, I don't know the specifics of the situation with Rapa Nui, so I won't comment directly on the specifics of addressing it. AG202 (talk) 23:15, 18 October 2021 (UTC)
- My impression of Yoruba, from the very very little I think I know of it, is that in fluent colloquial speech the vowels tend to assimilate to each other, and even consonants sometimes drop out, so that you might be left with a long [ooooo] with a bunch of tones and just a few consonants. It's the tone that makes it comprehensible. But that's by ear; I guess it wouldn't work well in writing.
- But Hausa, yeah, I can see omitting the tone without any problem, except maybe the need to dab an occasional word. You might learn to write those few words with tone, the way accent marks distinguish homonyms in Romance languages, and otherwise ignore it. And some languages mark changes in tone, rather than the tone of each syllable. But I doubt that would work for Yoruba either. kwami (talk) 23:23, 18 October 2021 (UTC)
- I have a suggestion that might be able to square this circle, but it's a bit awkward to explain so bear with me:
- There are situations when it makes sense to remember the difference between the orthographic character a person intends to write, and the Unicode character which they actually use. A good example of this is the full stop "." (U+002E), which is also used in English (and Translingually) to represent the decimal point. We all agree that a full stop and decimal point are two different things, because any competent French translator would have to treat them differently, but the important thing is that that remains true regardless of which Unicode codepoints we happen to use. Indeed, it's true whether or not we're even encoding the characters at all. The same is true in French with the decimal point and the comma, too. Equally, nobody who receives "A-" on their homework is receiving "A dash" or "A hyphen".
- Conversely, just because I write in full-width doesn't mean that any of us actually think "j" has a distinct identity to "j" etc. There might be technical, historical and/or stylistic reasons why we have both, but the point is that we consider them to have the same orthographic identity.
- However, none of this prevents us from having a particular manual of style when it comes to certain characters. If we want to start using the en dash "–" (U+2013) or minus sign "−" (U+2212) in places where people intended to use them (i.e. intended characters with those orthographical identities), then that's fine. It would be no more of a problem than our choice to use a clear, black, legible font on white by default, when the original might be scrawled on a barely legible manuscript. Obviously there are no codepoints to interpret in cases such as that. Hell, a lot of the time the codepoints "used" are actually just whatever the OCR software vomited up anyway. Just like with misspellings, there needs to be some genuine intention, and it needs to be considered with respect to the orthographical identity of the characters, and not the codepoints they happened to pick.
- A final point is that writers of a language don't necessarily know their own language perfectly, or they might not perceive a conscious distinction between two characters that does actually exist, because the context usually makes it so obvious (e.g. the full stop and the decimal point). It's not enough to say "yes, they intended to write an apostrophe because that's what they used". Are they really treating it as one?
- I don't know enough about Rapa Nui to know whether the saltillo is the most appropriate character, but I hope that's a framework that makes it easier to determine the answer. Theknightwho (talk) 16:00, 7 July 2022 (UTC)
See WT:Etymology scriptorium/2021/September#korku, -u and -i
This Turkish suffix entry is probably the same as -i, and possibly -ı, due to vowel harmony. While I don't know much about Turkish, the fact that this was created as the only Turkish edit ever by this contributor and the other two were created by a veteran contributor who is a native speaker has to count for something. Chuck Entz (talk) 05:45, 18 September 2021 (UTC)
- -u is a harmonized form of -i, as are -ü and -ı. The canonical forms of suffixes are those with i and e. I disagree with the current policy of essentially providing the same definition 4 (or 2) times, see for instance the situation with -im, -ım, -um, -üm where only one contains all meanings and etymological information. I'm in favor of keeping the harmonized realizations of the suffixes as separate articles but I'm strongly in favor of converting the non-canonical forms into simple referral pages (see -dük). --Fytcha (talk) 09:59, 18 September 2021 (UTC)
Apparently the same thing. Reduce one entry to a synonym? Equinox ◑ 16:08, 19 September 2021 (UTC)
2021 — October
Sundanese by @Rankf. I'm guessing this should be lowercase. —Μετάknowledgediscuss/deeds 07:20, 2 October 2021 (UTC)
·~ dictátor·mundꟾ 18:24, 15 October 2021 (UTC)
- Agree. DovaModaal (talk) 09:55, 16 October 2022 (UTC)
- Support — excarnateSojourner (talk · contrib) 07:52, 25 February 2023 (UTC)
Mölmsch
We have an entry at Männeken in Mölmsch (see w:de:Mölmsch (Dialekt)), which is a dialect of Brabantian spoken in Mülheim, Germany. There is no code for Brabantian, which we consider a dialect of Dutch. So what do we do with Männeken? I'm really not comfortable calling it Dutch, even labeled {{lb|nl|Brabant}}
(especially since Mülheim isn't in either Belgian Brabant or Dutch Brabant). So I'd like to create a new code, but for what? One specific to Mölmsch? One for Kleverländisch (the subvariety of Brabantian that Mölmsch belongs to)? One for Brabantian? What do others think? —Mahāgaja · talk 18:42, 16 October 2021 (UTC)
- According to Wikipedia it's not part of Brabantian or Dutch, but it belongs to Low Franconian, to which also belong Dutch (nl) and Limburgs (li). LVR differentiates the following Low Franconian dialects in the Rhineland: Kleverlandish, South Low Franconian, Ostbergisch. Mülheim is classified as Ostbergisch. So some possibilities are:
- Have an umbrella code for Low Franconian in Germany.
- Have a code for Kleverlandish and another for Ostbergisch. South Low Franconian, according to Wikipedia, is a synonym (see Limburgisch) or holonym (see Limburgish#Expanded) of Limburgish. In the first case, there's already the code li. In the second case, either all Limburgish entries would have to be moved to South Low Franconian, or there would be South Low Franconian except Limburgish or maybe South Low Franconian in Germany.
- --Myrelia (talk) 21:09, 20 October 2021 (UTC)
- Since there's already a Wikipedia article and Wikidata item for the Bergish dialects, I'm creating a code for Bergish and not worrying about the rest. —Mahāgaja · talk 13:04, 11 February 2022 (UTC)
Move to Wikipedia?
It's encylopaedic and not really about words, for example the etymology of footsies isn't explained (related to foot?). --Myrelia (talk) 21:09, 20 October 2021 (UTC)
- There is good stuff but it's mostly written like a long essay or book. Equinox ◑ 00:53, 4 November 2021 (UTC)
- This would almost certainly not survive on Wikipedia. I have mixed feelings about the Appendix namespace here, as it seems a lot of things go there that would never be acceptable in the main dictionary, but that appendix pages are so hard to find for the casual user that it doesn't really bring us down. —Soap— 11:36, 13 April 2023 (UTC)
- If we do end up deleting this, i'd hope we could try to contact the editor who wrote most of it (see User_talk:DKThel). There are other wikis that could host content like this where they wouldn't be pushed into the background like our appendix pages are. Admittedly the trade-off for that is having ads and using a site that is itself harder to find. —Soap— 11:47, 13 April 2023 (UTC)
- There's definitely no point in moving it to Wikipedia, since it was originally moved here from Wikipedia, so they already decided they don't want it and foisted it on us. That's why it's written so encyclopedically. If anyone's interested in it, they should clean it up to make it more dictionarian; otherwise we should just delete it. —Mahāgaja · talk 12:34, 13 April 2023 (UTC)
- A lot of these, to be honest, should just have their own entries. AG202 (talk) 19:53, 13 April 2023 (UTC)
2021 — November
These two entries link to each other rather confusingly and there may be redundancy in it. Equinox ◑ 00:52, 4 November 2021 (UTC)
This entry seems to have been created in title case by mistake; while proper nouns are capitalized in Esperanto, "eŭkaristio" is not a proper noun and thus should be moved to eŭkaristio. --Martelkapo (talk) 19:51, 10 November 2021 (UTC)
- What does actual usage look like? Eucharist is a capitalized common noun in English; maybe it is in Esperanto too. —Mahāgaja · talk 20:25, 10 November 2021 (UTC)
Seems to be an alternative spelling of sfacimma, or the other way round. I don't know which should be made the main entry. --Akletos (talk) 09:03, 13 November 2021 (UTC)
- Sfacimma is an alternative spelling of sfaccimma, which is how the word has been mostly written in the past years; the IPA pronunciation of the word is nowadays always /ʃfat͡ʃˈt͡ʃimmə/, with the voiceless postalveolar affricate having always the gemination, hence the spelling cc. Antomanu14 (talk) 13:59, 10 December 2022 (UTC)
Senses 1 and 3 seem to be the same thing. Maybe this could just be reduced to "dated form of guarantee", even. Equinox ◑ 10:56, 13 November 2021 (UTC)
- I think it's been an alternative form of guarantee for a couple of centuries. Only recently (this decade) has it become much less common than guarantee. DCDuring (talk) 17:39, 13 November 2021 (UTC)
- OTOH, MWOnline has differing, but overlapping, definitions for the two terms. DCDuring (talk) 17:42, 13 November 2021 (UTC)
In Britain the -y spelling seems to be almost exclusively used. Equinox ◑ 04:39, 18 November 2021 (UTC)
(There's whanghee too, but that is suitably an alt-form stub entry.) Equinox ◑ 11:22, 20 November 2021 (UTC)
2021 — December
Split Cuneiform script by language (Akkadia, Sumerian and Hittite)
Cuneiform script changed a lot over time, and the default fonts for cuneiform signs may be accurate for Sumerian, but they aren't for Akkadian and Hittite. This issue can be addressed by displaying separate fonts for each language. As far as I'm aware, this can only be achieved by spliting the cuneiform script code (Xsux) into three separate script codes. ObnoxiousCoder (talk) 01:17, 8 December 2021 (UTC)
- Linguistically, I fully agree. But I don't know anything about how to execute this properly. @Fish bowl, Erutuon, do you have any script-handling wisdom? —Μετάknowledgediscuss/deeds 21:29, 22 December 2021 (UTC)
- No need for a split:
.Xsux:lang(akk)
. —Fish bowl (talk) 22:03, 22 December 2021 (UTC)- @ObnoxiousCoder, in that case, I think all you have to do is ping Fish bowl with the fonts you'd like to be assigned to each language. —Μετάknowledgediscuss/deeds 07:16, 28 December 2021 (UTC)
- @Metaknowledge, Nice, in that case I will discuss it with @Sartma before taking a decision. ObnoxiousCoder (𒅀𒀀𒋾𒁺𒁍𒌒) 18:43, 28 December 2021 (UTC)
- @ObnoxiousCoder, in that case, I think all you have to do is ping Fish bowl with the fonts you'd like to be assigned to each language. —Μετάknowledgediscuss/deeds 07:16, 28 December 2021 (UTC)
- @ObnoxiousCoder, Metaknowledge, Fish bowl, Erutuon: The issue here is that we should at the very least have the possibility to use one font for Sumerian/Old Babylonian, and another for Neo-Assyrian. It means that we should find a way to have Sumerian and Old Babylonian to use the same font. We can then leave a general "Akkadian" for Neo-Assyrian and all the other Akkadian dialects. What solutions do we have? Sartma (talk) 12:50, 7 January 2022 (UTC)
- To clarify: I'm only talking about the possibility of having different fonts for quotes and examples (and maybe in the cuneiform spellings table?) not for the main headword. Headwords are given in Old Babylonian (and Romanised), so the current Monumental style works when we give the spelling in reconstruction pages and mentions/links. It would be great if we could use akk-obb and akk-nas (as per WT:List of languages/special) as language codes when needed, for example. Sartma (talk) 15:17, 7 January 2022 (UTC)
Renaming [nlo]
Wikipedia uses the phrase "Ngul (including Ngwi)" to describe this language, which we currently call "Ngul", but this paper indicates that these are just two of several synonyms, and uses "Ngwi" as the primary name. We should follow suit. —Μετάknowledgediscuss/deeds 00:19, 21 December 2021 (UTC)
Renaming [amf]
We currently call this language "Hamer-Banna", after two of its dialects; WP uses "Hamer". This hyphenated name is found in the literature, though it excludes the third dialect, Bashaɗɗa. Modern publications, following the lead of Petrollino's grammar, use the spelling "Hamar" for that dialect. As I see it, if we stick with the hyphenated name, we should change it to "Hamar-Banna", but we could also consider elevating the name of the primary dialect to cover the language as a whole, as WP does, though in that case we should use "Hamar" instead. —Μετάknowledgediscuss/deeds 07:56, 22 December 2021 (UTC)
Equinox ◑ 05:43, 26 December 2021 (UTC)
Indus Valley Language
We currently have this language, which Wikipedia refers to as the Harappan language, as [xiv]. I suggest that we retire the code, because the language is undeciphered and its script has not been encoded, so there is nothing to add to Wiktionary in the foreseeable future. I also suggest that we retire the script code [Inds], which is only used for this language. @AryamanA —Μετάknowledgediscuss/deeds 07:14, 28 December 2021 (UTC)
- Support retiring both
xiv
andInds
. If the script should be deciphered and the language interpreted someday, we can always unretire them then. —Mahāgaja · talk 10:09, 3 December 2022 (UTC)
Merging Yoruba dialects
Currently, we have codes for [mkl] "Mokole" (see Mokole language (Benin)), [cbj] "Ede Cabe", [ica] "Ede Ica", [idd] "Ede Idaca", [ijj] "Ede Ije", [nqg] "Ede Nago", [nqk] "Kura Ede Nago", [xkb] "Manigri-Kambolé Ede Nago", and [ife] "Ifè" (all of which are lumped into Ede language). These lects are all very close to Yoruba proper (which they use for formal and liturgical purposes), and spoken by people who are considered ethnic Yorubas; moreover, they are included in the Global Yoruba Lexical Database. I have added them as dialects of [yo] "Yoruba" in MOD:labels/data/subvarieties, but treating Yoruba as a macrolanguage means we must remove these codes. (Note: the family code [alv-ede] would have to be removed as well.) @AG202, Oniwe, Oníhùmọ̀ —Μετάknowledgediscuss/deeds 07:29, 28 December 2021 (UTC)
- Merge, obviously again Ethnologue’s fabrications, which were then copied over from Wikipedia and some other “encyclopedias” with their impractical credulity towards this reference. Fay Freak (talk) 07:54, 28 December 2021 (UTC)
- If anything I would keep the Ede family code and change the lects to be etymology-only languages (edit: excluding probably Ifè since it is much more documented), but putting them all under Yoruba I unfortunately oppose for now. The Western Ede languages as seen here have a higher degree of separation from Nuclear Yoruba, and it checks out more when comparing, at the very least, the words and phrases of Ifè to nuclear Yoruba: Ifè-French Dictionary, Peace Corps - IFÈ O.P.L. WORKBOOK, J'apprends l'ife: Langue Benue-Congo du Togo. While there are obviously words that are shared due to them being related languages, it doesn't feel like a dialect of Yoruba (to me at least), so I feel uncomfortable grouping it under Yoruba. Though I do admit that I haven't really looked into the other Ede languages nearly as much. Edit: This paper may be helpful and at least shows some of the differences between Ifè & Yoruba and some aspects of the dialect continuum. Obviously some Ede varieties are much closer to Yoruba, but then I wonder what to do about the other ones. AG202 (talk) 15:09, 28 December 2021 (UTC)
- @AG202: Thanks for the sources. The question of whether to lump a code is in part based on how much extra work is entailed; would you be willing to work through a subsample to see how much we would just be duplicating Yoruba entries, and how much would be distinct? I'm not sure what you're actually advocating, because making them etymology-only languages (which you say you support) would require merging them (which you say you oppose). —Μετάknowledgediscuss/deeds 07:18, 29 December 2021 (UTC)
- @Metaknowledge Yea, sorry for that being unclear. I oppose the merger under solely Yoruba. Regarding the etymology-only part, I would support having all the Ede lects (excluding Ifè) under the header "Ede" and then differentiating on the definition line which Ede lect it is, mainly because they have much less coverage than Ifè, and it's harder to tell their mutually intelligibility. (Though as mentioned I'm not as well-versed with the other lects, so I might be entirely wrong about their continuum) In terms of working through a subsample, I am up to do so, though I am swamped at the moment so it'd definitely take a while, but from what I've seen so far, I'd be worried about putting possible Ifè terms like ɖíɖì (“belt”) or àntã̀ (“chair”) under a Yoruba header and keeping nice clear entries for readers. AG202 (talk) 07:52, 29 December 2021 (UTC)
- Looks reasonable. To clarify, my main note relates to observation that the language names currently in the data are too unnatural to find use and are not even meeting our CFI, which again means there is no entrotopy for those who know the languages to assign material to the designations with little doubt, as there is little to confirm the meanings of the language names, which should be a consideration if you devise new namings, in so far as you would like to not have private language but more or less obvious to new editors what the language codes are for. So I was not to mean that there cannot be a split in a different manner, or a smaller merge, but the current ones should be recognized as off the wall, and then there will have to be something that interrelates the remaining codes if one stumbles upon one, else it will be a reoccurring problem that an editor did not see the distinction of the available language codes. Fay Freak (talk) 01:36, 30 December 2021 (UTC)
- @Metaknowledge Yea, sorry for that being unclear. I oppose the merger under solely Yoruba. Regarding the etymology-only part, I would support having all the Ede lects (excluding Ifè) under the header "Ede" and then differentiating on the definition line which Ede lect it is, mainly because they have much less coverage than Ifè, and it's harder to tell their mutually intelligibility. (Though as mentioned I'm not as well-versed with the other lects, so I might be entirely wrong about their continuum) In terms of working through a subsample, I am up to do so, though I am swamped at the moment so it'd definitely take a while, but from what I've seen so far, I'd be worried about putting possible Ifè terms like ɖíɖì (“belt”) or àntã̀ (“chair”) under a Yoruba header and keeping nice clear entries for readers. AG202 (talk) 07:52, 29 December 2021 (UTC)
- @AG202: Thanks for the sources. The question of whether to lump a code is in part based on how much extra work is entailed; would you be willing to work through a subsample to see how much we would just be duplicating Yoruba entries, and how much would be distinct? I'm not sure what you're actually advocating, because making them etymology-only languages (which you say you support) would require merging them (which you say you oppose). —Μετάknowledgediscuss/deeds 07:18, 29 December 2021 (UTC)
2022 — January
Equinox ◑ 10:35, 1 January 2022 (UTC)
probably the same thing. Br00pVain (talk) 13:19, 17 January 2022 (UTC)
Seemingly synonyms. Equinox ◑ 19:00, 18 January 2022 (UTC)
Tagged but not listed in August 2021 by User:Caoimhin ceallach, providing the reason:
I'm in favour of moving this page to *én. As {{R:ine:LIPP|page=221|vol=2}}
shows, there is no evidence that points to an initial laryngeal and Greek and Vedic speak against it.
I've redacted the preceding quote by incorporating the reference in the superscript. Thadh (talk) 11:16, 25 January 2022 (UTC)
- We reconstruct all PIE terms with an initial laryngeal on the project, per current PIE theory, so *én = *h₁én. Sidenote,
{{R:ine:LIPP}}
is an embarrassment in the academic community, and should never be used as a primary source. --Sokkjo (talk) 02:24, 8 February 2023 (UTC)- can you elaborate on why you think
{{R:ine:LIPP}}
is an embarrassment? --Ioe bidome (talk) 15:58, 4 March 2023 (UTC)- @Ioe bidome: His hypothetical system of deriving roots from particles is largely considered crackpottery. --– Sokkjō 20:42, 4 March 2023 (UTC)
- @Sokkjo @Ioe bidome I don't care where we hold this conversation, as long as you reply. I'm the second person who has asked you to elaborate. If you can't, your concern will have to be dismissed. —Caoimhin ceallach (talk) 08:04, 27 March 2023 (UTC)
- I have elaborated to why accademics largely reject
{{R:ine:LIPP}}
, and referred you to this unfavorable review, DOI:10.1515/zcph-2019-0009. That's all neither here nor there, as on this project, we subscribe to larygeal theory, which also calls for word-intitial larygeals before vowels. If you wish to make an arugment for why we should do away with that standard, feel free to start a discussion in the WT:Beer parlour, but as is, your move request is unwarranted. --– Sokkjō 08:52, 27 March 2023 (UTC)- @Sokkjo
- You seem to have not read that review. If you did you'd see that it is overwhelmingly positive:
- "Ce sera un ouvrage de référence pour longtemps."
- "Ces remarques ne retirent rien à l’importance de l’ouvrage, qui peut servir de base tant à une recherche synchronique éclairée consacrée à tel ou tel groupe de langues qu’à une étude proprement comparative."
- The other review I'm aware of is also overwhelmingly positive:
- "In this massive, and truly monumental, two-volume work that was years in the making, author George Dunkel (henceforth D) draws on the extensive research, and the literally dozens of articles, that he has done throughout his distinguished career as an Indo-Europeanist, investigating the uninflected bits and pieces – the ἄπτωτα (áptota), the indeclinabilia¹ – of the Indo-European lexicon that are so indispensable to the phrasal and sentential syntax and to discourse and text structure in all the family’s languages." https://www.jbe-platform.com/content/journals/10.1075/dia.33.4.05jos
- Nothing about this thing takes away from laryngeal theory.
- I'm going to ask again: please elaborate on your misgivings about LIPP. —Caoimhin ceallach (talk) 10:28, 27 March 2023 (UTC)
- Since I'm guessing you don't have academic access to the second page:
Reste une réserve. Malgré la prudence de l’auteur, les processus de formation des grammèmes qu’il étudie relèvent, par définition, de la reconstruction, et l’ouvrage n’étudie pas de manière détaillée les processus qui ont lieu à date historique. Parfois le lecteur peut avoir l’impression que le système titué est d’une complexité qui le rend typologiquement invraisemblable; ainsi, vol. 1, pp. 24–26, l’auteur pense pouvoir reconstruire pour l’indo-européen quatre thèmes pronominaux qui relèvent de l’exophore proximale, deux qui relèvent de l’exophore distale, et quatre thèmes anaphoriques (George Dunkel écrit que les thèmes liés à l’exophore proximale et distale ne sont pas en contraste sémantique les uns avec les autres, mais seulement avec l’absence de déixis; ce point est obscur aux yeux du recenseur).Une telle richesse en thèmes démonstratifs nécessiterait une explication. Au demeurant l’opposition entre exophore proximale et distale n’est pas nécessairement suffisante pour couvrir tous les thèmes de l’indo-européen, qui a pu posséder par exemple trois degrés d’exophore.
En fait il peut sembler que la reconstruction des grammèmes indo-européens est vouée au flou, faute de données permettant d’étudier, notamment, la sémantique exacte des éléments concernés aux différents stades chronologiques et dans les différentes aires géographiques à prendre en compte.- Again, I'm not here to agrue about LIPP -- that's beside the point. The point is that the established convention we follow on the project for reconstructioning PIE is that #VC- only possibly exists in pronouns, if even there. See
{{R:ine:IEL|52}}
. – Sokkjō 23:03, 27 March 2023 (UTC)- @Sokkjo, I read the whole review. I even quoted from the second page. As I said before, a reservation does not equal a invalidation.
- The validity of LIPP is very much on point. I would like to mention an alternative reconstruction *én (and if others agree move the page), which is supported by evidence instead of on some misplaced assumption. You preclude any discussion by rejecting the evidence out of hand.
- In addition, I would like to continue citing LIPP, so your violent objection to it ("embarrassment to the academic community") is relevant to me. I think it is fair to say that if you could back up your objection you would have done so by now.
- You seem to have not read that review. If you did you'd see that it is overwhelmingly positive:
- @Sokkjo
- I have elaborated to why accademics largely reject
- @Sokkjo @Ioe bidome I don't care where we hold this conversation, as long as you reply. I'm the second person who has asked you to elaborate. If you can't, your concern will have to be dismissed. —Caoimhin ceallach (talk) 08:04, 27 March 2023 (UTC)
- @Ioe bidome: His hypothetical system of deriving roots from particles is largely considered crackpottery. --– Sokkjō 20:42, 4 March 2023 (UTC)
- can you elaborate on why you think
- I am of course aware that roots had a CₓVCₓ structure. There are good reasons for assuming this. However, this is not the case for suffixes, it is not the case for pronouns, it is not the case for adverbs and it is, indeed, not the case for particles.
- Your "established convention" that #VC- entries aren't allowed, doesn't exist. If you think otherwise please point to it. WT:AINE does not mention the phonotactics of entries. And at any rate, WT:RECONS clearly says that "variants and disputed forms can then be addressed in great detail within the text of the pages themselves". If you don't want *én on the page, you need to have (at the very least) substantive arguments why the evidence supporting it is wrong.
- But I'd ask you to please be more careful about your references. You keep quoting things which don't support your position.
{{R:ine:IEL|52}}
: "It seems that onsetless initial syllables (#VC-) were rare" ie not nonexistent. LIPP, the first systematically study of Indo-European particles, documents evidence for a substantial number of exactly these. —Caoimhin ceallach (talk) 01:23, 29 March 2023 (UTC)- I'm aware of what I cited, "rare" meaning they are limited to pronouns, and to continue on to the following sentence, "It is common practice now to reconstruct initial laryngeals even when not strictly provable". You seem to be under the impression that I, created this "common practice" and I set that convention here on the project. I'm honored you think I have that seniority, but despite contributing here for over a decade, it long preceeds me. If you want to argue against the status quo, not just on this project, but in academia at large, the weight is on you to do so. – Sokkjō 03:51, 29 March 2023 (UTC)
2022 — February
The art/literature senses are defined very differently at the two entries, which seems like a problem. One is already tagged for cleanup, so, good luck! Equinox ◑ 00:21, 1 February 2022 (UTC)
Different spellings of the same word, from Yiddish. 70.172.194.25 19:46, 2 February 2022 (UTC)
Members:
@Justinrleung, RcAlex36, 沈澄心 —Fish bowl (talk) 05:55, 6 February 2022 (UTC)
- @Fish bowl: Gansu means actual Gansu in China, but Gansu Dungan should be its own label perhaps. I'm not sure why those entries are labelled specifically as Gansu Dungan, though, because do we know if it's not used in other varieties of Dungan? Pinging @Mar vin kaiser to know why he chose to label it as Gansu Dungan specifically. — justin(r)leung { (t...) | c=› } 06:03, 6 February 2022 (UTC)
- @Justinrleung: There's this website, I can't find the link now, that was like a mini Dungan dictionary, and for some of its words, it has a dialectal label. I think I got it from there. --Mar vin kaiser (talk) 08:39, 6 February 2022 (UTC)
- @Mar vin kaiser: This? I know these words are marked as Gansu here, but I wonder if we need to specify it as Gansu specifically when we don't know if other Dungan varieties use it. — justin(r)leung { (t...) | c=› } 09:02, 6 February 2022 (UTC)
- @Justinrleung: Oh, I added the label Gansu with the assumption that it's specifying that it's only used in Gansu. Aren't there just two dialects, Gansu and Shaanxi? --Mar vin kaiser (talk) 14:03, 6 February 2022 (UTC)
- @Mar vin kaiser: This? I know these words are marked as Gansu here, but I wonder if we need to specify it as Gansu specifically when we don't know if other Dungan varieties use it. — justin(r)leung { (t...) | c=› } 09:02, 6 February 2022 (UTC)
- @Justinrleung: There's this website, I can't find the link now, that was like a mini Dungan dictionary, and for some of its words, it has a dialectal label. I think I got it from there. --Mar vin kaiser (talk) 08:39, 6 February 2022 (UTC)
Change "Chinese" [zh] from descendant of "Middle Chinese" [ltc] to ancestor of "Old Chinese" [och]
- The family tree looks weird.
- Old Chinese is subsumed under ==Chinese==.
—Fish bowl (talk) 06:34, 6 February 2022 (UTC)
- I wonder if we should push it even further and have Proto-Sino-Tibetan be the ancestor? Even Old Chinese being the ancestor of Old Chinese may be weird. — justin(r)leung { (t...) | c=› } 06:54, 6 February 2022 (UTC)
I note that Category:English language similarly belongs to Category:Middle English language, although a major difference in Wiktionary treatment is that ==English== does not cover Category:Middle English language or Category:Old English language. —Fish bowl (talk) 11:19, 7 February 2022 (UTC)
- @Fish bowl I'm not a Chinese editor, but from a outside perspective, that'd feel more weird to see (how can Chinese be the ancestor of Old Chinese?). It's also make the Chinese lects go like "Chinese -> Old Chinese -> Middle Chinese -> lect", which seems more confusing to me. Honestly, at this rate. I'd just remove Chinese from the family tree entirely with its current treatment. Or, at least make it on the same level as Old Chinese, rather than an ancestor. AG202 (talk) 04:33, 4 March 2022 (UTC)
Merge Category:Hokkien, Category:Hokkien Chinese; and perhaps move Category:Hainanese depending on the result of the previous
Category:Hokkien is an etymology language, while Category:Hokkien Chinese belongs to the {{dialectboiler}}
system.
Category:Hainanese is presently both.
—Fish bowl (talk) 11:10, 7 February 2022 (UTC)
Category:Classical Chinese presently has ~270 pages.
Category:Chinese literary terms presently has ~8,000 pages.
(Thankfully, Category:Literary Chinese lemmas has 0 pages.)
Category:Classical Chinese is currently described as:
{{dialectboiler|zh|the 5th century BC to 2nd century AD, and continued as a [[literary language]] until the 20th century}}
[[Category:Old Chinese lemmas]]
[[Category:Middle Chinese lemmas]]
[[Category:Literary Chinese lemmas]]
Category:Korean Classical Chinese is a child category of Category:Classical Chinese, and may be a rationale for keeping Category:Classical Chinese in some form.
—Fish bowl (talk) 11:17, 7 February 2022 (UTC)
They look mergeable. Equinox ◑ 20:36, 11 February 2022 (UTC)
- @Equinox Did you mean just the English sections? Both entries also have Indonesian sections. — excarnateSojourner (talk · contrib) 19:52, 25 February 2023 (UTC)
- Yes, I meant the English sections. Merge request stands! Equinox ◑ 02:51, 13 March 2023 (UTC)
Seems like meat puppet sense #6 is the same as meatpuppet. 70.172.194.25 06:10, 19 February 2022 (UTC)
The conjunction وَ (wa) is not part of the phrase really. The phrase does occur frequently with it, but this is mainly owing to the "idiomaticness" of conjunctions in Arabic, mostly in prose. It is a sentence in itself, roughly "There is no(thing) equal to", not like the English adverbials that have comparable meanings (such as particularly and especially or above all). The entry should therefore be moved لَا سِيَّمَا (lā siyyamā), with the "variant" with وَ (wa) deleted. Roger.M.Williams (talk) 18:11, 24 February 2022 (UTC)
2022 — March
Slavic phylogeny
East Slavic codes
Following up a long discussion on the Old East Slavic About: page, I'd like to propose the following splits:
- Split off Old Ruthenian (
zle-ort
) - Set Old Ukrainian (
zle-obe
) and Old Belarusian (zle-ouk
) as etymology-only descendants and labels of Old Ruthenian - Set Ukrainian (
uk
), Belarusian (be
) and Rusyn (rue
) as descendants of Old Ruthenian - Change Old Russian (
zle-oru
) to Middle Russian (zle-mru
) and set this as a label of Russian (ru
)
On the final point there was quite some discussion, and I personally support making Middle Russian as a full-fledged code, but since we couldn't reach consensus, I propose making that a separate discussion if need be.
The proposed historical borders of the languages are as follows:
- Old East Slavic (until the 14th century)
- Middle Russian (=Moscow Literary language; 14th century-18th century) [Peter the Great's reforms]
- Old Ruthenian (='West Russian' Literary language; 14th century-19th century) [Kotliarevsky's Eneïd]
Pinging @Atitarev, ZomBear, Useigor, Ентусиастъ, Benwing2, Rua, Ogrezem. I apologise if I forgot anyone. Thadh (talk) 12:43, 2 March 2022 (UTC)
- I still support only the introduction of Old Ruthenian, which is missing but as before, I don’t claim to be an expert on the matter. The Russian corpus in the other discussion was helpful. When I filtered on “Middle Russian”, I think I was able to find a couple of words, which are now considered obsolete. The rest were words, which just need to be respelled to find quotes in (early) Modern Russian. I found a few different ways to abbreviate and also numerous misspellings. Overall I sort of feel why these additional splits are not so popular - little strong evidence to work with. Middle Russian may be allowed to be added, let’s just look for good cases.
- To make decisions easier, why don’t we add a couple of specific examples for each new language code proposed - something to work with. (They can be vocab, grammar or pronunciation cases). They proponents should have examples in mind to make the case(s) stronger. We can work together on confirming or disputing those cases. --Anatoli T. (обсудить/вклад) 22:57, 2 March 2022 (UTC)
- I'll see if I can make a list of features that distinguish Middle Russian from (Modern) Russian. In any case, for the time being, treating Middle Russian like Old East Slavic makes little sense to me, especially if we're splitting off Ruthenian (otherwise we get some kind of Dutch-Afrikaans situation), so we could go ahead with that now and in the meantime continue discussing MR's position as a separate code. Thadh (talk) 23:30, 2 March 2022 (UTC)
- (edit conflict) You can use any of the examples already in discussions used as evidence, e.g. онтарь/оньтарь, агистъ, etc. BTW, I see that "Old Russian" was used incorrectly by ZomBear when actually talking about Middle Russian. "Old Russian" = "Old East Slavic". The Russian term for Middle Russian is старору́сский (starorússkij) but Old East Slavic (Old Russian) is древнеру́сский (drevnerússkij). --Anatoli T. (обсудить/вклад) 00:21, 3 March 2022 (UTC)
- Quick update, I've found a relevant discussion from three years ago, Wiktionary talk:About Russian#Middle Russian?. Also, The Russian Language before 1700 (Matthews 1953) argues your and Fay Freak's point (that Middle Russian is too similar to modern Russian to warrant a linguistic distinction) Fun point, it also provides съмьрть's accentuation :0. I'll still look for differences in the corpora, but if the languages are too similar I guess I don't mind keeping the two together - as long as the descendants sections don't get too cluttered, I'm fine. Thadh (talk) 00:02, 3 March 2022 (UTC)
- BTW, I didn’t get back to you on the concern I have in regards to introduction of word stresses in Old East Slavic. My reason being there are many cases where assumptions can go wrong based on descendants. We should only use referenced data. Well, we don’t have native speakers to prove us wrong, do we? —Anatoli T. (обсудить/вклад) 23:03, 2 March 2022 (UTC)
- Sure, but of course we can still use sound laws for words without referencing the specific word's reconstruction. A word like съмь́рть will have the stress on the second syllable, because otherwise the Russian term would be something like **со́мерть rather than сме́рть. However, I wouldn't know where to look for any reference on this specific word, and googling "съмь́рть" returns no results. Thadh (talk) 23:30, 2 March 2022 (UTC)
- Of course, there could be strong (?) assumptions on vowels, which became silent (i.e. they are unstressed) but I wouldn't be so sure even on e.g. вода́ (vodá) (if it weren't referenced), since the word is stressed on the first syllable in some Ukrainian dialects, if you know what I mean. --Anatoli T. (обсудить/вклад) 00:21, 3 March 2022 (UTC)
- Sure, but of course we can still use sound laws for words without referencing the specific word's reconstruction. A word like съмь́рть will have the stress on the second syllable, because otherwise the Russian term would be something like **со́мерть rather than сме́рть. However, I wouldn't know where to look for any reference on this specific word, and googling "съмь́рть" returns no results. Thadh (talk) 23:30, 2 March 2022 (UTC)
- I'll see if I can make a list of features that distinguish Middle Russian from (Modern) Russian. In any case, for the time being, treating Middle Russian like Old East Slavic makes little sense to me, especially if we're splitting off Ruthenian (otherwise we get some kind of Dutch-Afrikaans situation), so we could go ahead with that now and in the meantime continue discussing MR's position as a separate code. Thadh (talk) 23:30, 2 March 2022 (UTC)
- @Thadh: I support your suggestions. Ентусиастъ (talk) 16:19, 3 March 2022 (UTC)
- I have already spoken before. I'm for it too.--ZomBear (talk) 00:57, 4 March 2022 (UTC)
- @Thadh: Again, unfortunately, I see that the discussion has stopped again. It's been almost a month since no one has written anything. Every day I look forward to the solution of this issue with the Old Ruthenian language. --ZomBear (talk) 07:32, 21 March 2022 (UTC)
- Done. What we need now is to split all pages into either Old East Slavic, Russian (with the Middle Russian label) or Old Ruthenian (with or without the Old Belarusian/Old Ukrainian label). Thadh (talk) 18:43, 21 March 2022 (UTC)
- @Thadh: Again, unfortunately, I see that the discussion has stopped again. It's been almost a month since no one has written anything. Every day I look forward to the solution of this issue with the Old Ruthenian language. --ZomBear (talk) 07:32, 21 March 2022 (UTC)
@Thadh how about adding more etymology only language codes? Modern dictionaries use more than just Old Belarusian/Ukrainian. I saw Middle Bulgarian, Old Slovak, Old Slovene, Old Serbian, Old Croatian, Old Serbo-Croatian, Old Bulgarian, Old Upper Sorbian, Old Lower Sorbian. Possibly Middle Czech and Middle Polish also would be useful sometimes. Old Sorbian was also used by Boryś (Old Sorbian peleš as cognate for Polish pielesze), however we can't just link to both Lower and Upper Sorbian at once, so that would require full support for this language (?). Scientific publications mention Old Polabian as language of Polabian Slavs in Middle Ages, it is used usually for proper nouns like given names, theonyms, toponyms, sometimes ordinary words mentioned in Latin texts and it is always reconstructed language, I would like to have it tho. Sławobóg (talk) 14:32, 28 May 2022 (UTC)
- @Sławobóg I'll need from you in order to determine if the splits are worth it is:
- - Exact boundaries of the languages' stages
- - You need to check how much literature there is in the earlier stages of the language.
- - You need to check in how much the languages differ from their modern stages.
- Once you do that, we can continue the conversation about splitting them. It seems pointless to split a language off just because there are two inscriptions in some dusty old book. Thadh (talk) 15:15, 28 May 2022 (UTC)
- @Thadh: IMO Middle Polish would benefit greatly from the split.
- Boundaries: As it is with extinct languages, there aren't really any exact boundaries, but it's usually defined as between the 16th and the 18th century; Polish Wiktionary has settled on years 1500 to 1750 to account for Doroszewski's dictionary.
- Literature: There are two major corpora, accessible on the SPXVI and ESXVII websites.
- Differences: I reckon the spelling and pronunciation differences, especially the employment of "slanted vowels" (samogłoski pochylone, I have no idea what their name is in English), should be enough.
- Plus, like, this would help with attestation. Hythonia (talk) 11:08, 30 July 2022 (UTC)
- Middle Polish is also thusly defined on Wikipedia. I also think it would make more sense to have Middle Polish as an LDL. The alternative would be having a label. If we split, we'd have to add Middle Polish both to Proto Slavic descendent entries as well as intermediates on etymologies. Vininn126 (talk) 11:52, 30 July 2022 (UTC)
- Also pinging @KamiruPL, as an editor for Old Polish. Do you think we should fully split Middle Polish, create a label, or some other alternative? Vininn126 (talk) 13:44, 30 July 2022 (UTC)
- @Vininn126: I treat Arabic before the spread of printing in the Arab world, which is from 1800 (Napoléon brought the press to Egypt, which was then a state business that over time was rented by privates who would copy it), as LDL. The reason becomes more obvious for Hebrew where we are eager to include hapax legomena in the Tanakh and due to lacking distinctness of the Modern to the Biblical language, from which the former has been resurrected, have little desire to split. This is in analogy to the split of English from Middle and Old English, where basically the split happens following the new medium of printed books—accordingly if Polish literacy in the same fashion starts only somewhere in the 18th century then we become stricter only then.
- Circumventing attestation criteria is no reason to split language headers, as your perception about whether something is another language is the same and only disingenuously modified by that consideration of its description. So more appropriate attestation criteria – and I think of the many carefully collected variants sadly left even unmentioned as a consequence of no sense of proportion applied to the teleology of our rules – by no means should serve motivation to split languages; we can already derive them by the accepted statutory interpretation methods.
- To be clear, since legal thinking is unwonted and mysteriously strange to many in spite of people rightly being appointed for it in any society: In this case this is really just systematic interpretation: Since the community authoring the policies was biased towards English but the splits of other languages wrought comparative inconsistency with its situation according to which it has been split by chronolects, we break the criteria down to be suited for the languages they were only roughly devised for. Fay Freak (talk) 09:51, 31 July 2022 (UTC)
- In all honesty a label is likely the best option. Vininn126 (talk) 10:05, 31 July 2022 (UTC)
- Middle Polish is also thusly defined on Wikipedia. I also think it would make more sense to have Middle Polish as an LDL. The alternative would be having a label. If we split, we'd have to add Middle Polish both to Proto Slavic descendent entries as well as intermediates on etymologies. Vininn126 (talk) 11:52, 30 July 2022 (UTC)
- @Thadh: IMO Middle Polish would benefit greatly from the split.
- I've thought about this more, and I think there might be a case for Middle Polish as an L2. If we agree it should be split, I can help convert the existing entries to Middle Polish.
- Here is my reasoning:
- Old Polish, Middle Polish, modern Polish, and Silesian are four lects that are hard to separate accurately. Part of this argument hinges on Silesian, which we currently treat as an L2, and I don't see that changing. There are political, historical, and linguistic reasons
- ===Why Silesian should be an L2===
- Its speakers feel strongly that it is a language, not a dialect, most Polish linguists pushing that it is a language include Jan Miodek, who is a notable prescriptavist who pushes more nationalistic views of how languages should be treated, and I believe that treating Silesian as a dialect is done partially to stifle any sense of individuality to further Polish control. However, I recognize that theory has some tinfoil-hat conspirist vibes to it, so I'll stick to its speakers strongly feel it is.
- Significant linguistic difference: Silesian has a different phonology to Polish, and other grammatical features, such as retaining the Proto-Slavic aorist in an analytical past tense, as opposed to a more agglutinative/morphological one in Polish. It also recently has undergone strong standardization, as can be seen on silling.org and the ślabikŏrzowy szrajbōnek.
- Significant lexical differences: Silesian differs quite a bit from Polish in terms of lexical information. Core inherited words are of course similar, but look at other Slavic languages. It's also been heavily "Policized", but so has Kashubian, which we also treat as an L2 and is recognized as a separate minority language in Poland, and both Kashubian and Silesian are recognized by ISO and Glottolog.
- Finally, the key point to the overall arguement: Silesian is a descendent of Middle Polish. Most claims that it is Czechoslovakian are refuted by Silesian philologists.
- ===Why Middle Polish should maybe be an L2===
- So if we decide that Silesian is an L2, that would give Middle Polish multiple descendents. This would "fix" many inherited etymologies, such as wszystek. This would also fix Latinate borrowings, where Silesian inherited an older pronunciation of Latinate words, and also the chain generally works better as Learned borrowing into Middle/Old Polish -> Polish + Silesian, as opposed to setting multiple Learned borrowings.
- Furthermore, Middle Polish was siginificantly different from Modern Polish in terms of phonology and grammar (I recently updated the Middle Polish Wikipedia page). In terms of lexical content - there were significant shifts, I would say less than the standard differences between Slavic languages, but there were still trends, and dictionaries such as
{{R:pl:SXVI}}
,{{R:pl:SXVII}}
, and occasionally{{R:pl:SJP1807}}
or{{R:pl:SJP1900}}
would be key in this. Furthermore, Middle Polish is otherwise resource poor, and should be treated as an LDL, label or not. Having it as an L2 is cleaner in terms of citations. - If we agree that this should be done, I would recommend setting the cutoff dates as c. 1500-c. 1780, with a language code of
zlw-mpl
. Vininn126 (talk) 12:39, 24 April 2023 (UTC)- @Atitarev@Fay Freak@Hythonia@Sławobóg@Thadh@ZomBear@Ентусиастъ Vininn126 (talk) 17:30, 24 April 2023 (UTC)
- Update: there is debate as to whether Silesian should be listed as from Old Polish or Middle Polish, which really affects the above argument. Vininn126 (talk) 14:53, 25 April 2023 (UTC)
- Just flagging up that it's possible to give Middle Polish an etymology-only language code, and to set it as the ancestor of Polish (and Silesian, if desired). This would be a way to keep its entries under the Polish L2, while allowing etymologies to formally mention it. In turn, Middle Polish could have Old Polish set as its ancestor.
- Of note is the fact we already have Middle Russian, Old Ukrainian, Old Belarusian, Middle Bulgarian and Early Modern Czech, which are all currently handled in the same way. Theknightwho (talk) 16:14, 25 April 2023 (UTC)
- Update: there is debate as to whether Silesian should be listed as from Old Polish or Middle Polish, which really affects the above argument. Vininn126 (talk) 14:53, 25 April 2023 (UTC)
- @Atitarev@Fay Freak@Hythonia@Sławobóg@Thadh@ZomBear@Ентусиастъ Vininn126 (talk) 17:30, 24 April 2023 (UTC)
Old Slovak ?
How about adding code for the Old Slovak (zlw-osk
) as well. In the same {{R:sla:ESSJa}}
(ЭССЯ), especially in recent editions, Old Slovak is constantly listed separately. In this case, etymology-only code is sufficient. --ZomBear (talk) 07:32, 21 March 2022 (UTC)
There still seems to be a lot of overlap here, e.g. the chandelier sense. Is there any sense of the word that cannot be spelled both ways? Equinox ◑ 03:47, 4 March 2022 (UTC)
- Any English word ending with -er occassionally shows up as -re. It doesn't seem like this needs a tag and discussion, though. Some editor at lustre was just wrong: There's no reason to say "alternate form of luster" + 3 repeated senses that're already at luster. Maybe add a usage note if some American speakers tend to still use re unexpectedly more often in some cases.
- That said, the luster entry is currently a bit off. 'Shininess', '5-year period', and 'den' all get spelled with an -re in standard British English but using it for 'one who lusts' would still seem like a misspelling. The alt form needs to be with each etym that uses it and not headlining like it is now. — LlywelynII 23:40, 13 June 2023 (UTC)
Merge 參巴#Chinese, 叁巴#Chinese
—Fish bowl (talk) 05:38, 4 March 2022 (UTC)
Merge lyme#English, lime#Etymology_4
—Fish bowl (talk) 14:55, 6 March 2022 (UTC)
Some circularity, with each form linking to the other for certain senses. Equinox ◑ 00:20, 8 March 2022 (UTC)
- I think I was trying to show that the open form was used more commonly for some definitions and the closed for others. Is there pure circularity remaining? DCDuring (talk) 17:18, 9 March 2022 (UTC)
- I have tried to make clearer the differences and have simplified the "Further reading" sections. I don't see why they should be moved, merged, or split, whichever it is that you are seeking. DCDuring (talk) 18:08, 9 March 2022 (UTC)
These look like the same word. 70.172.194.25 09:17, 9 March 2022 (UTC)
- It's not, just a doublet that came into the language by a different (rather convoluted) route. Serynga is listed as an alternative form at English seringa, but it looks like it's really a borrowing from French, where it's an alternative form of French seringa. The spelling is no doubt influenced by the taxonomic name.
- From what I can gather, Latin syringa developed into Dutch sering, which was borrowed into Portuguese as the name for rubber plants in the genus Hevea and into French for the syringa, Philadelphus coronarius, both with an "a" added. English borrowed Portuguese seringa for the rubber plant and French serynga for the syringa.
- If you're not confused by all of this, you're not paying atttention... Chuck Entz (talk) 15:40, 9 March 2022 (UTC)
- So, are the definitions in both entries correct? Because they currently claim to both have the same first two definitions... in which case we should either have
{{syn}}
crosslinks between them or reduce both senses on one to{{synonym of}}
(+gloss) of the other. - -sche (discuss) 08:07, 28 March 2022 (UTC)
- So, are the definitions in both entries correct? Because they currently claim to both have the same first two definitions... in which case we should either have
Possible move to align names
I made Category:English terms spelled with underscore a few hours ago and I just now discovered Category:Translingual terms spelled with low line. "Low line" and "underscore" refer to the same character: which one should be the name used? I've never encountered "low line" before, but it looks like this has some popularity. Thoughts? —Justin (koavf)❤T☮C☺M☯ 06:43, 12 March 2022 (UTC)
- I too would've thought "underscore" was the usual name; I'm intrigued to see that "low line" pages have apparently been around longer. My !vote is to consolidate the pages on "underscore". (Related issue: "Unsupported titles/Low line interfix" is an unintuitive name for that page; it seems like unsupported titles are more often named in more plainly descriptive ways; now that the page no longer has an interfix on it, can we change it to something like "hyphen underscore hyphen"?) - -sche (discuss) 03:31, 13 March 2022 (UTC)
- I support moving any categories that use low line to use underscore instead, since underscore is more common in my experience, it is the title of the Wikipedia article, and our entry for low line was missing the punctuation definition until I added it just now. - excarnateSojourner (talk | contrib) 17:31, 3 May 2022 (UTC)
- @ExcarnateSojourner: I created the “low line” category, because we have the character at “Unsupported titles/Low line”. You stated that you support moving the categories specifically, however, which would not be consistent with the entries. J3133 (talk) 14:27, 13 May 2023 (UTC)
- @J3133, ExcarnateSojourner, -sche, Koavf I have gone ahead and merged these both into underscore: these categorisations are now automatic, and I really, really didn't want to add a special case giving them different names for the sake of this glacial discussion. I simply picked the one which most speakers are more familiar with. If anyone has strong feelings about changing it to "low line" then I'm happy to change it.
- In terms of making page titles match, there are only 4 entries which actually use "low line" in the title: -_-, >_<, _ _ and _ itself, as they're the only pages where we have no other choice but to do it that way. In most instances, we can just use a space, as the "Unsupported titles/" prefix is enough to differentiate it: compare snake case and snake_case. Theknightwho (talk) 17:58, 14 May 2023 (UTC)
- <3 —Justin (koavf)❤T☮C☺M☯ 20:45, 14 May 2023 (UTC)
- I'll probably move the low line entries if no one objects. — excarnateSojourner (talk · contrib) 02:15, 15 May 2023 (UTC)
- Moved, though I need an administrator to update a MediaWiki page so that the entry titles display correctly. — excarnateSojourner (talk · contrib) 20:35, 23 May 2023 (UTC)
- @ExcarnateSojourner: I created the “low line” category, because we have the character at “Unsupported titles/Low line”. You stated that you support moving the categories specifically, however, which would not be consistent with the entries. J3133 (talk) 14:27, 13 May 2023 (UTC)
Defined recursively in terms of each other. Equinox ◑ 22:36, 12 March 2022 (UTC)
70.172.194.25 09:38, 13 March 2022 (UTC)
Lots of duplicate information, including but not limited to translations. — Fytcha〈 T | L | C 〉 19:27, 13 March 2022 (UTC)
in line with Category:Redlinks by language. —Fish bowl (talk) 03:57, 14 March 2022 (UTC)
- Fish bowl (nom) created cat:Chinese redlinks/zh-see in April 2022. — excarnateSojourner (talk · contrib) 20:18, 25 February 2023 (UTC)
- I don't remember doing this lol. It makes sense though; 肉体的 points to 肉體的#Chinese which doesn't exist. —Fish bowl (talk) 21:27, 25 February 2023 (UTC)
move Category:Sino-Vietnamese words with uncreated Han etymology to Category:Vietnamese redlinks/vi-etym-sino
Not sure whether "descrescendo" should be a misspelling or alternative form, though. It has quite a few hits on Google Books. 70.172.194.25 23:49, 15 March 2022 (UTC)
- If it's kept at all, it should be a
{{misspelling of}}
. —Mahāgaja · talk 07:56, 16 March 2022 (UTC)
2022 — April
Are these different? 98.170.164.88 01:33, 2 April 2022 (UTC)
Is there a difference? w:Political subdivisions redirects to w:Administrative division. —Fish bowl (talk) 04:05, 4 April 2022 (UTC)
- Even if there is, I’m doubtful that it’s relevant to Wiktionary. Theknightwho (talk) 07:56, 4 April 2022 (UTC)
- The first category contains things like "state", "county", "province". The second contains things like "California", "Yorkshire", "Guangdong". 70.172.194.25 18:58, 12 April 2022 (UTC)
- Just found that there are two like categories. No. Editors can’t apply any distinction. Administration executes political decisions. Fay Freak (talk) 21:46, 20 February 2023 (UTC)
- The intended distinction (which, when I spot-check a few categories, actually seems to be decently well maintained) seems to be as IP 70.172 says. But I am inclined to agree that the current names don't convey a meaningful distinction. If we want to continue having separate categories for "county, burgh, kingdom, ..." vs "Mayo, Yorkshire, Idaho, ...", it would be better to devise more distinct names for the categories... - -sche (discuss) 23:14, 20 February 2023 (UTC)
- IP is right. I just came here because Ottoman Turkish قضا (kaza) was in the wrong category, and pushed the panic button. The naming should be something more intelligent. Fay Freak (talk) 03:33, 21 February 2023 (UTC)
- I agree that the names are highly confusing. Maybe we should rename the first one “types of administrative division”, or something similar. Incidentally, that’s exactly the name of the corresponding en.wikipedia category. 70.172.194.25 03:39, 21 February 2023 (UTC)
- Now the yerba gave me the idea. We just name the latter “named political subdivisions”, to avert the exemplified mistake. The former shall not be renamed because it is added manually while the other is a mediate effect of Template:place etc. I also briefly thought about going to Wikipedia to see how they do but we don’t have the same problems. Fay Freak (talk) 03:47, 21 February 2023 (UTC)
- I agree that the names are highly confusing. Maybe we should rename the first one “types of administrative division”, or something similar. Incidentally, that’s exactly the name of the corresponding en.wikipedia category. 70.172.194.25 03:39, 21 February 2023 (UTC)
- IP is right. I just came here because Ottoman Turkish قضا (kaza) was in the wrong category, and pushed the panic button. The naming should be something more intelligent. Fay Freak (talk) 03:33, 21 February 2023 (UTC)
also Talk:point-blank#merge with point blank. – Jberkel 23:51, 12 April 2022 (UTC)
—Fish bowl (talk) 07:42, 13 April 2022 (UTC)
Proposal to rename Ottawa (otw) to Odawa
I think Ottawa should be renamed to Odawa; It's the more common English name used to refer to the language nowadays, and preferred by speakers. What do you think? /mof.va.nes/ (talk) 15:47, 15 April 2022 (UTC)
- Support —Mahāgaja · talk 07:26, 18 April 2022 (UTC)
So far as I can tell, the two senses refer to the same thing. Is this a case where differing terminology between chemistry and physics means that it's worth keeping both to better aid understanding? If so, we should probably clarify that they aren't referring to different things.
If I'm wrong and they are actually distinct, could someone with more knowledge than I do make that clearer? Theknightwho (talk) 14:42, 16 April 2022 (UTC)
- The redundancy was added in diff, I've merged the senses. There may be another sense, to which the first etymology (positron + ium) would apply, for positronium conceived of in sci-fi etc as an element or substance a la uranium, polonium, unobtainium, etc. - -sche (discuss) 01:09, 29 May 2022 (UTC)
Both entries are for nouns referring to groups of people that are predominantly or exclusively attested in the plural. The cited dictionaries lemmatize them at the nominative plural forms, and I think we should also. The forms ending in "is" are in any case not the right headwords, as these would only be genitive singular (Latin-syle); the nominative singular forms are Greek-type Telchis/Telchin and Hyas respectively, so those would be the right headwords if we decide it's preferable to lemmatize them at the nominative singular.--Urszag (talk) 02:06, 18 April 2022 (UTC)
- Since nobody objected, I moved Telchines, but since there is a preexisting page at Hyades, there will need to be a merge for that one. I'm not familiar with how to do that correctly.--Urszag (talk) 16:50, 1 December 2022 (UTC)
- @Urszag The history of Hyadis indicates that it was created by Theeddiebear888 a few hours after this discussion was opened. I guess it was deleted and recreated? In any case it has only a Latin section, and Hyades has only English and French sections, so it doesn't make sense to merge them. — excarnateSojourner (talk · contrib) 21:12, 25 February 2023 (UTC)
- @ExcarnateSojourner I don't understand, why do the languages that exist on each page affect whether a merge makes sense? My understanding was that when a page is at the wrong title, it should be moved to the correct one or merged if a page with the correct title already exists to preserve its edit history.--Urszag (talk) 21:18, 25 February 2023 (UTC)
- @Urszag Oh, sorry, I misunderstood. I thought you were talking about merging content from one language section into another language section. But you're right, moving an entire language section from one entry to another may make sense. — excarnateSojourner (talk · contrib) 21:25, 25 February 2023 (UTC)
- @ExcarnateSojourner I don't understand, why do the languages that exist on each page affect whether a merge makes sense? My understanding was that when a page is at the wrong title, it should be moved to the correct one or merged if a page with the correct title already exists to preserve its edit history.--Urszag (talk) 21:18, 25 February 2023 (UTC)
Move to Wiktionary:Foreign word of the day to be consistent with other pages in the namespace. - excarnateSojourner (talk | contrib) 19:48, 27 April 2022 (UTC)
To מ־ש־ך, to be consistent with other Hebrew entries. 70.172.194.25 18:13, 29 April 2022 (UTC)
—Fish bowl (talk) 23:23, 30 April 2022 (UTC)
2022 — May
This is only an English entry, and on English Wikipedia it is not capitalized inside the middle of sentences. The rationale for capitalizing it in 2007 was that it is a German language entry, except there has never been a German language section on this page. -- 65.92.246.142 03:20, 13 May 2022 (UTC)
- Science Direct: "witzelsucht"
- Prezi: "witzelsucht"
- MD Edge Neurology: "witzelsucht"
Man has taken a quick look and found that eponym is a term not restricted to words after persons, bare logically since ὄνυμα (ónuma) merely means name, i.e. in modern linguistic terminology proper noun, so we might reckon that our definition under eponym (inconsistent with the single adjective definition and Wikipedia) is a prescriptivist legend (of the dark mid 20·00s to which the categorizations and definitions date) and we rather have to move the category to Category:Terms derived from anthroponyms by language to attain consistency with Category:Terms derived from toponyms by language and mislead the public less. Fay Freak (talk) 20:31, 16 May 2022 (UTC)
a few or few
We have a few fries short of a Happy Meal (created by @Equinox) and few cards short of a full deck, few cards shy of a full deck, few sandwiches short of a picnic, few X short of a Y, which were moved/created by @TNMPChannel. J3133 (talk) 08:21, 29 May 2022 (UTC)
- IMO these should all be at "a few...", since a few means something quite different from few. —Mahāgaja · talk 09:21, 29 May 2022 (UTC)
- Whichever form we lemmatize, I guess we might as well leave redirects from the other. Several of these also have variants like google books:"several fries short of a" Happy Meal / happy meal, google books:"several cards short of a" full deck / full pack, which presumably need hard or soft redirects. - -sche (discuss) 19:00, 29 May 2022 (UTC)
- I definitely agree that all of the headwords mentioned should be at "a few ...". Unfortunately there are probably more (attestable?) alternatives besides what -sche has found. Redirects from "few ..." are especially useful because many with beginning knowledge of English seem to have problems with English determiners. DCDuring (talk) 19:23, 29 May 2022 (UTC)
- This is a good issue to raise. I've mentioned before that, with proper nouns, we don't seem to have (or at least we don't consistently use) anything about the determiner/article: I mean it's the Eiffel Tower and the Cold War, but ∅ Dijkstra's algorithm and ∅ Greenpeace. Proper nouns aside, I usually drop the determiner/article from entry titles unless it seems absolutely 100% necessary all the time. But that's pretty vague and comes out of my wacky head. Equinox ◑ 01:53, 4 June 2022 (UTC)
- @Equinox: Yes, but it's [ a few ] [ cards short of a full deck ], not [ a ] [ few cards short of a full deck ] (note the alternative form one card short of a full deck, where "one" replaces "a few" ) Chuck Entz (talk) 02:14, 4 June 2022 (UTC)
- Maybe. Yeah. I would imagine "some few..." etc. might be possible. But even I have better things to do than attest them. Just an observation. Equinox ◑ 02:17, 4 June 2022 (UTC)
- It's a snowclone with many possible variants. I dont think many people are going to look up few or short of expecting to find this full phrase. And those words arent in every variant anyway ... one can also say "two cards shy of a full deck" which uses neither of them.
- What would be nice is if the Appendix namespace was in the default search space so that the snowclone page might at least turn up in a search. As it stands, I don't think we need all these mainspace pages since they are all exact synonyms of each other, but if we delete them there will be no way for a naive user to find the snowclone pages unless they somehow know that it's tucked away in the Appendix. —Soap— 20:05, 30 June 2023 (UTC)
2022 — June
I've created the page using the wrong character. It should be moved to دلكو instead. Dohqo (talk) 07:21, 18 June 2022 (UTC)
- @Dohqo: Because there's already a Persian entry on the same page, and the character is correct for Persian, it doesn't make sense to move the page. Just delete the Old Anatolian Turkish section from this page and create it on the correct page. You can do it all yourself, no admin rights needed. —Mahāgaja · talk 08:36, 18 June 2022 (UTC)
Given that our planet's name is usually capitalized, I think this should be moved to Category:Flat Earth. Binarystep (talk) 03:51, 20 June 2022 (UTC)
- We have flat-earthism and flat-earther, but Wikipedia has flat Earth. Hmm. I support the move proposal.- excarnateSojourner (talk | contrib) 01:35, 6 November 2022 (UTC)
Duplicate definitions and potentially missing parts of speech. — Fytcha〈 T | L | C 〉 15:42, 22 June 2022 (UTC)
- The hyphenated form is now defined as an alternative of the spaced form (but the solid and spaced forms still need to be merged). — excarnateSojourner (talk · contrib) 21:38, 25 February 2023 (UTC)
Delete and merge into Category:Places in France or some appropriate equivalent. France does not have dependencies: all its overseas territories are integrated into the French Republic. See also: Category:ca:Dependent territories of France, Category:zh:Dependent territories of France, Category:nl:Dependent territories of France, Category:en:Dependent territories of France, Category:fi:Dependent territories of France, Category:fr:Dependent territories of France, Category:de:Dependent territories of France, Category:el:Dependent territories of France, Category:hu:Dependent territories of France, Category:ga:Dependent territories of France, Category:it:Dependent territories of France, Category:ja:Dependent territories of France, Category:lv:Dependent territories of France, Category:nrf:Dependent territories of France, Category:nb:Dependent territories of France, Category:nn:Dependent territories of France, Category:pl:Dependent territories of France, Category:pt:Dependent territories of France, Category:rar:Dependent territories of France, Category:ro:Dependent territories of France, Category:ru:Dependent territories of France, Category:es:Dependent territories of France, Category:sv:Dependent territories of France, Category:tr:Dependent territories of France, Category:vi:Dependent territories of France, and Category:vo:Dependent territories of France —Justin (koavf)❤T☮C☺M☯ 04:10, 23 June 2022 (UTC)
- @Koavf What do you call New Caledonia, French Polynesia and such if not dependent territories? Benwing2 (talk) 00:40, 24 June 2022 (UTC)
- New Caledonia is a sui generis overseas collectivity of France. It has membership in the French parliament and France's rule of law and citizenship extends there just like in Corsica or Guadelope or Lyons. None of these are dependencies: they are all first-level administrative divisions of the French Republic. —Justin (koavf)❤T☮C☺M☯ 00:48, 24 June 2022 (UTC)
- I want a category for all overseas territories of France, and I don't much care about the technicalities. What is the right category? Benwing2 (talk) 01:52, 24 June 2022 (UTC)
- I would have absolutely no objection to Category:Overseas France as a subcat of Category:Places in France and that can include everywhere other than Metropolitan/European France (the mainland, Corsica, and other nearby islands). Seems sensible to me. —Justin (koavf)❤T☮C☺M☯ 04:00, 24 June 2022 (UTC)
- That is hard to do in the current framework without major hacking. There used to be Category:Collectivities in France populated by these entities, will that work? Are they all collectivities? Benwing2 (talk) 05:05, 24 June 2022 (UTC)
- Guadeloupe, Mayotte etc are not collectivities. Unfortunately Justin is right, we need CAT:Overseas France if we're going to be strictly correct here. This, that and the other (talk) 04:20, 26 June 2022 (UTC)
- That is hard to do in the current framework without major hacking. There used to be Category:Collectivities in France populated by these entities, will that work? Are they all collectivities? Benwing2 (talk) 05:05, 24 June 2022 (UTC)
- I would have absolutely no objection to Category:Overseas France as a subcat of Category:Places in France and that can include everywhere other than Metropolitan/European France (the mainland, Corsica, and other nearby islands). Seems sensible to me. —Justin (koavf)❤T☮C☺M☯ 04:00, 24 June 2022 (UTC)
- I want a category for all overseas territories of France, and I don't much care about the technicalities. What is the right category? Benwing2 (talk) 01:52, 24 June 2022 (UTC)
- New Caledonia is a sui generis overseas collectivity of France. It has membership in the French parliament and France's rule of law and citizenship extends there just like in Corsica or Guadelope or Lyons. None of these are dependencies: they are all first-level administrative divisions of the French Republic. —Justin (koavf)❤T☮C☺M☯ 00:48, 24 June 2022 (UTC)
These were recently moved by @Apisite from their own user namespace to the Wiktionary namespace under "Requested entries (Chinese)". All of these pages are not requested entries but pronunciation requests. I'm not entirely sure where these should be moved instead, but I don't think they're in the right place currently. — justin(r)leung { (t...) | c=› } 16:59, 23 June 2022 (UTC)
- Seems like they should be subcategories of Category:Requests for pronunciation in Chinese entries. —Mahāgaja · talk 20:50, 23 June 2022 (UTC)
Duplicate content, move all to fiber. Ngram — Fytcha〈 T | L | C 〉 19:17, 26 June 2022 (UTC)
- @Fytcha: A word of caution: anything involving pondian variation should be handled carefully. There are good arguments for going either way on most of these, and we don't want to start any kind of conflict. Our general practice has been to arbitrarily go with whichever version was first, though it's been a while since one of these came up. Chuck Entz (talk) 20:17, 26 June 2022 (UTC)
- In this case, fibre is older, but by only 14 hours. Also, the translation tables are all already at fibre, so I feel like making fibre the primary spelling and fiber the alternative spelling will be less work. —Mahāgaja · talk 21:09, 26 June 2022 (UTC)
- From Google N-Grams: Since 1911 fiber has been more common. As of 2009 it is about three times as common. DCDuring (talk) 21:14, 26 June 2022 (UTC)
- We don't apply that when it comes to AmEng/BrEng differences. Theknightwho (talk) 21:37, 26 June 2022 (UTC)
- Who says? DCDuring (talk) 21:51, 26 June 2022 (UTC)
- Also, since 2016 fiber has been more common in Google's British English N-Gram corpus andsix times more common in American English corpus. DCDuring (talk) 21:54, 26 June 2022 (UTC)
- We don't apply that when it comes to AmEng/BrEng differences. Theknightwho (talk) 21:37, 26 June 2022 (UTC)
- From Google N-Grams: Since 1911 fiber has been more common. As of 2009 it is about three times as common. DCDuring (talk) 21:14, 26 June 2022 (UTC)
- @Chuck Entz: I see. If that is de facto policy then the meat should go to fibre. However, if I could have devised the policy, I would have made it so that it always aligns with the frequency because that way the users land on the non-redirecting spelling more often. — Fytcha〈 T | L | C 〉 22:26, 26 June 2022 (UTC)
- We actually had an attempt by a Russian internet troll (geolocating to Crimea) to get us arguing about UK vs. US issues, but it went nowhere. At the time I just thought it was odd, but with the revelations after Trump was elected I finally put two and two together and realized what was going on. I still have no idea why they even bothered, since our discussion forums aren't exactly the center of the universe. I do know that the mutual respect between our US and UK editors, helped by this kind of practice, was the main reason it was such a non-issue. Chuck Entz (talk) 23:20, 26 June 2022 (UTC)
- @Chuck Entz I have it on my to-do list to build a template that duplicates the material from the "primary" entry, which should hopefully circumvent issues like this anyway. I've done something similar with Tangut already (e.g. see 𗁘 (*rjijr²), 𗁩 (*tẽ¹), 𗀏 (*par²)), though the implementation would need some tweaking. Theknightwho (talk) 00:18, 27 June 2022 (UTC)
- We actually had an attempt by a Russian internet troll (geolocating to Crimea) to get us arguing about UK vs. US issues, but it went nowhere. At the time I just thought it was odd, but with the revelations after Trump was elected I finally put two and two together and realized what was going on. I still have no idea why they even bothered, since our discussion forums aren't exactly the center of the universe. I do know that the mutual respect between our US and UK editors, helped by this kind of practice, was the main reason it was such a non-issue. Chuck Entz (talk) 23:20, 26 June 2022 (UTC)
- In this case, fibre is older, but by only 14 hours. Also, the translation tables are all already at fibre, so I feel like making fibre the primary spelling and fiber the alternative spelling will be less work. —Mahāgaja · talk 21:09, 26 June 2022 (UTC)
- @Fytcha Both of these entries have sections for English, Danish, and Norwegian Bokmål. Can you confirm that you are only talking about the English sections? — excarnateSojourner (talk · contrib) 21:49, 25 February 2023 (UTC)
2022 — July
Sense 3: Relating to the spoken rather than written form of a word or name, as opposed to orthographic.
Feels like this could be merged in some way with sense 1: Relating to the sounds of spoken language.
Theknightwho (talk) 08:44, 7 July 2022 (UTC)
- I suppose this might be trying (unclearly) to express the sense used in "a phonetic spelling" (one based on how it sounds) as contrasted with, say, "a phonetic sketch of Urama" (one describing its phonology). Whether this merits a different sense I'm not sure. - -sche (discuss) 23:30, 23 July 2022 (UTC)
Inconsistent capitalization of I/internet slang
We capitalize it as a label — (Internet slang), but not in the category name — Category:English internet slang. When I was adding this category I thought it would also be capitalized, like in the label (but it was a red link). J3133 (talk) 09:16, 15 July 2022 (UTC)
- Not just for English slang; see Category:Internet slang by language and Category:Internet laughter slang by language. --Lambiam 15:30, 15 July 2022 (UTC)
- I am aware that the slang category is not exclusive to English; however, @Lambiam, what is our solution? J3133 (talk) 15:49, 15 July 2022 (UTC)
- The regular approach is to list these at WT:RFM. This seems, however, a place where proposals go to linger in limbo: there is an unresolved category move request (WT:RFM § Category:WC) from 2015. The sledgehammer approach is to create a vote at WT:VOTE. --Lambiam 17:20, 15 July 2022 (UTC)
- Good point; I have moved it. J3133 (talk) 17:33, 15 July 2022 (UTC)
- @J3133 What did you move? Category: English internet slang has not been moved since 2019. - excarnateSojourner (talk | contrib) 00:16, 23 October 2022 (UTC)
- @ExcarnateSojourner: I moved this discussion from the Beer parlour. J3133 (talk) 18:11, 13 March 2023 (UTC)
- @J3133 What did you move? Category: English internet slang has not been moved since 2019. - excarnateSojourner (talk | contrib) 00:16, 23 October 2022 (UTC)
- Good point; I have moved it. J3133 (talk) 17:33, 15 July 2022 (UTC)
- The regular approach is to list these at WT:RFM. This seems, however, a place where proposals go to linger in limbo: there is an unresolved category move request (WT:RFM § Category:WC) from 2015. The sledgehammer approach is to create a vote at WT:VOTE. --Lambiam 17:20, 15 July 2022 (UTC)
- I am aware that the slang category is not exclusive to English; however, @Lambiam, what is our solution? J3133 (talk) 15:49, 15 July 2022 (UTC)
- It doesn't matter Dunderdool (talk) 16:44, 15 July 2022 (UTC)
- It would be good if we fixed it, as we have with category and label inconsistencies previously. If not now, I am sure someone will bring this issue up and fix it sometime. J3133 (talk) 16:50, 15 July 2022 (UTC)
- Also if it would not matter I would fix this without a discussion but it would matter to someone. J3133 (talk) 16:52, 15 July 2022 (UTC)
- Personally, I would lowercase the label (and anything else). On the other hand, Google Books Ngrams suggests Internet is more common. That said, it's less work to lowercase the label than to move all the categories... - -sche (discuss) 23:33, 23 July 2022 (UTC)
- According to capitalization of Internet, cited by excarnateSojourner, the trend of capitalising the I in internet is decreasing. So I'll support lowercasing the I in the label. The higher rank of Internet in Google Ngram Viewer is maybe because internet sometime occurs at the beginning of a sentence and is thus capitalised. Sbb1413 (he) (talk • contribs) 08:07, 9 March 2023 (UTC)
- Looks like it was originally capitalized. Mnemosientje (talk • contribs) lowercased it back in 2019. - excarnateSojourner (talk | contrib) 00:15, 23 October 2022 (UTC)
- The capitalization of Internet is a whole thing, but for what it's worth Wikipedia does capitalize it. - excarnateSojourner (talk | contrib) 00:15, 23 October 2022 (UTC)
- @ExcarnateSojourner: Wikipedia's inconsistent too. Compare Category:History of the Internet and Category:People related to the internet. However, the capitalized spelling does seem to be more common in category names, so I support capitalizing Category:English internet slang. Binarystep (talk) 03:11, 13 March 2023 (UTC)
- It should be capitalised. There is such a thing as "an internet" or internetwork (generic; although you very rarely hear this terminology any more), versus "the Internet" (the global thing we all use all the time). Same deal with "the Web" versus (I suppose) "a web" although I don't remember even the most braggart webmasters using the latter. As always, citable usage trumps what I say, but I am historically correct. Equinox ◑ 03:14, 13 March 2023 (UTC)
Split Category:Thieves' cant into subcategories by language
Currently, this category contains 123 English entries, 1 Japanese entry, and 3 Yiddish entries. This is inconsistent with how categories usually work (compare Category:English fandom slang or Category:English Polari slang). I suggest that we split Category:Thieves' cant into Category:English thieves' cant, Category:Japanese thieves' cant, and Category:Yiddish thieves' cant, allowing for more subcategories if the need arises. Binarystep (talk) 11:33, 21 July 2022 (UTC)
- Sounds reasonable. 98.170.164.88 11:41, 21 July 2022 (UTC)
- Yeah. The category description suggests it was originally intended only for English (compare Category:Rotwelsch, for one or two languages not directly specific in the name). If multiple languages have thieves' cants, as seems to be the case,
then this should be split per nom. - -sche (discuss) 23:35, 23 July 2022 (UTC)- I assumed this was a mistake, to be quite honest. Seems odd to assume this would only exist in English in the first place, really. Theknightwho (talk) 23:40, 23 July 2022 (UTC)
- In fairness, the Wikipedia article and most books I can find about it take it as given that it's an English thing; I've occasionally even seen it capitalized as it people thought of it as the name of a specific lect. Apparently the term for it in other languages and other time periods of English is criminal slang, I now realize. Hmm, now I wonder whether we should split this after all, since then the "thieves' cant" and "criminal slang" categories would overlap. But the current name is clearly too ambiguous, since people are adding non-English entries to it. Maybe we should move the English entries to "English thieves' cant" (for the historical lect) and disperse the other languages and any modern English developments to Category:Criminal slang by language? - -sche (discuss) 23:51, 23 July 2022 (UTC)
- Other languages definitely have thieves' cants, but they might not use the term "thieves' cant". Rotwelsch is German thieves' cant, but it's just called Rotwelsch. —Mahāgaja · talk 06:56, 30 July 2022 (UTC)
- It seems reasonable to use "criminal slang" as the proper category for such terms. "English thieves' cant" can be made a subcategory of that. — Sgconlaw (talk) 07:01, 30 July 2022 (UTC)
- I think that’s a good idea, because that can be broken down by language, and allows for categories like this to go under the language categories. No doubt there are numerous lects of criminal slang in English alone. Theknightwho (talk) 21:23, 14 August 2022 (UTC)
- It seems reasonable to use "criminal slang" as the proper category for such terms. "English thieves' cant" can be made a subcategory of that. — Sgconlaw (talk) 07:01, 30 July 2022 (UTC)
- Other languages definitely have thieves' cants, but they might not use the term "thieves' cant". Rotwelsch is German thieves' cant, but it's just called Rotwelsch. —Mahāgaja · talk 06:56, 30 July 2022 (UTC)
- In fairness, the Wikipedia article and most books I can find about it take it as given that it's an English thing; I've occasionally even seen it capitalized as it people thought of it as the name of a specific lect. Apparently the term for it in other languages and other time periods of English is criminal slang, I now realize. Hmm, now I wonder whether we should split this after all, since then the "thieves' cant" and "criminal slang" categories would overlap. But the current name is clearly too ambiguous, since people are adding non-English entries to it. Maybe we should move the English entries to "English thieves' cant" (for the historical lect) and disperse the other languages and any modern English developments to Category:Criminal slang by language? - -sche (discuss) 23:51, 23 July 2022 (UTC)
- I assumed this was a mistake, to be quite honest. Seems odd to assume this would only exist in English in the first place, really. Theknightwho (talk) 23:40, 23 July 2022 (UTC)
- OK, I moved the three Yiddish entries and one Japanese entry to "criminal slang". For the English entries, are we renaming the historical English "Thieves' cant" lect(s) to "English thieves' cant" for clarity? And then are we
- making the "thieves' cant" label English-only, i.e. changing it from always adding
plain_categories = { "Thieves' cant" },
regardless of language code to always addingplain_categories = { "English thieves' cant" },
(a subcategory of "English criminal slang"), and ongoingly removing uses outside English? - or allowing for other languages to have their own "thieves' cant" subcategories of "criminal slang" (which entails changing the label to use
pos_categories
so each language could have its own "thieves' cant" subcategory of "criminal slang")?
- making the "thieves' cant" label English-only, i.e. changing it from always adding
- - -sche (discuss) 20:39, 21 August 2022 (UTC)
- @-sche: I’d say leave “Thieves’ cant” as English-only. It seems a peculiarly archaic English expression rather than a general term of art. — Sgconlaw (talk) 11:33, 17 September 2022 (UTC)
One is tagged as obsolete and defined as A kind of furnace used in refining, to separate the metal from cinders and other foreign matter., another not obsolete defined as A furnace in which slags of litharge left in refining silver are reduced to lead by being heated with charcoal.. Good luck to the potential merger Dunderdool (talk) 18:06, 29 July 2022 (UTC)
DoggoLingo is the jargon used in doge memes. This should be changed to "Category:English DoggoLingo," since it contains only English terms, and to remain consistent with similar categories (e.g. Category:English 4chan slang). WordyAndNerdy (talk) 05:29, 30 July 2022 (UTC)
- @WordyAndNerdy: 4chan is not only English: see DoggoLingo: “A form of English-language Internet slang related to dogs”. Compare Category:Rotwelsch, etc. J3133 (talk) 06:11, 30 July 2022 (UTC)
- "Category:[Language] [word type]" is the standard naming convention of lexical categories. Category:English irregular nouns, Category:English onomatopoeias, Category:English fandom slang, etc. This category contains only English-language DoggoLingo terms, and thus the correct name should be "Category:English DoggoLingo". German-language DoggoLingo terms would go under "Category:German DoggoLingo", French DoggoLiggo would go under "Category:French DoggoLingo", etc. (Presuming this meme has spread to other languages.) WordyAndNerdy (talk) 06:24, 30 July 2022 (UTC)
- @WordyAndNerdy: I just stated why it does not have “English” in the title: see DoggoLingo: “A form of English-language Internet slang related to dogs”. For the same reason, Category:Rotwelsch is not “German Rotwelsch”, etc. J3133 (talk) 06:29, 30 July 2022 (UTC)
- We could use some empirical data here. Does DoggoLingo or a close equivalent actually exist in German or French? If it does, that provides some reason to approve this proposal (and possibly to update the relevant articles). If not, it provides some reason to reject it. 98.170.164.88 06:40, 30 July 2022 (UTC)
- I agree with this. WordyAndNerdy, do you have proof that Internet slang related to dogs (i.e., of the type of DoggoLingo) exists in other languages, and would use the same name derived from English slang? J3133 (talk) 06:45, 30 July 2022 (UTC)
- *deep existential sigh* English-language lexical categories have an established naming convention. I have never seen an English-language lexical category that was just "Category:Word type" (e.g. "Category:Fandom slang", "Category:Military slang", etc.) in 10+ years of contributing. Can't speak for lexical categories in other languages, but if someone wants to change an established convention, they need to do so by obtaining consensus, not by unilaterally imposing a new standard. This is an extremely straightforward request and having to get bogged down in bureaucratic discussions like this means less time for doing productive things like attesting Internet slang. WordyAndNerdy (talk) 07:15, 30 July 2022 (UTC)
- The difference is that “DoggoLingo” is a proper noun, not just another word type. J3133 (talk) 07:18, 30 July 2022 (UTC)
- *deeper existential sigh* This reasoning is, to be perfectly frank, bizarre and arbitrary. Twitch-speak is a proper noun too. Guess what the relevant English-language lexical category is named? There's an established convention, and this category's name doesn't follow it. WordyAndNerdy (talk) 07:27, 30 July 2022 (UTC)
- A proper noun that specifically refers to English, if you are not already aware. Like Rotwelsch is a proper noun referring to German. J3133 (talk) 07:30, 30 July 2022 (UTC)
- The Wikipedia article defines DoggoLingo as an "Internet language" and doesn't specify that it's limited exclusively to English in said definition. In any case, this is completely perpendicular to the issue of what the category should be named. No one had to prove the existence of Dutch Twitch-speak, Korean Twitch-speak, etc. to create "Category:English Twitch-speak." That's what the category ought to be named following the established naming convention of English-language lexical categories. (And given that you haven't incorporated this category into the category tree module -- which is like step two of creating a new category -- maybe it isn't prudent to act as if you have special expertise or authority in this area.) WordyAndNerdy (talk) 07:51, 30 July 2022 (UTC)
- You are the one acting you have authority here, though. Category:Rotwelsch is not in the category tree either, not sure what is your point. J3133 (talk) 07:54, 30 July 2022 (UTC)
- Let consensus decide, instead of assuming this is a “straightforward request”. J3133 (talk) 08:02, 30 July 2022 (UTC)
- The Wikipedia article defines DoggoLingo as an "Internet language" and doesn't specify that it's limited exclusively to English in said definition. In any case, this is completely perpendicular to the issue of what the category should be named. No one had to prove the existence of Dutch Twitch-speak, Korean Twitch-speak, etc. to create "Category:English Twitch-speak." That's what the category ought to be named following the established naming convention of English-language lexical categories. (And given that you haven't incorporated this category into the category tree module -- which is like step two of creating a new category -- maybe it isn't prudent to act as if you have special expertise or authority in this area.) WordyAndNerdy (talk) 07:51, 30 July 2022 (UTC)
- A proper noun that specifically refers to English, if you are not already aware. Like Rotwelsch is a proper noun referring to German. J3133 (talk) 07:30, 30 July 2022 (UTC)
- *deeper existential sigh* This reasoning is, to be perfectly frank, bizarre and arbitrary. Twitch-speak is a proper noun too. Guess what the relevant English-language lexical category is named? There's an established convention, and this category's name doesn't follow it. WordyAndNerdy (talk) 07:27, 30 July 2022 (UTC)
- The difference is that “DoggoLingo” is a proper noun, not just another word type. J3133 (talk) 07:18, 30 July 2022 (UTC)
- *deep existential sigh* English-language lexical categories have an established naming convention. I have never seen an English-language lexical category that was just "Category:Word type" (e.g. "Category:Fandom slang", "Category:Military slang", etc.) in 10+ years of contributing. Can't speak for lexical categories in other languages, but if someone wants to change an established convention, they need to do so by obtaining consensus, not by unilaterally imposing a new standard. This is an extremely straightforward request and having to get bogged down in bureaucratic discussions like this means less time for doing productive things like attesting Internet slang. WordyAndNerdy (talk) 07:15, 30 July 2022 (UTC)
- I agree with this. WordyAndNerdy, do you have proof that Internet slang related to dogs (i.e., of the type of DoggoLingo) exists in other languages, and would use the same name derived from English slang? J3133 (talk) 06:45, 30 July 2022 (UTC)
- We could use some empirical data here. Does DoggoLingo or a close equivalent actually exist in German or French? If it does, that provides some reason to approve this proposal (and possibly to update the relevant articles). If not, it provides some reason to reject it. 98.170.164.88 06:40, 30 July 2022 (UTC)
- @WordyAndNerdy: I just stated why it does not have “English” in the title: see DoggoLingo: “A form of English-language Internet slang related to dogs”. For the same reason, Category:Rotwelsch is not “German Rotwelsch”, etc. J3133 (talk) 06:29, 30 July 2022 (UTC)
- "Category:[Language] [word type]" is the standard naming convention of lexical categories. Category:English irregular nouns, Category:English onomatopoeias, Category:English fandom slang, etc. This category contains only English-language DoggoLingo terms, and thus the correct name should be "Category:English DoggoLingo". German-language DoggoLingo terms would go under "Category:German DoggoLingo", French DoggoLiggo would go under "Category:French DoggoLingo", etc. (Presuming this meme has spread to other languages.) WordyAndNerdy (talk) 06:24, 30 July 2022 (UTC)
- Support. I'm reminded of the discussion I started about how to handle Category:Thieves' cant. Binarystep (talk) 06:37, 31 July 2022 (UTC)
- And the consensus of that discussion seems to be that "Thieves' cant" is a strictly English historical example of criminal slang, and that the non-English entries in Category:Thieves' cant should be moved to language-specific criminal slang subcategories- the opposite of this proposal.
- It's true that there's a naming convention to put language names in category names, but that doesn't apply to this kind of entry, and saying it does shows a misunderstanding of the convention. While there's nothing to stop other languages from having their own equivalents to DoggoLingo, it seems to have been created by English-speakers using humor based on the peculiarities of the English language. If other languages come up with their own equivalents, I sincerely doubt that they would be called DoggoLingo. DoggoLingo is a variety of English, just like pig Latin and double Dutch, and "English DoggoLingo" would be redundant. Chuck Entz (talk) 08:13, 31 July 2022 (UTC)
- Funny you should mention Pig Latin, since the category for that is called Category:English Pig Latin terms. Binarystep (talk) 02:54, 1 August 2022 (UTC)
- For the record, that category is very poorly formatted unsurprisingly. There’s no overarching Category:Pig Latin or Category Pig Latin terms, nor does there seem to be other languages linked to it, so there really shouldn’t be an English label there. AG202 (talk) 11:02, 1 August 2022 (UTC)
- I agree, the English label should be removed; see the RFM. J3133 (talk) 11:16, 1 August 2022 (UTC)
- For the record, that category is very poorly formatted unsurprisingly. There’s no overarching Category:Pig Latin or Category Pig Latin terms, nor does there seem to be other languages linked to it, so there really shouldn’t be an English label there. AG202 (talk) 11:02, 1 August 2022 (UTC)
- Funny you should mention Pig Latin, since the category for that is called Category:English Pig Latin terms. Binarystep (talk) 02:54, 1 August 2022 (UTC)
- As I did not vote: oppose per Chuck Entz. J3133 (talk) 10:02, 31 July 2022 (UTC)
2022 — August
As AG202 stated in the DoggoLingo category RFM, “There’s no overarching Category:Pig Latin or Category Pig Latin terms, nor does there seem to be other languages linked to it, so there really shouldn’t be an English label there.” This was after Chuck Entz used the argument there that “English DoggoLingo” would be redundant, “just like pig Latin”, then Binarystep pointed out that the Pig Latin category does use the English label—redundantly. J3133 (talk) 11:24, 1 August 2022 (UTC)
This idiom is far more versatile than the specific and somewhat informal phrasing we have here (which doesn't even match the quotation we have), it's a fully fledged verb phrase — see the examples at Teaching grandmother to suck eggs.
Two points: there is such a wide range of familiar terms for grandmother that can be used in this phrase so I think it's best to stick with "grandmother". However I think it's worth investigating if it's more common with or without the possessive pronoun (here "one's"); to me it sounds more natural with it but there are citations both ways. 86.145.59.120 18:42, 14 August 2022 (UTC)
- We also have teach grandma how to suck eggs. J3133 (talk) 08:19, 15 August 2022 (UTC)
- I'm somewhat inclined to pick a most common or general negative form to lemmatize like not teach grandmother how to suck eggs, and also have the positive form (maybe teach grandmother to suck eggs since a possessive doesn't seem required? or if a pronoun is more common, then redirect the pronounless form to the pronouned form, either works). This is both because it's unclear how many translations can have the negative removed and because in general, as I said in the discussion of all it's cracked up to be further up this page, when we redirect a negative expression to a positive one or vice versa there's a risk that a reader who doesn't notice they were redirected will come away thinking the phrase means the opposite of what it actually means. To avoid duplication we could make the negative form almost a soft redirect, defining it like "To not teach grandmother to suck eggs (“presume to give advice to someone who is more experienced”)" or even "To not teach grandmother to suck eggs (see that entry)"; I don't know, I don't like splitting content across multiple pages, but I also think it's risky to silently strip away the negative polarity with a seamless little redirect and expect IPs who sometimes don't even notice they're on Wiktionary and not Wikipedia to notice and understand that the polarity of the headword has changed and thus that the definition of the term they looked up is the opposite of the one we're giving them. - -sche (discuss) 15:20, 19 August 2022 (UTC)
- Negative polarity is "licensed" in many forms, starting with the negative being separated from the rest of the expression: conditionals, questions, infinitives with certain verbs (eg, try to) or other expressions (eg. hard to). These might lead someone to look up the positive form. I think that a "negative-polarity item" label (with link to WP or our Glossary), usage examples with adjoining and disjoint not and n't, and redirects would enable us to use the positive form as the lemma. I don't see how to use redirects in the other direction. Even usage examples would be problematic with not in the headwords. DCDuring (talk) 21:07, 19 August 2022 (UTC)
- What I mean is, I'm somewhat inclined to have both "not teach grandmother to suck eggs" defined as "not give advice to someone more experienced", and then also "teach grandmother to suck eggs" defined as "give advice to someone more experienced", redirecting all the various negative forms to the first one and the positive forms to the second one. But I'm not opposed to only having the positive form and redirecting everything to it; I do dislike splitting content across multiple pages, I just also think there's always a danger when someone types "not teach grandmother to suck eggs" into the search bar and as seamlessly sent to "teach grandmother to suck eggs" where they read a definition that's inverted from that of the term they typed in and which they think they looked up. - -sche (discuss) 21:46, 19 August 2022 (UTC)
- Negative polarity is "licensed" in many forms, starting with the negative being separated from the rest of the expression: conditionals, questions, infinitives with certain verbs (eg, try to) or other expressions (eg. hard to). These might lead someone to look up the positive form. I think that a "negative-polarity item" label (with link to WP or our Glossary), usage examples with adjoining and disjoint not and n't, and redirects would enable us to use the positive form as the lemma. I don't see how to use redirects in the other direction. Even usage examples would be problematic with not in the headwords. DCDuring (talk) 21:07, 19 August 2022 (UTC)
On the subject of removing "English" from certain category names (like several requests above)... should we just call this "Category:Translation hubs"? Even rare cases currently discussed in the Tea Room where a proverb exists in several other languages but not English (which we currently don't categorize at all, but which Fay Freak says he creates templates to usefully crosslink translations of, which probably should be categorized together with our other translations hubs) are better handled with one "Translation hubs" category than with, say, "Russian translation hubs", "Chinese translation hubs", no? - -sche (discuss) 19:26, 21 August 2022 (UTC)
- Support: More concise and less confusing. — excarnateSojourner (talk · contrib) 07:00, 22 May 2023 (UTC)
Split [zhx-pin] into [cnp] and [csp]
[zhx-pin] is an etym-only code added back in 2014 (diff) as [pinhua] and later renamed to [zhx-pin] in 2019. [cnp] and [csp] are ISO 639-3 codes added in January 2020. Note that the current data module incorrectly suggests [yue] (Cantonese) to be the parent of [zhx-pin], but they are generally considered to be distinct, which is mentioned in ISO's comment on the change request. -- Wpi31 (talk) 14:40, 23 August 2022 (UTC)
- Support 沈澄心✉ 12:29, 6 October 2022 (UTC)
- Support — justin(r)leung { (t...) | c=› } 16:10, 6 October 2022 (UTC)
Should be moved to a different title as “gender-neutral” is misleading: e.g., femxle, mxn, and womxn are not gender-neutral (only one is—Mixter). Created by WordyAndNerdy who stated “This is the senseid name used to link the -x- infix. Maybe a different name would work better, but this senseid is already baked into links in entries and the category name.” J3133 (talk) 11:03, 24 August 2022 (UTC)
- I have no strong feelings in this matter. This was created after -x, which required a disambiguating sense-id, as there are multiple distinct senses. The simplest solution here would be to just cut "gender-neutral" out of the category name since there is currently only one English sense for the infix. WordyAndNerdy (talk) 11:41, 24 August 2022 (UTC)
- If we decide this is best handled as an affix, I agree we should try to find a better name (possibly just "...words infixed with -x-" as WordyAndNerdy says). Whether this is best handled an affix is under discussion at Wiktionary:Tea room/2022/August#uses_of_x:_to_cover_at_x_or_as_affixes_-x_and_-x-?. - -sche (discuss) 23:21, 24 August 2022 (UTC)
Should have one's heart in one's boots be moved to just one's heart in one's boots because it also occurs without have (e.g. when someone stands/waits/etc google books:"with her heart in her boots")? That is why have was dropped from one's heart in one's mouth, according to the edit history. FWIW all three expressions can be found without even the pronouns, as in google books:"heart in throat". - -sche (discuss) 10:35, 28 August 2022 (UTC)
- The throat form has been moved to omit have, and is defined as a synonym of the mouth form. — excarnateSojourner (talk · contrib) 22:17, 25 February 2023 (UTC)
2022 — September
bighorn sheep and bighorn
Plenty of overlap, spesh with translations. Maybe there's just one species called this, maybe two... something for the animal nerds here... you know who you are Almostonurmind (talk) 00:48, 8 September 2022 (UTC)
- Not finding evidence that O. dalli is ever called "bighorn" or "bighorn sheep". It's called Dall sheep or thinhorn sheep AFAICT. —Mahāgaja · talk 07:09, 8 September 2022 (UTC)
- Formally, that's probably true, though I doubt most people make the distinction consistently colloquially. But that wouldn't be particular to bighorn. I think people who didn't make the distinction would be just as likely to use bighorn sheep when describing Dall's sheep. Andrew Sheedy (talk) 15:18, 21 October 2022 (UTC)
- I have split both into two subsenses and RfVed the O. dalli subsenses. I have not yet found any evidence that either term is applied to O. dalli. I would include O. dalli and thinhorn sheep under See also at both of these entries. DCDuring (talk) 15:57, 21 October 2022 (UTC)
- Formally, that's probably true, though I doubt most people make the distinction consistently colloquially. But that wouldn't be particular to bighorn. I think people who didn't make the distinction would be just as likely to use bighorn sheep when describing Dall's sheep. Andrew Sheedy (talk) 15:18, 21 October 2022 (UTC)
Re-merge Kven and Meänkieli into Finnish
@-sche, Chuck Entz, Rua, Tropylium, Hekaheka, Surjection, Brittletheories, Mölli-Möllerö
In the previous discussion on this topic ([1]) it seems everyone has agreed that it's best to merge Kven and Meänkieli into Finnish. However, the discussion was closed without actually merging the codes, and currently we (again) have 40 Kven and 30 Meänkieli lemmas, many of which are also duplicated as Finnish for the reasons discussed in the above discussion. Has anyone changed their opinion or does anyone have anything to add to this or can we actually go ahead and merge the languages?
I guess related to this is also the question of how to handle dialectal morphology of Finnish dialects, but maybe that's a bit out of scope for this discussion. Thadh (talk) 16:24, 23 September 2022 (UTC)
- The strongest arguments in favour of splitting them are political and should therefore be ignored. Our task is to best present the most information, and that would best be achieved by merging the three lects. The dozens or so new dialectal terms will fit in quite well with the 1250 pre-existing ones. brittletheories (talk) 16:49, 23 September 2022 (UTC)
- Incubator says "Wikimedia does not decide for itself what is a language and what is a dialect. We follow the ISO 639 standard." This means that it's up to the agency that grants language codes, not to us, right? Meänkieli and Kven have written standards so they should stay as they are. (In my view, Tver Karelian should also be treated as a language so I could add Tver Karelian words without knowing if they're used in the more usual "vienankarjala" dialect.) Mölli-Möllerö (talk) 19:55, 23 September 2022 (UTC)
- The Incubator standards are not the same as our standards. Our language treatment does not strictly follow ISO 639. — SURJECTION / T / C / L / 20:33, 23 September 2022 (UTC)
- @Mölli-Möllerö: On the Tver Karelian issue, you could also just leave the first parameter of
{{krl-regional}}
empty or|1=?
it, and it will automatically be sorted in Category:Karelian term requests, and I'll be able to add the terms later. Or you could use either{{R:krl:KKS}}
or another Viena source, the correspondences are usually quite easy. Thadh (talk) 20:44, 23 September 2022 (UTC) - Wrong. There's a big difference between Wikimedia's administrative needs and the lexical needs of a dictionary. As for written standards: the world is full of languages with multiple written standards: Brazilian and European Portuguese, European and Canadian French, Austrian and German German, etc. We can't let others decide for us- each case needs to be considered on its own. We've chosen to merge languages treated as separate by ISO and recognize languages with no ISO codes. In other cases we've gone with the ISO. Chuck Entz (talk) 20:59, 23 September 2022 (UTC)
- Incubator says "Wikimedia does not decide for itself what is a language and what is a dialect. We follow the ISO 639 standard." This means that it's up to the agency that grants language codes, not to us, right? Meänkieli and Kven have written standards so they should stay as they are. (In my view, Tver Karelian should also be treated as a language so I could add Tver Karelian words without knowing if they're used in the more usual "vienankarjala" dialect.) Mölli-Möllerö (talk) 19:55, 23 September 2022 (UTC)
- For outsiders, Meänkieli (in Sweden) and Kven (in Norway) are languages or rather dialects that have become languages by virtue of being across the border (the Finnish-Swedish border and the Finnish-Norwegian border, respectively). Finnish speakers can easily understand nearly 99% of Meänkieli or Kven, and the main differences are either dialectal features also found in Far Northern Ostrobothnian dialects or (the lack of) recent developments within the past 200 years (in one or the other).
- Linguistically they are 100% dialects, but politically both Sweden and Norway respectively have recognized them as separate languages, which is also what their speakers think. A more cynical person might say that they have deluded themselves into thinking their language is not Finnish in order to avoid persecution of Finnish that was prevalent in Sweden and Norway in the 19th and 20th centuries ("Finnish? what Finnish? we're not speaking Finnish, it's Meänkieli/Kven").
- However WIktionary best handles cases like these, I don't know. 200 years is not enough for what is generally a phonologically conservative language for it to become anywhere near unrecognizable. It could be compared to how Karelian is now almost universally treated as a separate language, even though it forms a dialect continuum and has been diverging now for at least about 800 years (ever since the 1323 Treaty of Nöteborg).
- Finnish sources almost exclusively consider Meänkieli and Kven to be dialects, even more so when these sources are linguistic-oriented (some other sources take a political stance and recognize that they are considered "minority languages" in their respective countries). — SURJECTION / T / C / L / 20:34, 23 September 2022 (UTC)
- "The main differences are either dialectal features also found in Far Northern Ostrobothnian dialects or (the lack of) recent developments within the past 200 years (in one or the other)"... and the additional Swedish/Norwegian loanwords found in Meänkieli/Kven, of course. But many of these are also found in Finnish dialects. — SURJECTION / T / C / L / 21:37, 23 September 2022 (UTC)
- The divergence of Karelian from Finnish, FWIW, almost certainly goes back at least 1200 years (to the archeological / mentioned-in-Novgorod-sources Old Karelian culture). The initial split-off of Northern Finnish dialects is probably about as old too.
- What I would think of as the best argument against treating Meänkieli and Kven as languages is that they're not even internally well-defined — typically they're just catch-all terms for "Northern Finnish in Sweden" and "Northern Finnish in Finnmark" with relatively various dialects encompassed by each. There's some efforts (schoolbooks, etc.) towards a "standard" Meänkieli based on the Torne Valley dialect but I don't think it could be called actually standardized just yet. I suppose one thing we could do is to document whatever is done on this specifically under "Meänkieli" and leave anything else as dialectal Finnish, but that might be a bit premature still too. --Tropylium (talk) 07:44, 24 September 2022 (UTC)
- I would not say that "everybody" agreed on the merger. I didn't. I can only comment Meänkieli but I would not be surprised if similar argumentation would also apply for Kven:
- The overall small number of Meänkieli words in Wiktionary only proves that we don't have an active editor in Meänkieli. There seem to be some 30,000 entries in this Meänkieli--Finnish-Swedish dictionary[13]
- The small sample of words we have proves nothing of similarity of the vocabularies. If you study the dictionary I mentioned (press "tutki") you'll find that there are considerable differences between Finnish and Meänkieli. In addition to vocabulary, conjugation of verbs seems to differ (e.g. Meänkieli: tukeat - Finnish: tuet - English: you support).
- This article[14] promotes the opinion that Meänkieli is a dialect. However the writers admit that the two are not readily mutually understandable: Finnish-speakers usually understand Meänkieli relatively well, partly because of their knowledge of Swedish, but for Meänkieli speakers Finnish isn't as easy. If we took a Finn who does not know a word of Swedish, they would be lost with a Meänkieli speaker.
- This article[15] starts from the maxim that Meänkieli is a dialect of Finnish but finishes with the conclusion that at the end of the day it is the spakers of a language themselves who decide the status of a language/dialect. Meänkieli speakers have made their opinion clear: they want it treated as a language. How competent are we to second-guess their point of view? Has any of us studied Meänkieli more than superficially?
- Here is also a link to a Kven-Norwegian dictionary[16]--Hekaheka (talk) 09:44, 24 September 2022 (UTC)
- To be fair all these points would still hold for Ingrian and Savonian dialects, too, and of Ingrian dialects I'm fairly certain no Finnish speaker would readily understand them much better than, say, Izhorian or Karelian. Thadh (talk) 09:51, 24 September 2022 (UTC)
- A clear-cut solution would be to stick to ISO. Ingrian has an ISO code, Savo hasn't. Is Ingrian currently treated as Finnish dialect? I think it shouldn't. --Hekaheka (talk) 12:05, 24 September 2022 (UTC)
- You're confusing Ingrian (inkeroinen) and Ingrian (inkerin (suomalainen)). The first one is the same as Izhorian and is handled as a distinct language, has an iso code, and is spoken by the orthodox Izhorians. The latter one is the same as Ingrian Finnish and is handled as a Finnish dialect, does not have an iso code, and is spoken by the lutheran Ingrian Finns. My remark concerned the latter. Thadh (talk) 13:46, 24 September 2022 (UTC)
- A clear-cut solution would be to stick to ISO. Ingrian has an ISO code, Savo hasn't. Is Ingrian currently treated as Finnish dialect? I think it shouldn't. --Hekaheka (talk) 12:05, 24 September 2022 (UTC)
- To be fair all these points would still hold for Ingrian and Savonian dialects, too, and of Ingrian dialects I'm fairly certain no Finnish speaker would readily understand them much better than, say, Izhorian or Karelian. Thadh (talk) 09:51, 24 September 2022 (UTC)
- I would not say that "everybody" agreed on the merger. I didn't. I can only comment Meänkieli but I would not be surprised if similar argumentation would also apply for Kven:
- "The main differences are either dialectal features also found in Far Northern Ostrobothnian dialects or (the lack of) recent developments within the past 200 years (in one or the other)"... and the additional Swedish/Norwegian loanwords found in Meänkieli/Kven, of course. But many of these are also found in Finnish dialects. — SURJECTION / T / C / L / 21:37, 23 September 2022 (UTC)
- I've come around to say that I think they should be merged. We don't consider Valencian, Ulster Scots nor Lemko (the linguistic case is very similar between those examples and this one) to be their own languages despite political arguments that they should be considered as such (and even some recognition like in the ECRML). We shouldn't do so here either. And don't even mention the whole thing going on with Serbo-Croatian... The general trend on en.wikt seems to be to consider the linguistic argument more important than any political ones (which I can appreciate). — SURJECTION / T / C / L / 11:51, 3 October 2022 (UTC)
- As a Norwegian, I find it odd that there is a proposal to merge Kven with Finnish - as Kven is an officially recognized minority language in Norway (Finnish is not). I do not agree with this merge, for the following reasons:
- At least in Norway, Kven and Finnish are considered separate languages. You are able to get elementary school education and books in Kven (but not in Finnish, as far as I know) - you can even study Kven at the University of Tromsø and receive a bachelor's and master's degree in the language (there is a Finnish one as well, and they are considered two separate degrees). Kven people are considered a separate ethnicity, along with their language, descendant from Finns/Finnish.
- Political reasons are of course relevant, not just linguistic ones. The average Kven speaker has never set foot in Finland, never studied any Finnish, nor consumed any part of Finnish culture and media (music, literature, etc.). An argument was that Finnish speakers understand 99% of Kven - as a Norwegian I understand up to 99% of Swedish and Danish, but they are not getting merged into one language called Scandinavian (for political reasons).
- If merged, then in theory thousands of new Finnish entries on Wiktionary would emerge, in the form of "dialectal" words which are actually Kven words. If someone bothered to add them all (I, stubbornly, might) - then every Kven word and declension would need to be added under Finnish, and certain words and forms which don't even exist in Finnish dialects in Finland would be present. Every Kven word, even if the nominative singular is identical to Finnish, has a separate declension chart, every single one - there would then need to be a separate template to show these (I think Finnish Wiktionarians would be quite annoyed by this).
- Kvens in Norwegian have fought very hard for their language, they have gotten their own language institute with a promotion of literature and culture in the Kven language - erasing their language from Wiktionary and treating it as a dialect of a language they don't even speak would be a huge slap in the face. Finns in Finland who speak a dialect of Finnish, also all know standard Finnish, Kven people do not. If a Kven person handed in an essay at a school in Finland, every other word would be marked as wrong or a typo. Supevan (talk) 22:49, 2 November 2022 (UTC)
- This entire argument can be boiled down to "Kven is standardized". So is Valencian and Croatian, but we still don't treat them as separate languages. — SURJECTION / T / C / L / 14:57, 5 November 2022 (UTC)
- @Surjection: Actually, Kven isn't firmly standardised afaik. Thadh (talk) 14:58, 5 November 2022 (UTC)
- We should. Supevan (talk) 17:35, 5 November 2022 (UTC)
- @Supevan Most of these points were already raised for Meänkieli. I will try to answer them anyways.
- 1) First, our standard procedure is to emphasise linguistics over politics, even when much more controversial (see WT:Serbo-Croatian).
- 2) Secondly, and most importantly, you claim all Kven inflection should be incorporated into Finnish. This is false. There is already a ridiculous amount of variation in the inflection of the various Finnish dialects, and none of it is represented here. We simply do not have the capacity to maintain 30 different tables containing dozens of inflected forms. Additionally, natives do not stick to one variety of Finnish but mix standard Finnish grammar with that from various dialects and registers. It would also be naive to assume that Kven speakers all use one well-defined standard themselves. A language with a morphology as righ as that of Finnish leaves much space for variation.
- 3) You say, "thousands of new Finnish entries [– –] would emerge, in the form of 'dialectal' words which are actually Kven words", but this is only true if one assumes Kven not to be a collection of Finnish dialects, which is not a popular opinion among linguists. Besides, only a small number of these terms are exclusive to the Ruija dialects.
- brittletheories (talk) 13:46, 27 January 2023 (UTC)
- This entire argument can be boiled down to "Kven is standardized". So is Valencian and Croatian, but we still don't treat them as separate languages. — SURJECTION / T / C / L / 14:57, 5 November 2022 (UTC)
2022 — October
ghc: Classical Gaelic aka Early Modern {Irish / Gaelic}
I’d like to propose adding the Classical Gaelic language with the code ghc
to Wiktionary, ie. split the ghc
(called Hiberno-Scottish Gaelic in Ethnologue) code from ga
and gd
.
The code had existed on Wiktionary before (due to being an accepted ISO-639-3 code) but it was merged in 2013 into ga
and gd
and the move was backed by two arguments: “it seems crazy the number of Irishes we have over time” and “[t]here's no reason 17th-century Irish can't be simply ga
(…) [s]peakers of Modern Irish have no more difficulty reading Geoffrey Keating than speakers of Modern English have reading Shakespeare” and I believe this merger was a mistake – especially when we also keep Old and Middle Irish distinct.
The first argument is just a subjective view (on a language with pretty good attestation from 4th century CE til today) of a person not familiar with the history of Goidelic languages. Also, we somehow have no problem with the amount of old- and East-Slavics that we have (reconstructed Proto-Slavic, Old Church Slavonic, Old East Slavic, Old Novgorod, Old Ruthenian – all of them often listing exactly the same forms).
The second one is not applicable to the stage in question in general. The Early Modern stage of Irish and Scottish Gaelic covers the language from the early 13th century up to late 17th century. That’s half a millennium of a language change. Now, we treat this 13th century stage as modern Irish (mostly, rarely as Scottish Gaelic), thus the Irish label and the ga
language code is supposed to cover everything Irish from conservative 13th century literature up to colloquial 21st century language.
Also important to note is the term Classical Gaelic (sometimes Classical Irish) – generally applied to the literary standard created in late 12th century and used consistently in dán díreach over the centuries and taught in bardic schools of late medieval and early modern Ireland and Scotland. From The linguistic training of the mediaeval Irish poet, Brian Ó Cuív (1973), DIAS, →ISBN:
Nowadays we regard Early Modern Irish as beginning about the end of the twelfth century. This view is based on the fact that from that time on professional poetry has two distinctive features, the first linguistic, the second metrical. On the linguistic side we can observe the poets using, as a literary medium, a standard language which seems to have had as its basis a normative or prescriptive grammar. On the metrical side we have, from about the year 1200 on, a clear-cut distinction between strict versification in syllabic metres, dán díreach, and other types, such as óglachas and brúilingeacht. (…). I have discussed this development in a recent article in Éigse where I have suggested that the final stage in the development was reached in the second half of the twelfth century.
(…)
I have implied that the vernacular showed variation from place to place, and we may be sure that even within any one area it showed variation between speakers of one age-group and another. The master-poets did not balk at these difficulties. What they did, it would appear, was to examine fairly thoroughly the various current forms of speech against the background of the existing literary usage, taking into account both morphology and syntax. If what they observed of the language at that time had been written down and identified according to regions, and if the manuscripts containing their observations had survived the vicissitudes of the intervening centuries, we would have to-day a fascinating and unique collection of descriptive linguistic material. However, what the poets did was to co-ordinate this material to produce a prescriptive grammar. I suggest that the resultant literary language about the year 1200 had the following elements:
- A large residuum of the older language surviving in all areas in modern form (i.e. allowing for phonetic changes, etc.),
- Variant forms which had been in use in the language for some considerable time and which were retained either generally or regionally in the modern language,
- Modern speech-forms in general use, adopted to the exclusion, or near exclusion, of the corresponding old forms where such forms had existed,
- Modern speech-forms, possibly in use at local level only, adopted beside surviving old forms,
- Modern speech-forms showing variation, possibly reflecting regional differences, adopted to the exclusion of the corresponding old forms where such forms had existed,
- An archaic element consisting of forms which, being either obsolete or obsolescent, were not normal in the ordinary language.
This language was a codified standard of late 12th century spoken Gaelic dialects and includes some features long lost in modern Gaelic languages (and as Ó Cuív mentioned, probably not common already in 13th c.): infixed object pronouns, conjugated copula, accusative-of-motion, accusative direct objects, etc. It is well attested in bardic poetry of 13th–17th centuries and its grammar is described in bardic grammatical tracts (some most important of those were published by Osborn Bergin as Irish Grammatical Tracts and by Lambert McKenna as Bardic Syntactical Tracts). Morphologically and syntactically this language is closer to Middle Irish than to Modern Irish or Scottish Gaelic.
(It also was not used in prose – it was purely a poetic standard. Even parts of poems written in prose do not adhere to it, and also the aforementioned grammatical tracts do not follow it closely in their main text, they just explain its rules and give examples in verse that do follow the standard closely.)
So giving Geoffrey Keating as an example of a ghc
-language author was cherry-picking a prose author from the most recent stage of the 500-year-long period.
It was also claimed in the old discussion that Early Modern words can be added to either Irish or Scottish Gaelic as obsolete or archaic, wherever they are attested. But how exactly do you classify words used by Scottish poets writing a praise-poem for a Connacht king and preserved in an Irish manuscript? Or a Munster poet living in Scotland and writing for Scottish lords? That happened, and both Irish and Scottish poets used the same standard, and as I understand, they were trained in the same bardic schools.
The lack of Classical Gaelic as an acknowledged stage on Wiktionary limits its usefulness to anyone interested in pre-18th century Irish. It also makes us list some nonsensical forms. For example the verb ibh (“drink”) is the standard bardic verb meaning ‘drink’, but as far as I’m aware it hasn’t been used in spoken Irish for centuries. But Wiktionary lists inflected forms of Irish verbs and thus we list the regular present tense form ibheann in that article – this form is not attested in the Irish historical corpus in texts from 1600–1926 even once. The standard bardic 3rd person sg. of this verb is ibhidh (independent) or ibh (dependent, with ibheann hypothetical but unattested(?) variant).
We also list adhaigh (“night”) which, in this nominative form, does not exist anywhere outside of bardic poetry for centuries, having been replaced by dative form oíche, oidhche. And yet not that long ago we listed nonsensical unattested forms *adhaighe and *adhaigheanta as if it were a regular modern variant.
Those terms exist in FGB and Dinneen’s dictionary (which generally are modern Irish dictionaries), but that’s just because they do sometimes list medieval words as literary.
There are also terms used in the grammatical tracts like taoibhréim for ‘genitive’ which are not used and understood today, not even listed in FGB or Dinneen’s – I won’t add them to Wiktionary because I am not sure if I should do that under ga
, gd
, or nowhere. And whether I should provide inflection for them or not. I did add the classical sense and examples for nar but am not sure if it was the right thing to do, since that’s also not part of the language for centuries.
Thus my proposal is to restore ghc
as a separate language code under the header Classical Gaelic. Since the language was a prescriptive literary standard, I wouldn’t consider it the direct ancestor of Irish or Scottish Gaelic but rather an independent well-attested historical stage (that sometimes would be useful in etymology sections though). I don’t see a problem with continuing to include 16th (maybe 15th) century ⁊ later Irish and Scottish Gaelic prose under the modern languages, but I would consider everything from bardic verse and grammatical tracts to be Classical Gaelic instead. If, for a given lemma, usage differed in prose and poetry, usage notes could clarify what was classical and what was early modern.
Alternative solution: if the “crazy number of Irishes” is an issue, we could merge Old Irish (sga
), Middle Irish (mga
) and add Classical Gaelic to them – especially since the line between Old and Middle Irish often isn’t clear (while there are some unambiguously Old Irish texts, many Middle Irish manuscripts contain Old Irish stories that mix older and newer forms, the two stages also use mostly the same spelling although that changes later due to some MIr. sound changes) – all three under a single heading like Early Irish, Early Gaelic, or similar. Then, how the forms changed could be documented inside that entry, somewhat similarly to what we do with Ancient Greek (which tries to cover over 2 millennia of development). That’s also what the Dictionary of the Irish Language (the historical dictionary for Old/Middle Irish) does, it covers everything from 7th century until 17th century (but does it in a very confusing, hard to use way if you’re not very well familiar with all the sources it cites – we could be better here). // Silmeth @talk 19:32, 4 October 2022 (UTC)
- This proposal seems pretty reasonable to me. I particularly agree with the points about the inflectional morphology of 13th-17th century Classical Gaelic which is severely under-represented on Wiktionary at the moment. It doesn't make much sense to ignore the inflectional morphology of this significant stage in the language as the developments since then have been quite extensive to count under 21st century varieties. This includes the nominal morphology with the extra accusative case, dual number, and dative plural endings, and an extensive range of copular and verbal forms with personal endings that are now long gone from any modern variety. It would be hard to imagine either L1 or L2 speakers of today's Gaelic being able to recognise such forms without a significant degree of study. I think making space for Classical Gaelic makes the most sense. It certainly doesn't make any sense to lump it all under Modern Irish when it served as a literary standard for both Ireland and Scotland during the Medieval period.
- I would say that if we don't assign a space to this period of Gaelic then we should at least consider a massive upgrade to the inflectional morphology under both Modern Irish and Scottish Gaelic (because why only one and not the other?). One of the challenges with upgrading the forms for the modern varieties would be within the realm of the significant orthographic developments that have occurred since Classical Gaelic. I'm not sure what the best approach would be to achieve this in order to represent a wide range inflections that no longer occur in either variety, both having their own orthographic standards. It might be easier to achieve with Scottish Gaelic since its orthography is generally more conservative, and closer to that of Classical Gaelic. Though, this ultimately suggests that it would be easier to set a single space for Classical Gaelic with its own orthographic and morphological standards and definitely less work overall while achieving a much better representation of this historical form that is barely being represented at all at the moment.
- Currently, the jump between Old/Middle Irish to Modern Irish and Scottish Gaelic is too great a leap to properly show the respective etymologies and inflections. I hope this proposal goes ahead and am excited to see what this could lead to. Erisceres (talk) 21:59, 4 October 2022 (UTC)
- I forgot the pings: @Marcas.oduinn, Mahagaja, Mellohi!, Moilleadóir, Rua, Catsidhe, Embryomystic, Akerbeltz – not sure who else is active in (historical) Gaelic // Silmeth @talk 11:43, 5 October 2022 (UTC)
- I'm still not convinced that Early Modern Irish and Classical Gaelic can't be adequately covered with "ga", "gd", and generous use of the
{{lb|ga|archaic}}
and{{lb|ga|obsolete}}
labels, but whatever. —Mahāgaja · talk 12:10, 5 October 2022 (UTC)- What about a special label like we do with
{{lb|pl|Middle Polish}}
? Vininn126 (talk) 12:40, 5 October 2022 (UTC)- Should we then put them under
ga
orgd
? I guessga
by default cause most of the sources are from Ireland, but then what about those Scottish poets and some Scottish manuscripts? Should we just list basically the same bardic language under two different headers? - What about inflection tables? Should we remove the inflection forms from ibh completely, or replace them with classical ones marking each of them as classical? What with stuff like feic which classically was faic (doesn’t exist right now, its Scottish Gaelic cognate’s entry is there though) and had forms like 1st sg. do-chiú, ad-chiú – should I create the entries faic and faicsin under Irish as “classical/obsolete form/verbal noun of feic” and add obsolete/classical forms to the main modern entry? What about ag derived from OIr. *aicc ‘see’ used in phrases ag so (+ accusative) ‘this is’, ag sin (+ accusative) ‘that is’ – from which modern Irish sin fear, sin é an fear and Sc. Gaelic seo mo mhàthair, etc. phrases come, but have dropped the ag part and were reanalyzed as copular?
- Should we list classical pronunciation under modern Irish and Sc. Gaelic headers? That’s important for the classical poetry (syllabification especially differs from modern languages) and often not reflective of modern spelling (although, truth be told, dialectal pronunciations often diverge from the standard spelling too).
- I mean, sure we could treat anything 13th century+ as modern Irish (and sometimes Sc. Gaelic) but if we do, I’d like to have some clear policy on it, and a one that makes it clear to the reader that a given form is not modern. And as I wrote above, in such case I’d rather group classical language under one header with Old and Middle Irish.
- I know Old Irish scholars often do use the term Modern Irish to refer to anything 13th century+, but it does not make forms like adhaigh, faic, do-chiú, fhiora, caiméal, meic, ibhidh, inéasad, -fuile, do bhádhas, etc. modern (coincidentally, thanks to the choice of imperative as the headword form of verbs, the hypothetical classical and modern-Irish-in-pre-reform orthography headwords often are the same) // Silmeth @talk 13:12, 5 October 2022 (UTC)
- This proposal seems more than reasonable to me. We cannot adequately accommodate Classical Gaelic under gd/ga plus an archaic label because that would simply create a weird overlap with words which are archaic in MODERN ga/gd but aren't as far back as Classical Gaelic. For example gd has the archaic verb fimir 'must' which became archaic/defunct when the dialects around Inverness died off in the last 50-100 years but AFAIK it's not even attested as a form in Classical Gaelic.
- I don't think the number of Irishes is crazy, if it comes across as such to some people, it's perhaps merely a reflection of the unusually long history of writing in Ireland which goes back further than other less well documented languages which only joined the writing party centuries later. Akerbeltz (talk) 13:26, 5 October 2022 (UTC)
- Should we then put them under
- What about a special label like we do with
- I'm still not convinced that Early Modern Irish and Classical Gaelic can't be adequately covered with "ga", "gd", and generous use of the
- Support I am convinced that the grammar and vocabulary of Classical Gaelic is distinct enough from today's Irish that they should be treated separately. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 15:38, 5 October 2022 (UTC)
- Support Well, I'm convinced. embryomystic (talk) 20:26, 5 October 2022 (UTC)
- Support I am not an expert on the history of the Gaelic languages, but it seems to me that Silmeth who is much more knowledgeable than I am has presented a very well argued case for adopting the use of the code ghc in this way.
- I don’t know where the code ghc “Hiberno-Scottish Gaelic” came from. I guess it originated with Ethnologue and mostly from an ill-conceived idea of covering the in-between Gaelic of Rathlin Island. Rathlin Island Gaelic was never large or different enough to merit a language code all of its very own, and sadly it has long since gone as a living language. Hence the code has been abandoned by Ethnologue and marked as “historical” by ISO 639-3. And since then the code ghc seems to have been increasingly “repurposed”, or “more clearly defined” if you like, a lot further back in time. The ISO 639-3 page for ghc links to “ISO 639:ghc” on Wikipedia, which redirects to “Early Modern Irish”. It also links to code hibe1235 on Glottolog, which links to the Wikipedia page “Classical Gaelic”. It seems to me to be a good idea to allow code ghc on Wiktionary and to define it clearly to be the Classical Gaelic of the bardic schools.
- At times when we need to force languages into a genetic tree structure, I guess that the Goidelic tree would show pgl as ancestor of sga, which is ancestor of mga, which is immediate ancestor of all of ghc, ga, gd and gv? (Rather than ghc being ancestor of ga, gd and gv.) I guess this will make the descendant “trees” on Wiktionary look a bit odd at times, with ghc forms similar to parent mga forms being shown on the same level as the modern languages ga, gd and gv.
- @Caoimhin: Yeah, that’s how I imagine it,
pgl
→sga
→mga
→ {ghc
,ga
,gd
,gv
}.
- @Caoimhin: Yeah, that’s how I imagine it,
- Except that the
ghc
head-words would often look closer to modern Irish and/or Scottish Gaelic forms – because the spelling was changing in late Middle Ages. (But sometimes, well, longer, eg. classical mathghamhain, Irish mathúin, Gaelic mathan) – many (especially later) classical manuscripts use spelling close to pre-reform modern Irish spelling (or rather – pre-1950s Irish used spelling pretty close to classical). Hence many modern normalized editions of classical texts use basically the spelling of Dinneen’s Dictionary, but there are exceptions – notably Eoin Mac Cárthaigh’s modern edition of the “first” or “Introductory” (though it’s really neither) tract, The Art of Bardic Poetry: A New Edition of Irish Grammatical Tracts I, normalized the spelling in much older fashion, to something closer to what the tract itself prescribes (eg. Gáoidheilg instead of Gaoidhilg or Gaoidheilg; brég instead of bréag, a ttigh instead of i dtigh, etc.). I think we should stick to Dinneen’s spelling though, I believe that’s the most common practice nowadays. We can of course list all variant spellings.
- Except that the
- The inflected forms, if we list them, would be closer to Middle Irish with modernized spelling.
- One thing we’ll need to settle is how we lemmatize verbs:
- we use imperative for modern Irish, Sc. Gaelic, Manx and 3rd sg. for Old and Middle Irish,
- DIL uses 3rd person for classical and so does the word-list in aforementioned The Art of Bardic Poetry,
- léamh.org glossary (mostly based on word list in Aithdioghluim Dána by McKenna) uses 1st person sg., as does Dinneen (for modern pre-reform Irish).
- I don’t have a strong preference in any direction here. // Silmeth @talk 15:25, 6 October 2022 (UTC)
- @Caoimhin: also, you made me think about the code. I see Wikipedia has claimed the
ghc
code is intended for Gáoidhealg Chlasaiceach at least since 2007, and I can’t find anything publicly available on the Internet that would use the name Hiberno-Scottish Gaelic for anything else. But of course your suggestion that the intention could be to represent Rathlin Gaelic (being basically a Scottish Gaelic variety native to Ireland) makes perfect sense! So I wonder if we have here an example of a mistake made on Wikipedia due to misunderstanding forming the reality instead of describing it. Do you know if there are any documents that would explain the original intention of the code and whether they’d be publicly available? I see the SIL ISO-639-3 website doesn’t list anything (except for noting the type change to “historical” in 2019). - I guess Ethnologue ed. 15 would be the place to look (but I can’t access that)? I managed to verify that ed. 12 doesn’t list this code or language under Ireland or United Kingdom. // Silmeth @talk 22:32, 6 October 2022 (UTC)
- Ethnologue ed. 15 (2005), page 565:
- Gaelic, Hiberno-Scottish (Gaoidhealg, Hiberno-Scottish Classical Common Gaelic) [ghc] Extinct. Ireland and Scotland. Class: Indo-European, Celtic, Insular, Goidelic. Lg Dev: Roman script. Bible: 1690. Other: Archaic literary language based on 12th century Irish, formerly used by professional classes in Ireland until the 17th century and Scotland until the 18th century. vso.
- 98.170.164.88 22:39, 6 October 2022 (UTC)
- Ah, so indeed it was intended for the classical language, before Wikipedia. That’ll make me sleep a bit easier! :) Also I had no idea Ethnologue is available in Internet Archive, didn’t cross my mind to look there. Thanks for the quote and link! // Silmeth @talk 23:07, 6 October 2022 (UTC)
- Ah, very good! So I was wrong in guessing that ghc was originally meant to refer to Rathlin Island Gaelic and the like. I was misled a bit by the name “Hiberno-Scottish Gaelic”, and by the fact that the code originated with Ethnologue, which normally only deals with living (or very recently extinct) languages. I suppose the ‘c’ in ‘ghc’ maybe even stood for “classical”. By the way, I notified a few people much more knowledgable than myself and got replies from John Cowan (linguistics, computing, language codes) and David Stifter (Old Irish, early Celtic). Both said that the proposal sounded reasonable to them in principle, even though this was not their main area of expertese. John Cowan reminded me that it was not just the “Classical Gaelic of the bardic schools” as I wrote but that the grammars were aimed also at the prose-writing needs of lawyers, etc. David Stifter was strongly against any suggestion of combining Old and Middle Irish, with a reminder that Old Irish embedded in Middle Irish manuscripts is still Old Irish, not Middle Irish, and that it was normally very clear which is which. Mark Wringe made the comment that the term “Classical Gaelic” refers solely to a written language, and that the terms “Early Modern Gaelic” or “Early Modern Irish” are used when talking about phonology, language evolution, dialects and suchlike. That is ok, since written language is what Wiktionary primarily deals with at the moment, but I guess it means that we should avoid associating descriptions like “aka Early Modern {Irish / Gaelic}” with the code ghc. // --Caoimhin (talk) 13:07, 7 October 2022 (UTC)
- @Caoimhin:
Mark Wringe made the comment that the term “Classical Gaelic” refers to a written language – but that is ok, since written language is what Wiktionary primarily deals with at the moment.
- I would strongly disagree with that. It was a literary language, artificially mixing forms from multiple dialects – yes, but definitely not mainly written. The tracts do focus on pretty minute details of the pronunciation and often mention when something might be spelt differently to how it’s pronounced – and they do emphasize that it is the pronunciation that is important for the poem: rhyming, syllabification, etc., depend on the pronunciation, not on the spelling (and the rules for delenition for example are very strict). We might not know the exact values of the vowels and some consonants (especially how the poets pronounced dh and th – they seem to have merged with gh, sh during Middle Irish, but are still treated differently for the purposes of poems), but the bardic tradition was definitely based on the Gaelic sounds. // Silmeth @talk 16:31, 7 October 2022 (UTC)
- By the way, thank you very much for getting input from all those scholars! I do appreciate it (and didn’t even dare to hope for getting any of them involved in this in any way), it is really great! :) // Silmeth @talk 16:46, 7 October 2022 (UTC)
- @Caoimhin: oh, sorry, I somehow missed the later edits earlier. I totally agree that we should clearly distinguish Early Modern dialects and Classical Gaelic though! And I agree the title I wrote here is bad in this regards.
- That’s why I suggested we don’t treat
ghc
as the ancestor of modern languages, and continue to allow 16th (or 15th) century and later prose to be included in the modern languages, and that whenever we deal with something outside of the standard, we clearly mark it as non-classical (even if described under Classical Gaelic). - For example it’s clear that the preposition do-chum has existed since Old Irish (dochum) continuously up to today (chun), and was often used in the early modern prose, but the grammatical tracts deem it incorrect in verse and AFAIK bardic poetry does avoid its usage (using gus an etc. instead). I think we should include do-chum as a lemma under
ghc
but mark it as (proscribed) and explain its status in the poetry under Usage notes. I hope this approach is reasonable. // Silmeth @talk 09:19, 10 October 2022 (UTC)
- Ah, very good! So I was wrong in guessing that ghc was originally meant to refer to Rathlin Island Gaelic and the like. I was misled a bit by the name “Hiberno-Scottish Gaelic”, and by the fact that the code originated with Ethnologue, which normally only deals with living (or very recently extinct) languages. I suppose the ‘c’ in ‘ghc’ maybe even stood for “classical”. By the way, I notified a few people much more knowledgable than myself and got replies from John Cowan (linguistics, computing, language codes) and David Stifter (Old Irish, early Celtic). Both said that the proposal sounded reasonable to them in principle, even though this was not their main area of expertese. John Cowan reminded me that it was not just the “Classical Gaelic of the bardic schools” as I wrote but that the grammars were aimed also at the prose-writing needs of lawyers, etc. David Stifter was strongly against any suggestion of combining Old and Middle Irish, with a reminder that Old Irish embedded in Middle Irish manuscripts is still Old Irish, not Middle Irish, and that it was normally very clear which is which. Mark Wringe made the comment that the term “Classical Gaelic” refers solely to a written language, and that the terms “Early Modern Gaelic” or “Early Modern Irish” are used when talking about phonology, language evolution, dialects and suchlike. That is ok, since written language is what Wiktionary primarily deals with at the moment, but I guess it means that we should avoid associating descriptions like “aka Early Modern {Irish / Gaelic}” with the code ghc. // --Caoimhin (talk) 13:07, 7 October 2022 (UTC)
- Ah, so indeed it was intended for the classical language, before Wikipedia. That’ll make me sleep a bit easier! :) Also I had no idea Ethnologue is available in Internet Archive, didn’t cross my mind to look there. Thanks for the quote and link! // Silmeth @talk 23:07, 6 October 2022 (UTC)
- Ethnologue ed. 15 (2005), page 565:
- One thing we’ll need to settle is how we lemmatize verbs:
- Support. I'm always of the notion that with proper care, separating L2s out like this will be much more beneficial in the long-run for editors and readers. It looks like with a separate header, Classical Gaelic would have more attentiveness towards it rather than being lumped under (Modern) Irish or Scottish Gaelic and having misaligned inflected forms. AG202 (talk) 05:39, 11 October 2022 (UTC)
Support. A sound proposal. We should try to reflect historical facts rather than imposing the later divergence on the Classical language. I’d hardly say I was “active in (historical) Gaelic”, but thanks for the ping. ☸ Moilleadóir ☎ 05:18, 11 October 2022 (UTC)
Support. You've made a very good case for this. I look forward to this being created, wiktionary could serve as a brilliant resource for this stage of the language's history. Moling Luachra (talk) 07:21, 12 October 2022 (UTC)
It’s been over a month and a half with multiple voices of support and no direct opposition (except for one “I'm still not convinced (…) but whatever” voice). So… what happens next? Should I proceed somehow myself? // Silmeth @talk 21:41, 21 November 2022 (UTC)
- Been ~3 months now. Any opinions, suggestions, directions? (Pinging, because I’ve seen ye involved in language treatment changes: @-sche, Mahagaja, Metaknowledge) // Silmeth @talk 15:27, 9 January 2023 (UTC)
- Instead pinging @Benwing, who is more skilled in technical matters and given his participation in the Brythonic dispute. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 02:19, 21 January 2023 (UTC)
- @Benwing2: Wrong account pinged. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 20:53, 21 January 2023 (UTC)
- @Mellohi! I don't think I received your ping and I just noticed this now. What is the request exactly again? Add the code 'ghc' = Classical Gaelic? Can you spell out exactly what needs to be changed in the language data? Apologies, I'm not very familiar with the staging of Irish/Gaelic. Also I'd like to get any comments/thoughts from long-time multi-language editors User:-sche, User:DCDuring and User:Mahagaja to make sure there are no blocking objections. Benwing2 (talk) 07:36, 28 February 2023 (UTC)
- Mahagaja already gave his opinion above: “I'm still not convinced that Early Modern Irish and Classical Gaelic can't be adequately covered with "ga", "gd", and generous use of the
{{lb|ga|archaic}}
and{{lb|ga|obsolete}}
labels, but whatever.” - As for what the proposal is, as I wrote above:
Thus my proposal is to restore
ghc
as a separate language code under the header Classical Gaelic. Since the language was a prescriptive literary standard, I wouldn’t consider it the direct ancestor of Irish or Scottish Gaelic but rather an independent well-attested historical stage (that sometimes would be useful in etymology sections though). I don’t see a problem with continuing to include 16th (maybe 15th) century ⁊ later Irish and Scottish Gaelic prose under the modern languages, but I would consider everything from bardic verse and grammatical tracts to be Classical Gaelic instead. If, for a given lemma, usage differed in prose and poetry, usage notes could clarify what was classical and what was early modern.- ie.
ghc
as Classical Gaelic, which would be a separate “leaf node” in Goidelic tree descending from Middle Irish, itself not a direct ancestor of any modern languages,pgl
→sga
→mga
→ {ghc
,ga
,gd
,gv
}. - (Also not sure what your definition of “long-time multi-language editors” is, but I’ve been editing entries for multiple languages on Wiktionary since 2010 – even if not nearly as actively as some others here). // Silmeth @talk 10:17, 28 February 2023 (UTC)
- I am not aware of any problem that is solved or caused by such a change at this time. I suppose that there is a potential issue with the code being used for another language by those who determine such things, but that could probably be solved easily. DCDuring (talk) 13:35, 28 February 2023 (UTC)
- The code is an official ISO 639-3 code, not a local Wiktionary code, so it's not going to be used for another language. —Mahāgaja · talk 14:02, 28 February 2023 (UTC)
- Mahagaja already gave his opinion above: “I'm still not convinced that Early Modern Irish and Classical Gaelic can't be adequately covered with "ga", "gd", and generous use of the
- @Mellohi! I don't think I received your ping and I just noticed this now. What is the request exactly again? Add the code 'ghc' = Classical Gaelic? Can you spell out exactly what needs to be changed in the language data? Apologies, I'm not very familiar with the staging of Irish/Gaelic. Also I'd like to get any comments/thoughts from long-time multi-language editors User:-sche, User:DCDuring and User:Mahagaja to make sure there are no blocking objections. Benwing2 (talk) 07:36, 28 February 2023 (UTC)
- @Benwing2: Wrong account pinged. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 20:53, 21 January 2023 (UTC)
- Instead pinging @Benwing, who is more skilled in technical matters and given his participation in the Brythonic dispute. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 02:19, 21 January 2023 (UTC)
- For my part, I have basically the same opinion as Mahagaja, that this seems unnecessary — it seems like the lect / words can be (sometimes already are) handled under existing codes/L2s — but I'm not going to stand in the way of the numerous people above who support doing it (which is why I didn't bother saying anything before). It's a finite corpus, so it's not the sort of open-ended issue that splitting two questionably distinct living languages would be, and as long as we define a clear cutoff date / criterion as to when something is ghc vs ga/gd, there shouldn't be any more problems with deciding which language a word/text belongs to than we already face with other languages (e.g. if the exact date of a text is uncertain and borderline between Middle and modern English). And I hear the point that we already split a lot of Slavic stages. So, whatever. - -sche (discuss) 00:25, 1 March 2023 (UTC)
- I’ll just reiterate that this is a standardized literary language and we do have a kind of standardized vocabulary too (tracts on declensions and irregular verbs listing lots of paradigms). The morphology is also drastically different from modern Sc. Gaelic and significantly different from any modern Irish dialect or standard. So it’s not an attempt at creating a language stage out of nothingness, the bards of 13th–17th centuries made the whole “defining what is Classical Gaelic” thing for us pretty well already.
- Also, regarding the “this seems unnecessary” concern – as someone who’s read a fair bit of 19th and 20th century Irish literature (and Wiktionary was quite useful with that), I almost never use Wiktionary when reading Classical Gaelic, because it’s pretty much useless to me here. I occasionally look at Old Irish lemmata here when I can guess the Old Irish form, but never at modern Irish entries – despite
ga
andgd
supposedly covering the “early modern” period. And whenever I consider adding something classical to Wiktionary, I typically give up as I have no idea where I should include it and how exactly I should handle it (see my comments above about taoibhréim, also adhaigh and nar). - I mostly use eDIL (mostly Old and Middle Irish dictionary, and very difficult to use by an amateur), Léamh.org glossary (website dedicated to Early Mod.Ir., with word-lists scrapped from EMI texts editions, also not the greatest “dictionary”), and if those fail – Dinneen’s dictionary (pre-reform, early 20th c., modern Irish dictionary that has a lot of vocab used in poetry). But I think Wiktionary could become the most useful dictionary for Classical Gaelic, if it dealt with it in its own right.
- I guess a specific clear policy on handling Classical Gaelic and earlier parts of Early Modern (13th–15th c., I’d imagine) stuff instead of treating it as a separate language could work too (but then, since the language was already described by its main users in treatises and already has an official ISO code… why not just treat it as such?) // Silmeth @talk 10:34, 1 March 2023 (UTC)
- I've stayed out of this so far, but I have two opinions on the subject.
- 1. We know that there were differences between Irish and Scottish as far back as the Book of Deer. I would have thought that Classical Gaelic, describing the uniting literary register which held the Irish and Scottish dialects together until the Gaelic Academic establishment fell apart over the 17C, would be a good catchall for that 1200-c.1650 period where there aren't clear signs of regionalisation. It would also tend to emphasise how extended the Modern Irish period is. By labelling everything after 1200
ga
, you're mixing words which can't have been coined or borrowed before the 20th century with words which can't be attested after the 15th century. That seems to me to be a bit more extreme than "archaic". Maybe my expectations are wrong. - 2. To a first degree of approximation,
ghc
would seem to be as good a code as we've got for Early Modern Irish, so maybe it would make sense to havepgl
→sga
→mga
→ghc
→ {ga
,gd
,gv
}. - But then, what would I know. Catsidhe (verba, facta) 10:43, 1 March 2023 (UTC)
- @Catsidhe: I suggested keeping
ghc
as the leaf node, and not the ancestor of the other languages, as: - a) I’m not sure how Manx fits into this (I know Christopher Lewin often cites Classical Gaelic forms when discussing Manx etymologies, but he’s not stating directly they derive from Classical Gaelic forms, I take this practice rather as providing a “Pan-Gaelic” cognate form, to make it easier to connect it to the other Goidelic langs; also AFAIK the Classical G. literary tradition has never existed in Man),
- b) Classical Gaelic mostly is an artificial prescriptive literary tradition and there are developments happening in the background, so not all modern Irish forms derive from something approved as Classical Gaelic or attested in the early modern period at all, Scottish Gaelic often keeping its own features dating to before Classical: hiatus in motha, the p in piuthar, etc.)…
- but I agree that at least
pgl
→sga
→mga
→ { (ghc
→ {ga
,gd
}),gv
} in general could work and be helpful in etymologies. // Silmeth @talk 11:04, 1 March 2023 (UTC)
- @Catsidhe: I suggested keeping
- For my part, I have basically the same opinion as Mahagaja, that this seems unnecessary — it seems like the lect / words can be (sometimes already are) handled under existing codes/L2s — but I'm not going to stand in the way of the numerous people above who support doing it (which is why I didn't bother saying anything before). It's a finite corpus, so it's not the sort of open-ended issue that splitting two questionably distinct living languages would be, and as long as we define a clear cutoff date / criterion as to when something is ghc vs ga/gd, there shouldn't be any more problems with deciding which language a word/text belongs to than we already face with other languages (e.g. if the exact date of a text is uncertain and borderline between Middle and modern English). And I hear the point that we already split a lot of Slavic stages. So, whatever. - -sche (discuss) 00:25, 1 March 2023 (UTC)
duck decoy and decoy-duck
Probably some crossover here. And alternative forms like duck-decoy and decoy duck to be made. GreyishWorm (talk) 23:01, 7 October 2022 (UTC)
Split into black book GreyishWorm (talk) 22:32, 9 October 2022 (UTC)
It would seem that have no chill is the proper lemma for this verb. (Probably not possible to instead treat no chill as a noun phrase because it's very awkward to define other than as the lack of the definition for chill, which would be SOP.) 86.144.233.189 13:50, 12 October 2022 (UTC)
- Not always "have". "I got no chill" is also heard, for instance. Equinox ◑ 19:08, 17 October 2022 (UTC)
Are these the same? The Kiowa language does not appear to be related to Shoshone, nor does the Wikipedia article on the Kiowa people claim that they are from North Platte, Nebraska. I have a hunch that Kioway is an alt form of something, and this seems like the most obvious answer, but someone should check and try to make sense of this before merging them. 98.170.164.88 07:56, 14 October 2022 (UTC)
- Ah, it's likely that Webster was referring to the North Platte River and not the specific city of North Platte, NE (which is where our entry Kioway currently links). But the rest of the confusion remains. 98.170.164.88 08:00, 14 October 2022 (UTC)
- Shoshonean is an old term for the northern part of the Uto-Aztecan languages, from Shoshone. Many of the names for the Numic languages are only loosely correlated with linguistic reality, so terms like "Shoshone", "Paiute" and "Ute" are kind of hard to pin down without qualifiers. There is a Shoshoni language, but peoples like the Timbisha and the Bannock are also called Shoshone.
- Kiowa is part of the Tanoan languages, which may very well be related to Uto-Aztecan as the Aztec–Tanoan languages, but linguists have yet to completely connect the dots. It was speculative in 1913, and it's still not definitively established in 2022. It reminds me of the Achilles and the Tortoise paradox.
- It's all part of the confusion that results from early efforts to classify wide-ranging nomadic peoples who have moved into different regions and adopted different cultural patterns and lifestyles. Just as the Comanche were Great Basin Numic people who moved to the Great Plains and adopted a Plains Indian culture, The Kiowa also moved from the pueblos into the Great Plains and adopted a similar culture.
- I would make Kioway a simple alternative form of Kiowa and lose the Interesting Facts™ in the definition. Chuck Entz (talk) 16:17, 14 October 2022 (UTC)
Distinguish slang terms from terms with slang senses
I think it is worth splitting cat:Terms with slang senses by language (and subcategories) out of cat:Slang by language (and subcategories) similarly to how we have cat:Terms with dated senses by language distinct from cat:Dated terms by language and cat:Terms with uncommon senses by language distinct from cat:Uncommon terms by language. I am developing a crossword game that uses Wiktionary data (which I do not wish to link as it is associated with my real-life identity), and it would be useful to me if the categories made this distinction for English in particular. - excarnateSojourner (talk | contrib) 02:47, 25 October 2022 (UTC)
- Examples of English terms that have non-slang primary senses, but are currently in cat:English slang because of less common slang senses: aardvark, absolute zero, AC/DC, acid. - excarnateSojourner (talk | contrib) 02:52, 25 October 2022 (UTC)
2022 — November
Robbie SWE pointed out in 2016 that the correct initialism is ABCA (see Wikipedia). But now that page already exists with the relevant sense (among others), so I do not know what should be done, if anything. - excarnateSojourner (talk | contrib) 00:09, 8 November 2022 (UTC)
- Done: I decided to mark the sense as obsolete and give the current name of the association (but not move or redirect anything), since the term still has meaning and expands to a different name than the modern initialism. — excarnateSojourner (talk · contrib) 22:59, 25 February 2023 (UTC)
The definitions we give for all three terms are essentially identical, but the forms differ because they are borrowed from different Chinese lects (Mandarin, Cantonese, and Taishanese itself, respectively). Should these use {{alt form}}
or {{syn of}}
? 98.170.164.88 23:08, 14 November 2022 (UTC)
- Oof, yes; as it stands, the entries make it seem like these refer (respectively) to the inhabitants of three different places. - -sche (discuss) 02:15, 15 November 2022 (UTC)
- Merged into the first form which, per ngrams, is the most common. (For the place rather than the -ese, Taishan is particularly lopsidedly more common than the alternatives.) - -sche (discuss) 07:26, 6 March 2023 (UTC)
These two are essentially the same phrase sharing the same meaning, with the more common 食花生 being derived from the other. – Wpi31 (talk) 12:00, 18 November 2022 (UTC)
Requesting to move snowsquall to a space-separated form snow squall. The unspaced form doesn't appear to have been used purposefully or frequently, if at all, in the past or present. It also does not appear to be used by either the US-American NWS or the Canadian MSC, and hasn't appeared in any online news coverage. Bailmoney27 (talk) 19:14, 19 November 2022 (UTC)
- I support this. I've never seen the bunched spelling before and I've been following winter weather for many years. It does seem to be in use, but distinctly less common. Wikipedia's favoring of the bunched spelling seems to be largely a matter of the article having been created early in Wikipedia's lifetime, and with a radar scan from 2004 featuring that bunched spelling. Essentially, we had a model to follow and we stuck to it, but it happens that most people, including the national weather services of both the US and Canada, prefer the two-word form. —Soap— 21:22, 10 December 2022 (UTC)
- Support. Binarystep (talk) 06:55, 21 December 2022 (UTC)
Well, I decided to just move the page myself, as it's been up here unopposed for six months, and because I want to fill in the usual "see also" hatnote which would require that both spellings exist. Since this would make a non-admin move impossible, I moved the page before I put in the hatnote. —Soap— 11:19, 13 April 2023 (UTC)
(And its sister categories in other languages.) This is currently a subcat of English terms by orthographic property, but this is not an orthographic property. I suggest moving it, but I don't know whither.—msh210℠ (talk) 11:30, 21 November 2022 (UTC)
2022 — December
These are just alternative case forms, but they have slightly different glosses and large translation tables on both pages. —Al-Muqanna المقنع (talk) 21:10, 2 December 2022 (UTC)
Etymologies 1 and 2 (including translations) should be merged. J3133 (talk) 05:02, 10 December 2022 (UTC)
- Not sure this is a good idea. Etymology 1 is directly imitative; etymology 2 is from the French. — Sgconlaw (talk) 18:55, 5 January 2023 (UTC)
- @Sgconlaw: I meant that “Etymology 1” and “Etymology 2” (but not “Etymology 3”) in one entry should be merged with the respective etymologies in the other entry. J3133 (talk) 19:02, 5 January 2023 (UTC)
- @J3133: mmm, I'm seeing only one etymology section in ha ha, and only two in ha-ha (for English, that is; all the other language sections have only one etymology section as well). — Sgconlaw (talk) 19:14, 5 January 2023 (UTC)
- @Sgconlaw: Sorry, I meant haha (which has three etymologies), not ha ha—fixed. J3133 (talk) 19:17, 5 January 2023 (UTC)
- @J3133: ah, ha ha! I take it you mean that etymology 1 in haha and ha-ha duplicate each other, so one entry should be made the main lemma and the other converted to an alternative form; and likewise for etymology 2 in those entries. — Sgconlaw (talk) 19:19, 5 January 2023 (UTC)
- Yes, that’s correct. J3133 (talk) 19:29, 5 January 2023 (UTC)
- @J3133: ah, ha ha! I take it you mean that etymology 1 in haha and ha-ha duplicate each other, so one entry should be made the main lemma and the other converted to an alternative form; and likewise for etymology 2 in those entries. — Sgconlaw (talk) 19:19, 5 January 2023 (UTC)
- @Sgconlaw: Sorry, I meant haha (which has three etymologies), not ha ha—fixed. J3133 (talk) 19:17, 5 January 2023 (UTC)
- @J3133: mmm, I'm seeing only one etymology section in ha ha, and only two in ha-ha (for English, that is; all the other language sections have only one etymology section as well). — Sgconlaw (talk) 19:14, 5 January 2023 (UTC)
- @Sgconlaw: I meant that “Etymology 1” and “Etymology 2” (but not “Etymology 3”) in one entry should be merged with the respective etymologies in the other entry. J3133 (talk) 19:02, 5 January 2023 (UTC)
Shouldn't this be in the Reconstruction namespace? Tbh I'm not sure why we need an entry for this at all, even granting that the suffixes -iġ and -eġ are alternative forms of each other. If this specific non-attested form were mentioned in secondary literature then I could see a case for it, but I can't find anything. To be generous, it's plausible that a version with an /e/ vowel existed in Anglo-Saxon speech, if the versions of the suffix were interchangeable. For now at least, I'll just leave this at RFM, but feel free to send this to RFD if desired. 98.170.164.88 10:49, 11 December 2022 (UTC)
Shouldn't this be in the Reconstruction namespace? Special:PrefixIndex/Reconstruction:Old Persian already includes plenty of other entries for names not directly attested in Old Persian sources, but found in Greek, Elamite, Semitic, etc. 98.170.164.88 11:04, 11 December 2022 (UTC)
- @Skiulinamo. Seems like IP has a point, but I don't know enough about the topic. Thadh (talk) 12:02, 19 December 2022 (UTC)
Currently this redirects to arsed and, further to the discussion in the Tea Room, I propose that we undo the redirect. After all we aren't currently redirecting can't be fucked or can't be bothered. It seems better to have stub entries for all synonyms of can't be bothered listing them as alternative forms only, with all the synonyms and translations listed on the same page. Though I'm not suggesting creating be arsed and be fucked, we should probably keep be bothered as a translation hub and for the purpose of distinguishing it from the rare word bebothered as we currently do. --Overlordnat1 (talk) 01:49, 14 December 2022 (UTC)
- I agree. I hate these redirects to single words - they rarely make sense without the rest of the term, and they're unintuitive even for experienced users. Theknightwho (talk) 23:39, 4 January 2023 (UTC)
- Agree — Saltmarsh🢃 08:10, 5 January 2023 (UTC)
- Support undoing the redirect. lattermint (talk) 21:56, 15 August 2023 (UTC)
- I'd be happy to redirect can't be bothered to an appropriate sense of bother#Verb.
- I don't know whether there are any other uses of fuck to mean "bother", nor of arse with that meaning.
- Why wouldn't we RfV arse#Verb "To make, to bother" if the redirect doesn't seem right? If virtually the only usage with the "bother" sense is can't be arsed there is no reason for this not to be a lemma. DCDuring (talk) 23:44, 15 August 2023 (UTC)
- @DCDuring It is possible to use it separately, but it's not common, and I strongly suspect it's a back-formation(?) from can't be arsed. For example, "can you be arsed with this? Me neither." You can do the same thing with fuck, too. Theknightwho (talk) 12:54, 16 August 2023 (UTC)
The capitalization of these entries is inconsistent, even though they are all coordinate terms for different views on the same issue. Note that Miaphysite and dyophysite don't (currently) exist, while both capitalizations of monophysite do. Also, some of these have adjective senses and some don't. Not technically a request for a move, merger, or split, but it's a similar issue to what often comes up here, so this seemed like a fitting venue. 70.172.194.25 11:37, 19 December 2022 (UTC)
- I agree, they should have the same capitalisation for the main lemmas, and lower-case makes most sense IMO. —Al-Muqanna المقنع (talk) 03:48, 27 February 2023 (UTC)
It seems the two terms are sometimes (erroneously?) used interchangeably. But maybe not. Flackofnubs (talk) 10:39, 25 December 2022 (UTC)
Each of these entries contain overlapping definitions (and similar etymologies), and they should probably be merged into a single entry as they seem to be alternative spellings of the same term. OED2 has sirkar and circar. Einstein2 (talk) 14:20, 26 December 2022 (UTC)
Both etymologies give "stupid person" as a definition. Perhaps some etymology can be merged, or moved to Dumbo, or just mentioned at Wikipedia. Flackofnubs (talk) 22:06, 27 December 2022 (UTC)
2023 — January
The entry one fell swoop is lemmatized at the noun phrase. one foul swoop redirects to that. Meanwhile, the prepositional phrase in one foul swoop has its own separate entry. I think the latter should drop the "in" for consistency. Perhaps it could even be given as an {{alt form}}
or {{syn of}}
the main entry, but I'm not sure. 70.172.194.25 08:22, 5 January 2023 (UTC)
- IMHO, at the very least, one foul swoop needs explanation and therefore needs a full entry. Also, it has a distinct pronunciation and [[[fell]] and foul are not close cognates, so they don't seem to be alternative forms of one another. One foul swoop seems to refer to (be derived from) one fell swoop. If one foul swoop gets the main entry I think it deserves, then in one foul swoop should redirect thereto. DCDuring (talk) 16:59, 5 January 2023 (UTC)
- I agree that one foul swoop deserves a separate entry and that in one foul swoop should redirect thither. —Mahāgaja · talk 11:31, 6 January 2023 (UTC)
- Agreed. As in one fell swoop redirects to one fell swoop, redirecting in one foul swoop to one foul swoop would seemingly be the only logical and consistent course of action. --Overlordnat1 (talk) 11:43, 6 January 2023 (UTC)
- I agree that one foul swoop deserves a separate entry and that in one foul swoop should redirect thither. —Mahāgaja · talk 11:31, 6 January 2023 (UTC)
parlez vous, parleyvoo, and parley-vous are all treated as separate words
parlez vous, parleyvoo, and parley-vous whilst having the exact same meanings and roughly the same pronunciation, all have their own pages and the others are listed as synonyms. Two have the meaning of “a Frenchmen, one has “the French language” and all of them have “to speak a foreign language, especially French”. Are these all not the word, with differing spellings? -CanadianRosbif (talk) 10:37, 7 January 2023 (UTC)
- We should probably merge them into parlez vous but list the other two spellings as alternative forms. There is also the song 'Mademoiselle from Armentieres' aka 'Hinky Dinky Parley Voo'[17] which has the form parley voo, the spaced version of parleyvoo, though I don't think this bawdy WW1 song would be a good example to include in our entry, fun though it is, as it's not clear what the final refrain of parley voo at the end of each line is actually supposed to mean. There is also a version that appears in the final credits of Peter Jackson's film They Shall Not Grow Old which can be found on YouTube and which is where I first came across the song. --Overlordnat1 (talk) 02:21, 8 January 2023 (UTC)
- It's lemmatised as parleyvoo in the OED, and this is indeed how their most recent citations spell it (well, with or without a hyphen). Ƿidsiþ 10:59, 17 April 2023 (UTC)
The senses and translations should probably be listed under one page, with the other being listed as an alternative form (adjective sense only for fucked up). I'd personally prefer the more common one be listed as the main lemma, but I'm open to suggestions otherwise. AG202 (talk) 11:29, 9 January 2023 (UTC)
- Yeah, they're clearly the same expression in two different spellings. The hyphen doesnt even change the meaning since it's not an attributive noun; i think this is just a matter of people's spelling preferences. I made the hyphenated spelling an alternate of the spaced spelling and will merge the translations later. —Soap— 07:53, 12 July 2023 (UTC)
Rename Pomeranian to Proto-Pomeranian
"Pomeranian" is essentially a term for the family consisting of the Kashubian and Slovincian languages. In fact, Pomeranian (or its Polish counterpart, język pomorski) has always been used as a synonym of Kashubian (Polish język kaszubski), as witnessed by the 1893 dictionary called Słownik języka pomorskiego czyli kaszubskiego ("Dictionary of the Pomeranian a.k.a. Kashubian language").
There are no written records of an ancestor of both Kashubian and Slovincian, and any attestation of a Pomeranian lect will automatically fall into either of the two categories according to our current handling. However, the two languages do share an ancestor, and this ancestor did influence other languages. As such, it seems only logical to set this language as a proto-language, and rename it to Proto-Pomeranian in reference to its being the unattested ancestor of a language family, rather than being an attested language.
Pinging @Sławobóg, Vininn126, Gnosandes. Thadh (talk) 13:55, 9 January 2023 (UTC)
- Support. Vininn126 (talk) 13:59, 9 January 2023 (UTC)
- Support. Sławobóg (talk) 14:10, 9 January 2023 (UTC)
- Support. // Silmeth @talk 15:09, 9 January 2023 (UTC)
- Oppose. Gnosandes ✿ (talk) 16:47, 9 January 2023 (UTC)
- Support. —Mahāgaja · talk 08:24, 11 January 2023 (UTC)
- Support. — Fenakhay (حيطي · مساهماتي) 11:56, 11 January 2023 (UTC)
- Support I have looked into usage and it appears that this ancestor—not only in English, but also Russian and other current tongues of linguistic science—can only be called “Pomeranian” in the same way as Proto-Slavic can be called “Slavic” language, and it the same way as “Slavic” is any Slavic language, or “Turkic” any Turkic language etc., “Pomerian” is any language of the said group. Fay Freak (talk) 12:35, 11 January 2023 (UTC)
- Done. @Vininn126, Thadh, Sławobóg, Silmethule, Gnosandes, Mahagaja, Fenakhay, Fay Freak Currently there is no family corresponding to Proto-Pomeranian and the code for this language is 'zlw-pom', which is exceptional in lacking the '-pro' suffix normally given to proto-languages. I'm thinking we should add a 'Pomeranian' family with code 'zlw-pom' and give Proto-Pomeranian the code 'zlw-pom-pro'. Thoughts? Benwing2 (talk) 07:25, 28 February 2023 (UTC)
- Yes, absolutely. zlw-pom should be a family code and zlw-pom-pro the protolanguage code. —Mahāgaja · talk 08:13, 28 February 2023 (UTC)
- @ZomBear, Mahagaja Can you take a look at Reconstruction:Proto-Slavic/žarъ? @ZomBear listed a non-reconstructed descendant for Proto-Pomeranian and I'm not quite sure how to fix this as I'm not sure where that term comes from. Benwing2 (talk) 23:18, 28 February 2023 (UTC)
- @Benwing2 error corrected by adding * (in *žarъ). Pomeranian term I met here — Martynaŭ, V. U., editor (1985), “жар”, in Этымалагічны слоўнік беларускай мовы [Etymological Dictionary of the Belarusian Language] (in Belarusian), volumes 3 (га! – інчэ́), Minsk: Navuka i technika, page 210 -- ZomBear (talk) 03:16, 1 March 2023 (UTC)
- BTW I renamed the codes. Benwing2 (talk) 23:24, 28 February 2023 (UTC)
- @ZomBear, Mahagaja Can you take a look at Reconstruction:Proto-Slavic/žarъ? @ZomBear listed a non-reconstructed descendant for Proto-Pomeranian and I'm not quite sure how to fix this as I'm not sure where that term comes from. Benwing2 (talk) 23:18, 28 February 2023 (UTC)
- Obviously, there is no “Proto-Pomeranian”. There are no reconstructions, no comparisons of paradigmatic morphology, and so on. Gnosandes ❀ (talk) 08:41, 1 March 2023 (UTC)
- The little reality of the field of study does not equal the irreality of the language itself. Subbranch proto forms are often left unreconstructed for being too small a fish when a larger one is available. Fay Freak (talk) 22:28, 4 March 2023 (UTC)
- Yes, absolutely. zlw-pom should be a family code and zlw-pom-pro the protolanguage code. —Mahāgaja · talk 08:13, 28 February 2023 (UTC)
- Done. @Vininn126, Thadh, Sławobóg, Silmethule, Gnosandes, Mahagaja, Fenakhay, Fay Freak Currently there is no family corresponding to Proto-Pomeranian and the code for this language is 'zlw-pom', which is exceptional in lacking the '-pro' suffix normally given to proto-languages. I'm thinking we should add a 'Pomeranian' family with code 'zlw-pom' and give Proto-Pomeranian the code 'zlw-pom-pro'. Thoughts? Benwing2 (talk) 07:25, 28 February 2023 (UTC)
The correct is “Etóña” with an acute diacritic as written on [Wikipedia] 100.undentifieduser (talk) 20:05, 18 January 2023 (UTC)
Split Category:European politics into Category:EU politics
Currently, {{lb|de|EU politics}}
categorizes as Category:European politics and the lede in Category:European politics says "terms related to politics of the European Union." I don't dispute that this ridiculous misnomer is widespread but we don't do ourselves any favors by leaning into it. I propose that we repurpose Category:European politics and make it the category of all European (i.e. taking place in or relating to the continent of Europe) politics categories and entries, not just those related to the politics of the European Union. Entries and categories pertaining to EU politics should instead be part of Category:EU politics which itself should be a subcategory of Category:European politics. — Fytcha〈 T | L | C 〉 08:22, 19 January 2023 (UTC)
- Support. That makes sense, especially since CAT:Swiss politics is currently a subcategory of CAT:European politics even though Switzerland isn't in the EU yet. —Mahāgaja · talk 08:37, 19 January 2023 (UTC)
- Support. It wouldn't make sense to have Category:en:UK politics moved to Category:EU politics post-Brexocalypse but it would make sense to have both of these as subcategories of Category:European politics. --Overlordnat1 (talk) 10:19, 19 January 2023 (UTC)
- Support. — Fenakhay (حيطي · مساهماتي) 10:24, 19 January 2023 (UTC)
- Support Vininn126 (talk) 10:40, 19 January 2023 (UTC)
- Support J3133 (talk) 12:31, 19 January 2023 (UTC)
- Support although maybe it should be called 'European Union politics' as we tend to avoid abbreviations in category names. Benwing2 (talk) 05:07, 27 January 2023 (UTC)
- Support Prefer Benwing's variant. As a matter of curiosity, would the current unpleasantness in Ukraine belong in thye repurposed Category:European politics? DCDuring (talk) 14:34, 27 January 2023 (UTC)
- Support
But the EU politics version. It's consistent with Category:US politics, which does use an abbreviation.Theknightwho (talk) 16:28, 27 January 2023 (UTC)- @Theknightwho This is true but at the same time we have CAT:New Zealand politics not #CAT:NZ politics. In general I actually think we should replace 'Fooan politics' with 'Politics of Foo'; this is keeping with CAT:History of the United States, CAT:Languages of the United States, CAT:Political subdivisions of the United States, etc. Besides the politics categories, there are no categories that abbreviate US or UK except for a few odd stragglers (e.g. Category:Upper Midwest US English), while there are hundreds of categories that spell out 'United States'. Similarly, we already have CAT:European Union (not #CAT:EU). Benwing2 (talk) 06:50, 28 January 2023 (UTC)
- Fair point. We should probably change "US politics" to "United States politics" and "UK politics" to "United Kingdom politics", in that case. Best to be consistent with country/supranational entity names. Theknightwho (talk) 14:00, 12 July 2023 (UTC)
- @Theknightwho This is true but at the same time we have CAT:New Zealand politics not #CAT:NZ politics. In general I actually think we should replace 'Fooan politics' with 'Politics of Foo'; this is keeping with CAT:History of the United States, CAT:Languages of the United States, CAT:Political subdivisions of the United States, etc. Besides the politics categories, there are no categories that abbreviate US or UK except for a few odd stragglers (e.g. Category:Upper Midwest US English), while there are hundreds of categories that spell out 'United States'. Similarly, we already have CAT:European Union (not #CAT:EU). Benwing2 (talk) 06:50, 28 January 2023 (UTC)
- Support and prefer Benwing's variant. — excarnateSojourner (talk · contrib) 05:51, 30 January 2023 (UTC)
- Support, and apparently both EU and UK need to be spelt out for consistency, but this is a secondary issue. Fay Freak (talk) 22:26, 4 March 2023 (UTC)
2023 — February
Per citations. I suppose sir is capitalized as a term of address. J3133 (talk) 09:22, 1 February 2023 (UTC)
- The reason it's capitalized in the citations is because it's mentioned as a standalone sentence, in which it happens to be the first word. But yes, the ‘proper’ way would be to capitalize sir/ma'am in the vocative, like mom → Mom. However in practice the distinction is rarely made online, and this is an internet slang term. Maybe list it as an alternative form instead. — 69.121.86.13 19:58, 3 February 2023 (UTC)
Personally, I'm from the US and I've only ever seen/heard "pompom". Ultimateria (talk) 18:26, 1 February 2023 (UTC)
Merge into Reconstruction:Proto-Indo-European/(s)mel-. Most modern sources agree these are part of one and the same root. The only descendant that (traditionally) requires PIE *a is Latin malus, which fits semantically better with the gloss at *(s)mel- anyway. In fact it is unnecessary to reconstruct *a at all, in light of *mo > *ma unrounding in an open syllable with coda resonant (see de Vaan:2011 p. 8: 7.1; p. 360), the same process that resulted in mare (“sea”) < *móri. In any case the reconstruction of the vowel is irrelevant to whether the Latin, Slavic and Germanic words are cognate, despite the last sentence of the Latin etymology 1 described at malus. — 69.121.86.13 19:31, 3 February 2023 (UTC)
No idea why there are both templates existing where the only difference is lower and upper cases on N/n. --Liuxinyu970226 (talk) 08:16, 5 February 2023 (UTC)
- I agree, and have raised this issue before. I think they should be merged. @Erutuon? — Sgconlaw (talk) 05:15, 8 February 2023 (UTC)
- Rename the lowercase one. They have distinct purposes, and are both useful. — excarnateSojourner (talk · contrib) 23:22, 11 May 2023 (UTC)
Church Slavonic and Moravian
Technically Old Church Slavonic and Church Slavonic should be two two separate languages (?), but we only have the former probably because of the small number of editors. These languages are always treated as two different languages in etymology. For now in etymologies and Proto-Slavic pages (*viňaga). For now we trick it as Church Slavonic: {{l|cu|асдф}}
or Church Slavonic: {{desc|cu|асдф|nolb=1}}
. That is not very convenient, we should have separate etycode for Church Slavonic.
We Should also have etycode for Czech Moravian, which is also pretty often used in Proto-Slavic pages (and many etym dictionaries), Serbo-Croatian has templates like that (ckm
, sh-kaj
, sh-tor
). Sławobóg (talk) 12:53, 5 February 2023 (UTC)
- @Павло Сарт, Atitarev, Kamen Ugalj, Skiulinamo, Rua, ZomBear, Bezimenen, IYI681, Vininn126 pinging some people that might be interested. Thadh (talk) 13:03, 5 February 2023 (UTC)
- Support @Sławobóg I completely agree with you, we need a separate etymological code for the usual Church Slavonic language. I constantly thought about it, why is it not there.. --ZomBear (talk) 19:32, 5 February 2023 (UTC)
- Support for Church Slavonic Безименен (talk) 13:45, 7 February 2023 (UTC)
- Oppose for Czech Moravian: there would be 20-30 more regional varieties that could spring if one started Balkanizing Slavic languages + I don't want to give food for thought to Z-Russians. There are already talks for forging Novorussian, Transnistrian, or Lipovan Russian in order to justify their expansive aspirations over former Imperial Russian territories. Безименен (talk) 13:45, 7 February 2023 (UTC)
I also propose to do away with similar problems in the tree of Slavic languages once and for all. I suggest:
- South Slavic:
- 1. Add etymological code for Old Serbo-Croatian
(zls-osh)
. With a redirect to modern Serbo-Croatian. Meets regularly in{{R:sla:ESSJa}}
. - 2. Add etymological code for Old Slovene
(zls-osl)
. With a redirect to modern Slovene. Meets regularly in{{R:sla:ESSJa}}
. - 3. Move the Macedonian language to the descendant of Old Church Slavonic, as it was done some time ago with the Bulgarian language.
- 4. Add etymological code for Church Slavonic
(cu-chu)
. Perhaps even with a division into Russian Church Slavonic(cu-rcu)
, Serbian Church Slavonic(cu-scu)
and others, if any.
- West Slavic:
- 1. Add etymological code for Middle Polish
(zlw-mpl)
. With a redirect to modern Polish or (?). @KamiruPL, Vininn126 - 2. Add etymological code for Old Slovak
(zlw-osk)
. With a redirect to modern Slovak. It was high time to do it! Meets regularly in{{R:sla:ESSJa}}
. Especially if evenEarly Modern Czech (cs-ear)
was awarded a separate code. - 3. Possibly add (family code) a Czech–Slovak languages
(zlw-csk)
?. Just like there areLechitic (zlw-lch) F
. - 4. It's possible: add etymological code for "Old Sorbian" (see Wendish/Lusatian ?)
(zlw-osb)
? Perhaps with a redirect to Upper Sorbian or (?).
- East Slavic:
- 1. Rename etymological codes Old Ukrainian
(zle-ouk)
& Old Belarusian(zle-obe)
→ Middle Ukrainian(zle-muk)
& Middle Belarusian(zle-mbe)
, respectively. A similar request from another user was about six months ago (Wiktionary:Beer parlour/2022/September#“Old Ruthenian” language). Therefore, with "Old" for those languages, these are "parts" of Old East Slavic until the 14th c. (this is indicated on the en.Wikipedia). - 2. Probably it is worth removing the Old Novgorod from the descendants of the Old East Slavic. Make it a separate and parallel ancient language in the East Slavic subgroup. --ZomBear (talk) 19:32, 5 February 2023 (UTC)
- 3. Add etymological code for Pannonian Rusyn with a redirect to Rusyn
(rue)
.
- PS: LOL, I'm serious, add an etymological code for "Early Proto-Slavic"
(sla-ear)
(?) with a redirect to Proto-Balto-Slavic (?). Because Wiktionary "for the standard" uses a rather late version of the Proto-Slavic language. And sometimes in the Etymology section it may be necessary to indicate an earlier form, and the presence of a separate etym-code for "Early PSl." would not be superfluous. --ZomBear (talk) 19:50, 5 February 2023 (UTC)
- I don't think any "Old Sorbian" is attested. Both Upper Sorbian and Lower Sorbian are attested only from the 16th century, and they were already distinct at that point. In theory there could be a code for Proto-Sorbian, but it would have to be a full-fledged protolanguage, not an etymology-only language. —Mahāgaja · talk 20:17, 5 February 2023 (UTC)
- @Mahagaja Yeah, I'm not sure about "Old Sorbian" either. This suggestion is only possible. I relied on the fact that in
{{R:sla:ESSJa}}
sometimes there are words with abbreviations "ст.-луж."/"др.-серболуж." ("старолужицкий"/"древнесерболужицкий" = translation "Old Sorbian") without specifying where the word belongs - to the Upper or Lower Sorbian language. --ZomBear (talk) 21:09, 5 February 2023 (UTC)- @ZomBear: I agree with most of your suggestions, except for Old Serbo-Croatian and Old Sorbian. Serbs and Croats never had an organized shared language until 17-18 century. One could perhaps talk about an Old Serbo-Croatian stage in the development of the Dinaric Slavic complex, but there never was a common language that could be associated with this period (leaving aside the Bosno-Rascian recension of Church Slavonic or Glagolitic Croatian). The same holds in even greater magnitude for Sorbian. Sorbs may self-identify as one people ethnically, but linguistically their languages are noticeably divergent.
- PS I also don't see much educational value in copying all the distinctions that you can find in ESSJa. Note that it often gives old spellings that precede various spelling reforms, dialectal forms which don't follow any orthographic standard, morphological variants (like diminutive forms, etc.) which don't contribute much additional insight, it provides local colloquial meanings which are clearly recent innovations, etc. I personally prefer a more concise and economic presentation for reconstructed terms rather than having 10-15 dialectal spellings of Serbo-Croatian or those monstrosities that are given as dialectal variants of Polish/Bulgarian/Slovenian by ESSJa. Meiner Meinung nach, such an information should go to the respective page of the daughter language, rather than overblowing the proto-Slavic Descendants section.
- PS2 Early proto-Slavic is a useful designation, however, I don't know where exactly where one should draw the border between Early, Middle and Late proto-Slavic and what notation should be applied. Безименен (talk) 13:30, 7 February 2023 (UTC)
- @Mahagaja Yeah, I'm not sure about "Old Sorbian" either. This suggestion is only possible. I relied on the fact that in
- As it stands, Middle Polish is listed as a variant of Modern Polish. We do see some significant phonological changes and a few semantic ones as well, however, it's hard to say whether it should have its own code or not. Even if it did, it would certainly be a redirect to Modern Polish, seeing as it's a period of only about
1250 years. (1500-1750). Vininn126 (talk) 13:36, 7 February 2023 (UTC)- @Vininn126: That's 250 years. —Mahāgaja · talk 15:16, 7 February 2023 (UTC)
- The one and the two are right next to each other.
- @Vininn126: That's 250 years. —Mahāgaja · talk 15:16, 7 February 2023 (UTC)
- I don't think any "Old Sorbian" is attested. Both Upper Sorbian and Lower Sorbian are attested only from the 16th century, and they were already distinct at that point. In theory there could be a code for Proto-Sorbian, but it would have to be a full-fledged protolanguage, not an etymology-only language. —Mahāgaja · talk 20:17, 5 February 2023 (UTC)
The prefix is from Glottolog, which is a proper noun. The capital G should be included in the article's name
18:13, 15 February 2023 (UTC)
- This request was opened by Akalendos. — excarnateSojourner (talk · contrib) 03:53, 27 February 2023 (UTC)
English. As the entry says "capitalization varies". I see no compelling reason that this shouldn't be a noun sense at boot with "always with 'the'", or something of the sort. Chuck Entz (talk) 23:43, 18 February 2023 (UTC)
- (BTW the Saatse Boot is also referred to as "the Boot".) - -sche (discuss) 23:21, 20 February 2023 (UTC)
- The Boot meaning the Saatse Boot should be somewhere uppercase, I think, whether Boot or the Boot or The Boot I'm not entirely sure, because it functions as a proper noun place name. I'm not familiar with how the (b/B)oot meaning Louisiana is used; in the one cite in the entry, or others I can image like referring to LA as America's boot, it seems like a metaphorical general sense for something or somewhere boot-shaped. So it may be an RFV question, does Saatse-style use as a proper noun place name exist (for either place ... I can't actually find the Saatse one in books, either, only online). - -sche (discuss) 17:51, 24 April 2023 (UTC)
The forms with noses are pretty dated at this point and not in widespread use. I think it'd be better if the noseless forms were the main entries, perhaps with a note on the older forms indicating that they were used first. Binarystep (talk) 20:24, 25 February 2023 (UTC)
- Support per nom. — excarnateSojourner (talk · contrib) 23:22, 25 February 2023 (UTC)
2023 — March
Polish Silesian and Silesian
@Shumkichi @KamiruPL The Cieszyn Silesia Polish category has many terms that should probably be moved to Silesian proper. Can we figure out which ones we need to fix? Vininn126 (talk) 12:29, 8 March 2023 (UTC)
- Also maybe @Hythonia, @Sławobóg Vininn126 (talk) 12:30, 8 March 2023 (UTC)
- Idk where Silesian proper starts and Silesian Polish ends so I don't think I'll be of much help o_ _ _ _ _ _ _ _ _ _ _ _ O Maybe let's just assume they'd all be used in Silesian anyway, and then we can add Polish headers to the few entries that can be considered dialectal Polish after we find some sources later??? Shumkichi (talk) 13:33, 8 March 2023 (UTC)
- @Vininn126, Shumkichi Not to throw a monkey wrench into this discussion but ... I read the Wikipedia article on Silesian and it seems there's debate over whether it's a separate language as well as a not-yet-established writing system. Given this, I wonder if it wouldn't be better to unify Silesian and Polish similarly to the way that all Chinese lects as well as Serbo-Croatian are unified. The motivation here is practical: it's significantly more difficult to implement and maintain all the infrastructure for two separate L2's vs. one unified L2, and the minority status of Silesian means it's likely to not get much love as a separate L2 (compare the situation with Jeju vs. Korean and Scots vs. English). Benwing2 (talk) 06:19, 16 March 2023 (UTC)
- @Benwing2 I've actually been trying to do some research on this. One problem with that system are the politics involved - there is a considerable Silesian group that consider it separate. I've also been trying to do some research on the pronunciation, but there are some major difference that point to Silesian having come from an older variant of Polish, as opposed to a modern one. And as to the orthography, recently, Ślabikorz śląski was introduced and has been fairly widely adapted, even silling.org has a normalizer - I've included all of this in WT:About Silesian, and I would actually like to go through all the entries and do a major cleanup. I've even been trying to set up other infrastructure. Vininn126 (talk) 09:59, 16 March 2023 (UTC)
- As to the fact of it coming from an older variant - there are significant sound differences, such as maintaining distinctions from previous long vowels, having more of a 7 vowel system like in Italian, and some significant grammatical differences like continuing the old aorist in a past tense system that's completely different. Vininn126 (talk) 10:22, 16 March 2023 (UTC)
- @Vininn126 I think it's a mistake to conflate whether language A and B are different languages with whether they need separate L2's in Wiktionary. IMO the latter question should be determined by what makes for less work and duplication. If the majority of terms in Silesian are the same as in Polish (which I suspect they are), it might make sense to unify them. The current set of lemmas is non-representative in that it mostly covers lemmas that are different in Silesian. Benwing2 (talk) 15:25, 16 March 2023 (UTC)
- @Benwing2 In order to determine that we need more data on that and currently there aren't any major Silesian dictionaries aside from Silling, which is relatively new, and it's currently doing a massive import of words. Currently they are important a Polish-Silesian dictionary so based on that alone it would suggest a lot sharing. However further work needs to be done to determine how different they really are. As someone who works with it more, I'd say it's not any more different than some of the differences between other Slavic languages, which are remarkably similar. Vininn126 (talk) 15:34, 16 March 2023 (UTC)
- @Vininn126: Makes sense, thanks. Benwing2 (talk) 15:42, 16 March 2023 (UTC)
- @Benwing2 And I think you didn't understand his point. Silesian is not a dialect of Polish since it doesn't come from modern Polish - they both come from Middle Polish (or you could call it Middle Silesian, it doesn't matter, it's just that Polish's always had more speakers, hence the privileged position of Polish over other dialects). That's why your comparison to Serbo-Croatian makes no sense since S-C. is a single language with most of its officially recognised "varieties" not even being different dialects nor even subdialects but simple local variants with at most a few different words, lol. Silesian and Polish, on the other hand, are full of seemingly small but SYSTEMATIC differences that all add up to them being sufficiently different (more so than e.g. Czech and Slovak, I'd say). And the important thing is that they differ not only in vocabulary but also in syntax.
- "If the majority of terms in Silesian are the same as in Polish (which I suspect they are)" - no, they are not the same, and your suspicion is wrong. It's as if you looked at the spelling of some Kashubian words and compared them to their Polish cognates - yes, their orthographies are quite similar but it's jsut a superficial similarity. Shumkichi (talk) 20:17, 16 March 2023 (UTC)
- @Shumkichi Don't get all worked up over this. You didn't even read the first line of my comment: "I think it's a mistake to conflate whether language A and B are different languages with whether they need separate L2's in Wiktionary." Benwing2 (talk) 20:33, 16 March 2023 (UTC)
- @Benwing2 I'm not worked up??? And I did read it, that's why I said the orthographies are different, and that's enough NOT to merge Silesian entries with Polish ones. Polish has an official body that regulates its orthography so it can't use two different spelling norms that also differ in pronunciation. Capisci? Shumkichi (talk) 20:55, 16 March 2023 (UTC)
- Also, according to your argument, we should merge Czech and Slovak. But KKK, as they say in Polent. Shumkichi (talk) 20:56, 16 March 2023 (UTC)
- Alright, let's cool it here. It seems like Silesian is here to stay at least for the time being. Vininn126 (talk) 21:17, 16 March 2023 (UTC)
- Also, according to your argument, we should merge Czech and Slovak. But KKK, as they say in Polent. Shumkichi (talk) 20:56, 16 March 2023 (UTC)
- @Benwing2 I'm not worked up??? And I did read it, that's why I said the orthographies are different, and that's enough NOT to merge Silesian entries with Polish ones. Polish has an official body that regulates its orthography so it can't use two different spelling norms that also differ in pronunciation. Capisci? Shumkichi (talk) 20:55, 16 March 2023 (UTC)
- @Shumkichi Don't get all worked up over this. You didn't even read the first line of my comment: "I think it's a mistake to conflate whether language A and B are different languages with whether they need separate L2's in Wiktionary." Benwing2 (talk) 20:33, 16 March 2023 (UTC)
- @Benwing2 In order to determine that we need more data on that and currently there aren't any major Silesian dictionaries aside from Silling, which is relatively new, and it's currently doing a massive import of words. Currently they are important a Polish-Silesian dictionary so based on that alone it would suggest a lot sharing. However further work needs to be done to determine how different they really are. As someone who works with it more, I'd say it's not any more different than some of the differences between other Slavic languages, which are remarkably similar. Vininn126 (talk) 15:34, 16 March 2023 (UTC)
- @Benwing2 I've actually been trying to do some research on this. One problem with that system are the politics involved - there is a considerable Silesian group that consider it separate. I've also been trying to do some research on the pronunciation, but there are some major difference that point to Silesian having come from an older variant of Polish, as opposed to a modern one. And as to the orthography, recently, Ślabikorz śląski was introduced and has been fairly widely adapted, even silling.org has a normalizer - I've included all of this in WT:About Silesian, and I would actually like to go through all the entries and do a major cleanup. I've even been trying to set up other infrastructure. Vininn126 (talk) 09:59, 16 March 2023 (UTC)
- @Vininn126, Shumkichi Not to throw a monkey wrench into this discussion but ... I read the Wikipedia article on Silesian and it seems there's debate over whether it's a separate language as well as a not-yet-established writing system. Given this, I wonder if it wouldn't be better to unify Silesian and Polish similarly to the way that all Chinese lects as well as Serbo-Croatian are unified. The motivation here is practical: it's significantly more difficult to implement and maintain all the infrastructure for two separate L2's vs. one unified L2, and the minority status of Silesian means it's likely to not get much love as a separate L2 (compare the situation with Jeju vs. Korean and Scots vs. English). Benwing2 (talk) 06:19, 16 March 2023 (UTC)
- Idk where Silesian proper starts and Silesian Polish ends so I don't think I'll be of much help o_ _ _ _ _ _ _ _ _ _ _ _ O Maybe let's just assume they'd all be used in Silesian anyway, and then we can add Polish headers to the few entries that can be considered dialectal Polish after we find some sources later??? Shumkichi (talk) 13:33, 8 March 2023 (UTC)
We have two different entries for the same thing, while links generated with {{m}}
or {{l}}
like *vьśegъda link to the latter (vьsegъda) as they seem to ignore ś in Proto-Slavic reconstructions which IMO is unexpected. This makes the former (vьśegъda) being ignored and forgotten recently. I guess both entries should be merged and the language modules should be tweaked to make Proto-Slavic stuff ś-aware? // Silmeth @talk 12:28, 15 March 2023 (UTC)
- @Silmethule Converting ś to s seems intentional, and asserts that there's no separate ś phoneme in Proto-Slavic. Reconstructing ś seems ahistorical to me; it's rather that the third (and second ...) palatalizations occurred post-Proto-Slavic. Benwing2 (talk) 06:27, 16 March 2023 (UTC)
- @Benwing2: but it has different reflexes in different branches. So, either those palatalizations happened post-Proto-Slavic and *ś is a valid dia-phoneme projected back and reconstructing *s in those places for Proto-Slavic is wrong, or it was an actual Proto-Slavic phoneme with some value separate from both *s and *š that merged with those at a later stage – in which case we’re justified to reconstruct *ś and *s is wrong. In either case, unless we undo all progressive and 2nd regressive palatalizations of *x (and all the other sounds? there are traces of non-palatalization in *otьcь in the east too), we need to treat *ś as a (dia)phoneme of its own and *s is wrong. Also WT:About Proto-Slavic seems to treat *ś as a separate phoneme (and even ascribes a specific IPA value to it). // Silmeth @talk 10:00, 16 March 2023 (UTC)
- @Silmethule What do the primary sources say? Benwing2 (talk) 15:10, 16 March 2023 (UTC)
- @Benwing2: What primary sources? Proto-Slavic is a reconstructed, not directly attested, language.
- If you mean etymological dictionaries and historical linguistic papers – depends, you get all sorts of things (*vьšь in Polish dictionaries, *vьsь in some southern ones, non-palatalized *vьxъ in Vasmer, etc.) – although in general progressive and 2nd regressive palatalizations are commonly marked. But *x is problematic as it has different reflexes in the west vs south+east; hence Derksen’s notation with *ś, as he puts it:
// Silmeth @talk 17:04, 16 March 2023 (UTC)The introduction of *ś, on the other hand, could not be avoided, cf. *vьśь ‘all’ vs. *vьsь ‘village’
- @Silmethule We need some other people to weigh in. The current situation with no ś was done intentionally so we shouldn't change it willy-nilly. Benwing2 (talk) 20:35, 16 March 2023 (UTC)
- OK. I’ll leave some pings then: @Fay Freak, Ivan Štambuk, Sławobóg, Thadh, Useigor, Vorziblix, ZomBear. // Silmeth @talk 20:54, 16 March 2023 (UTC)
- I already agreed that we should use ś. Third palatalisation is only absent in Old Novgorodian and most of our entries already do apply the sound law to stops, so I don't see why we should treat the sibilant any differently. Thadh (talk) 22:23, 16 March 2023 (UTC)
- I agree that *ś should be used. Make the main reconstruction - *vьśegъda, and the form *vьsegъda (maybe?) as a redirect. ZomBear (talk) 06:43, 17 March 2023 (UTC)
- OK. I’ll leave some pings then: @Fay Freak, Ivan Štambuk, Sławobóg, Thadh, Useigor, Vorziblix, ZomBear. // Silmeth @talk 20:54, 16 March 2023 (UTC)
- @Silmethule We need some other people to weigh in. The current situation with no ś was done intentionally so we shouldn't change it willy-nilly. Benwing2 (talk) 20:35, 16 March 2023 (UTC)
- @Silmethule What do the primary sources say? Benwing2 (talk) 15:10, 16 March 2023 (UTC)
- @Benwing2: but it has different reflexes in different branches. So, either those palatalizations happened post-Proto-Slavic and *ś is a valid dia-phoneme projected back and reconstructing *s in those places for Proto-Slavic is wrong, or it was an actual Proto-Slavic phoneme with some value separate from both *s and *š that merged with those at a later stage – in which case we’re justified to reconstruct *ś and *s is wrong. In either case, unless we undo all progressive and 2nd regressive palatalizations of *x (and all the other sounds? there are traces of non-palatalization in *otьcь in the east too), we need to treat *ś as a (dia)phoneme of its own and *s is wrong. Also WT:About Proto-Slavic seems to treat *ś as a separate phoneme (and even ascribes a specific IPA value to it). // Silmeth @talk 10:00, 16 March 2023 (UTC)
- Keep Reconstruction:Proto-Slavic/vьśegъda and delete Reconstruction:Proto-Slavic/vьsegъda. Old Polish wszegdy proves it can't have been ×vьsegъda. —Mahāgaja · talk 20:17, 27 June 2023 (UTC)
Renaming Proto-Mon-Khmer to Proto-Austroasiatic
Proto-Mon-Khmer is deprecated. The name of Category:Proto-Mon-Khmer language needs to be changed to Category:Proto-Austroasiatic language, just like how we have Category:Proto-Sino-Tibetan language rather than Category:Proto-Tibeto-Burman language. See the Wikipedia article on Austroasiatic languages to get an idea of why Mon-Khmer is no longer valid, because Munda and Nicobarese are simply regular branches that are sisters of the other so-called Mon-Khmer languages.
The page names can simply be renamed, and the lemmas do not need to be changed. Category:Proto-Sino-Tibetan language is a perfect example of this. The Proto-Sino-Tibetan lemmas are actually all Proto-Tibeto-Burman reconstructed forms by James A. Matisoff, who considers Tibeto-Burman to be a branch of Sino-Tibetan. Now, more scholars are thinking that Chinese is simply another another regular sister branch of the various Sino-Tibetan languages out there, rather than its own special branch. Same goes for Mon-Khmer.
So how can this name change be done? Ngôn Ngữ Học (talk) 22:23, 18 March 2023 (UTC)
Formerly:
- Austroasiatic
- Munda
- Mon-Khmer (which Shorto reconstructed)
- (about a dozen branches)
Now the consensus is that the tree has a rake-like structure (per Sidwell):
- Austroasiatic
- (about a dozen branches including Munda)
That's why Mon-Khmer is an obsolete term now.
Similarly, with Sino-Tibetan, it formerly was:
- Sino-Tibetan
- Chinese
- Tibeto-Burman (which Matisoff reconstructed)
- (dozens of branches)
Now the consensus among many scholars is that the tree has a rake-like structure with many "fallen leaves" (quoting George van Driem), making Tibeto-Burman obsolete:
- Sino-Tibetan
- (dozens of branches including Chinese)
Ngôn Ngữ Học (talk) 22:27, 18 March 2023 (UTC)
- Support. If this change happens we should delete Category:Mon-Khmer languages. Benwing2 (talk) 23:41, 18 March 2023 (UTC)
- Abstain. I prefer to wait for when an actual new reconstruction of Proto-Austroasiatic is published to do the move, see what I wrote at Wiktionary:About Proto-Mon-Khmer, but I do not actually oppose to moving now. However, if the move do happen, I'm would like to see a line like "This reconstruction is from Shorto (2006) for the obsolete concept of Proto-Mon-Khmer, and should not be treated as actual reconstruction of Proto-Austroasiatic, which as of now has not yet fully materialized, and is simply "placeholder" for the actual Austroasiatic etymologies" (probably as a template) to be added as warning for every reconstruction item. I very much want the same thing to happen to "Proto-Sino-Tibetan", considering a lot of them are no way near actual Proto-Sino-Tibetan, and the reconstruction items themselves are "icky" to say at least. PhanAnh123 (talk) 01:52, 19 March 2023 (UTC)
- @PhanAnh123: Take a look at Sidwell's Proto-Austroasiatic reconstruction and Shorto's Proto-Mon-Khmer reconstruction. Sidwell's inclusion of Munda and Nicobarese had virtually no impact on his Proto-Austroasiatic reconstruction (versus if he had only included the "Mon-Khmer" languages) because he considered Munda to be highly innovative and restructured, with few original retentions from Proto-Austroasiatic. Furthermore, it would be very confusing to have duplicates for both Proto-Austroasiatic and Proto-Mon-Khmer. I would just merge them as Proto-Austroasiatic. Ngôn Ngữ Học (talk) 19:25, 19 March 2023 (UTC)
- I have no intention to keep Proto-Austroasiatic and Proto-Mon-Khmer seperated (I consider Proto-Mon-Khmer to be likely a ghost after all), what I mean is that we either should keep the entries as are until actual Proto-Austroasiatic reconstruction comes about, or move the "Proto-Mon-Khmer" items to Proto-Austroasiatic but with the warning added. I know what you mean by "inclusion of Munda and Nicobarese had virtually no impact", because like Sidwell, I do think these branches are quite innovative, however, that does not mean I agree to move the Shorto's Proto-Mon-Khmer reconstruction to Proto-Austroasiatic without any warning, since Austroasiatic linguistics have progressed quite a lot even outside of those two branches. The vocalism in Shorto (2006) was very rudimentary reconstructed, which the reconstruction of the descendant branches as well as the recent "sneak peek" to Proto-Austroasiatic reconstruction by Sidwell improved upon; furthermore, the syllable structure itself is also slightly changed, it is now thought that a glottal stop phonetically presented in any Proto-Austroasiatic word that ended in a pure vowel (meaning any word ended in *aːj would still have *aːj, but those ended in **aː would automatically became *aːʔ), plus there is the status of *ʄ- that very much awaits assessment in the actual reconstruction of Proto-Austroasiatic. Like I said, I don't oppose moving, but there much be strings attached. PhanAnh123 (talk) 01:53, 20 March 2023 (UTC)
- @PhanAnh123, Ngôn Ngữ Học Such a warning can be added by bot to the top of all entries if both of you agree. Benwing2 (talk) 03:30, 20 March 2023 (UTC)
- @Benwing2: Agree, a warning placed by a bot should be sufficient. Also @PhanAnh123, we can use Sidwell & Rau (2015) for some of the basic Swadesh list words, but a full reconstruction of Proto-Austroasiatic is currently being done by Sidwell. It should come out in a few years. Ngôn Ngữ Học (talk) 10:19, 20 March 2023 (UTC)
- We are all in agreement then, so obviously now I support moving. With this Munda cognates can be directly added to the entries. PhanAnh123 (talk) 10:29, 20 March 2023 (UTC)
- Agree on the support.
- We are all in agreement then, so obviously now I support moving. With this Munda cognates can be directly added to the entries. PhanAnh123 (talk) 10:29, 20 March 2023 (UTC)
- @Benwing2: Agree, a warning placed by a bot should be sufficient. Also @PhanAnh123, we can use Sidwell & Rau (2015) for some of the basic Swadesh list words, but a full reconstruction of Proto-Austroasiatic is currently being done by Sidwell. It should come out in a few years. Ngôn Ngữ Học (talk) 10:19, 20 March 2023 (UTC)
- @PhanAnh123, Ngôn Ngữ Học Such a warning can be added by bot to the top of all entries if both of you agree. Benwing2 (talk) 03:30, 20 March 2023 (UTC)
- I have no intention to keep Proto-Austroasiatic and Proto-Mon-Khmer seperated (I consider Proto-Mon-Khmer to be likely a ghost after all), what I mean is that we either should keep the entries as are until actual Proto-Austroasiatic reconstruction comes about, or move the "Proto-Mon-Khmer" items to Proto-Austroasiatic but with the warning added. I know what you mean by "inclusion of Munda and Nicobarese had virtually no impact", because like Sidwell, I do think these branches are quite innovative, however, that does not mean I agree to move the Shorto's Proto-Mon-Khmer reconstruction to Proto-Austroasiatic without any warning, since Austroasiatic linguistics have progressed quite a lot even outside of those two branches. The vocalism in Shorto (2006) was very rudimentary reconstructed, which the reconstruction of the descendant branches as well as the recent "sneak peek" to Proto-Austroasiatic reconstruction by Sidwell improved upon; furthermore, the syllable structure itself is also slightly changed, it is now thought that a glottal stop phonetically presented in any Proto-Austroasiatic word that ended in a pure vowel (meaning any word ended in *aːj would still have *aːj, but those ended in **aː would automatically became *aːʔ), plus there is the status of *ʄ- that very much awaits assessment in the actual reconstruction of Proto-Austroasiatic. Like I said, I don't oppose moving, but there much be strings attached. PhanAnh123 (talk) 01:53, 20 March 2023 (UTC)
- @PhanAnh123: Take a look at Sidwell's Proto-Austroasiatic reconstruction and Shorto's Proto-Mon-Khmer reconstruction. Sidwell's inclusion of Munda and Nicobarese had virtually no impact on his Proto-Austroasiatic reconstruction (versus if he had only included the "Mon-Khmer" languages) because he considered Munda to be highly innovative and restructured, with few original retentions from Proto-Austroasiatic. Furthermore, it would be very confusing to have duplicates for both Proto-Austroasiatic and Proto-Mon-Khmer. I would just merge them as Proto-Austroasiatic. Ngôn Ngữ Học (talk) 19:25, 19 March 2023 (UTC)
AbstainSupport. I've seen assertions that Mon and Khmer actually form a subgroup within the traditional Mon-Khmer grouping. Of course, it could be something messy as with Indo-European, where we have at least Indo-Iranian and Balto-Slavonic. --RichardW57m (talk) 16:19, 21 March 2023 (UTC)- There is no such thing as a Mon+Khmer grouping within Mon-Khmer. Some classifications propose Eastern, Southern, and Northern groupings within Mon-Khmer, but none of them put Monic and Khmeric together. Please consult the Austroasiatic languages article on Wikipedia to get a basic refresher of all the major previous classifiations. Ngôn Ngữ Học (talk) 15:04, 23 March 2023 (UTC)
- The cited articles do show that their crown group is larger than Monic + Khmeric, but it does look as though we don't need to worry about anyone using 'Mon-Khmer' to denote their (weak) association. --RichardW57m (talk) 11:36, 28 March 2023 (UTC)
- There is no such thing as a Mon+Khmer grouping within Mon-Khmer. Some classifications propose Eastern, Southern, and Northern groupings within Mon-Khmer, but none of them put Monic and Khmeric together. Please consult the Austroasiatic languages article on Wikipedia to get a basic refresher of all the major previous classifiations. Ngôn Ngữ Học (talk) 15:04, 23 March 2023 (UTC)
Renaming Proto-Hmong to Proto-Hmongic
- Category:Proto-Hmong language needs to be changed to Category:Proto-Hmongic language. See Hmongic languages and Hmong language on Wikipedia.
- Category:Proto-Mien language needs to be changed to Category:Proto-Mienic language. See Mienic languages and Iu Mien language on Wikipedia.
The Hmong-Mien language tree is like this:
- Hmong-Mien
- Hmongic
- Hmong
- (dozens of languages)
- Mienic
- Iu Mien
- (several languages)
- Hmongic
Proto-Hmong refer thus refers to only Hmong, not Hmongic. There are dozens of Hmongic languages that are not Hmong. They include Hmu, Pa Hng, Bunu, She, and others.
Same goes for Proto-Mienic. Proto-Mien technically refers to Proto-Iu Mien, but does not include Kim Mun, Biao Min, and Dzao Min.
Ngôn Ngữ Học (talk) 22:23, 18 March 2023 (UTC)
- Support. If we make this change we also need to rename the families, i.e. Category:Hmong languages -> Category:Hmongic languages and Category:Mien languages -> Category:Mienic languages. This is similar to the change from Category:Korean languages -> Category:Koreanic languages, which was implemented in Jan 2022. Benwing2 (talk) 23:45, 18 March 2023 (UTC)
- Support. Theknightwho (talk) 17:57, 1 June 2023 (UTC)
They are defining the same thing, using various grades of nautical jargon Van Man Fan (talk) 10:39, 24 March 2023 (UTC)
- I don't think this would normally be spelled with a hyphen, at least not as a verb. heave-to with a hyphen looks like a noun, probably meaning "the act of heaving to", though as a landlubber I don't know if such a noun exists. —Mahāgaja · talk 10:47, 24 March 2023 (UTC)
I think the two senses should be merged. The distinction we're trying to draw doesn't seem fundamental and isn't borne out by other dictionaries. PUC – 21:49, 25 March 2023 (UTC)
- I agree. The term is neutral/ambiguous about whether or not physical alteration is involved in the reuse for a purpose other than a previous purpose. AHD uses or in its definition to make the ambiguity explicit. Maybe our users need to be informed that the term is used for both sorts of repurposing. DCDuring (talk) 15:44, 26 March 2023 (UTC)
- I agree; merged. - -sche (discuss) 17:40, 24 April 2023 (UTC)
Renaming Wiradhuri (wrh) to Wiradjuri
I think we need to change this one because—so far as I can tell—"Wiradjuri" is the most current and by far most common English spelling for this language since at least the 1980s, as opposed to our current spelling (see "Category:Wiradhuri language"). "Wiradjuri" is also the form used in official signage and communication (see for instance: local shire boundary signage; a city council webpage; a unit from NSW state school curriculum; cultural information from the National Indigenous Australians Agency—a federal government agency). Helrasincke (talk) 03:44, 29 March 2023 (UTC)
- Support. Even on Glottolog, where they use the -dh- form, the -dj- form (and then -dg- forms) is more common in the names of the reference works about it they have catalogued. - -sche (discuss) 20:54, 1 April 2023 (UTC)
2023 — April
kaffir should probs be the main form It is probably (talk) 08:11, 13 April 2023 (UTC)
- You may be right as kaffir seems to be slightly more widely used than kafir, though oddly enough we (and Wikipedia) have an entry for Kafiristan and not Kaffiristan (which is a far more prevalent form on GoogleBooks). Though on a raw Google search 'Kafir' is twice as popular as 'Kaffir' and 'Kafiristan' is a lot more popular than 'Kaffiristan' and there does seem to be a slight tendency of late to differentiate the 2 words so that 'kaffir' is the Souh African insult and 'kafir' is the Islamic one. --Overlordnat1 (talk) 09:06, 13 April 2023 (UTC)
So from doing a lot of research and hearing testimonies from elders who speak this North African Judeo-Spanish language, I think there should be a separate list and code for Haketia. It has been associated as just a dialect of Ladino but that is not the case. Haketia has consonants and words directly from Arabic that are never used in Ladino as well as an array of different phrases and spellings. It is a separate language. Let me know if this can be done. I have a lot of words, pronunciations and phrases ready for adding to it after it is set up. Shukur/thanks. Nevermiand. (talk) 18:43, 16 April 2023 (UTC)
Looking at google books:"od's niggers" and google books:"odd's niggers", it seems like the O is always (or almost always?) capitalized, as if treated as a name, like Odd, which we currently only have as Norwegian but which is also attestable in English—as a non-God-related given name, I mean. Odd might also be attestable as a minced oath for God, given the variety of other oaths like this I see used or mentioned in old books, including Odd's pittikins, Odd's blood, Odd's hounds, Odd's dickens, Od's fish, Od's heft. For od's bobs the hits are more split, but that entry too should possibly be capitalized. - -sche (discuss) 04:32, 19 April 2023 (UTC)
2023 — May
“akrasia” is currently listed as the alternative spelling of “acrasia”, which contradicts Wikipedia, as well as the fact that “acratic” is (correctly) listed as the alternative of “akratic”. Also, “akrasia” has 4.5× as many Google results as “acrasia” does. (There’s probably a better metric I could cite, but oh well.) IMO we should swap the two and make “akrasia” the main one. —Will • B [talk] 23:30, 5 May 2023 (UTC)
Tagged a long time ago, I'm just bringing it here. I tend to think they should be merged to be up to, as all of the citations include a form of to be. 76.100.240.27 19:54, 8 May 2023 (UTC)
Proto-Witotoan and Proto-Huitoto-Ocaina
Currently, the way they are treated makes no sense: Proto-Huitoto-Ocaina is handled as the parent of Proto-Witotoan and nothing else. In our word list, Proto-Witotoan is used to denote Proto-Bora-Witoto, which is a macroreconstruction that is very speculative. I propose we merge these into one language, Proto-Witotoan, which seems to be the more common term.
Notifying @-sche, Mahagaja. Thadh (talk) 15:32, 9 May 2023 (UTC)
- No objection here. —Mahāgaja · talk 15:36, 9 May 2023 (UTC)
- If I recall and understand correctly (?) the current setup is based on your request here, where I noted the issue of the broader/older grouping having been set in some entries as the child of a smaller/younger grouping. No objection to changing it. I see that a handful of Murui Huitoto entries use Proto-Witotoan in their etymologies, but these should be unaffected by removing Boran from its scope. - -sche (discuss) 16:53, 9 May 2023 (UTC)
- Oof, I didn't even recall that discussion - I guess I was a bit too hasty, sorry. If there are no further objections, I'll just manually remove and/or change the reconstructions from the Murui Huitoto etymologies and fix this. Thadh (talk) 17:05, 9 May 2023 (UTC)
Should be put behind oneself methinks. 76.100.240.27
- Also whip it on someone should be at whip it on and trust someone to should be at trust to. 76.100.240.27
- @76.100.240.27 How is the idiomatic meaning of trust someone to used in a sentence? Would one say, "I trust her to spend all day reading" to mean "It is predictable for her to spend all day reading"? If so that makes our current gloss unsubstitutable. Is it ever used without any words between trust and to? If not I think we should not move it. — excarnateSojourner (talk · contrib) 16:34, 22 May 2023 (UTC)
- @76.100.240.27 I don't think so: "It was a rough time, but I've put all that behind me now." Cambridge Dictionary has, "You need to put it behind you now". — excarnateSojourner (talk · contrib) 16:34, 22 May 2023 (UTC)
- I agree with ExcarnateSojourner about put behind one.
- I agree that trust to is a worse location for the expression than trust someone to, though both are worse than trust + to, IMHO.
- Doesn't whip it on require a person (or personified object) as complement? I suppose we could handle that with a label. Also. it is possible that there might be another meaning involving inanimate objects or other expressions. I would probably then be easier on users to be able to compare meanings. DCDuring (talk) 00:42, 23 May 2023 (UTC)
- I would not move this entry. People say "I put it behind me" and "You need to put it behind you and move on", not *"I put it behind myself" and *"You need to put it behind yourself and move on". —Mahāgaja · talk 06:48, 23 May 2023 (UTC)
Middle Polish (yet again)
I propose we make Middle Polish an Etymology only code with the language code zlw-mpl
. Would be very useful for linking and mentioning.
@Sławobóg @ZomBear @Mahagaja @Thadh @KamiruPL Vininn126 (talk) 13:11, 11 May 2023 (UTC)
- I know this is a boring topic that many people find irrelevant, but in what contexts would this code be used? (e.g. are there a lot of Ruthenian borrowings from Middle Polish?) Thadh (talk) 16:03, 11 May 2023 (UTC)
- Or a lot of Middle Polish only inheritances from Old Polish, or Silesian derivations. Vininn126 (talk) 16:13, 11 May 2023 (UTC)
- @Thadh in the Old Ruthenian there are a lot of Polish borrowings, just the period of the 1500-1700s. This is, to some extent, one of its features that alienated the modern Ukrainian and Belarusian languages first from Middle Russian and then from modern Russian. --ZomBear (talk) 16:46, 11 May 2023 (UTC)
- To me, as the editor of the Old Ruthenian entries, this would be helpful. In the Old Ruthenian
zle-ort
language (existed in the period ~ 1387-1798), there are extremely many Polish borrowings. Words borrowed in the 1400s (before 1500) have to be indicated from Old Polishzlw-opl
, everything is fine here. But borrowings in the period of 1500-1700 have to be indicated as borrowed from the modern Polish language. The presence of a separate code for Middle Polish would solve this inaccuracy. --ZomBear (talk) 16:42, 11 May 2023 (UTC)
- @ZomBear @Sławobóg Done, thanks @Theknightwho! Vininn126 (talk) 14:00, 18 May 2023 (UTC)
- @Vininn126 thank you and everyone who contributed to this. I have already created the first one, the Old Ruthenian гартова́ти (hartováti), where it is listed as a borrowing from Middle Polish. ZomBear (talk) 17:55, 18 May 2023 (UTC)
molly-mawk is given as an alternative form of mollemoke, and not of mollymawk. The etymologies given for those two are half-different too, while both mention fulmars. There's probably some obsolete taxonomy in there too, so a taxo-specialist's eyes would be more than welcome. Skisckis (talk) 20:40, 11 May 2023 (UTC)
(Not sure where best to put a suggested subcategory. Here? The BP?)
CAT:en:Heraldry has 600+ entries [update: now less, because I moved some to "Heraldic charges"] and more are on my list to add. We subcategorize "charges" and "tinctures" (I am migrating more entries to those categories as I go through the main category). Should we also put "heraldic attitudes" like at gaze, dormant, urinant in a subcategory? (On the model of {{lb|en|heraldiccharge}}
, which displays the same as {{lb|en|heraldry}}
but adds the "heraldic charges" subcategory, I'd create a label like {{lb|heraldic attitude}}
to display heraldry but effect subcategorization.) The main issue I foresee is that the scope is a little fuzzy: I'd tentatively be inclined to include e.g. affrontant, and probably flotant even though it's an attitude of an object rather than a beast, but what about e.g. palewise (google books:"lions palewise") or flamant, are those attitudes or just descriptions? (Is it unmaintainable to have many fine-grained categories?) - -sche (discuss) 15:50, 12 May 2023 (UTC)
- I share your doubts. If the definition of what the subcategory is supposed to contain is unclear to ordinary editors (especially for a specialized topic like heraldry), you are going to have to constantly maintain the subcategory because most editors will probably just dump entries into the main category. — Sgconlaw (talk) 05:40, 15 May 2023 (UTC)
heraldic complement, plenitude, etc → in complement, etc?
In heraldry, you can bear a sun in (its/his) splendor, suns in glory, peacocks in their pride, pelicans in piety: I created those entries at in (x), because AFAICT that sense of the noun isn't used without in. But I see we have the moon in its/her complement, increment [moved] or plenitude defined at the bare nouns. For complement, I can't find it used without in, e.g. *the moon's complement, so I'm inclined to move it to in complement. I can't find plenitude or detriment without in in heraldry, but I can find google books:"plenitude of the moon", google books:"moon in detriment" outside heraldry: should I still move the heraldic sense? (Alternatively we could just create a redirect from in plenitude and in her plenitude etc to the relevant sense of plenitude to make sure people searching for in... forms got to the right place.) - -sche (discuss) 18:31, 15 May 2023 (UTC)
English. There seems to be some conflation between the two. {{lb|en|China}}
categorizes into the former, though people often do meant the latter, which only has 3 entries. For example, typhoon shelter, Hong Kong foot, add oil, and aiya are labelled as both {{lb|en|China}}
and {{lb|en|Hong Kong}}
.
Also, "Chinese English" technically includes Hong Kong English by the criteria of geography, but linguistically and lexicographically speaking, there is very little influence on HKE from the mainland, which means there are not many instances where we actually need to categorize into both; the existing ones in the category that I'm aware of are (excluding the four already mentioned above) joss stick, Ins, KMT, and ACG. Note that this also causes abominations like the one at ACG, which is meant to include Taiwan as well. (We can ignore Macau for the sake of simplicity, since the English used there is basically a toned down version of formal HKE) – Wpi (talk) 17:29, 30 May 2023 (UTC)
- Off-topic: In my opinion, KMT and joss stick are not regional forms of English; indeed, the latter is currently not labelled as such. (Indeed, 'joss' is not so labelled, though it's not part of my active vocabulary.) --RichardW57m (talk) 09:18, 2 June 2023 (UTC)
2023 — June
"Crossdresser" is a more common term for this now. It also feels weird having this thesaurus entry here. CitationsFreak: Accessed 2023/01/01 (talk) 16:30, 1 June 2023 (UTC)
- Support - transvestite is borderline derogatory these days, so we shouldn't be using it as the generic term. Theknightwho (talk) 17:51, 1 June 2023 (UTC)
- Support - I don't think transvestite is derogatory in the clinical sense (our current sense 2), but it's derogatory to apply it to transgender people and to drag queens. It's thus much more specific than cross-dresser (which bgc/ngrams says is the most common spelling is so if we move it, we should move it to Thesaurus:cross-dresser), a term which can include drag queens/kings. The vast majority of transgender people are of course not cross-dressers since they generally dress in clothes appropriate to their gender identity. If we move it to Thesaurus:cross-dresser, we should add drag queen and drag king to the synonyms, but should probably remove trans and shemale. —Mahāgaja · talk 10:33, 2 June 2023 (UTC)
- Support. Vininn126 (talk) 10:47, 2 June 2023 (UTC)
- Comment - At this time, transgender issues are a hot button political issue in the USA. My biggest work on this area on Wiktionary has been nailing down some senses at transgender. But I would say that a shift in the dictionary made because of purely political trend could ultimately undermine the legitimacy of the dictionary project in the eyes of the readers, so be cautious. I only say this because of the timing of events: there is a political focus on this area at this time. But I believe a legitimate, academically grounded decision can be made, and it may very well be that crossdresser should be the core word instead of transvestite. I have no further opinion on this, but I just wanted to flag this potential issue. --Geographyinitiative (talk) 11:05, 2 June 2023 (UTC)
English. Needs splitting into vulcanian. Probably some crossover Elevenpluscolors (talk) 08:03, 7 June 2023 (UTC)
- @Elevenpluscolors Per the OED entry, no, it shouldn't be split but geological senses (and probably the cuckold sense too) should be separately listed as appearing with a lower-case initial letter. Whichever is currently the more common form should be the main entry. The other one should still have those senses but use the template for alternative case form of. — LlywelynII 22:04, 9 June 2023 (UTC)
Okinoerabu and Tokunoshima
- Discussion moved from Wiktionary:Beer parlour/2023/June.
These are two Ryukyuan languages that we currently call Oki-No-Erabu and Toku-No-Shima, because that’s how they’re spelled in ISO 639. However, literature invariably uses the unhyphenated forms, and they’re also much easier to read.
Could we please therefore rename them to the unhyphenated forms? Theknightwho (talk) 19:39, 4 June 2023 (UTC)
- I dislike the EN penchant for glomming Japanese names into long undifferentiated strings, as I find that this instead makes them harder to read, and it erases the distinction between the actual component terms.
- In some cases, the resulting interpretation or partial-expansion goes sideways, as we see at w:Tokunoshima, where the English text describes this as "Tokuno Island" -- the no portion is simply the genitive particle の (no), so as Japanese, this is better thought of as "Toku Island".
- The Japanese historical record bears this out, with the first mention in a 699 text as 度感. At the time, this may have been pronounced as something like twokom or dwokom, based on the Middle Chinese readings and known man'yōgana sound values, although some sites render this as toku or doku; it is not clear to me where the ku reading for 感 comes from. At any rate, the no is not part of the base of the name.
- For those interested and who can read Japanese, here are several references at the Kotobank aggregator site. Search the page for 度感.
- See also this entry at Nihon Jiten, which also lists 度感嶋 as an attested spelling with the pronunciation Toku Shima, further evidence that the base name is simply Toku and that the no is the particle.
- That aside, I do see that w:Tokunoshima language lists the alternative rendering "Toku-No-Shima", and the w:Okinoerabu dialect cluster similarly lists the alternative rendering "Oki-no-Erabu". A quick-and-dirty Google hits comparison (including "the" to filter for English hits):
- google:"tokunoshima" "the" -- 129K
- google:"toku-no-shima" "the" -- 55.7K
- google:"toku no shima" "the" -- 4.7K
- google:"okinoerabu" "the" -- 57.3K
- google:"oki-no-erabu" "the" -- 4.2K
- google:"oki no erabu" "the" -- 30K
- In the English-language web, the allthewordsruntogether renderings appear to be most common. Meanwhile, the
- Language Subtag Registry based on ISO 639 and maintained by IANA
- (https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) does indeed use the hyphenated descriptors.
- Meh. After digging into this some, I realize I just don't care all that much one way or the other. ‑‑ Eiríkr Útlendi │Tala við mig 22:09, 9 June 2023 (UTC)
- Searching on Google Scholar, it seems the unhyphenated forms are more common, but I concur with Eirikr's views that they look worse.
- However, I would suggest that if we were to retain the hyphens, the two languages should be renamed to "Oki-no-Erabu" and "Toku-no-shima" (or the rarer "Toku-no-Shima"), since the these are more common from Google Scholar, and also because "no" is a particle that shouldn't be capitalised in a proper noun, cf. Southend-on-Sea, Stoke-on-Trent or von, de, etc. in surnames. – Wpi (talk) 11:20, 21 June 2023 (UTC)
"Swaziland" categories to "Eswatini"
@LlywelynII wrote at Category talk:en:Swaziland on 20 April 2022, “I don't know who tends to these pages / But we should move this to Eswatini. They formally changed their English name a few years back.” @Paintspot moved Category:en:Swaziland to Category:en:Eswatini on 5 December 2022, also writing in edit summaries, “The Swaziland/Eswatini situation should be dealt with soon.”; “Is it possible to get WingerBot involved in moving all the other Swaziland categories like we did for North Macedonia? We should be consistent with Wikipedia.” Currently, it is in Category:Categories with incorrect name and with a “The automatically-generated contents of this category has errors.” notice, because the category is still indicated as “Swaziland” in our modules. J3133 (talk) 14:13, 9 June 2023 (UTC)
- I'm not sure why I was pinged. I mean... unless they changed their name back, I myself am still in full support of my own idea, sure. xD I can't help with the coding that should obviously be how this is handled though.
- Thanks for getting the ball rolling, though, and showing me where I should've come to make this request. — LlywelynII 21:34, 9 June 2023 (UTC)
- Support renaming the categories and editing the modules to accept the new name. —Mahāgaja · talk 07:35, 10 June 2023 (UTC)
- Support. This should be an easy fix honestly. CC: @Benwing2, @-sche AG202 (talk) 17:35, 10 June 2023 (UTC)
- Are we sure that Eswatini is the most commonly used name in English? I'm reminded of the Ivory Coast vs. Côte d'Ivoire issue (and more recently Czech Republic vs. Czechia), where a similar request was made in 1985 but doesn't seem to have stuck. (I still say "Swaziland" and "Ivory Coast" although I admit I rarely have found it necessary to refer to the former.) Benwing2 (talk) 19:13, 10 June 2023 (UTC)
- It seems to me that category names are among the items where we can enact prescriptive naming. DCDuring (talk) 00:46, 11 June 2023 (UTC)
- Are we sure that Eswatini is the most commonly used name in English? I'm reminded of the Ivory Coast vs. Côte d'Ivoire issue (and more recently Czech Republic vs. Czechia), where a similar request was made in 1985 but doesn't seem to have stuck. (I still say "Swaziland" and "Ivory Coast" although I admit I rarely have found it necessary to refer to the former.) Benwing2 (talk) 19:13, 10 June 2023 (UTC)
- I would fix this, but the "edit category data" links no longer point to where this information is stored. Where is the instance of Swaziland that makes auto-cat work stored, that needs to be changed? It is the (redundant-looking?) instances in Module:place/shared-data/tables and Module:place/shared-data? - -sche (discuss) 03:00, 11 June 2023 (UTC)
- @-sche: If you look at Special:WhatLinksHere/Module:place/shared-data and Special:WhatLinksHere/Module:place/shared-data/tables, it should be obvious. Chuck Entz (talk) 03:22, 11 June 2023 (UTC)
- @-sche: I deleted Module:place/shared-data/tables. It was a failed experiment to reduce memory usage that actually ended up increasing it. Module:place/shared-data should be where this info is found. If you make this change we need to move all the relevant categories (top-level and language-specific). Benwing2 (talk) 06:29, 12 June 2023 (UTC)
- Also, I'm not sure whether you can set a country-level alias in Module:place/data; we'd want existing places that reference Swaziland to still work (and categorize into 'Places in Eswatini', 'Cities in Eswatini', etc.). I'll look into this tomorrow and see whether the support is already there and if not, how to implement it. Benwing2 (talk) 06:31, 12 June 2023 (UTC)
- @-sche: I deleted Module:place/shared-data/tables. It was a failed experiment to reduce memory usage that actually ended up increasing it. Module:place/shared-data should be where this info is found. If you make this change we need to move all the relevant categories (top-level and language-specific). Benwing2 (talk) 06:29, 12 June 2023 (UTC)
- @-sche: If you look at Special:WhatLinksHere/Module:place/shared-data and Special:WhatLinksHere/Module:place/shared-data/tables, it should be obvious. Chuck Entz (talk) 03:22, 11 June 2023 (UTC)
- Support "Eswatini", also "Cabo Verde", "C'ote d'Ivoire" (with appropriate diacritics), "Czechia", "Timor-Leste", and "Turkiye" (with appropriate diacritics). —Justin (koavf)❤T☮C☺M☯ 08:52, 14 June 2023 (UTC)
- FWIW I oppose all the other suggestions of User:Koavf on grounds of common usage. Benwing2 (talk) 07:41, 15 June 2023 (UTC)
This page seems to no longer be maintained by User:AryamanA and is of general interest to the Wiktionary community. Also: (@Koavf) this wouldn't break any links as long as we turn the old page into a redirect. Ioaxxere (talk) 03:00, 14 June 2023 (UTC)
Support per nom. — excarnateSojourner (talk · contrib) 06:34, 14 June 2023 (UTC)- Oppose: On second thought, I prefer Thadh's suggestion to move it to User:Wonderfool. — excarnateSojourner (talk · contrib) 05:04, 15 June 2023 (UTC)
- Support. (Re broken links: @Ioaxxere: I think Koavf was referring to the user pages of Wonderfool accounts, which were all redirected to User:AryamanA/Wonderfool; double redirects do not work. As there would be many to fix, a bot could help.) J3133 (talk) 07:56, 14 June 2023 (UTC)
- Support —Justin (koavf)❤T☮C☺M☯ 08:42, 14 June 2023 (UTC)
- Oppose - There has to be some limit to how much we glorify someone for refusing to behave normally. Project pages are beyond that limit in my view. I'm deleting the Wiktionary namespace page. - TheDaveRoss 22:44, 14 June 2023 (UTC)
- Oppose This is not Wiktionary material, it's userspace material. If you don't want to host it at Aryaman's, host it directly at User:Wonderfool. Thadh (talk) 22:46, 14 June 2023 (UTC)
- @Thadh: What Ioaxxere mentioned at User talk:AryamanA/Wonderfool but did not here is that Wikipedia uses the project namespace (see subpages at w:Wikipedia:Long-term abuse/List). They have decided (inclduding by four deletion nominations – all kept) that it should be there. J3133 (talk) 23:09, 14 June 2023 (UTC)
- This is not a page used to help stifle long-term abuse, it is to make WF feel good. A much less valid rationale for a project page. - TheDaveRoss 23:18, 14 June 2023 (UTC)
- @Thadh: What Ioaxxere mentioned at User talk:AryamanA/Wonderfool but did not here is that Wikipedia uses the project namespace (see subpages at w:Wikipedia:Long-term abuse/List). They have decided (inclduding by four deletion nominations – all kept) that it should be there. J3133 (talk) 23:09, 14 June 2023 (UTC)
- Oppose WF is definitely not deserving of a project page dedicated to them. Megathonic (talk) 04:54, 16 July 2023 (UTC)
- Oppose But I welcome anyone to edit it, I was never the one primarily maintaining it (all the fancy code stuff was added by Erutuon iirc). —AryamanA (मुझसे बात करें • योगदान) 18:34, 8 August 2023 (UTC)
Should these categories be merged? Many terms in the -cide categories end in -icide, and thus should be moved, unless we decide not to make this distinction. J3133 (talk) 12:45, 14 June 2023 (UTC)
- Support -icide isn't really even a real suffix, it's just -i- + -cide. Ioaxxere (talk) 00:02, 16 June 2023 (UTC)
- What (s)he said. Nicodene (talk) 16:09, 17 June 2023 (UTC)
- Support unless someone comes up with a really good argument otherwise. DCDuring (talk) 16:17, 17 June 2023 (UTC)
Is there a maintainable distinction here? The pages claim that jail is pre-trial and prison is post-trial, but our entries define both as being for those awaiting trial as well as those being jailed/imprisoned, and most of the entries listed on Thesaurus:jail are for places people are confined long-term post-sentence (so according to the claimed distinction, belong on the other page). My initial inclination is that it'd be better to merge these. - -sche (discuss) 01:53, 15 June 2023 (UTC)
- It would be a further demonstration that our thesaurus entries don't have explanatory value, whatever their value for writers and translators and others who need to be reminded of words they at least recognize. It's a small niche, but the explanatory thesaurus of my dreams has yet to be written. DCDuring (talk) 02:31, 15 June 2023 (UTC)
- I think the words are synonymous and interchangeable in most varieties of English. — Sgconlaw (talk) 04:41, 15 June 2023 (UTC)
- In a narrow technical sense in some jurisdictions, these are distinct- but thesaurus entries are about more than a narrow technical sense. The second is really a subset of the first- there's nothing about the wording in the first one to exclude anything that goes in the second. The result is that the first has three times as much as the second, with considerable redundancy between the two and even a few terms like can, clink, jug and slammer in the second one that aren't particularly associated with long-term incarceration. The imbalance can be at least partly corrected by sorting through everything, but it would be far easier to just move everything in Thesaurus:prison to Thesaurus:jail that isn't already duplicated there. We could even have a section set aside in Thesaurus:jail for long-term facilities. Chuck Entz (talk) 06:05, 15 June 2023 (UTC)
Not a bad idea either Ioaxxere (talk) 00:02, 16 June 2023 (UTC)
- Oppose It's suspicious that WF supports this move. Daniel.z.tg (talk) 22:25, 16 June 2023 (UTC)
- @Daniel.z.tg WF moved it to Wiktionary:Wonderfool (not User:Wonderfool). Do you mean because WF moved the page anywhere you are opposed to moving it anywhere? — excarnateSojourner (talk · contrib) 16:05, 17 June 2023 (UTC)
- I don't really know. This time I just decided to concur with the people who opposed WF recently on WT:V.
- I originally had weak support for this move. I was thinking there was a minor cleanup benefit, and two people supported it. Then WF moved it which caused me to change my mind. Normally WF starting something is only a small negative point and, as seen in the previous vote or two, usually doesn't cause me to change my mind. There I mainly consider the merits of the proposal itself instead of the champion and I found them concretely beneficial. This time I started out at Weak Support so what WF did was able to move me to (Weak?) Oppose. Daniel.z.tg (talk) 16:53, 18 June 2023 (UTC)
- @Daniel.z.tg WF moved it to Wiktionary:Wonderfool (not User:Wonderfool). Do you mean because WF moved the page anywhere you are opposed to moving it anywhere? — excarnateSojourner (talk · contrib) 16:05, 17 June 2023 (UTC)
- Support Keeps it in userspace without implying it "belongs" to any one (non-WF) editor. — excarnateSojourner (talk · contrib) 16:05, 17 June 2023 (UTC)
- Support Thadh (talk) 12:18, 21 June 2023 (UTC)
- Support, but it should be made clear that it is not Wonderfool’s user page. J3133 (talk) 12:23, 21 June 2023 (UTC)
- Support Binarystep (talk) 20:38, 28 June 2023 (UTC)
- Oppose I'd rather delete User:AryamanA/Wonderfool entirely and have all sockpuppet accounts redirect to User:Wonderfool. Including a long list of sockpuppet accounts and their edit totals just adds to the attention that WF craves and encourages their behavior. WF's original userpage could be modified to contain nothing more than a very brief message saying if you were redirected here from a user's page, it's because they're a sockpuppet of WF. Short and to the point, it lets users know who they are dealing with without adding to the attention that WF gets. The page itself would be set to the template editor/admin lock level to prevent WF from using another sockpuppet to modify it. Megathonic (talk) 04:38, 16 July 2023 (UTC)
- Agreed. User:Ioaxxere, are you going to fix the double redirects (e.g. with AWB)? —Justin (koavf)❤T☮C☺M☯ 02:48, 22 July 2023 (UTC)
- I also agree, now that the vote to unblock WF has been created. The timing of the resolution of this discussion is unfortunate, I think. It would be better to wait until we see the outcome of the vote. Ah, well. Andrew Sheedy (talk) 03:10, 22 July 2023 (UTC)
- Agreed. User:Ioaxxere, are you going to fix the double redirects (e.g. with AWB)? —Justin (koavf)❤T☮C☺M☯ 02:48, 22 July 2023 (UTC)
Moved. Ioaxxere (talk) 02:01, 22 July 2023 (UTC)
- All of WF's sockpuppets now need to have the redirect link updated. For whoever wants to do it. Megathonic (talk) 02:07, 22 July 2023 (UTC)
Splitting Selkup
@Surjection, Tropylium, Kaarkemhveel After a while of deliberation with Kaarkemhveel and two other future Selkup editors, we have come to the conclusion that it's best to split Selkup into two codes: Northern Selkup (sel-nor) and Southern Selkup (sel-sou) [the exact form of the codes is up for debate], which will both be part of the Selkup family (sel).
These two dialect areas are so different that treating them as a single language would be too bothersome. All subdialects are going to be marked with labels, and provided as languages in descendants sections (much like the two Karelian proper varieties are, or the Zyrian dialects).
The two branches are often named as different: Glottolog splits Selkup into "Kety-Central-Southern Selkup" (Southern) and "Taz-Turukhan" (Northern); The Oxford Guide to the Uralic Languages also shows a split between "Northern Selkup" and "Tomsk region Selkup" (p.778). A few more examples of papers that do this include Wurm (1997), Budzisch (2015), Vorobeva et al. (2017)...
There is precedent for treating these as different languages: ELP splits the family into three full-fledged languages ([18] [19] [20]). On the pages there is the following reasoning for this split: "The three main varieties of Selkup have traditionally been counted as dialects of a single language; their differences are, however, comparable to those between, for instance, Ket, Yug, and Pumpokol".
The Russian institute RAN also splits Selkup into Northern and Southern, as two full-fledged languages.
So, does anyone have an issue with this split? Thadh (talk) 11:04, 19 June 2023 (UTC)
- Not oppose as there are clear differences both lexical and cultural. Tollef Salemann (talk) 11:14, 19 June 2023 (UTC)
- The Wikipedia article also mentions a Central Selkup. What are you doing with that one? Does it belong to Southern Selkup? —Mahāgaja · talk 14:03, 19 June 2023 (UTC)
- Yes, that one will then be handled as Southern Selkup, just like it is by the above sources. Thadh (talk) 14:12, 19 June 2023 (UTC)
- The Wikipedia article also mentions a Central Selkup. What are you doing with that one? Does it belong to Southern Selkup? —Mahāgaja · talk 14:03, 19 June 2023 (UTC)
- No opposition on this much, Northern Selkup is by now clearly distinct from non-Northern and has its own literary standard. Bridging historical data exists but would be probably better handled in Proto-Selkup entries anyway, about all of it is field records and not direct literary use by the speaker community.
- Depending on how work on non-Northern Selkup develops, further division could be eventually meaningful too. The other recent handbook, Routledge's The Uralic Languages, Second Edition discusses things from a primarily tripartite Southern / Central / Northern perspective and notes that, though the sharpest modern boundary is Central vs. Northern, the most taxonomically significant difference is Southern vs. {Central, Northern}. I believe currently Southern is better-documented than Central, but the latter is what still has some attempts at literary usage and revival. --Tropylium (talk) 14:48, 19 June 2023 (UTC)
- Done. Cleanup is ongoing. Thadh (talk) 20:01, 28 June 2023 (UTC)
English. Theres gotta be some overlap between pinnulated and pinnulate. Someone smarter than me could have a go at fixing it Sub zero Temps (talk) 12:55, 23 June 2023 (UTC)
These should be merged. I'm not sure which should be the main form or how to figure that out. Andrew Sheedy (talk) 19:07, 27 June 2023 (UTC)
- Merged. The unspaced form seems more common. - -sche (discuss) 17:57, 16 August 2023 (UTC)
The pronoun doesn't have to be present: google books:"queer the pitch". Should the lemma be moved to queer the pitch, or should that be a synonym, ...? - -sche (discuss) 18:12, 28 June 2023 (UTC)
- I am familiar with the similar queer the deal, defined by "NetLingo" as "To ruin a potential business deal or arrangement despite all favorable odds. For example, 'They are a liberal company, so don't queer the deal by letting them know our conservative tactics.'" The "deal" version is more common with "the" than with possessive pronouns. But I wonder whether the right approach isn't to make both the "the"" and the "someone's" versions redirect to the right sense of queer#Verb, adding usage examples there. DCDuring (talk) 20:15, 28 June 2023 (UTC)
- Sense 4 of queer#Verb is the right definition for these. I wouldn't have called these dated, but then I'm dated. DCDuring (talk) 20:21, 28 June 2023 (UTC)
2023 — July
Chinese. Move to 屄養的: 小 + 屄養的. 恨国党非蠢即坏 (talk) 02:34, 1 July 2023 (UTC)
Tagged over 5 years ago but never listed, a request to merge {{desc-top}}
and {{desc-bottom}}
into {{des-top}}
and {{des-bottom}}
. Both create collapsible two-column tables of descendants, but {{desc-top}}
provides only the header "Descendants" while {{des-top}}
provides the header "Descendants of [term] in other languages". Seems reasonable to me; at any rate, we certainly don't need both. —Mahāgaja · talk 12:27, 4 July 2023 (UTC)
- I prefer the text generated by
{{des-top}}
, but the name{{desc-top}}
. Why don't we move{{des-top}}
and{{des-bottom}}
to{{desc-top}}
and{{desc-bottom}}
. I don't think anything bad would happen, at least not in principal namespace. DCDuring (talk) 13:52, 4 July 2023 (UTC)- As long as there's a hard redirect from the name not selected, it doesn't matter to me which name is selected. In fact, given our fondness for long names with short redirects, we could move
{{des-top}}
to{{descendants-top}}
and have redirects from both short names. —Mahāgaja · talk 14:12, 4 July 2023 (UTC)- Why not. DCDuring (talk) 14:14, 5 July 2023 (UTC)
- Do template redirects slow down page rendering at all? —Soap— 17:06, 12 July 2023 (UTC)
- Why not. DCDuring (talk) 14:14, 5 July 2023 (UTC)
- As long as there's a hard redirect from the name not selected, it doesn't matter to me which name is selected. In fact, given our fondness for long names with short redirects, we could move
- Merge. I don't really care about the names. — excarnateSojourner (talk · contrib) 21:16, 15 July 2023 (UTC)
- Merge. I agree here with User:DCDuring about preferring the name
{{desc-top}}
because it matches{{desc}}
. I feel like "Descendants of [term] in other languages" is a bit wordy; maybe it should just say "Descendants in other languages" since the term is implicit? However, I'm not strongly attached to this, and we can always change the wording after the merger. Note that this falls under the purview of WT:RFDO#remove lesser-used column templates where I proposed removing both sets of templates and replacing them with a more general collapsing template (e.g. something like{{box-top|<d>}}
for descendants), but that is maybe a longer-term discussion. Benwing2 (talk) 21:45, 24 July 2023 (UTC)- Actually I really like "Descendants in other languages" because then it isn't necessary to format the word. At the moment,
{{des-top}}
just puts the term in italics no matter what, but non-Latin scripts aren't supposed to be italicized here, and even without italics, text is supposed to be tagged as the correct language. And changing the text to "Descendants in other languages" is a much simpler solution than changing the template so that you have to type{{desc-top|grc}}
(or whatever) and then have the template or module format the text appropriately. —Mahāgaja · talk 06:55, 25 July 2023 (UTC)
- Actually I really like "Descendants in other languages" because then it isn't necessary to format the word. At the moment,
- Merge. I agree here with User:DCDuring about preferring the name
English. cotton, cotton on, cotton on to, cotton to. – Jberkel 18:57, 10 July 2023 (UTC)
Correct language names
Could you correct Juǀ'hoan to Juǀʼhoan, Kwak'wala to Kwakʼwala, and K'iche' to Kʼicheʼ? There's no punctuation in the ethnonyms. If we want to use assimilated English forms, then the latter would be Quiché; I'm not sure about Juǀʼhoan. kwami (talk) 19:16, 13 July 2023 (UTC)
- Support. To clarify for people using low-resolution screens: the request is to use the modifier letter apostrophe character ʼ rather than the typewriter apostrophe '; the categories are currently at Category:Juǀ'hoan language (ktz) and Category:K'iche' language (quc). Our usual practice is to use the spelling most common in contemporary English-language discussions of the language. Which is more common in current books and journal articles, Kʼicheʼ or Quiché? —Mahāgaja · talk 19:30, 13 July 2023 (UTC)
- Just to be clear, I personally don't care about ASCII substitutions in category names; what I'm concerned about is proper headers in the dictionary entries. But it's fine by me if the two go together.
- As for Kʼicheʼ or Quiché, the English-language lit has been moving from the Spanish form to the ethnonym. That's an ongoing trend, though of course not universal (e.g. 'German', 'Greek', 'Armenian' etc.). kwami (talk) 21:15, 13 July 2023 (UTC)
- The L2 headers and category names do need to match, at least for readers using tabbed browsing. Otherwise, the categories won't appear in the correct language tab. I think there are also bots that require the L2 header to be the canonical language name in order to work properly. —Mahāgaja · talk 22:20, 13 July 2023 (UTC)
- Okay. Works for me. kwami (talk) 22:24, 13 July 2023 (UTC)
- @Kwamikagami Normally at Wiktionary we use typewriter apostrophes rather than curly single quotes, and this issue is somewhat controversial, so this change is unlikely to happen without significant further discussion and consensus. Benwing2 (talk) 04:27, 24 July 2023 (UTC)
- I'm not requesting quote marks. That would also be incorrect. Rather, since we are attempting to use the endonym, IMO it should be the glottal stop or ejective diacritic that's in the orthography. kwami (talk) 04:41, 24 July 2023 (UTC)
- Indeed, no one is advocating curly single quotes. The modifier letter apostrophe is a different character; it's a letter, not a punctuation mark. There are several other language names besides these two that ought to be using it. —Mahāgaja · talk 06:23, 24 July 2023 (UTC)
- Sarci, for example, which was just moved to its endonym (minus tone marking). But I thought I'd wait to see how things went before attempting a more comprehensive proposal. kwami (talk) 06:27, 24 July 2023 (UTC)
- @Kwamikagami Normally at Wiktionary we use typewriter apostrophes rather than curly single quotes, and this issue is somewhat controversial, so this change is unlikely to happen without significant further discussion and consensus. Benwing2 (talk) 04:27, 24 July 2023 (UTC)
- Okay. Works for me. kwami (talk) 22:24, 13 July 2023 (UTC)
- The L2 headers and category names do need to match, at least for readers using tabbed browsing. Otherwise, the categories won't appear in the correct language tab. I think there are also bots that require the L2 header to be the canonical language name in order to work properly. —Mahāgaja · talk 22:20, 13 July 2023 (UTC)
- As for Kʼicheʼ or Quiché, the English-language lit has been moving from the Spanish form to the ethnonym. That's an ongoing trend, though of course not universal (e.g. 'German', 'Greek', 'Armenian' etc.). kwami (talk) 21:15, 13 July 2023 (UTC)
- Support - this isn't a matter of using curly quotes vs straight ones; it's a matter of using the correct letter instead of punctuation. We already do this extensively in entries for languages that use it anyway. Theknightwho (talk) 15:39, 24 July 2023 (UTC)
- Going through WT:LOL, these are the languages whose names have the modifier letter apostrophe at Wikipedia but the typewriter apostrophe here:
- aah: Abu' Arapesh
- bcr: Babine-Witsuwit'en
- bei: Bekati'
- bbj: Ghomala'
- bko: Kwa'
- byd: Benyadu'
- caa: Ch'orti'
- crq: Iyo'wujwa Chorote
- crt: Iyojwa'ja Chorote
- fmp: Fe'fe'
- gaq: Gata'
- gwi: Gwich'in
- ilu: Ili'uun
- kek: Q'eqchi
- kjb: Q'anjob'al
- ktz: Juǀ'hoan
- kuk: Kepo'
- kuy: Kuuku-Ya'u
- kwk: Kwak'wala
- lni: Daantanai'
- lra: Rara Bakati'
- lul: Olu'bo
- mgo: Meta'
- mhy: Ma'anyan
- mlu: To'abaita
- mtk: Mbe'
- muc: Mbu'
- mym: Me'ne
- nea: Eastern Ngad'a
- nnz: Nda'nda'
- ood: O'odham
- pav: Wari'
- phq: Phana'
- poh: Poqomchi'
- pqa: Pa'a
- quc: K'iche'
- rob: Tae'
- sda: Toraja-Sa'dan
- srs: Tsuut'ina
- ssq: So'a
- stv: Silt'e
- tfn: Dena'ina
- tln: Talondo'
- tyh: O'du
- tzj: Tz'utujil
- ulm: Ulumanda'
- ulu: Uma' Lung
- wmh: Waima'a
- xkk: Kaco'
- xky: Uma' Lasan
- xoc: O'chi'chi'
- myn-chl: Ch'olti'
- Other languages with typewriter apostrophe whose Wikipedia article uses a different character include:
- gez Ge'ez → Geʽez with ʽ (U+02BD modifier letter reversed comma)
- hps Hawai'i Pidgin Sign Language → Hawaiʻi Pidgin Sign Language with ʻ (U+02BB modifier letter turned comma)
- num Niuafo'ou language → Niuafoʻou with ʻ (U+02BB modifier letter turned comma)
- tct T'en → Tʻen with ʻ (U+02BB modifier letter turned comma)
- tsl Ts'ün-Lao → Tsʻün-Lao with ʻ (U+02BB modifier letter turned comma)
- I support making all of these changes. —Mahāgaja · talk 19:54, 24 July 2023 (UTC)
- I oppose these changes. What is the actual benefit? From the above discussion, there are at least three different Unicode apostrophe-like characters involved, which are easily confused, and it will make it significantly harder to type the language names into headers, categories and the like. This is going to be a major pain in the ass for people like me who will have to clean up wrongly-typed apostrophes in language headers in innumerable articles created by IP's and other occasional contributors, who are unlikely to be able to type the right character. Furthermore, even with these changes, the language names in many cases will not actually match their endonym spelling; cf. the proposed Oʼodham, which is actually spelled ʼOʼodham natively with two apostrophes. Similarly, as pointed out by User:Kwamikagami, our spelling of the CAT:Tsuut'ina language doesn't include the tone mark that is present in the native orthography, and wouldn't even with the change in apostrophe. I should add that Wikipedia uses these Unicode chars specifically because Kwami went around renaming all the articles (formerly they used the straight apostrophes), and is not consistent, e.g. the article on the name of the people is still at O'odham with a straight apostrophe. Glottolog uses straight apostrophes for O'odham; so does [21], the Endangered Languages Project. In general, our policy is to use the *English* names for languages; we are not forced to use the exact native spelling. While I agree it's a good idea to approximate the spelling (e.g. avoiding exonyms where possible), I disagree we have to take this to the extreme of using the "correct" Unicode apostrophes (which I bet you will find native speakers not using in many cases as well). Benwing2 (talk) 20:22, 24 July 2023 (UTC)
- Other people's carelessness in using Unicode is no excuse for us to be careless, and anyway, language names can always be inserted by typing
{{subst:\|xyz}}
, which doesn't involve any non-ASCII characters. Latin a and Cyrillic а look identical in every font and font style too, but substituting one for the other is an error; it's no different with ' and ʼ. —Mahāgaja · talk 07:05, 25 July 2023 (UTC)- I think you're missing the point. We don't include Cyrillic letters in language names, either. Benwing2 (talk) 07:13, 25 July 2023 (UTC)
- I know that. My point is that using ' where ʼ belongs is as bad as using Cyrillic letters in Latin-script language names. —Mahāgaja · talk 07:24, 25 July 2023 (UTC)
- I think you're missing the point. We don't include Cyrillic letters in language names, either. Benwing2 (talk) 07:13, 25 July 2023 (UTC)
- Other people's carelessness in using Unicode is no excuse for us to be careless, and anyway, language names can always be inserted by typing
- I oppose these changes. What is the actual benefit? From the above discussion, there are at least three different Unicode apostrophe-like characters involved, which are easily confused, and it will make it significantly harder to type the language names into headers, categories and the like. This is going to be a major pain in the ass for people like me who will have to clean up wrongly-typed apostrophes in language headers in innumerable articles created by IP's and other occasional contributors, who are unlikely to be able to type the right character. Furthermore, even with these changes, the language names in many cases will not actually match their endonym spelling; cf. the proposed Oʼodham, which is actually spelled ʼOʼodham natively with two apostrophes. Similarly, as pointed out by User:Kwamikagami, our spelling of the CAT:Tsuut'ina language doesn't include the tone mark that is present in the native orthography, and wouldn't even with the change in apostrophe. I should add that Wikipedia uses these Unicode chars specifically because Kwami went around renaming all the articles (formerly they used the straight apostrophes), and is not consistent, e.g. the article on the name of the people is still at O'odham with a straight apostrophe. Glottolog uses straight apostrophes for O'odham; so does [21], the Endangered Languages Project. In general, our policy is to use the *English* names for languages; we are not forced to use the exact native spelling. While I agree it's a good idea to approximate the spelling (e.g. avoiding exonyms where possible), I disagree we have to take this to the extreme of using the "correct" Unicode apostrophes (which I bet you will find native speakers not using in many cases as well). Benwing2 (talk) 20:22, 24 July 2023 (UTC)
- Other languages with typewriter apostrophe whose Wikipedia article uses a different character include:
- Hmm... like Benwing, my initial inclination is to oppose this, because the odds of anyone being able to type names with the fancy characters when adding entries is low (and given recent events, I wonder if one or more admins would block people for 'adding wrong language names' if people keep typing the names they're able to type). OTOH, I recognize that we require entries themselves to be input using correct spellings (with accents etc) and not in hacky ways... If we had a system like the French Wiktionary where no-one had to type the language names (instead only typing language codes, which only consist of easily-typeable ASCII characters), then changing the displayed character would be less of a problem (though still hard for navigating to categories, etc). Do we have a template with a simple short name people could subst: to produce the untypeable names, so they could write =={{subst:langname|foo-bar}}== to get ==Fooʾbar==? Or if we took this type of functionality and had a button people could periodically press (hosted on here like that Javascript is, not as a Python script on the computer of a user who might leave the project or be too busy to run it) that would search the database for instances of the typeable names and update them to the untypeable names, then it would be less of a problem (although it'd still be creating an unending maintenance task). - -sche (discuss) 16:22, 16 August 2023 (UTC)
Chinese. Move to 有你的: 真 + 有你的. 恨国党非蠢即坏 (talk) 07:50, 18 July 2023 (UTC)
English. Move to bee's knees: like shits for "the shits". 恨国党非蠢即坏 (talk) 03:58, 24 July 2023 (UTC)
- Maybe, but does bee's knees attestably occur other than as a part of the bee's knees? DCDuring (talk) 15:33, 24 July 2023 (UTC)
- @DCDuring: Are attestations needed to leave "the" out? Other entries seemingly just always leave "the" out, like United States。 恨国党非蠢即坏 (talk) 10:52, 8 August 2023 (UTC)
- It would be better if we had some evidence for all similar cases. OTOH, if we think new normal users are able to use the failed-search page, then they would find [[bee's knees]], even if they searched for "the bee's knees" (and vice versa). I personally think that normal users can't be assumed to make good use of that page. DCDuring (talk) 12:53, 8 August 2023 (UTC)
- @DCDuring: Are attestations needed to leave "the" out? Other entries seemingly just always leave "the" out, like United States。 恨国党非蠢即坏 (talk) 10:52, 8 August 2023 (UTC)
- We are quite inconsistent about whether we include the or not, e.g. cat's pyjamas redirects to the cat's pyjamas, contrary to the direction of the the shits → shits redirect. It would be better to try to decide on a general approach rather than move entries piecemeal. DCDuring, you argued in favor of redirecting verb oneself to verb even when it's never attested other than with a reflexive pronoun; it seems to me the same logic would make it better to centralize content at bee's knees, too. "The" is dropped from constructions like this when they're used attributively and in certain other cases (peruse the cites at google books:"and bee's knees"), and in headlinese ("Mayor Says New Parks Are Bee's Knees"). I pointed this out about Talk:The Rock, too. - -sche (discuss) 16:15, 16 August 2023 (UTC)
Old Polish. Should be kamień. Vininn126 (talk) 11:55, 27 July 2023 (UTC)
@-sche, Sgconlaw {{circa2}}
was created apparently to work around the fact that {{circa}}
adds (or added) a comma automatically. Now that I'm changing {{circa}}
(along with {{ante}}
and {{post}}
) not to do this, I don't see any use for {{circa2}}
and propose merging it into {{circa}}
. Benwing2 (talk) Benwing2 (talk) 21:47, 1 August 2023 (UTC)
- @Benwing2: it seems like
{{circa}}
and{{circa2}}
serve different purposes. The template{{circa}}
(and{{ante}}
and{{post}}
) appear to have been created for quotations in entries that do not use quotation templates. That is why the year appears in bold and there is a comma after the year. On the other hand,{{circa2}}
is for adding circa or c. before a year in other contexts, such as in etymology sections or image captions. I suppose{{circa}}
and{{circa2}}
could be merged, but then some parameter would have to be added to allow for switching between the two formats. Alternatively, if all quotations using{{circa}}
,{{ante}}
, and{{post}}
were replaced with quotation templates, then{{ante}}
and{{post}}
could be eliminated and{{circa2}}
could be renamed as{{circa}}
. — Sgconlaw (talk) 05:55, 2 August 2023 (UTC)- @Sgconlaw I have eliminated the aftercomma from
{{circa}}
,{{ante}}
and{{post}}
. What differences remain? Just the boldface? That seems a pretty small thing to have two templates for, esp. given the horrible naming. Benwing2 (talk) 06:17, 2 August 2023 (UTC)- BTW
{{ante}}
etc. frequently appear inside of quotation templates. What is the way to do without them? Benwing2 (talk) 06:18, 2 August 2023 (UTC)
- BTW
- @Sgconlaw I have eliminated the aftercomma from
{{quote-book|en|year={{ante|1597}}|first=William|last=Shakespeare|authorlink=William Shakespeare|title={{w|The Merry Wives of Windsor}}|section=Act 3, Scene 5|passage=No, Master Brook, but the peaking '''cornuto''' / her husband, Master Brook, dwelling in a continual / 'larum of jealousy, comes me in the instant of our / encounter, after we had embraced, kissed, protested, / and, as it were, spoke the prologue of our comedy}}
- I am cleaning all of them up to use e.g.
a. 1597
instead. Note that|origyear=
,|year_published=
, etc. now supporta.
,c.
andp.
prefixes. Benwing2 (talk) 08:22, 13 August 2023 (UTC)
- I am cleaning all of them up to use e.g.
- Hmm. I would not merge these as things stand now, with them having the differences re bolding that they do, and the differences in where they're used: they currently serve different purposes. (Since we don't normally bold years in etymologies, descendants lists, etc, a template used in etymologies to qualify a year as circa shouldn't bold the year either, whereas we do normally bold years at the start of quotation metadata, so a template that supplies circa there should bold the year.) However, if we replace all of the relatively few (~680) uses of
{{circa2}}
with just the spelled-out word "circa" — formatted however: "circa", "c.", whatever we decide — rather than a template, we could just delete{{circa2}}
. And/or if we made sure all uses of{{circa}}
were inside quotation templates (not manually-formatted quotations), then we could presumably have the quotation templates know that if year={{circa|####}}, then format #### in bold (but don't bold circa?), and then if{{circa}}
and{{circa2}}
stopped differing in the formatting they apply, they could be merged. - -sche (discuss) 16:06, 16 August 2023 (UTC)
I think these should be merged. The former (which includes a usage example with I can’t even) was created in 2015, the latter in 2022. J3133 (talk) 16:15, 4 August 2023 (UTC)
- We don't have any attested use of any kind for either entry. We would want evidence of use with subjects other than I, including use with subjects not in first person. DCDuring (talk) 16:40, 4 August 2023 (UTC)
There's no Wiktionary:Cleanup process 1, so it is quite trivial. Requires admin powers to override the redirect, I think Pious Eterino (talk) 00:10, 9 August 2023 (UTC)
Ktunaxa, Secwepemctsín
Could we rename Kutenai (kut) to Ktunaxa, and Shuswap (shs) to Secwepemctsín please? The first names are the Anglicized terms for the languages, and are somewhat outdated and/or not in use among speakers. GKON (talk) 22:46, 12 August 2023 (UTC)
- @-sche Can you weigh in here? There is nothing wrong per se with having exonyms for languages (we say "German" not "Deutsch" for example), and I note that Wikipedia still uses Kutenai and Shuswap. The main issue in my view is (a) avoid pejorative terms, and (b) use the most common terms as found in English-language sources. Benwing2 (talk) 23:37, 15 August 2023 (UTC)
- For Shuswap, almost no-one uses Secwepemctsín in English, either in books overall as tracked by Ngram Viewer, or in reference works about the language at Glottolog. For
kut
, Kutenai was the main name (in reference works/Glottolog and overall/Ngrams) until a few years ago, when Ktunaxa started to just barely overtake it. - -sche (discuss) 17:45, 16 August 2023 (UTC)
- For Shuswap, almost no-one uses Secwepemctsín in English, either in books overall as tracked by Ngram Viewer, or in reference works about the language at Glottolog. For
Should be merged into Category:Christianity. Ioaxxere (talk) 17:38, 13 August 2023 (UTC)
- @Ioaxxere No: the category covers terms mainly used by religious figures, not terms that merely relate to Christianity. Many of the terms in Category:Thai ecclesiastical terms, for example, have nothing to do with Christianity. Theknightwho (talk) 22:21, 15 August 2023 (UTC)
PIE *ḱléyteh₂
I think this should be *ḱleytéh₂, in order for Verner's law to give Proto-Germanic *hlīdō, which is the only way to get German Leite. Is this correct? —Caoimhin ceallach (talk) 18:42, 14 August 2023 (UTC)
- @Caoimhin ceallach: I've moved it. --
{{victar|talk}}
19:40, 14 August 2023 (UTC)
Template:lj and Template:jaru into Template:rja
Both {{lj}}
and {{jaru}}
are aliases for {{ruby/ja}}
, which calls {{ruby}}
and wraps it using {{lang|ja|...}}
. Now, why doesn't {{ruby}}
take a lang code in the first place? That is strange. But the aliases are terrible; I propose eliminating them both in favor of {{rja}}
, which is a logical shortening of "ruby/ja". We could have for example {{rko}}
for Korean ruby, if it is so needed. Pinging the Japanese work group (Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Chuterix, Mcph2): , sorry for the wide ping. Benwing2 (talk) 08:14, 15 August 2023 (UTC)
- All three --
{{lj}}
,{{jaru}}
, and{{ruby/ja}}
-- were the creations of Fumiko Take. They did very little to document any of the templates or aliases they created, and if dim memory serves, they were even aggressively oppositional when asked to provide documentation.- Stepping back -- what is the use case for this infrastructure? Do we not already have functional ruby text provided by
{{ja-r}}
?
- Granted,
{{ruby/ja}}
offers the ability to specify arbitrary ruby text -- but I struggle to think of when we'd actually want that. It's used to great effect in manga, when authors will not uncommonly spell a word to convey a particular sense, and gloss it with ruby to indicate a different word entirely -- but for a dictionary, this is aberrant behavior outside of direct quotes of such texts. I suspect that, in most cases,{{ja-r}}
would do just fine for our needs.
- Stepping back a bit further -- do we need ruby text at all?
- Serious question. Tiny kana over the kanji is something that only provides value to people who can already read kana, and is otherwise likely to confuse anyone unfamiliar with Japanese typography (which is probably the greater part of our user base). If a given user can already read kana, they are likely to be savvy enough to be able to match up any provided romanized string to the kanji, much as we get when using
{{m|ja|TERM|tr=romanization}}
. - I argue that kana ruby text over kanji is snazzy, but it also presents usability issues.
- Stepping back -- what is the use case for this infrastructure? Do we not already have functional ruby text provided by
- At any rate, I would welcome an overhaul of the listed
{{lj}}
,{{jaru}}
, and{{ruby/ja}}
templates. ‑‑ Eiríkr Útlendi │Tala við mig 08:34, 15 August 2023 (UTC)- @Eirikr I'm in full agreement that these are superfluous to
{{ja-r}}
(and{{ryu-r}}
), but I disagree that we should be getting rid of rubytext. I think the aim should be to incorporate rubytext into{{l}}
,{{m}}
(et al). The infrastructure for language-specific formatting in links already exists (and is already used by Chinese and the Chinese lects to generate simplified forms), so we could add something for the Japonic languages that essentially reimplements{{ja-r}}
(for the relevant language). Theknightwho (talk) 22:09, 15 August 2023 (UTC) Forgot to ping Benwing2. Theknightwho (talk) 22:12, 15 August 2023 (UTC)- Just to add a bit further to this - I'd also like to automate much of the kanji/kana mapping which is currently necessary with
{{ja-r}}
. It won't be possible to do away with it entirely, due to redlinks or when there are multiple possibilities, but{{ja-pos}}
(and all the other headword templates) are able to do this already by looking at the input for{{ja-kanjitab}}
, so there's no reason why link templates shouldn't be able to do this as well. - This would greatly simplify a lot of the complexity encountered when adding Japanese links, which would help with the usability issues Eirikr mentions. Theknightwho (talk) 22:17, 15 August 2023 (UTC)
- @TheknightwhoIs there a way to automatically convert
{{ruby}}
to{{ja-r}}
? There are over 1,000 uses of{{ruby}}
(often appearing as{{lj}}
) and I'd like to get rid of them if possible. Benwing2 (talk) 22:36, 15 August 2023 (UTC)- @Benwing2 It doesn’t look like it’ll be straightforward, as the syntax is pretty different unfortunately. I’ll need to look at it more in-depth to get a better idea. Theknightwho (talk) 23:11, 15 August 2023 (UTC)
- Blah. So much crappy East Asian code (and templates) out there. Even if the conversion is possible auomatically in only say 80% of the cases, that would probably be good enough, as we can do the remainder by hand or just leave them. If for example there are cases that can be handled using
{{ruby}}
and not with{{ja-r}}
that is probably fine, but we should not have two ways of doing the same thing and randomly use one or the other. Benwing2 (talk) 23:31, 15 August 2023 (UTC)- @Benwing2 Yeah, I suspect a conversion is possible, and as a last resort 1,000 uses is doable manually if a few of us handle it.
- On the subject of crappy East Asian templates (and before I forget), it’s worth you having a look at the templates reliant on Module:th and Module:km as well. Theknightwho (talk) 01:30, 16 August 2023 (UTC)
- Blah. So much crappy East Asian code (and templates) out there. Even if the conversion is possible auomatically in only say 80% of the cases, that would probably be good enough, as we can do the remainder by hand or just leave them. If for example there are cases that can be handled using
- @Benwing2 It doesn’t look like it’ll be straightforward, as the syntax is pretty different unfortunately. I’ll need to look at it more in-depth to get a better idea. Theknightwho (talk) 23:11, 15 August 2023 (UTC)
- @TheknightwhoIs there a way to automatically convert
- @Theknightwho, my usability concern is not about editing, it's about reading, and about accessing the text as it is rendered in the browser.
- On the reading side, things like 漢方 (kanpō) are visually unclear to anyone not already somewhat familiar with Japanese typography -- it looks like the entire block of kanji + furigana together is the Japanese "word", ruby and all, when in fact the Japanese term is 漢方. Even if a reader understands that the ruby text is not actually part of this term, the kana are only useful for someone who already knows how to read kana. The kana are also superfluous, as we already include a romanization, which provides the same information just in a different script.
- In terms of the accessibility of the rendered text, for reasons obscure to me, the
<ruby>
element in the HTML seems to render the Japanese term un-copyable. If I select the text "things like 漢方 (kanpō) are" as rendered, and hit CTRL+C and then try to paste that somewhere, I only get "things like (kanpō) are" -- the Japanese text itself is missing entirely. Meanwhile, if I select the text "the Japanese term is 漢方." and do the same, I get "the Japanese term is 漢方." -- the pasted text includes everything I expected.
- I'm curious, why do you think we should use ruby more? ‑‑ Eiríkr Útlendi │Tala við mig 23:00, 15 August 2023 (UTC)
- @Eirikr I think realistically, most Japanese entries are going to be used by people already familiar with Japanese enough to know what the function of the rubytext is. Although we’re a dictionary in English, that doesn’t change the reality that most dictionary entries are of little use to a complete novice.
- You’re right about there being an issue from a copy and paste point of view, and it’s something that it would be good to solve if at all possible. I’m sure there is a solution, but I’d need to look into it. Theknightwho (talk) 23:07, 15 August 2023 (UTC)
- Also, just adding that the rubytext does actually serve an additional purpose to the romanisation, as it shows the reading for each kanji; romanisation can’t do that (unless we used rubytext for that instead, which I don’t think would be very helpful as it wouldn’t show semantic word breaks). Theknightwho (talk) 23:14, 15 August 2023 (UTC)
- If folks are familiar enough with Japanese to where they understand both kana and how furigana (kana used as ruby text) work, then they also have some idea of how Japanese phonemes break down, and how kanji readings work -- so again, furigana wind up largely superfluous to the only audience that knows how to use them.
- I really think we (speaking generally) get too caught up in technical details and the coolness factor, and lose sight of usability and usefulness. Outside of those manga-esque cases were the spelling and the intended reading are really orthogonal, like 騎士 (naito), I honestly don't think that furigana are useful enough to offset the negative impacts on usability.
- ... One idea occurs to me. Is there any easy way of toggling ruby display on and off? Thinking further, would there be any way of indicating in the wikicode if ruby is really needed (as in the 騎士 (naito) example, otherwise anyone who can read Japanese that looks at 騎士 would expect to read it as kishi), or if the ruby is optional (such as when the ruby just indicates the regular reading of a given spelling)? ‑‑ Eiríkr Útlendi │Tala við mig 19:51, 16 August 2023 (UTC)
- @Eirikr I know next to nothing about Japanese but I can see how ruby text is useful. For example, I can read Cyrillic but I don't know the ins and outs of irregular pronunciations in Russian; in cases like that we show a respelling in Cyrillic as well as give the IPA, and I think the Cyrillic respelling is useful. I imagine there are plenty of Japanese learners who will be able to read Hiragana (it's probably one of the first things taught) but have difficulty with Kanji (keep in mind it takes around 10 years for native speakers to learn to read and write Kanji, and probably only a few weeks to learn Hiragana). Benwing2 (talk) 20:30, 16 August 2023 (UTC)
- I was afraid of some confusion, and indeed, here we have it. :)
- Speaking specifically about ruby for Japanese -- I grant that there are plenty of other use cases in other languages. By no means do I advocate for getting rid of
{{ruby}}
. I'm looking solely at the use case for{{ruby/ja}}
and redirects. - → For Japanese itself, how is ruby using kana any more useful than simply providing the romanization in parentheses? ‑‑ Eiríkr Útlendi │Tala við mig 20:58, 16 August 2023 (UTC)
- @Eirikr The ruby text seems to allow for convenient markup of running Japanese text without interrupting the flow; putting romanizations in parens in the middle of a sentence would interrupt the flow, which is why it gets added at the end. I could imagine putting romanization in ruby text but it seems that isn't conventional. Benwing2 (talk) 21:05, 16 August 2023 (UTC)
{{usex|ja}}
and{{ja-usex}}
put romanization afterwards, not mid-text. I can't think of any case where a romanization would be inserted in the middle of an otherwise-running Japanese text.{{usex|ja|これは見本です。|This is an example.|tr=Kore wa mihon desu.}}
→
- これは見本です。
- Kore wa mihon desu.
- This is an example.
{{ja-usex|これは見本です。|これ は みほん です。|This is an example.}}
→
- これは見本です。
- Kore wa mihon desu.
- This is an example.
- これは見本です。
- We could also leverage
{{ja-r}}
.{{ja-r|これは見本です。|^これ は みほん です。|This is an example.|linkto=-}}
→
- これは見本です。 (Kore wa mihon desu., “This is an example.”)
- In terms of the wikicode used to call the templates, I'd argue that
{{ruby/ja}}
is more of a mess, and the syntax is confusingly different from the rest of our Japanese infrastructure. - From the markup example on the Module:ja-ruby page (what
{{ruby/ja}}
actually invokes):[[振る|[振](ふ)り]][[仮名|[仮](が)[名](な)]]
- Yuck. Granted, part of the problem here is borderline link abuse, but by way of comparison, we could use
{{ja-r}}
to similar effect, with a more straightforward syntax:{{ja-r|[[振る|振り]][[仮名]]|ふりがな}}
→
- Separately, in looking for examples of
{{lj}}
just now, I'm finding cases where{{lj}}
seems to have been used as a replacement for{{lang|ja}}
-- there are no ruby characters provided. See this snippet of the wikicode source at 会う#Japanese, for instance:{{quote-book|ja|year=1923|author={{lj|夢野久作}}|title={{lj|約束}}}}
→
- This kind of template misuse should probably be cleaned up as part of this overhaul. ‑‑ Eiríkr Útlendi │Tala við mig 22:32, 16 August 2023 (UTC)
- @Eirikr The ruby text seems to allow for convenient markup of running Japanese text without interrupting the flow; putting romanizations in parens in the middle of a sentence would interrupt the flow, which is why it gets added at the end. I could imagine putting romanization in ruby text but it seems that isn't conventional. Benwing2 (talk) 21:05, 16 August 2023 (UTC)
- @Eirikr I know next to nothing about Japanese but I can see how ruby text is useful. For example, I can read Cyrillic but I don't know the ins and outs of irregular pronunciations in Russian; in cases like that we show a respelling in Cyrillic as well as give the IPA, and I think the Cyrillic respelling is useful. I imagine there are plenty of Japanese learners who will be able to read Hiragana (it's probably one of the first things taught) but have difficulty with Kanji (keep in mind it takes around 10 years for native speakers to learn to read and write Kanji, and probably only a few weeks to learn Hiragana). Benwing2 (talk) 20:30, 16 August 2023 (UTC)
- Also, just adding that the rubytext does actually serve an additional purpose to the romanisation, as it shows the reading for each kanji; romanisation can’t do that (unless we used rubytext for that instead, which I don’t think would be very helpful as it wouldn’t show semantic word breaks). Theknightwho (talk) 23:14, 15 August 2023 (UTC)
- @Eirikr For reference, on MacOS, using Chrome, when I copy the text with Ruby in it and paste it into TextEdit I get this:
- things like 漢方
- かんぽう
- (kanpō) are
- The same thing happens using Safari, which suggests it's an OS issue, although possibly there are carriage returns in the underlying text that are leading to this. Benwing2 (talk) 23:10, 15 August 2023 (UTC)
- Geez. I asked this one in February and again in March to update the documentation of Module:languages/data/2 for the "generate_forms" stuff that is otherwise largely unexplained. With the promise "I'll add it shortly" half a century passed and the documentation is still nowhere to find. Now he suddenly jumps out and complains how Japanese does not follow the Chinese model... -- Huhu9001 (talk) 01:54, 16 August 2023 (UTC)
- @Huhu9001 Not interested in your drama. Theknightwho (talk) 04:02, 16 August 2023 (UTC)
- Just to add a bit further to this - I'd also like to automate much of the kanji/kana mapping which is currently necessary with
- @Eirikr I'm in full agreement that these are superfluous to
- "Eliminating them both": does "both" mean t:ruby/ja and t:ruby, or t:lj and t:jaru? -- Huhu9001 (talk) 08:36, 15 August 2023 (UTC)
- @Huhu9001, Eirikr My original proposal was to rewrite
{{lj}}
and{{jaru}}
into{{rja}}
as a shortcut for{{ruby/ja}}
, but given what Eirikr says, maybe we don't need either of them, or{{ruby/ja}}
for that matter. It sounds like maybe the best thing is for{{ruby}}
to take a language code and use it to wrap the generated text appropriately, and to simply use{{ruby|ja|FOO}}
when you really need to display arbitrary ruby that can't be handled by{{ja-r}}
. Then we can get rid of{{ruby/ja}}
and its shortcuts. Thoughts? Benwing2 (talk) 19:42, 15 August 2023 (UTC)- That sounds like a saner approach (using something like
{{ruby|ja|FOO}}
), but I say this in ignorance of the implementation details. ‑‑ Eiríkr Útlendi │Tala við mig 20:36, 15 August 2023 (UTC) - T:ruby sometimes serves to prevent double wrapping of language HTML classes, mainly in
|title=
or|chapter=
of quotation templates, like this one|title={{lw|ko|s:님의 침묵/생의 예술|{{ruby|[生](생)의 [藝](예)[術](술)}}|tr=Saeng-ui yesul}}
in 열. - If anyone wants to get rid of t:ruby and replace it with t:ja-r entirely, that could mean you will have to type
{{ja-r|.....|linkto=-|tr=-}}
every time you want just pure text but nothing else. -- Huhu9001 (talk) 01:54, 16 August 2023 (UTC)- @Huhu9001 We seriously need to avoid having to wrap one template in another. Maybe we need to make
{{ruby}}
smarter so that it can handle cases like the one above. Can you enumerate other cases where{{ruby}}
gets wrapped in another template, or vice-versa, that can't simply be replaced by the equivalent of{{lang|FOO|{{ruby|...}}}}
? Benwing2 (talk) 03:43, 16 August 2023 (UTC)- There are some cases when you want to ruby only a part of text. Then it can be done like:
{{lang|LANG|unrubied text, blahblah, {{ruby|somehow rubied text}}, more blahblah}}
. One such usage is in 閣下. -- Huhu9001 (talk) 04:14, 16 August 2023 (UTC)- @Huhu9001 Assuming that
{{ruby}}
is modified to do lang markup, why can't you just wrap the whole text in{{ruby}}
and only annotate the portion of text you want the Ruby stuff added to? Benwing2 (talk) 04:26, 16 August 2023 (UTC)- -- Huhu9001 (talk) 05:05, 16 August 2023 (UTC)
- @Huhu9001 You need to think outside the box a bit. For #1, we're talking for the moment about
{{ruby}}
not{{furigana}}
, but{{furigana}}
can be made smarter like{{ruby}}
is, so that you can annotate part of the text. For #2,{{quote}}
should be modified not to language-tag text that already is language-tagged, so it's OK to write{{ruby|ja|...}}
inside of{{quote}}
; and/or we make a ruby-quote template, similar to how we already have{{ja-x}}
; and/or we add built-in support to{{quote}}
for ruby text. In general, having to manually wrap using both{{lang}}
and{{ruby}}
inside of each other is super ugly and should be avoided. Benwing2 (talk) 05:22, 16 August 2023 (UTC)- @Huhu9001 What I'm probably going to do is modify
{{ruby}}
so it takes a language param, but you can write{{ruby|-|...}}
to force no language wrapping, so that if you really want to embed one template in another, you can do it without fear. Benwing2 (talk) 05:27, 16 August 2023 (UTC) - I do not think inside or outside the box. I just tell you the current situation because you asked. -- Huhu9001 (talk) 05:40, 16 August 2023 (UTC)
- @Huhu9001 What I'm probably going to do is modify
- @Huhu9001 You need to think outside the box a bit. For #1, we're talking for the moment about
- @Huhu9001 Assuming that
- There are some cases when you want to ruby only a part of text. Then it can be done like:
- @Huhu9001 We seriously need to avoid having to wrap one template in another. Maybe we need to make
- That sounds like a saner approach (using something like
- @Huhu9001, Eirikr My original proposal was to rewrite
- I don't see any real need for
{{lj}}
or{{jaru}}
, but I haven't looked at any current uses. It seems to me that{{ruby|ja|}}
should suffice. As regards e.g.{{ja-r}}
, it makes sense to me to use hiragana ruby with kanji, as this is fairly commonly done in Japanese-learning materials. It seems to me (again naively, without having done any specific research into the question) that users are likely to include a fair number of Japanese language learners. Cnilep (talk) 01:49, 16 August 2023 (UTC)