Wiktionary:Grease pit/2023/June

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Burmese book titles

[edit]

Is there an approved method for getting book titles in a non-Roman script rendered well when they appear in {{quote-journal}} and friends? On my home system (Firefox on Ubuntu), the default rendering of Burmese (and Thai for that matter) is unreadably small. I find a strong temptation to wrap them in {{lang|my}}, but this doesn't seem a proper approach to me, though I was impelled to resort to it at မမ this morning. --RichardW57m (talk) 11:32, 1 June 2023 (UTC)[reply]

I've already resorted to baking {{lang}} into the quotation template {{Template:RQ:pi:Burmese Dighanikaya 1}} for the title and the Burmese version of the page numbers. --RichardW57m (talk) 11:46, 1 June 2023 (UTC)[reply]
@RichardW57m Can you give me a specific example I can work against? It sounds like there's just some missing language-wrapping in Module:quote. Benwing2 (talk) 19:06, 2 June 2023 (UTC)[reply]
Will Special:Diff/69547799 do for testing? That's the version of မမ before I fixed it. The current display doesn't look too bad on Windows 10. --RichardW57 (talk) 22:57, 2 June 2023 (UTC)[reply]
@RichardW57 One thing that occurs to me is that the reason for the issue here is that you're using a title and author written in Burmese script. This is the English Wiktionary, so IMO we should not do this, particularly for the author. In fact there is a |trans-title= parameter for giving a translation of the title, but no |trans-author= param; User:Sgconlaw can comment but this may be intentional, with the expectation that the author's name should be transliterated in all circumstances. In general, it's tricky to handle a field that may be in one of several languages without an indication of the language. One thing I could do is make it so that if the |trans-title= param is provided, it is assumed the |title= param is in the language of the quotation, and it is automatically styled appropriately using the equivalent of {{lang}}; and provide a language prefix syntax for any remaining cases, so e.g. if for some reason you want to write a title or author in Burmese and not translate it, you could write {{quote-journal|my|author=my:ခင်စန္ဒာတိုး|title=my:နောင်ချိုဒေသရှိ ဓနုဒေသိယစကားမှ နေ့စဉ်သုံးစကားများလေ့လာချက်|work=Journal of the Myanmar Academy of Arts and Science|year=2018|worklang=my|url=http://maas.edu.mm/Research/Admin/pdf/6.%20Dr%20Khin%20Sandar%20Toef.pdf|volume=XVI|issue=6B}} or similar. (But keep in mind that I and the majority of other Wiktionary readers can't read Burmese, so having the title and author in a foreign script is decidedly user-hostile.) Benwing2 (talk) 06:08, 3 June 2023 (UTC)[reply]
One problem with translating a title is that it may be impossible to get back to the original for searching, and personal names can also be difficult. Converting Romanised Thai names back to Thai requires serious domain knowledge. (One Thai newspaper, the Nation, used to ask people in the news how they wanted their name romanised!) Wikipedia's w:template:cite journal has the trio of |title=, |trans-title= and |script-title= to handle non-Roman script title, so that one can give a transliteration, a translation and the original. In this particular case, the file name suggest that the name ခင်စန္ဒာတိုး is being transliterated (or ASCIIfied) as Khin Sandar Toe. And I've just noticed that the wrong template was used; it should have been {{cite-journal}}, not {{quote-journal}}, a common mistake. I will correct it in this instance. --RichardW57 (talk) 11:09, 3 June 2023 (UTC)[reply]
The default language for titles should, I suggest, be taken from |worklang= for quote-*. --RichardW57 (talk) 11:09, 3 June 2023 (UTC)[reply]
If we use a language prefix, I urge that it be documented as a font selection hint. I can imagine CJK text being headed by a Roman script title, and us not wanting a CJK font to render the title. The prefix 'en:' could then be legitimately used to select a font suitable for English, whatever the actual language of the heading. Note that in this (Burmese) example, the journal name, volume and issue are in the Roman script, while the article title, author and text are (mostly) in the Burmese script. --RichardW57 (talk) 11:09, 3 June 2023 (UTC)[reply]

Hindi spelling generation in Urdu headwords gone?

[edit]

Urdu headword isn't producing the Hindi/Devanagari equivalent for some reason, when the head parameter is given. It was working fine up until recently? نعم البدل (talk) 14:26, 1 June 2023 (UTC)[reply]

(Pinging @Atitarev, Benwing2). نعم البدل (talk) 15:15, 2 June 2023 (UTC)[reply]
@نعم البدل Can you give me an example? I haven't touched the code in question but User:Atitarev actually thinks this Hindi equivalent shouldn't be generated as it causes lots of problems. Benwing2 (talk) 18:34, 2 June 2023 (UTC)[reply]
BTW see WT:Grease pit/2023/May#Urdu headword issues with two Hindi equivalents. Benwing2 (talk) 19:16, 2 June 2023 (UTC)[reply]
@Benwing2, @نعم البدل: I think Hindi/Devanagari equivalents are still produced on full vocalisations but need to see that example.
In the other topic, I said that alternative headers shouldn't automatically produce Hindi equivalents and the reverse, alternative Hindi spellings shouldn't produce alternative Urdu headers. It's just gets a bit messy that way, one pair is enough. Anatoli T. (обсудить/вклад) 13:32, 3 June 2023 (UTC)[reply]

FWOTD (today and tomorrow)

[edit]

This is time-sensitive, and is not the biggest issue ever, but it would be nice if someone with the right permissions could edit today's and tomorrow's Foreign Words of the Day to remove the apostrophe error in today's and the glaring preposition doubling in tomorrow's.

@Lingo Bingo Dingo Is this your domain? SaryaniPaschtorr (talk) 21:45, 1 June 2023 (UTC)[reply]

@SaryaniPaschtorr What apostrophe issue? Vininn126 (talk) 21:48, 1 June 2023 (UTC)[reply]
@SaryaniPaschtorr By the way if the issue is in the entry itself, just change it yourself. Vininn126 (talk) 21:48, 1 June 2023 (UTC)[reply]
"Thieves' cant" needs an apostrophe. I have tried to change the entry myself but FWOTDs are protected (with reason) — I got the following message:
"This action has been automatically identified as harmful, and therefore disallowed. If you believe your action was constructive, please start a new Grease pit discussion and describe what you were trying to do. A brief description of the abuse rule which your action matched is: WOTD Protection". SaryaniPaschtorr (talk) 22:46, 1 June 2023 (UTC)[reply]
Plus the second definition for today's entry has 'to' for 'in'. This stuff should really be looked over by a detail-oriented and fluent speaker of English before it makes the main page! SaryaniPaschtorr (talk) 22:58, 1 June 2023 (UTC)[reply]
Done Done This, that and the other (talk) 23:06, 1 June 2023 (UTC)[reply]
Thanks! SaryaniPaschtorr (talk) 20:52, 2 June 2023 (UTC)[reply]
"This stuff should really be looked over by a detail-oriented and fluent speaker of English before it makes the main page!"
That is unlikely to be feasible, in view of the generally low community interest. It is also somewhat rude to the native speaker of English who edited the Polish entry. ←₰-→ Lingo Bingo Dingo (talk) 20:11, 6 June 2023 (UTC)[reply]
@SaryaniPaschtorr It's not an error of native grammar - I am a native speaker and I made the entry. Mistakes leak through, especially when you're adding thousands of bits a day, like I am. Vininn126 (talk) 20:14, 6 June 2023 (UTC)[reply]

Creation of wiktionary page for "dignayan"

[edit]

I was creating a page for the Tagalog translation for "geology" but for some reason, the page I was making was automatically tagged as "various spamming behavior" which I find odd. This was flagged in error. Paul Christian B. Yang-ed (talk) 11:56, 2 June 2023 (UTC)[reply]

@Paul Christian B. Yang-ed: Creating a page with links to sites like Facebook is in fact a common thing that spammers do. An abuse filter can't tell good Facebook links from bad ones, and 99 percent of the time the links are bad ones. Leave out the link and you should have no problem. Also, consider whether something on Facebook is going to still be there 10 years from now. There are details in pages like this that no one ever gets around to checking after they're added, so it's best to err on the side of stability. Chuck Entz (talk) 14:39, 2 June 2023 (UTC)[reply]
Well, unfortunately, the National Museum of the Philippines (which is the source of my translation) does not have a web version of the source page I am trying to cite. They only posted it in their official social media account. How can I get around this problem? Paul Christian B. Yang-ed (talk) 14:50, 2 June 2023 (UTC)[reply]
Cite it without a link, find their source and cite it, find another source, or find three uses in durably archived media and don't cite any authority. DCDuring (talk) 15:05, 2 June 2023 (UTC)[reply]

Request for Verification of Quotations

[edit]

Is it intended that {{rfv-quote}} simply display a note and add the page to category "Requests for quotations in LANG"? (That is the function of {{rfquote}}.) I would expect at least a separate category for challenges, if not a dedicated request page. --RichardW57m (talk) 12:03, 2 June 2023 (UTC)[reply]

@RichardW57 Yeah this is used on all of three pages but I made it also add to 'Requests for verification in LANG entries'. There don't seem to be enough uses at this point to warrant a dedicated request page. BTW why do you have two accounts (one of which appears to be for mobile, one for non-mobile), and which one should I ping? Benwing2 (talk) 19:12, 2 June 2023 (UTC)[reply]
I see, the m one appears to be for editing on non-trusted machines. Just curious, can you explain how this increases security? Benwing2 (talk) 19:14, 2 June 2023 (UTC)[reply]
If it gets hijacked, I just jettison it. Recent changes by RichardW57m would then need scrutiny for maliciousness. As to effectiveness, it's implicitly recommended at w:WP:SECURESOCK. If I ever get privileges, then they should probably be restricted to RichardW57. I have used RichardW57m for mobile, but that's just because the password for RichardW57 is so secure that it's difficult to access. --RichardW57 (talk) 22:37, 2 June 2023 (UTC)[reply]
I overlooked that category. But 'Requests for verification in LANG entries' is for challenged entries, rather than challenged quotations. It's a slightly different category of users who would respond. --RichardW57 (talk) 22:46, 2 June 2023 (UTC)[reply]
@RichardW57 Maybe yes, theoretically, but in practice given that there are only 3 pages across all languages using {{rfv-quote}}, it wouldn't make sense at this point to have separate language-specific (or IMO even combined) pages for verification of quotes. Maybe we could rename 'Requests for verification in LANG entries', but IMO the name is fine; it refers generally to verification of any sort in the entry, not necessarily to the term itself (that would be 'Requests for verification of LANG terms'). Benwing2 (talk) 06:11, 3 June 2023 (UTC)[reply]

Had to try to add comment about vandalism multiple times until it was allowed.

[edit]

The comment was not vandalism but it was banned as "probably vandalism". My solution was to add zero width spacer everywhere 2001:818:E801:C000:C077:4EB3:8FB3:EDC7 17:30, 2 June 2023 (UTC)[reply]

Unhelpful comment removed. Obviously we don't want people to vandalise. Equinox 17:31, 2 June 2023 (UTC)[reply]

User:Theknightwho vandalising Japanese modules again

[edit]

(Not Grease pit, moved to Beer parlour.) -- Huhu9001 (talk) 01:38, 4 June 2023 (UTC)[reply]

This was not vandalism, for the record. Theknightwho (talk) 02:06, 4 June 2023 (UTC)[reply]

I propose we bring this back but as a categorizing etymology template that constrasts with {{uncertain}} and {{unknown}} and that we bot replace instances of it in etymologies. Thoughts? Vininn126 (talk) 13:11, 4 June 2023 (UTC)[reply]

@Vininn126 Where is the line between disputed and uncertain? They kind of go hand in hand; almost all uncertain etymologies that aren't simply unknown are that way because there's a dispute. Benwing2 (talk) 19:04, 4 June 2023 (UTC)[reply]
@Benwing2 I was hoping specifically you'd reply with this exact question! (is that weird?) The difference is: uncertain etymologies are when a given etymologist says "I'm not sure, but:" vs "X etymologist is wrong" etc. Vininn126 (talk) 19:05, 4 June 2023 (UTC)[reply]
@Vininn126 What I mean is that in practice, most etymologies that are "uncertain" but not unknown list several possibilities proposed by different sources; to me this necessarily suggests there's a dispute of some sort. My main concern with introducing this template is that it will simply result in splitting the "uncertain" etymologies in a fairly random fashion; maybe we should do a bot addition of the {{uncertain}} template to any etymology containing the word "Disputed" at the beginning. Benwing2 (talk) 19:23, 4 June 2023 (UTC)[reply]
@Benwing2 I disagree with that notion. Oftentimes you see many etymologists saying "uncertain, but I think this" vs "I think X etymologist is wrong". Both are uncertain but in different ways. Vininn126 (talk) 19:24, 4 June 2023 (UTC)[reply]
I was more familiar with {{unknown}} than with {{uncertain}}. It looks like they put words into the same category of "Terms with unknown etymologies". They differ in the displayed text (which can be overridden anyway); are there any other important differences? I'm not sure I get the use of having separate templates even for these two. Is the idea that using different templates theoretically allows for them to be treated differently in the future, or by bots? Although the guidance on when to use each seems fairly clear, I can think of some edge cases, and I'm not sure I've used them correctly in the past. In particular, what is the right thing to do when some old source offers a hypothetical derivation, but newer, more reliable sources implicitly or explicitly reject that derivation and offer no new hypothesis to replace it? An example I touched recently that comes to my mind is arbutus, where a 19th-century source suggests an etymology, but two 20th-century sources say that the etymology is unknown. I guess that in this case {{uncertain}} is correct, but I don't see how meaningful the distinction is between this and a case where we happen not to have an unreliable etymology suggested by an earlier source. (I guess by mentioning L&S's suggestion, you could argue that the entry implicitly suggests that it is possibly correct.) For now, I agree with Benwing2 that it doesn't seem like a third contrasting template is needed, but I feel like I need to understand better what the point of these distinct templates is.--Urszag (talk) 03:08, 5 June 2023 (UTC)[reply]
@Urszag {{unknown}} and {{uncertain}} categorize differently. Uncertain is used when a source that we decide to uses says "I'm not sure, but I think X", as opposed to just "I don't know." Vininn126 (talk) 09:10, 5 June 2023 (UTC)[reply]
In what sense do they categorize differently? Both of the templates examples end by saying that they "automatically categorise the page in Category:English terms with unknown etymologies". There is no Category:English_terms_with_uncertain_etymologies (although I wonder why there isn't, if the difference between these is considered to be clear and important.) In the case of arbutus, Lewis and Short (1879) says "kindr. with arbor" (without expressing uncertainty about that) while Ernout and Meillet (1985) and Schrijver (1991) say there is no certain etymology (without offering suggestions about possible etymologies). So none of the cited sources here says "I'm not sure, but I think X".--Urszag (talk) 09:19, 5 June 2023 (UTC)[reply]
On a balance, I think this is probably splitting hairs more finely than it makes sense to, as the others above have said. - -sche (discuss) 20:05, 5 June 2023 (UTC)[reply]
@Urszag I think when I proposed changing {{uncertain}} to its current definition, I suggested renaming 'terms with unknown etymologies' to 'terms with unknown or uncertain etymologies'. That can still be done without too much work. An alternative as you note is to have two separate categories. I'm not sure which is better, User:Vininn126 what do you think? Benwing2 (talk) 20:12, 10 June 2023 (UTC)[reply]
Interesting, thanks for informing me of the context! I'm a bit less confused now about why I didn't know about the distinction between {{unknown}} vs. {{uncertain}} in etymology sections given that it apparently was implemented in January of this year. I agree with what Al-Muqanna said: "The problem with splitting them rigorously is that I think the intuitive distinction is less clear-cut—IMO it's useful to call something where there are conjectures, but they're relatively weak and/or there are a lot of them, "unknown" rather than just "uncertain" (e.g. Latin -ensis)". It still seems to me like {{uncertain}} is functionally equivalent to {{unknown|title=Uncertain}}, so I don't think the template is really necessary, but I guess it's fine as a kind of shortcut (although on balance I think I would prefer for it not to exist). I don't think it's either straightforward or worthwhile to try to maintain a strict separation, so I would not like for them to be split into two separate categories. I do think it would be less confusing to rename the category to 'terms with unknown or uncertain etymologies'. I see -sche also agreed at that time that this name would be preferable.--Urszag (talk) 00:27, 11 June 2023 (UTC)[reply]
@Benwing2 I was under the impression there were different categories already! Which is why I was considering a third (and probably last) one. Vininn126 (talk) 07:17, 11 June 2023 (UTC)[reply]

{{vern}} on mobile

[edit]

Does {{vern}} display its red error message only on mobile, or is it something I've done with CSS? I use a highly customized CSS, but nothing in my code seems like it would suppress an error message, and the error message still didnt appear when I used the URL to switch to a different skin, Monobook.

At the page https://en.wiktionary.org/w/index.php?title=apple_pear&oldid=72892933 I see no error message, but if I use a mobile device it gives me https://en.m.wiktionary.org/w/index.php?title=apple_pear&oldid=72892933 , which displays a red error message next to the word papple. Was the display of the error message recently turned off for desktop users? (It is rather unnecessary, I think, especially considering that the link to papple works perfectly even in the supposedly wrong diff of the page.) If so, should it also be turned off for mobile users? Thanks, Soap 19:57, 4 June 2023 (UTC)[reply]

It's not really an error message. It's designed to tell someone who's editing the entry that the {{vern}} template is no longer necessary and can be replaced by plain wikitext (it also adds a maintenance category, but that's also hidden). Because the template itself can't tell the difference between preview in edit mode and regular display, it uses a CSS class that is supposed to be invisible except in edit mode. That way readers aren't bothered with ugly red text. The problem here is that the CSS for mobile apparently doesn't have the code necessary for it to be invisible. @Erutuon. Chuck Entz (talk) 20:34, 4 June 2023 (UTC)[reply]
This should fix it. — Eru·tuon 19:00, 5 June 2023 (UTC)[reply]
Still seems to work as I need it to in edit preview desktop. Thanks. DCDuring (talk) 19:37, 5 June 2023 (UTC)[reply]

Buggy citations button

[edit]

Recenty, I'm seeing this buggy citations button below the title, though it appears only sometimes and disappears after refreshing the page. I'm using the Vector (2022) skin. Has anyone had the same problem? Tc14Hd (talk) 15:17, 5 June 2023 (UTC)[reply]

Acceleration for female equivalents

[edit]

I notice that Module:accel/es (and /pt, /fr) has "special handling for female equivalents" that converts {{head|es|noun|g=f}} to {{es-noun|f}}. I would like the same for Catalan and Italian, but before I create those submodules, I'm wondering if that code belong in a more centralized module since it can be used by so many languages. Ultimateria (talk) 16:28, 6 June 2023 (UTC)[reply]

@Ultimateria The code needs to make reference to a language-specific template (e.g. {{es-noun}}, {{fr-noun}}, {{pt-noun}}) and it's only 3 lines or so, so it should probably stay in the specific submodules. (Generally the language-specific templates work similarly — and in cases where they don't, I have been working to harmonize them — but they may not all work this way.) Benwing2 (talk) 04:32, 7 June 2023 (UTC)[reply]
@Benwing2: And the @Theknightwho has been deharmonising them to avoid proliferation! Where have you two shared your consultation on the matter with the rest of us? --RichardW57m (talk) 13:51, 7 June 2023 (UTC)[reply]
As noted in WT:Grease pit/2023/June#Selective Forcing and Disabling of Transliteration below, {{head|LANG|noun}} and {{LANG-noun}} are not always equivalent. --RichardW57m (talk) 13:51, 7 June 2023 (UTC)[reply]
@RichardW57m What? I don’t understand what your comment refers to. Theknightwho (talk) 16:55, 7 June 2023 (UTC)[reply]
If you've replied to the wrong paragraph, I'm referring to your breaking the pattern by replacing {{ryu-noun}} by {{ryu-head}}, for which you subsequently provided a justification. --RichardW57 (talk) 18:37, 7 June 2023 (UTC)[reply]
@RichardW57 That was specific to Ryukyuan languages only. Theknightwho (talk) 18:47, 7 June 2023 (UTC)[reply]
@Benwing2: Thanks, I've gone ahead and created those two modules. Ultimateria (talk) 01:35, 8 June 2023 (UTC)[reply]

Selective Forcing and Disabling of Transliteration

[edit]

In Pali headword lines, we (to be precise, @Benwing2) want to suppress transliteration when it is the same as the standard Roman script form, which is displayed by {{pi-sc}} at the start of the sense definitions. For most lemmas, the display is suppressed by default. One method is that {{head|pi|noun|sccat=1}} and {{pi-noun}} are not equivalent. In the absence of explicit transliteration or its explicit suppression, the former provides default transliteration and the latter suppresses transliteration. I'm very slowly reviewing (in User:RichardW57/WIP#Pali Headword Templates) the mechanisms to control the use of the default transliteration (forcing by explicit override is undesirable, but it is the main populator of category:Terms with redundant transliterations/pi), and I fear @Theknightwho reasonably condemns the method that currently works best where it will work, which is documented so far at {{pi-verb form}}).

One method that seems intuitive is to generalise the use of '-' to suppress transliteration to use '+' to force the display of the default transliteration. How acceptable would users find this? At present, supplying '+' as a transliteration results in its being accepted as the literal transliteration to use. I could implement it just in Module:pi-headword and then make all Pali headword templates invoke the module, but that would be potentially confusing. --RichardW57m (talk) 13:47, 7 June 2023 (UTC)[reply]

@RichardW57 I have used the + code for similar uses in the past so I don't see that it would be an issue. However I think it would be better for this to somehow be incorporated into the general code rather than implemented only in Module:pi-headword. Maybe User:Theknightwho can comment. Benwing2 (talk) 20:07, 10 June 2023 (UTC)[reply]
@Benwing2, Theknightwho: In the absence of further comments, I've started implementing it for Pali headword templates and module. (Not all the templates use the module - it doesn't always help and sometimes hinders.) These changes will not interfere with incorporation of the convention in general code, though they will stop '+' reaching general code. The Pali suppression logic and conversion will go both in the module and in new template {{pi-hwtr}} and its invocations. I can then start removing redundant input of transliterations. --RichardW57m (talk) 12:09, 28 June 2023 (UTC)[reply]

Template:quote-newsgroup: trans-title also translates newsgroup

[edit]

E.g., {{quote-newsgroup|fr|newsgroup=newsgroup|title=title|trans-title=title translation|year=2000}} results in:

2000, “title”, in newsgroup (Usenet):

J3133 (talk) 16:51, 7 June 2023 (UTC)[reply]

I have fixed this. J3133 (talk) 13:31, 9 June 2023 (UTC)[reply]

NEC still uses trans-mid

[edit]

While creating terminal strip using NEC I got this error:
The {{trans-mid}} template is obsolete and should no longer be added to entries. Please remove it before clicking "Publish changes" again. Thanks for your contributions!
See User:Yair rand/newentrywiz.js#L-875. — Alexis Jazz (talk) 19:10, 8 June 2023 (UTC)[reply]

@Alexis Jazz Should be fixed. Benwing2 (talk) 20:10, 10 June 2023 (UTC)[reply]

What to do with thousands of declension tables on the right side?

[edit]

As I read in Wiktionary:Style guide#Right-hand side, it is discouraged to place declension tables on the right side of the page instead of a regular section. However, this is exactly how Georgian adjective declension tables are placed, for example: შემდეგი. {{ka-adj}} has almost 3000 translusions. What should be done to align the practice with the guide? Extract the declension table from {{ka-adj}} and massively add the declension section with it to the pages with {{ka-adj}}? Gradilion (talk) 20:27, 8 June 2023 (UTC)[reply]

@Gradilion Definitely not what you're suggesting. We'd just need to change the module code to not right-align, but it would be good to get some comments from regular Georgian editors, if there are any. Benwing2 (talk) 20:03, 10 June 2023 (UTC)[reply]
If we just remove the right alignment, the section order would be wrong: Declension should go after the definitions. {{ka-adj}} is a template that combines a headword line with a declension table. Gradilion (talk) 13:15, 11 June 2023 (UTC)[reply]
@Gradilion It sounds like the most sensible thing would be to create a new declension template and to put it in a new declension section on each page. I assume that's what you were suggesting? This could be done with a bot job. Theknightwho (talk) 13:59, 11 June 2023 (UTC)[reply]

Disabling Thai line-breaking in inflection tables

[edit]

Is there any good way of disabling Thai line-breaking in inflection tables? I've just added the future system to Pali เท⁠เส⁠ติ (de⁠se⁠ti), perhaps excessively including the middle voice and conditional, and the inflection table display on Microsoft Edge is now vomitous, with words being unnecessarily split in the inflection table.

At present, I use the usual link generation method for the word forms in the abugidic Thai script writing method, but have to pass the transliteration in for the alphabetic Thai script writing method.

One brute force method would be to saturate the forms with word joiners (U+2060), though there is a possibility that Microsoft might simply ignore the character for Thai line-breaking, but it worked with the linked Thai script word above. I think the modified forms would have to be passed in as an 'alt' form so that links would still work.

A seemingly hard method would be to make the units of the display word plus transliteration, though I can imagine ways even that would fail.

A potential method would be to use <nobr>, but I don't know how that works when words do have to be broken - Pali has some rather long inflected words - I don't think anupādāaparitassanā is the worst of them. Also, I'm not sure how to wrap it round the formatted inflected word.

Ideally, the line-breaking system would know that Thai line-breaking was inappropriate for Pali, but that day still seems a long way off.

So, is there a better way than the brute force method? --RichardW57m (talk) 12:18, 9 June 2023 (UTC)[reply]

Note: Currently, sequences of word plus transliteration are joined together to make markup-rich strings, which are then arranged in the inflection table. --RichardW57m (talk) 12:29, 9 June 2023 (UTC)[reply]
One way of making the units of display be word and transliteration is to force line breaks between the units using HTML tag <br/>. It's a vast improvement in the display of Pali เท⁠เส⁠ติ, but it will not be an improvement for short Roman script words with few alternatives. RichardW57m (talk) 12:42, 9 June 2023 (UTC)[reply]
@RichardW57 I'm afraid you might have to figure this out yourself; I for one don't know much about Thai, Thai script or Pali and I don't know that there's anyone else who can help either. IMO inserting line breaks in the module output using <br /> is not a bad idea, and you can always check the length of the Roman script and not insert line breaks for short words. Benwing2 (talk) 19:55, 10 June 2023 (UTC)[reply]

QQ, quote-book and Google's language codes

[edit]

When formatting a Google Books quote, WT:QQ pulls language codes from ... somewhere, presumably Google, to pass them to T:quote-book. But Google(?) uses some codes which are different from us, e.g. "un" for our "und", which causes quote-book to resolve to nothing but a big red error. Can we add a function to QQ where when it's pulling the codes from wherever, it looks at a table (which maps Google-specific codes to Wiktionary codes) to determine if it needs to modify the value it pulled from Google (e.g. "un" to "und" or "en-GB" to "en")? - -sche (discuss) 03:20, 11 June 2023 (UTC)[reply]

@-sche Line 52 of MediaWiki:Gadget-QQ.js. The only thing it seems to do is replace 'iw' with 'he', otherwise keeping the source language code. This should use a table of mappings, yes. If you could supply this table, maybe User:Erutuon can add this mapping; I might be able to do it otherwise although I would have some issues testing it as I don't know how Quiet Quentin works. Benwing2 (talk) 03:57, 11 June 2023 (UTC)[reply]
I'm not sure I understand what that line is doing (what's "iw"? what is the code doing to "iw"?).
Looking through Google lists of codes like this and this, I spot several codes which I recall appearing (and causing issues) in QQ results: pt-BR (should just be pt), pt-PT (likewise), zh-CN, zh-TW, hr and sr (should sh), en-GB (to which I can add en-US and un, although they don't appear in the aforelinked lists). IMO QQ should check if the code it's pulled from Google is in the table, and if it's not, then use the Google-supplied code as-is (so we don't have to list all the codes that don't change, which is most of them). - -sche (discuss) 06:59, 11 June 2023 (UTC)[reply]
@-sche Right, your suggestion is how it will be implemented. The line is replacing the first instance of the string or substring 'iw' occurring anywhere in the language code with 'he'. I have no idea what 'iw' is; 'he' sounds like Hebrew but in that context 'iw' rings no bells. We could potentially write it to cut off any country code appended onto a language code, or change the {{quote-*}} methods to accept an etymology-only language (which I'll probably do anyway); I know e.g. 'en-US' is an etymology-only variant of 'en'. Benwing2 (talk) 06:36, 12 June 2023 (UTC)[reply]
The code 'iw' is the ISO 639:1988 code for Modern Hebrew. --RichardW57 (talk) 07:04, 12 June 2023 (UTC)[reply]
For what it's worth, Google's detection for zh-CN and zh-TW is very bad, and it lacks zh-HK. We should simply change them to zh; the quote templates already need to do script detection and there's no practical reason to use the country-specific codes. – Wpi (talk) 09:30, 12 June 2023 (UTC)[reply]

J3133 (talk) 09:51, 12 June 2023 (UTC)[reply]

@J3133 Is this a request to delete the category? Such categories end up in CAT:Empty categories and get deleted over time (whenever I run the script to delete them). Benwing2 (talk) 03:46, 13 June 2023 (UTC)[reply]
@Benwing2: It was not empty before, but the entries seem to not be categorized in it automatically any more. J3133 (talk) 03:54, 13 June 2023 (UTC)[reply]
@Theknightwho: Re Special:Diff/75731560: I do not know how to automate this category. J3133 (talk) 09:51, 20 August 2023 (UTC)[reply]
@J3133 I’m on mobile right now, but I think we decided not to have hyphen/dash categories. This hasn’t been populated automatically in ages, in any event. @Benwing2 Do you know more? Theknightwho (talk) 10:40, 20 August 2023 (UTC)[reply]
@Theknightwho @J3133 I am not sure, unfortunately, but I completely agree with @Theknightwho that we should not add such categories manually. Either they should or shouldn't be populated, and if they should be populated it needs to be done automatically as otherwise it will be unmaintainable and incomplete. My view is that a "terms spelled with hyphen" category makes no sense whatsoever for English because there are a zillion such terms, and it's probably the same for a "terms spelled with en-dash" since any term involving two joined names can be written with either a hyphen or an en-dash. "Terms spelled with X" should be reserved for rare or relatively rare characters. Benwing2 (talk) 20:56, 20 August 2023 (UTC)[reply]

Ignore . , / in English category sorting

[edit]

We already ignore spaces, apostrophes, and hyphens. J3133 (talk) 09:51, 12 June 2023 (UTC)[reply]

Also, the problem with (manual) categories that are not placed in templates ({{lb}}, {{C}}, {{cln}}, etc.; i.e., [[Category:…]]) is that they do not sort correctly. J3133 (talk) 10:08, 12 June 2023 (UTC)[reply]

We should aim to eliminate all uses of manual categories wherever possible, for this reason. Theknightwho (talk) 14:37, 12 June 2023 (UTC)[reply]
Yes, I agree. There is no way to automatically make manually-specified categories sort correctly without manually giving the sort key. I have a script to templatize manual categories and I have run it for various languages. Benwing2 (talk) 17:17, 12 June 2023 (UTC)[reply]
I'm not entirely sure what the suggestion is here. I currently rely on the difference in how for example, "[[Category:English quotation templates|Shelley Adonais]]" and "[[Category:English quotation templates|Shelley, Mary Frankenstein]]" are sorted to ensure that works by Mary Shelley are grouped together after Percy Bysshe Shelley's works, and not interspersed among them (which would be the case if the sorting key of the latter was "Shelley Mary Frankenstein". — Sgconlaw (talk) 18:28, 12 June 2023 (UTC)[reply]
@Sgconlaw: My suggestion is a follow-up to Wiktionary:Beer parlour/2023/April § Should spaces and punctuation be ignored in column template sorting?, which, as I mentioned, was implemented for spaces, apostrophes, and hyphens. For consistency, the ones I listed should also be ignored. E.g., aca/fan and aca-fan should be listed next to each other. J3133 (talk) 21:17, 12 June 2023 (UTC)[reply]
@J3133: so does your suggestion affect the use of commas in sorting keys in categories? — Sgconlaw (talk) 21:35, 12 June 2023 (UTC)[reply]
@Sgconlaw: Manual sorting keys/manual categories are not affected. There is consensus for ignoring punctuation (which commas are). J3133 (talk) 21:38, 12 June 2023 (UTC)[reply]
@J3133: I see. Thanks. — Sgconlaw (talk) 21:41, 12 June 2023 (UTC)[reply]
@Sgconlaw Nothing here would affect the sorting of templates, as those are outside of mainspace. Theknightwho (talk) 02:21, 13 June 2023 (UTC)[reply]
@Sgconlaw, J3133 Are you sure manual sort keys in templates like {{C}} and {{cln}} are unaffected? If it's only manual sort keys in manual categories, I feel uneasy having such a difference between the two as I regularly convert the latter to the former. Benwing2 (talk) 03:45, 13 June 2023 (UTC)[reply]
They would only be affected if the template overrides the manual sort keys. @Theknightwho: Is this correct? J3133 (talk) 03:54, 13 June 2023 (UTC)[reply]
@J3133 @Benwing2 Correct, but I would prefer that we change over to a "sort base" system: i.e. where the sort= parameter determines what goes into the sortkey algorithm, rather than overriding it entirely. At the moment, manual sortkeys tend to be a disorganised mess, because they use various ad hoc methods to achieve a particular sort order that may not work well with the automatically generated keys of other entries. Theknightwho (talk) 03:58, 13 June 2023 (UTC)[reply]
@J3133, Theknightwho I am inclined to agree with you, although I'd like to see a proposal with a bit more detail. Benwing2 (talk) 07:10, 15 June 2023 (UTC)[reply]

{{surf}} and non-affix-based word-formation

[edit]

It would be nice if {{surface analysis}} could handle things like {{deverbal}} and the like. I hope that wouldn't be too hard of a change but I suspect it might be. Vininn126 (talk) 10:49, 12 June 2023 (UTC)[reply]

@Vininn126 It depends on how general you want it. Currently {{surf}} is just a simple wrapper around {{affix}}. The tricky thing about the various other etymology templates is they don't all work exactly the same. What would be the proper syntax for e.g. a surface analysis involving {{deverbal}}? Benwing2 (talk) 17:19, 12 June 2023 (UTC)[reply]
@Benwing2 These are all questions I've been wondering as well. I'm wondering if we shouldn't include most, even {{ellipsis}} or {{clipping}}. @Catonif Suggested making the template essentially just link to the glossary and then we could type something like {{surf}}, {{deverbal|pl|term|nocap=1}}. This would be at least more convenient than typing By {{glossary|surface analysis}}, a {{deverbal|pl|term|nocap=1}}., which I have done on some entries. If there's a way to set it so that maybe parameter 2 calls a specific template, which then determines other parameters, maybe that would be better. Vininn126 (talk) 18:05, 12 June 2023 (UTC)[reply]
Yes this is possible although for backward compatibility's sake we would either need to prefix template names in param 2 with a special char, or create a separate template ({{surf+}}???) that takes the template name as param 2. Benwing2 (talk) 20:56, 12 June 2023 (UTC)[reply]
As a proof of concept (and because I wasn't reading the conversation, just heard the ping) it works (tests). Catonif (talk) 21:27, 12 June 2023 (UTC)[reply]
Whatever works easiest technically. Part of me wants to raise the issue of changing the name, but I remember how that went last time... Vininn126 (talk) 05:59, 13 June 2023 (UTC)[reply]
@Benwing2 I suppose we can use {{surf+}} for now. Vininn126 (talk) 09:07, 15 June 2023 (UTC)[reply]
@Benwing2 If there aren't any objections or problems with the code I linked earlier, I'd move that to the mainspace into a template {{surf+}}. Catonif (talk) 21:09, 27 June 2023 (UTC)[reply]
@Catonif I haven't had a chance to take a look, give me a day to look at it. I think I'd actually prefer that we use a special char along with {{surf}}, something like this: {{surf|it|+deverbal|amare}} although it depends on what User:Vininn126 prefers. Benwing2 (talk) 21:13, 27 June 2023 (UTC)[reply]
@Benwing2 I don't really care, to be honest. Vininn126 (talk) 21:14, 27 June 2023 (UTC)[reply]
@Benwing2 Any update? Vininn126 (talk) 17:16, 2 July 2023 (UTC)[reply]
@Vininn126 Taking a look now. Benwing2 (talk) 18:17, 2 July 2023 (UTC)[reply]

completely failing to understand the syntax of quotation templates

[edit]

I'm trying to design two quotation templates that would work like this:

  • {{RQ:Mishneh Torah|Book|Section|Chapter|Halakha}} — for Hebrew quotes)
  • {{RQ:Targum|Source|Book|Chapter|Verse}} — for Aramaic quotes, but more general and versatile than {{RQ:Onkelos}}

I'd like them to work similarly to the way that the current {{RQ:Jewish Oral Law}} and {{RQ:Tanach}} quotation templates work, where you can insert numbers that reflect a list of parameters, and the text itself would use {{ux}} instead. I'd like to fill out all the data entry myself. I just can't wrap my head around the syntax of those pages! Does anyone have any advice for how to make things like that? And would my user page be the appropriate place to sandbox it, or is there another place to practice? I'm sorry if this is a really basic issue, I have a lot of trouble with programming in general but I think both of these would be useful templates for the community to have. Isaacmayer9 (talk) 13:26, 12 June 2023 (UTC)[reply]

@Isaacmayer9 You can maybe ask for help from User:Sgconlaw, who is the quotation template expert around here. In general, template syntax is not very user-friendly, unfortunately; Lua code is often easier to make sense of if you are familiar with programming. In terms of testing such templates, what I normally do is put them in user space, so that e.g. instead of Template:RQ:Targum you'd create User:Isaacmayer9/RQ:Targum with the same definition, and then you can call it like this: {{User:Isaacmayer9/RQ:Targum|...}}. Once you're satisfied with the results, you'd copy it to Template:RQ:Targum. Benwing2 (talk) 17:25, 12 June 2023 (UTC)[reply]
@Isaacmayer9: I've left a message on your user talk page. — Sgconlaw (talk) 18:30, 12 June 2023 (UTC)[reply]
Quotations should use {{quote}}, not {{ux}}. --RichardW57m (talk) 09:22, 13 June 2023 (UTC)[reply]
@Isaacmayer9, Sgconlaw I believe we typically use {{Q}} for quotes in classical languages. I'm not sure if these works would fit perfectly to what this template does. However, if we are not referring to a specific edition but rather the "text" at large, {{Q}} might do the trick. This, that and the other (talk) 11:16, 13 June 2023 (UTC)[reply]
@This, that and the other: ah, well if {{Q}} is to be used then someone else will need to assist as I’m unfamiliar with that template and with Lua. — Sgconlaw (talk) 13:10, 13 June 2023 (UTC)[reply]

Multiple etymology entries: a proposal

[edit]

I noticed we don't have categories for (English?) entries with multiple etymologies, which I think can have a decent educational value and is quite easy to create and automate. I just wanted to test the waters first, lest I'm missing anything. Assem Khidhr (talk) 16:39, 12 June 2023 (UTC)[reply]

@Assem Khidhr The reason for this is that it's not so easy to implement automatically; you'd need to add page scraping to Module:headword, which implements the generation of headword lines, and that may cause large pages to go over their limits. The only other way is to have a bot script running frequently to automatically add the appropriate categories, and we don't really have anyone running such scripts now. Benwing2 (talk) 17:29, 12 June 2023 (UTC)[reply]

Cite-book is not displaying consistently

[edit]

Maybe I'm just dumb here, but I've been adding quotations using {{cite-book}}. On lechuga iceberg, you can see how I have a date in ISO format and it displays correctly as "2011 February 8" at the beginning of the line. On suferia, I have added three instances of the template, also using ISO dates and none of them display. What is happening here? Is there something that I'm doing wrong? I don't understand how this is even possible. —Justin (koavf)TCM 02:13, 13 June 2023 (UTC) I am dumb: {{quote-book}} is not {{cite-book}}. —Justin (koavf)TCM 02:22, 13 June 2023 (UTC)[reply]

Sgconlaw (talk) 03:44, 13 June 2023 (UTC)[reply]
[edit]

Hi. Ŵ and ŵ link to each other, but clicking on the links takes you to W or w. kwami (talk) 04:05, 13 June 2023 (UTC)[reply]

@Kwamikagami: The links in the Welsh section work just fine. That tells me that it has something to do with Chichewa being set somewhere in the language data modules to strip diacritics. Chuck Entz (talk) 04:59, 13 June 2023 (UTC)[reply]
You can also navigate with the Unicode infobox, so not a big deal. kwami (talk) 05:01, 13 June 2023 (UTC)[reply]
@Kwamikagami We shouldn't be sloppy about this and have them lead to the wrong page. I've corrected the links, but Chuck is right about what was happening. Theknightwho (talk) 05:04, 13 June 2023 (UTC)[reply]
Ah, I'd already forgotten that trick! kwami (talk) 05:06, 13 June 2023 (UTC)[reply]

Using "verbal participles" for Hungarian

[edit]
  1. Why can't Category:Hungarian verbal participles handle "auto cat"?
  2. Why can't {{participle of}} handle verbal participles, even if the necessary parameter is supplied, linking only to "participle" rather than "verbal participle"?

In theory, all relevant templates have been modified accordingly. Adam78 (talk) 08:44, 13 June 2023 (UTC)[reply]

@Adam78 What's a "verbal participle"? Benwing2 (talk) 18:48, 13 June 2023 (UTC)[reply]
Aren't participles verbal by default? What would a theoretical "nominal participle" be? Vininn126 (talk) 19:00, 13 June 2023 (UTC)[reply]

@Benwing2, Vininn126 I'm not sure if you checked out its entry: verbal participle, igei igenév.

As you know, participles combine the features of the verb (downwards in the syntax level) and another part of speech (upwards in the syntax), e.g. in "Wanting a chocolate, he entered the store", "wanting" has an object (or in "swimming in the water, she enjoyed the sunshine", "swimming" has an adverb on its own), so they function downwards as verbs, while they are the adverbs of "entered" (or "enjoyed"), so they functions upwards as adverbs, that's why they may be called adverbial participles.

So there is this verbal participle (e.g. az apám írta levél, lit. "the my-father-wrote-it letter", i.e. the letter written by my father). It is semantically identical with but syntactically distinct from az apám által írt/írott levél, "the by-my-father-written letter", i.e., the exact equivalent of what you call the past participle. Incidentally, this latter is identical in Hungarian in most cases with the indefinite- or generic-object third-person singular past tense form of verbs.

This verbal participle functions downwards as a verb, and even more so than the past participle, as it has a subject. That's why it's called a verbal participle: it is more expressly verb-like than a usual, adjective-like participle (upwards it still functions as an adjective, an attribute if you like, hence it's a participle). In addition, it is identical in form with the definite-object third-person singular past tense form of the verb, that's why I rendered it literally as "the my-father-wrote-it letter".

On the other hand, the regular counterpart of the past participle behaves as an adjective, and therefore it can take an adverb to express the agent, as in "the letter written by my father". This latter form is more productive in Hungarian, although I found 100+ examples for the verbal participle. These forms are mostly interchangeable with past participles, after the necessary conversion in syntax of course, though there is some difference in style.

The related Hungarian Wikipedia article mentions that there was a debate if it should be called a verbal participle or participial verb but the former name took on. Adam78 (talk) 20:14, 13 June 2023 (UTC)[reply]

@Adam78 The term "verbal participle" strikes me as questionable since, as User:Vininn126 points out, all participles are verbal, so the term is a pleonasm. For example, I'm pretty sure that gerunds in Latin can take the object in either the genitive case (more noun-like) or the accusative case (more verb-like), but that doesn't fundamentally change what they are. English gerunds are similar: you can say either "Him/John leading the company is problematic" or "His/John's leading the company is problematic" (the latter is considered more correct but the former is more common). Are you sure that this term is standard in English-language sources that discuss it? (You created the entry for verbal participle and it links to the Wikipedia article on attributive verb, which doesn't mention Hungarian at all.) If we are to accept this usage, it should be Hungarian specific as the term "verbal participle" is not used elsewhere. (The reason why {{auto cat}} and {{participle of}} don't work the way you describe above is that neither of them knows about the concept of "verbal participle". {{auto cat}} only recognizes certain sorts of participles as parts of speech, and likewise {{participle of}} only categorizes certain sorts of participles. We'd need to add a Hungarian-specific module that has a label for 'Hungarian verbal participles' and puts it somewhere, maybe under 'Hungarian participles', and we'd need to modify the language-specific form-of rules in Module:form of/cats, around lines 340-345, so that verbal participles are recognized and categorize appropriately). Benwing2 (talk) 05:43, 14 June 2023 (UTC)[reply]
Grammatical aside: I don't know if the term is standard (we have had no RfV against it on Wiktionary), but I could find information in English about the term. What it refers to is functionally similar to the other participles, but it is inflected as a verb. It reminds me of the Middle Egyptian 'relative form'. I also found 'verbal participle' defined to mean the 'adjectival participles' - par for the course in naming non-finite forms. --RichardW57m (talk) 13:08, 14 June 2023 (UTC)[reply]

@RichardW57m: I find this construction more similar to Chinese, e.g. 我父亲写的信, compare az apám írta levél "the my-father-wrote-it letter", where the section 我父亲写(了) literally corresponds to the finite sentence az apám írta (a levelet) ("my father wrote [the letter]"). That's why I linked "attributive verb".

@Benwing2 "Are you sure that this term is standard in English-language sources that discuss it?" No, unfortunately I'm not, so we can treat it as Hungarian-specific if you prefer. Or do you think the notion of "attributive verb" should be applied to this? — The term "verbal participle" is the literal translation of igei igenév. As for this term being a pleonasm, and reacting on @Vininn126's theoretical "nominal participle", all I can say is rephrasing what I wrote above, this form being a repurposed but otherwise full-fledged verb form with its own subject. ("Regular" participles could indeed be called "nominal" in the sense that they behave like adjectives [= nominals] in terms of their arguments.)

I looked up my Latin grammar, and I found that the gerundium can take an object and an adverb (examples given for the object: dolum adhibendō, mittendō in urbem lēgātōs, potestās lēgātōs mittendī, fructūs servandī causā, artem discendō; for the adverb: diligenter discendō, ad beatē vīvendum, ars bene vīvendī, ad bene vīvendum, bene vīvendī causā), just like English constructions such as "reading a book is enjoyable", i.e. with an object. In fact, this isn't possible (or rather uncommon) in Hungarian, as we normally say either "the reading of a book…", or "book-reading". — However, this is the gerund use of the -ing form (speaking of an action on its own, as opposed to a property of another entity), which is a bit different from this topic.

As far as the technical details are concerned: I already created Module:category_tree/poscatboiler/data/lang-specific/hu to handle the issue with {{auto cat}}, so I think your second criterion is already completed. (I also expanded Module:form_of/data2 with the verbal participle a few months ago, but to no avail.) The other relevant templates and modules seem to be protected from editing so I don't think I can modify them. Would you possibly perform the remaining changes, please? I don't know if it's better to link the entry "verbal participle" or create a new entry in Glossary and link that. Adam78 (talk) 21:42, 14 June 2023 (UTC)[reply]

@Adam78 Before we go down this road any farther I think it's important to look through some recent English-language papers on Hungarian syntax and see how they refer to this "verbal participle". It's not good to make up bespoke grammatical terminology even if it's literally translating the standard term in Hungarian. I see for example some references to "adverbial participles" and "adverbial adjuncts" but I don't know if they are the same thing as what you're referring to. Benwing2 (talk) 23:22, 14 June 2023 (UTC)[reply]
I found this section in Lépésenként magyarul. Hungarian Grammar – not only for Hungarians by Valéria M. Korchmáros (→ISBN): page 153 (bottom), page 154, and page 155. I don't think I can supply more reference for you in English, maybe because this form is not so interesting for language learners or for language description as such (it's not as productive as the usual past participle, mostly because it requires a subject, it's somewhat literary in style, and its form is just the repurposing of a past-tense verb form). I hope this source will suffice for you. Adam78 (talk) 20:56, 15 August 2023 (UTC)[reply]
@Benwing2, Vininn126: Hungarian grammar sometimes refers to the infinitive as the nominal participle (főnévi igenév), and similarly indefinite attributive participles are sometimes referred to as adjectival participles (melléknévi igenév). These terms are not generally found in English-language literature since they don't represent anything very distinctive (e.g. the English present and past participles can also be used attributively). "Adverbial participle" by contrast is more widely used in Anglophone literature since there is no particularly obvious and preferable equivalent. As for the "verbal participle", as far as I know and as Adam says, the form itself is always equivalent to the standard definite past tense—it is a syntactic and not a distinct morphological feature.
On the face of it, I'm not sure there's much point to adding a separate POS header for it. The only vaguely relevant English-language technical paper I can find (M. den Dikken, "Hungarian Nominal and Verbal Inflection Integrated", in Boundaries Crossed, at the Interfaces of Morphosyntax etc.) argues that the Hungarian past tense is itself basically nonfinite, and adduces the attributive use as evidence. This would suggest that it should be considered a single POS.
What I would add though is that certain features of Hungarian syntax are variously obscure, confused, and controverted in the English-language literature and we need to make our own decisions in any case. If we did want to treat it as a distinct POS I'm not sure it'd be fair to call calquing a broadly accepted Hungarian term just making things up, given that a settled English terminology simply doesn't exist. Likewise bringing in other stuff like gerunds would IMO certainly not be helpful since they generally have a quite different meaning. —Al-Muqanna المقنع (talk) 21:47, 15 August 2023 (UTC)[reply]
@Adam78@Al-Muqanna In general I'd like to avoid adding another POS just for this specific use case. If we can get away with just labeling it as the past tense that sounds best. Benwing2 (talk) 22:28, 15 August 2023 (UTC)[reply]
@Benwing2 The problem is that simple past-tense verb forms normally can't be attributes of a noun, while verbal participles can; that's how they can be identified as such. It's like conflating English past-tense and past-participle forms just because they're the same for regular (and some irregular) verbs, as in We held a conference and The conference held was great. The syntax of these Hungarian words are just as distinct as these two above. I don't really understand why and how the difference between a verb form and a participle (whatever their morphological form) could eventually amount to nought (especially here on Wiktionary). I don't even understand how the label you suggested would address the issue I brought up originally. Anyway, you can give it some thought and we'll get back to the matter later. Adam78 (talk) 00:38, 16 August 2023 (UTC)[reply]
@Adam78 From what User:Al-Muqanna says, they are always the same in form, not just sometimes, and as a result it can be viewed as a syntactic rather than a morphological issue. It reminds me a bit of Arabic, where (if I remember correctly; my memory of Arabic is very rusty) you can say essentially كَانَ يَكْتُبُ (kāna yaktubu) which means "he was writing" but literally "he was he-writes", where يَكْتُبُ is normally a finite verb "he writes" but can follow another finite verb in a position where in English a participle would be required, and takes on a participial meaning. We would not say that the finite verb يَكْتُبُ is a participle merely for this reason. Benwing2 (talk) 01:19, 16 August 2023 (UTC)[reply]
It is identical afaik. As the source Adam linked above says, "it is homonymous with 3rd person singular definite past tense forms". The analysis at work is just that it is being derived from the past participle and that this identity is in a sense a coincidence. (This is, I think, specifically the sort of analysis den Dikken rejects in the piece I mentioned, at least in the sense of it being a coincidence.) The Hungarian Wikipedia article has a good amount of detail on the various, contradictory ways in which Hungarian grammarians have understood the "verbal participle", though a lot of it is 19th-century guff—no point listing all of it but suffice to say there have been many theories. In any case the held example is odd IMO because we do treat those two uses under the single "verb" header, on one line: see held. —Al-Muqanna المقنع (talk) 01:34, 16 August 2023 (UTC)[reply]
@Al-Muqanna, Benwing2 A participle can have inflection in Hungarian (declension), listed under its own subheader (as well as compound words of its own), while a verb form cannot have any more suffixes or compounding elements. I think it was one of the reasons why we decided we should have a separate section for the participle. (By the way, the more productive Hungarian past participle form also coincides with a verb form, namely the indefinite third person singular in the past tense. That form cannot have its own subject, unless with an adverbial "by" structure.)
Now, the practical question is: what template shall we use instead of {{participle of|hu|<verb>||verbal}} for the 133 verbal participles I collected in the past few months? It should link to verbal participle as such, and the current template cannot. Is it okay to create something like {{hu-verbalparticiple}}? Asking @Panda10 as well. Adam78 (talk) 13:11, 16 August 2023 (UTC)[reply]
I'm a native speaker fwiw. This already seems to be handled inconsistently, though: Hungarian wp claims színre lépte előtt as an example of the verbal participle, but we treat lépte (along with jötte) simply as a noun, glossed "his/her/its stepping" etc. If we ignore the Hungarian wp's statement about "vagy pedig valamely toldalékkal, névutóval" and only apply the verbal participle to the attributive (the ... írta könyv syntax), what further inflection or declension would it have beyond the definite past form? To me it seems that it needs to be thought through more carefully. —Al-Muqanna المقنع (talk) 13:23, 16 August 2023 (UTC)[reply]
@Adam78 I agree with User:Al-Muqanna; before we go off and create a bunch of templates, esp. language-specific ones, we need to think through whether they are really necessary. I'm not convinced the "verbal participle" is really a thing at all; Adam, do you understand my point about participle-like uses of the non-past tense finite verb in Arabic (and BTW the past tense can be used similarly; كَانَ كَتَبَ (kāna kataba) means "he had written" but literally "he was he-wrote")? If Hungarian is similar, it would suggest that we should not be tagging 3rd person singular definite past tense forms as participles at all (regardless of whether some grammatical analyses of Hungarian posit "verbal participles"). Benwing2 (talk) 20:36, 16 August 2023 (UTC)[reply]
One of my first classes in linguistics was about how to define a word class, and the solution was what kind of gap in can fill in a sentence. If it behaves like nouns, then we must call it a noun, and so on. (This is how "go" can be a noun in "give it a go" etc.) As I wrote, a verbal participle takes the role of the adjective (attribute) in a sentence, so it's not only a verb because verbs don't do that. This is how syntax works! Are we supposed to dump syntax altogether, so that we can save your simplifying view of this phenomenon? Or are we supposed to make a big notice like "EXCEPTION! Syntax doesn't apply here." around this form? (I don't know any Arabic so I don't know why grammarians do what they do concerning the form you brought up, provided they actually do what you claim they do. I can't relate.) Why don't you just accept the past centuries of linguistic tradition about the established ways of defining word classes? Adam78 (talk) 22:35, 16 August 2023 (UTC)[reply]
Everyone here's well aware that syntax is a thing, but e.g. we still classify English nouns that are exclusively used attributively as nouns, not adjectives, for good reason. The appeal to past centuries seems odd because the problem here is precisely that there's no definitely settled tradition, either within or outside of traditional Hungarian grammar, about what part of speech a "verbal participle" is—even the Hungarian wp article offers "participial verb" (igenévi ige) in the lede as an equivalent alternative title. Similarly the den Dikken article touching on the verbal participle phenomenon I mentioned above is not "simplifying" the syntactic analysis by arguing that the Hungarian past tense in general is nonfinite ("inalienably possessed infinite participles"), implying that the verbal participle should not therefore be considered separate from the definite past tense; it's a dense and detailed paper. —Al-Muqanna المقنع (talk) 22:48, 16 August 2023 (UTC)[reply]
It almost seemed like you guys deny the mere existence of conversion. ("The process whereby a new word is created without changing the form, often by allowing the word to function as a new part of speech.")
Even if you were to ignore syntax, there's also a morphological difference: a maguk alkotta szabályok ("the themselves-created rules," with a verbal participle) vs. maguk alkották a szabályokat ("they created the rules themselves", a verb form). As you can see, the verbal participle doesn't reflect the plural subject, so they're not only syntactically but also morphologically different in such cases. QED. The first-person form will also be different when constructed with the reflexive pronoun: magam alkotta szabályok vs. magam alkottam a szabályokat (verb form: conjugated; participle: not necessarily conjugated for person and/or number).
The verbal participle has been a thing in Hungarian grammar for a century or so, see the two links I recently added to the end of -ta about "Isten-adta"-like structures ("God-given", as in Isten adta tehetség, "God-given talent," literally "the God-gave-it talent"). {{R:MagyarGrammatika 2000}} also explains this case. Next time I can upload those pages as well if you like. Adam78 (talk) 01:18, 17 August 2023 (UTC)[reply]
@Adam78 I can't speak to the agreement phenomena (User:Al-Muqanna as a native speaker will have to take this), and I don't deny that it's possible for the same word to function in different parts of speech; it just seems in this particular case that the analysis of the word as having two distinct parts of speech causes more issues than it solves. Note that whether there's agreement is also not sufficient to demonstrate that there are two POS's involved. As an example of this, participles in French and German are invariable (in most cases) when used with the auxiliary "to have" but agree with the subject when used with the auxiliary "to be"; but that doesn't mean there are two different parts of speech involved. Benwing2 (talk) 02:07, 17 August 2023 (UTC)[reply]
I think I know what you mean concerning French (sections 281–283 in this grammar) but I don't see anything surprising about the POS remaining the same (and even native speakers make mistakes about it, including professors and writers such as Proust, as it's described in the small print). I don't know about anything similar in German, except they indeed use haben and sein depending on the verb in present perfect.
I only see issues with not treating this form as a participle. The syntactic test above (being an attribute of a noun like adjectives) is supposed to be satisfactory on its own. If you reject it, many other decisions become doubtful, like why do we have separate POS sections for (English-language) nouns that were converted from gerunds (e.g. reading), why do we have separate POS sections for (English) adjectives that were created from present or past participles (e.g. taken), why do we have separate POS sections for (French) determiners that derive from prepositions (de), and so on. Why? Because they work differently in a sentence. That's all and that's just enough.
We can also look into semantics. As I wrote above (see apám írta levél), the verbal participle can always be rephrased into a past participle without the slightest change in meaning (which you can't do with a verb form, obviously). So why would we call it anything other than a participle?
We can try vocabulary as well: how it can associate with other lexemes. The verbal participle can form compound words with their subjects (those listed at their own Derived terms, e.g. csípte, ette, marta, csinálta, adta, rágta, lakta, szítta, fútta), which verbs cannot do. – After four tick marks in four tests from four domains of linguistics, I hope we can now conclude this discussion. Adam78 (talk) 19:00, 17 August 2023 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @Adam78: Template: Using the general {{participle of}} template would be better than creating a specific Hungarian one. I’m sure eventually a link can be added to the appropriate glossary item.

The term verbal participle: Google returns hits for this term for other languages, so it is not unknown. The Hungarian-English Linguistic Dictionary (Magyar-Angol Nyelvészeti Szakszótár) provides two translations for igei igenév: attributive verb or verbal adjective. I’ve used this dictionary before and wasn’t always impressed by the translations, so I can’t say these are accurate or appropriate.

Separate header: While there is no consensus among linguists about this, the latest grammar books classify them as independent participles, not a subtype. (Magyar grammatika by Borbála Keszler). I'm for a separate header. Reasons:

1. It has a different etymology. It is an inflected form of the past participle. Example:

A nagymama készítette étel (the food made by the grandmother)
készít (to make), készített (made, past participle) + -e (possessive suffix)

Compare:

Az ételt tegnap készítette. - She made the food yesterday.
készít (to make) + -ette (past-tense suffix)

2. It can be inflected, although the inflected forms are rarely used.

Panda10 (talk) 21:16, 17 August 2023 (UTC)[reply]

@Panda10: That seems reasonable to me in general, and I recognise den Dikken's viewpoint will be a minority one. I don't have Keszler so I'm not sure what her specific analysis is, but are we then including or excluding stuff like lépte and jötte, which the Hungarian WP seemingly does include as verbal participles? For 2., could you be more specific on the further inflected forms? Like I mentioned above, if we're restricting it to attributives (as in a nagymama készítette étel) I'm not sure what further inflections would occur, though with lépte etc. the nominal declension is obvious. —Al-Muqanna المقنع (talk) 21:22, 17 August 2023 (UTC)[reply]
@Al-Muqanna: Verbal participles can be created only from transitive active verbs (per Magyar grammatika: Igei igenevet csupán tárgyas cselekvő igéből hozhatunk létre). The words lépte és jötte are nouns derived from lép (to step) + -te (noun-forming suffix) and jön (to come) + -te (noun-forming suffix). They are not verbal participles. I can't think of an example where I would use them as verbal participles. Incorrect: *a medve lépte ösvény (*the path stepped by the bear). The Hungarian WP uses this example: Kolumbusz színre lépte előtt (before Columbus entered the scene). In that phrase lépte is a noun, so it shouldn't be listed as a verbal participle. I could not find where jötte is used in the Hungarian WP. As for the inflection, Magyar grammatika brings these examples: az (én) hoztam alma (the apple brought by me), a (te) hoztad alma (the apple brought by you), etc. The author notes that the phrases using the first and second-person forms are strange, almost foreign sounding in modern Hungarian. Panda10 (talk) 00:02, 18 August 2023 (UTC)[reply]
@Panda10: On hoztam etc., that's what I mean, those forms are also identical to the definite past tense, so they wouldn't constitute evidence against treating the verbal participle as a use of the definite past. I brought up jötte as being generically similar to lépte (as with other -te verbal nouns). Den Dikken's analysis, as mentioned above, is that for strong diachronic as well as syntactic reasons the -te suffix should be analysed as performing a single function, as the possessive suffix of a nonfinite past tense, rather than separate coincidentally related functions in which it forms the definite past tense, the "noun-forming" suffix, and the "verbal-participle-forming" suffix. These things might appear clear within the recent tradition of Hungarian grammar, but it's not straightforward, from a comparative perspective, to say that lépte in színre lépte előtt is clearly functioning as a noun, for the same reason that technical grammars don't consider English gerunds to be nouns. Hence the Hungarian WP's confusion (if it is confusion). All of that said, your original suggestion above seems like the least confusing way forward in the circumstances. —Al-Muqanna المقنع (talk) 06:13, 18 August 2023 (UTC)[reply]
Színre lépte is synonymous with színre lépése, where lépés is a regular verbal noun (the first form is a bit more literary in style). It's not really a gerund because a gerund can have an object (retaining its verb-like feature), as in "painting the room was fun", while the Hungarian term can only correspond to "the painting of the room". It's still fairly close to the "verb-like" character of the verb, e.g. it can still have adverbial arguments like a verb (though optionally inserting the grammar particle való, adjectivizing it), even if it can also have adjectival attributes like a noun. (I found the related Wikipedia article informative.)
I think the reasons why we cannot treat "lépte" exactly like "lépés(e)" is that (1) it comprises an obligatory possessive suffix ("his/her/its"), which can be replaced by any other person instead of the third person singular and (2) the former suffix is not productive. There are literally barely a handful of nouns where this -ta/-te survived without a possessive ending (-a/-e). The verbal noun (with the possessive) has a few dozen examples at least. Adam78 (talk) 15:19, 18 August 2023 (UTC)[reply]
@Adam78 @Panda10 @Al-Muqanna @Benwing2 I’m pretty sure adverbial participles are converbs, which we do have a part of speech for. The Wikipedia article even gives “adverbial participle” as another name for them. Would using that term make sense instead? I don’t really see how the verbal participle as described is meaningfully outside of the definition of converbs as given either, but I may be (read: am probbaly) misunderstanding something. Theknightwho (talk) 22:53, 25 August 2023 (UTC)[reply]
Yes, I think adverbial participles do correspond to converbs. However, it's a different form from those described above (with the suffix -va/-ve or somewhat archaically -ván/-vén, the latter being preferred when expressing a reason rather than a manner). They work like adverbs.
On the other hand, verbal participles take the role of attributes, i.e. adjective-like elements, and they are synonymous with past participles. The only difference is that they express the agent as their subject, rather than with a by-phrase (e.g. "God-given", i.e., "given by God"), while they both modify a noun phrase (rather than a verb phrase, as converbs do). Verbal participles can be described as attributive verbs.
The case for using the term "participle" rather than anything else seems convenient as it roughly corresponds to Hungarian igenév, although the latter includes infinitives as well. Hungarian grammar currently has four types of igenév:
  • főnévi igenév (infinitive), ending in -ni,
  • melléknévi igenév ([present, past, and future] participle), ending in /, -t/-ott/-ett/-ött, and -andó/-endő, respectively,
  • határozói igenév (adverbial participle, or converb if you like), ending in -va/-ve,
  • igei igenév (verbal participle, or attributive verb if you like, synonymous with the past participle), ending in -ta/-te. — Incidentally, this suffix can also form possessive-suffixed nouns from verbs (synonymous with verbal nouns, below, aside from the possessive suffix), which may have caused some confusion.
Instead of gerunds, Hungarian uses full-fledged nouns (suffixed with -ás/-és, which cannot take objects as gerunds can (like "driving a car is easy"), but it's normally converted into a genitive (as "the driving of a car"); that's why it's not one of the participles, and it's referred to as a verbal noun instead.
In short, if we could retain the term "participle" for the three latter types (excluding the infinitive, of course), then (most of all) we could reflect that they are equally verb-derivatives taking the role of non-verb elements in a sentence, and (as a bonus) we could be relatively faithful to the existing Hungarian terminology and grammar description. Adam78 (talk) 23:37, 25 August 2023 (UTC)[reply]
@Theknightwho I just realized I didn't ping you back when I replied; now I do so. Adam78 (talk) 16:50, 6 September 2023 (UTC)[reply]

transliteration of Greek gamma

[edit]

On the έγκυος page, we have the ancient Greek word transcribed as énkuos, but the modern descendant is spelled as égkyos. I know the vowels are different because of a sound change, but should we standardize the gamma?

Also, this might not be important, but when testing out an edit in preview mode I got a message saying

The ancestor of Greek is Katharevousa (el-kth) (an etymology-only language whose regular parent is Greek (el))..

This error message says Greek is the ancestor of its own parent. Is there a word wrong, or would this message only show up if someone was doing something wrong anyway? What I did was to type {{inh|el|el|έγκυος}} in the etymology section, just to see if the spelling would change. So, not something a person would normally need to do, but it still makes me wonder if the module has a mistake. Thanks, Soap 19:45, 13 June 2023 (UTC)[reply]

Now I realize the same error message is generated when any other language besides grc or el-kth is used as the source language. Soap 20:01, 13 June 2023 (UTC)[reply]
For both ancient and modern Greek, "gk" is simply a direct transliteration of the letter pair "γκ". It isn't phonetic or phonemic in either. For Ancient Greek, the transcription "nk" represents the phonemes involved (phonetically realized as [ŋ.k]). In modern Greek, the plosive is voiced. Our entry gives the modern Greek phonemic transcription as /ˈeŋ.ɟi.os/, but I'm suspicious of the "ŋ.ɟ" here, since from what I've read, modern Greek tends to not have phonemic nasal + plosive sequences, but instead has voiced plosive phonemes that are optionally but non-phonemically prenasalized when they come after vowels (that is, I suspect a better transcription would be along the lines of /ˈe.ɟi.os/ or /ˈe.ᵑɟi.os/). But I'm no expert on Greek.--Urszag (talk) 05:59, 14 June 2023 (UTC)[reply]
Hmm. Well the Wikipedia article for the Yoruba surname w:Adetokunbo shows how nt can spell a simple /d/ and nmp spells the sequence /nb/. But this may be a foreign language loaning convention. Im pretty sure there are at least Greek dialects that preserve the nasals, and that Greek immigrants to America may have brought those dialects with them. I remember hearing when I was young that a stereotypical Greek accent will have nasal-insertions like "stromberry" due to plain voiced stops not occurring in that position in Greek. But it's possible that those pronunciations stand out in America because they are the most exotic, just as we do with Italian accents. The Wikipedia article on modern Greek phonology says that voiceless stops are being lenited into weakly voiced ones now in modern Greek speech, which I wouldnt expect if the nasal+stop sequences were also turning into plain voiced stops. But it also says Greeks often confuse consonant sequences in loans, so maybe there really is at least a weak merger going on. Soap 20:20, 14 June 2023 (UTC)[reply]
@Soap You’d get the same issue with Latin: {{inh|la|la|…}} will throw an error saying that the ancestor of Latin is Old Latin, an etymology-only language whose regular parent is Latin. This is fine: it’s because we want to acknowledge that a particular stage is the ancestor of the (rest of the) language without giving it an L2 in its own right. Theknightwho (talk) 19:44, 14 June 2023 (UTC)[reply]

Adding all entries in a dictionary to Wiktionary

[edit]

Hello! I would like to know if it would be possible to add all entries from Oulhadj's English-Amazigh dictionary to the Tashelhit Wiktionary? The dictionary is free for access in academia.edu Taluzet (talk) 13:39, 15 June 2023 (UTC)[reply]

What is the copyright status? If not specified further, we must assume "all rights reserved", which means we cannot copy it wholesale. — SURJECTION / T / C / L / 13:50, 15 June 2023 (UTC)[reply]
In general, there's a big difference between using a dictionary as a source (with proper attribution), and systematically copying it whole. It's true the fact that these are terms in Tashelhit and that they have certain meanings can't be copyrighted, the wording of the definitions can be, as well as the choice and arrangement of terms covered as a whole. At any rate, the free availability of a resource isn't a license to copy it wholesale. Most "freely available" works (including Wiktionary- see the legal language below the edit window) retain their copyright and only license the use of their content under conditions that are explicitly stated somewhere. We need to know the terms of that license before we can answer your question.
Even if there's no legal obstacle to copying content, we need to give attribution to meet the requirements of our license and as a matter of principle. It may be possible to copy everything, but it's important to do it right. Chuck Entz (talk) 14:51, 15 June 2023 (UTC)[reply]

I'd personally prefer manually entering the entries, is there a guide to do this manual workquickly while respecting Wiktionary's standards? Something like 100 entries/hour Taluzet (talk) 13:39, 15 June 2023 (UTC)[reply]

@Taluzet 100 pages per hour is less than one a minute; that's way too fast to create quality entries. Manually entering the entries wouldn't be any better from a copyright standpoint than automatically importing them. I looked through the dictionary you linked to and I can't find any mention of copyright; you'd have to contact the author and see what terms they're licensing the work under. Benwing2 (talk) 23:07, 15 June 2023 (UTC)[reply]

Interface admin?

[edit]

May I be appointed an interface admin? I have had it in mind to make this request for some time, as I often notice minor CSS/JS issues that I could uncontroversially repair myself, but the ultimate impetus was my desire to remove the defunct Wikiwix search engine from MediaWiki:Gadget-SpecialSearch.js. Sure, I could ask someone else to do it - but I am capable of doing it myself.

My CV: I'm fluent in CSS and JS. About ten years ago I spent time as the lead developer of Wikipedia's Twinkle gadget. More recently (and more locally), I spearheaded the push to convert our translation tables from a fixed two-column layout to a flexible number of columns depending on the width of the user's screen.

Apparently interface adminship doesn't require a formal vote, or even a discussion (see Wiktionary:Beer parlour/2022/February#Request for interface administrator permissions), so I'm pinging some bureaucrats @Surjection, Chuck Entz. This, that and the other (talk) 03:14, 18 June 2023 (UTC)[reply]

FWIW I am in favor of this. Benwing2 (talk) 04:06, 18 June 2023 (UTC)[reply]
Support. --{{victar|talk}} 05:01, 18 June 2023 (UTC)[reply]
Done DoneSURJECTION / T / C / L / 07:08, 18 June 2023 (UTC)[reply]
Thanks all! I appreciate the support. This, that and the other (talk) 08:59, 18 June 2023 (UTC)[reply]

These are, indeed, useful, but they all use insource: with regular expressions. To quote mw:Help:CirrusSearch "Warning: Do not run a bare insource:/regexp/ search. It will probably timeout after 20 seconds anyway, while blocking the queries of responsible users." Indeed, these all return only a handful of results, along with a nasty timeout message. Can we find a way to make these work without timing out? Chuck Entz (talk) 15:38, 18 June 2023 (UTC)[reply]

I would say "if nothing else, we could search the database dumps for them semi-regularly instead", but those have also been only spottily available in recent months. - -sche (discuss) 18:23, 19 June 2023 (UTC)[reply]
I would use these with additional "filters" per Help:CirrusSearch. DCDuring (talk) 18:31, 19 June 2023 (UTC)[reply]
For example, the "sense" search would benefit from the addition of 'hastemplate:"sense"'. DCDuring (talk) 18:40, 19 June 2023 (UTC)[reply]
Also the "the the" search is efficient and focused if it is as follows '"the the " insource:/[Tt]he the /'. ("the the" is fast because it uses the word indexes. The "insource" part eliminates instances where punctuation intervenes and also diacritic-laden variants of the characters.) DCDuring (talk) 18:49, 19 June 2023 (UTC)[reply]

Template:Q getting "bad argument #1 to 'lc' (string expected, got function)" error.

[edit]

I don't think this error message was there the last time I edited this page, although maybe I just overlooked it: second bullet at Citations:quae. I can't tell if I misformatted something or this is a bug (maybe affected by some recent change to Module:Quotations/la/data)? The error message seems to disappear if I remove the fifth argument, but from what I understand including that shouldn't cause an error. Urszag (talk) 12:13, 19 June 2023 (UTC)[reply]

@Urszag It's probably a module bug. There have been a ton of changes in the last day to the Latin data module. I'll see if I can pinpoint where the issue is. Benwing2 (talk) 18:32, 19 June 2023 (UTC)[reply]

"no <references/> tag" warning doubles CAPTCHA

[edit]

Steps to reproduce:

  1. Attempt to save an edit containing references with external links on a page with no <references/> tag. You'll be asked to solve a CAPTCHA.
  2. Solve it, then resubmit. You'll see the following message: Warning: You're trying to save page with a <ref> tag but no <references/> tag! This action has been automatically identified as likely in error. If you believe this edit to be correct, you may click "Publish changes" again to confirm it. Otherwise, please fix it and then click "Publish changes".
  3. Resubmit. (Perhaps you want to add the tag in the next edit within a minute).

Expected result:

  • The page is saved.

Actual result:

I didn't even know this site has CAPTCHA's at all. (a) this must be fairly new; (b) it's certainly an IP-only thing; (c) I'm not sure we even have control over this, it's probably a MediaWiki thing. Benwing2 (talk) 20:55, 19 June 2023 (UTC)[reply]
It definitely isn't new, it's been like this for years. 89.64.68.162 23:26, 19 June 2023 (UTC)[reply]
Yeah, IIRC it's something IPs and new accounts get if they have fewer than some number of edits (10?), to restrict spammers from spamming links (separate from our own local abuse filter against new users adding links under certain conditions). AFAIK this issue would occur on all wikis which have edit filters which warn against certain actions, so this would be something to report on Phabricator, not a local issue. - -sche (discuss) 00:01, 20 June 2023 (UTC)[reply]
[edit]

(Sorry, I meant {{no_entry}}, not {{LDL}}.) --RichardW57m (talk) 11:36, 20 June 2023 (UTC)[reply]

And that's my problem! I'd used {{LDL}} instead of {{no_entry}}. When I use {{no_entry}}, I get the blue links I want! **NO ISSUE**. --RichardW57m (talk) 12:11, 20 June 2023 (UTC)[reply]

I have orange links enabled. When I link to a language section containing only {{LDL|sa}}{{no entry}} and there are populated sections on the page for other languages, the link displays as orange. For example, at अतिथि (atithi), the links from the 'Alternative scripts' section to the Khmer and Assamese/Bengali script sections are orange, even though Sanskrit sections exist on those pages. What should be done to remove the orange misdirection warnings? --RichardW57 (talk) 19:49, 19 June 2023 (UTC)[reply]

@RichardW57: There may be Sanskrit sections on those pages, but there are no Sanskrit entries there. I think the orange links should stay orange. Chuck Entz (talk) 20:18, 19 June 2023 (UTC)[reply]
Then can we stub them out at source in the invocation of {{sa-alt}}? Notifying Sanskrit editors: (Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat): . Generating spurious blue links is an insult to users. --RichardW57 (talk) 20:29, 19 June 2023 (UTC)[reply]
@RichardW57, RichardW57m: Including the {{LDL}} template by default in {{sa-alt|Deva=}} or {{sa-sc}} may not be needed for commonly used modern scripts that do not require much post-processing in MOD:sa-convert such as the Gujarati script. This may be more useful for extinct scripts and those that require post-processing (post_replace_fix and the Unresolved Issues section at MOD:sa-convert) since the accuracy of their forms may be less certain. So if the {{LDL}} template is to be shown, it could perhaps be on a script-by-script basis rather than being included in every transclusion of {{sa-alt|Deva=}} or {{sa-sc}}. Kutchkutch (talk)
This is an inter-page problem. One page will contain an invocation of {{sa-alt}}, but for example typically one cannot find a durably archived source with the Thai script spelling (and I've not manage to identify, let alone obtain, a dictionary for Thai script Sanskrit), and it will then match a word recorded in Thai script. (Well, I think the Hare Krishna Thai-script spelling would qualify, but in general that is not what is generated by {{sa-alt}}.) When I entered the Thai script spelling, they were summarily deleted out of process. Similarly, editors attempting to add Bengali-script Sanskrit were asked to desist without sound attestation. At present, unlogged in users see a blue link, that merely leads to the page, but will appear to lead to the first language on the page. Logged in users with orange links enabled are warned by the orange colour if the link would misdirect, but confusingly the same colour currently occurs if the section exists but has only {{no entry}}, which is merely a sign of work in progress.

I thought I had hit upon a solution - provide a proper target for the link in that it is in a Sanskrit section, but with content being {{no entry}}, requesting evidence of the word being spelt thus in the script. Thus, if durably archived Sanskrit in the Tai Tham script is restricted to a single slab, only words on that slab would get a proper entry, but no others.

One solution, generally not as easy as having {{no entry}} result in blue links, is to add parameters to stop a normal link being generated. (For example, it might be colour-coded - red still has the right general semantic, though it wouldn't be a conventional red link.) This will mean that dozens of pages have to record this information - or one could wrapping problematic {{sa-alt}} calls in a word specific template. (I've come to favour that approach for {{pi-alt}}.) One may already need that sort of approach for whether repha triggers duplication of the phonetically following consonant, a once common practice. --RichardW57m (talk) 11:36, 20 June 2023 (UTC)[reply]

Time-outs in Massed Brahmi Declension

[edit]

Module:pra-decl/noun/gallery/documentation is currently timing out. I hadn't checked cat:E for a while, but there's a reasonable chance that tweaking of link displays by @Theknightwho is not responsible for tipping it over the edge this time. I do notice that there is module error in the related formal testbed Module:pra-decl/noun/testcases, and I would suggest that anyone interested in fixing this timing issue start out with that error. I'm confident it was introduced by the obvious optimisation in Special:diff/72638514, but I intend to tackle it myself tonight unless TKW thinks a change is as good as a rest. (It may, of course, be a red herring.) Notifying @Svartava, Kutchkutch, but they seem innocent. --RichardW57m (talk) 11:15, 21 June 2023 (UTC)[reply]

Fixed. I cut out the frame object interface in Module:pra-decl/noun, and both problems went away. I did this by exposing the function interpret_shortcuts() from Module:typing-aids. --RichardW57m (talk) 12:57, 21 June 2023 (UTC)[reply]
Outstanding issues:
  1. I should instead have made the function replace() in Module:typing-aids work from Lua function calls in addition to #invoke:.
  2. There may still be a lurking compute-time issue somewhere.
--RichardW57m (talk) 13:20, 21 June 2023 (UTC)[reply]

We do have a fallback precedent in case we can't fix the run-time problem within a week. --RichardW57m (talk) 11:15, 21 June 2023 (UTC)[reply]

@RichardW57: it's back (or it was- now it's gone again). This page has been showing up intermittently in CAT:E for years, with Lua time usage that varies from 8 to 10+ seconds for no apparent reason. The errors go away with null edits, though sometimes it may take a couple tries. Quite often you see things like no error on the page itself- even though it's on the CAT:E page- then after a null edit it displays a module error on the page itself but it disappears from the CAT:E page at the same time. Apparently it changes back and forth in the midst of the save process.
In other words, your "fix" probably had less to do with the result than random chance- it was the null edit you made at the end rather than the changes you made on the back end. Recently, Appendix:Pahari Swadesh lists has been doing the same thing.
I think you need to split the page: with that many transclusions, even the slightest variation gets magnified into something significant when multiplied over them all. I would suggest having a separate page for each script, to start with. Chuck Entz (talk) 06:32, 22 June 2023 (UTC)[reply]
Splitting makes sense - I'm just not sure which way to split - by script or by dialect, or perhaps both ways. Input appreciated - @Svartava, Kutchkutch. In the mean time, I've bought some time by replacing unconditional mw.ustring.gsub by gsub from Module:string_utilities. On my one-off measurements, that got the Lua CPU time down from 9.997s (phew!) to 7.763s. I think it's possible to further squeeze time out of the transliteration, but that hits the law of diminishing returns. I'm surprised how many languages seem to be doing inflection in Roman transliteration - that relies on a faithful transliteration being available and writing system variations not being correlated between stem and flexion. --RichardW57m (talk) 12:18, 22 June 2023 (UTC)[reply]
So the two issues were unconnected. --RichardW57m (talk) 12:22, 22 June 2023 (UTC)[reply]

Invisible characters in Frequency lists

[edit]

Hello, it's me again accidentally triggering the vandalism filters (see also [1]). This latest set of frequency lists I'm working on seem to contain a lot of invisible characters which my script wasn't initially set up to handle and which I have been rather slow to identify during the upload phase. Disappointingly, even after filtering them out (as per [2]), I am still getting a vandalism warning preventing me uploading some pages, e.g. Wiktionary:Frequency lists/Danish/Mixed sources/1350001-1400000.

I am happy to take any suggestions from others as to how best proceed. In the short term, could someone please identify if possible the specific character(s) I need to remove to pass upload? In the longer term, could we possibly consider implementing a user-friendly tool to specifically identify offending characters, ideally with location similar to the "show changes" feature when editing a page? Or even better a prompt to strip such characters when publishing changes (though my uploads are possibly too much of an edge-case). Helrasincke (talk) 22:38, 21 June 2023 (UTC)[reply]

@Helrasincke The edit matched two filters: one which catches invisible characters - however, this filter does not prevent you from saving your edit - and a second one which is designed to stop recurrent vandalism. It was the second filter that prevented you from saving this particular page of the frequency lists. I have disabled the relevant part of the filter so you should now be able to save the missing frequency list page. This, that and the other (talk) 05:07, 22 June 2023 (UTC)[reply]

Issue with bold apostrophes in quotations

[edit]

Outside quotation:

&#39;'''bold'''&#39;
'bold'
Closing apostrophe is not bold.

Inside quotation:

{{quote-book|en|passage=&#39;'''bold'''&#39;}}
(Can we date this quote?), (Please provide the book title or journal name):
'bold'
Closing apostrophe is bold.

@Theknightwho J3133 (talk) 12:28, 22 June 2023 (UTC)[reply]

@J3133 Thanks - the template is treating it as though it were ''''bold'''', which gives 'bold' as you see here, because the escaped apostrophes must be resolving too early in the process. I'll take a look at some point, but I'm caught up with a bunch of other things at the moment. Theknightwho (talk) 13:05, 22 June 2023 (UTC)[reply]

English adverbs and adjectives that are only sometimes gradable

[edit]

If some senses of an English noun pluralize and others don't, you can set {{en-noun|~}}. Is there, or should there be, an equivalent parameter for if only some senses of an adjective or adverb are gradable and others aren't? I was about to combine the two ===Adverb=== headers at queer#Adverb under one header, but then noticed they are asserted to differ in gradability. (Do they really? Are there other adverbs where some senses are gradeable and others aren't?) - -sche (discuss) 13:07, 23 June 2023 (UTC)[reply]

June (not generally comparable, comparative more June, superlative most June), is that what you mean? Vininn126 (talk) 13:18, 23 June 2023 (UTC)[reply]
Aha, that works! I missed that in the documentation. I've combined the adverb sections now. I guess if it's felt to be useful, labels could be added to indicate which sense is (sometimes comparable) vs (not comparable). - -sche (discuss)

Bring T:rootsee in line with T:root

[edit]

Currently {{rootsee}} is quite janky. It defaults to PIE, the other possible origins of the root are nv,ar,mt,akk,tzm,he, specified manually into the code, which are given as the entry-language and then defaulted to the root-language. Then there's |3= tried do handles this, but only allows the value ine-pro, and even more weirdly, supports PIE as alternative value for it. It should maybe be noted that initially the entry-language was defaulted at Russian. I suggest we make the arguments of {{rootsee}} in line with {{root}}, that is:

  • |1=: language code of the entry
  • |2=: language code of the root
  • |3=: the root
  • |id=: the ID

Hopefully the change can be put into practice by bot. Catonif (talk) 15:24, 23 June 2023 (UTC)[reply]

1. I think you mean not "the language code of the entry", but "the language code of the terms to be listed".
2. I would expand "the ID" to the "sense or etymological ID of the root". Hmm, I see a conflict between these two IDs. Do we have guidelines for which to use with {{root}}?
3. If the root will default to anything, it should be to {{PAGENAME}}.
4. In the entries for roots themselves, the commonest usage will be for the language of root and displayed entries to be the same.
This makes what I take to be the current order of the positional parameters the best for defaulting:
I think we may have a conflict for the defaulting of the language code of the root. Those using {{rootsee}} to populate lists of related words for English may want it to default to PIE, while those documenting native roots of attested languages will want it to default to the language of the terms to be listed - at least, I'm using it this way for scrappy lists for Pali. (I impose structure on ample lists, which makes them handcrafted and naturally incomplete.) --RichardW57m (talk) 17:07, 23 June 2023 (UTC)[reply]
Yes, I meant the language code of the terms to be listed, which should be the language of the entry the template is under, no? I can't think of an example where it isn't the case.
To get rid of {{PIE root see}} as well, probably together with similar templates, I re-propose:
  • |1=: the language code of the terms to be listed. Mandatory, i.e. no default value. Can be set to - for the general category, so for example:
    {{rootsee|-|ine-pro}} is what would be placed at *dʰeh₁-
    {{rootsee|ru|ine-pro|*h₁rewdʰ}} is what would be placed at рдеть
  • |2=: the language code of the root. Mandatory if |3= is also given, if not, it can be made to default to the value of |1=, so for example.
    {{rootsee|pi}} is what you'd place in the derived terms of a Pali root.
  • |3=: the root itself. Defaults to page name.
  • |id=: the ID of the root (what's the other one?)
See how I would keep this order, which is more inline with {{root}} and pretty much any other template (i.e. first the codes, then the rest).
However we wish to further modify the proposal, it doesn't make any sense for PIE to be defaulted in any parameter, as this should be a general-purpose template without an IE bias. Catonif (talk) 10:59, 24 June 2023 (UTC)[reply]
But an IE bias makes sense for three variously related reasons:
  1. English and its major loan sources are IE languages.
  2. IE is the largest family (or is this an outdated assessment?).
  3. IE languages are a major source of loans in many vigorous languages.
The downside of ditching this default is that many invocations of {{rootsee}} in English entries will now need all three positional parameters, whereas at present the root language defaults.
I concede that the inclusion of the functionality of {{PIE root see}} argues for being able to just write {{rootsee|-|sa}}. I had conceived of being able to list, say, the Prakrit derivatives of an Old Indic root in the WT-Sanskrit page of that root, so say {{rootsee|pi|sa}} in the code-code-root scheme, but there are probably too many Indic languages for this to work well.
There are two ID schemes in operation, those defined under {{senseid}} and {{etymid}}; for each combination of term and language, they share a name space. The former directs to a sense, which is part of a PoS section; the latter directs to a clutch of PoS sections. For links, one just chooses the desired level of precision. For root identification, this causes a problem, because there is no record of what sense IDs imply what etymology IDs. By the time one gets to a language like Pali, there is plenty of evidence of formerly distinct but now homonymous roots. --RichardW57 (talk) 15:03, 24 June 2023 (UTC)[reply]
I haven't gone through this thread in detail but I'm pretty sure I agree with User:Catonif. Benwing2 (talk) 23:09, 24 June 2023 (UTC)[reply]
Me too. The current state of {{rootsee}} is completely untenable. The proposed change by Catonif would bring the template in line with other etymology templates like {{bor}} and {{inh}}, and consistency in that regard is far more important than saving English (or Indo-European) editors a handful of keystrokes. The obvious solution is to integrate it into Module:etymology, instead of the weird half-baked version we have now. Theknightwho (talk) 22:37, 17 August 2023 (UTC)[reply]

Double period headword

[edit]

The headwords at Unsupported titles/Double period should be “” (cf. ...), not “Unsupported titles/Double period”. J3133 (talk) 20:02, 23 June 2023 (UTC)[reply]

@J3133 This is because they use {{mul-morse letter}}, {{ja-morse letter}} and so on, which plug into Module:Morse, which uses a shitty hack to generate the display form that can't cope with unsupported titles properly. I'll see what I can do to fix it up, as it should never have been coded like that in the first place. Theknightwho (talk) 22:43, 17 August 2023 (UTC)[reply]
@J3133 This is fixed. Theknightwho (talk) 10:40, 20 August 2023 (UTC)[reply]

“Reddit slang” is listed in Module:category tree/poscatboiler/data/terms by usage but this category does not work. J3133 (talk) 18:56, 24 June 2023 (UTC)[reply]

Category:Usenet slang by language also has this problem. I suspect the initial capital. @Benwing2, Theknightwho ? This, that and the other (talk) 09:28, 28 June 2023 (UTC)[reply]
Interim Done fix applied by using {{poscatboiler||Reddit slang}} directly instead of {{auto cat}}. This, that and the other (talk) 10:05, 29 June 2023 (UTC)[reply]
@This, that and the other This isn't the right fix in that I'm trying to eliminate direct calls to {{poscatboiler}}. There should be another way of fixing this but I haven't had a chance to look into it yet. Benwing2 (talk) 20:57, 29 June 2023 (UTC)[reply]
OK, this will require some code changes to the auto cat code. Benwing2 (talk) 21:17, 29 June 2023 (UTC)[reply]

How to suppress alternative traditional form in {{zh-forms}} template

[edit]

In the entry 嘩佬, the alternative form "譁佬" is automatically generated, but it is not common to write like this in this term. Mahogany115 (talk) 10:20, 27 June 2023 (UTC)[reply]

@Mahogany115: this is currently not supported by the template. I suggest that we could set |t2=- to disable that. What do you think? – Wpi (talk) 19:03, 27 June 2023 (UTC)[reply]
Alternatively, just remove 嘩譁 from Module:zh-forms/data, and add t2=譁○ manually to other entries. As the one who added that function, it's not a big deal. —Fish bowl (talk) 22:29, 27 June 2023 (UTC)[reply]
I followed what you said and failed. Mahogany115 (talk) 23:44, 27 June 2023 (UTC)[reply]

I was trying to add the persian word for "disrespectful"

[edit]

i kept editing the page to make it look like how i envisioned and im new to this website but my action was identified as harmful Sweetishpie (talk) 15:08, 28 June 2023 (UTC)[reply]

@Sweetishpie Welcome to Wiktionary! Consider using the "Show preview" button to check that your edit does what you intend before you click "Publish changes". This, that and the other (talk) 01:47, 29 June 2023 (UTC)[reply]
@Sweetishpie: You created the entry احترامی (ehtarami) but put the header as بی‌احترامی (bi-ehtarami), which is a different word or a word with the preposition. This is not right.
If احترامی (ehtarami) is always used with بی (bi) (with a space ZWNJ or written as one word, there is a way to mention it.
  1. بی احترامی (bi ehtarami) with a space
  2. بی‌احترامی (bi-ehtarami) with ZWNJ
  3. بیاحترامی (bi-ehtarami) as one word
Anatoli T. (обсудить/вклад) 02:18, 29 June 2023 (UTC)[reply]
@Sweetishpie:
Also:, the punctuation should be physically in the right position, so exclamation mark should be placed at the end of the sentence, not the beginning. Since Persian is written right-to-left, the templates will automatically place the ! in the right position, like this:
بی‌احترامی نکنی!bi-ehtarami nakoni!Do not be disrespectful!
Compare with your version. Anatoli T. (обсудить/вклад) 02:24, 29 June 2023 (UTC)[reply]
Hello there, I do not know how to change the heading into بی‌احترامی Sweetishpie (talk) 11:24, 29 June 2023 (UTC)[reply]
@Sweetishpie: Don't. If the entry is احترامی, why should the header be بی‌احترامی? You don't make the header for the word disrespectful as disrespectfully, do you? Anatoli T. (обсудить/вклад) 02:22, 30 June 2023 (UTC)[reply]