Wiktionary:Beer parlour/2011/August
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live. |
Beer parlour archives edit | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
earliest-attestation categories
In such cases as we an say with some certainty — perhaps through research or by appeal to the OED and other authorities — that the earliest a word can be attested is 1922, or circa 1922, do we want to categorize it as such? Say, category:English words first attested 1900-40, with corresponding categories for "...1940-60", "...1960-80", "...1980-2000", "...2000-20", and, working down, "...circa 1900", "...1860-1900", "...circa 1850", "...1800-1850", "...circa 1800", "...1750-1800", and so on (with, for earlier centuries, perhaps fewer than the four categories per century I've envisioned for the 19th and 18th. Anyway, my choice of specific categories here was off the top of my head. I'm asking about the general idea. Too, specific categories will vary by language, with Esperanto, say, having pre-1887, 1887-1904, and other categories, perhaps).—msh210℠ on a public computer 06:07, 4 August 2011 (UTC)
- I like the general idea. --Daniel 23:41, 7 August 2011 (UTC)
- It would be a lot of work, and we would find that many authorities disagreed on the earliest attestation of a word (I discovered that making this list; for quartz, for example, there's a range of more than a century). On the other hand, it could be quite useful for some things and to some people. I don't oppose the idea. - -sche (discuss) 23:46, 7 August 2011 (UTC)
Special:NewMessages
What's the deal wtih New messages, which appears in the upper right-hand corner of every page between "My watchlist" and "My contributions"? If I had new messages, wouldn't they be on my talk page? —Angr 13:38, 4 August 2011 (UTC)
- We have installed here, for use by those users who want it on their talkpages, LiquidThreads. See, e.g., [[user talk:Yair rand]]. If you post something using that system and get a reply, it will show up on your watchlist as "You have new messages" or something like that with a link to [[special:newmessages]], which latter (is also linked to from the top of each page, as you've seen, and) lists all the replies you've gotten using LiquidThreads. If you want for any reason to hide the "New messages" link atop each page, add
#pt-newmessages{display:none!important}
to your CSS ([[special:mypage/vector.css]] if you use Vector).—msh210℠ (talk) 15:51, 4 August 2011 (UTC) 16:03, 4 August 2011 (UTC)- OK, thanks. It does seem like it would be better to call it something other than "New messages", since that's exactly what new messages on one's user talk page are called. —Angr 15:57, 4 August 2011 (UTC)
- Yes, the whole thing is somewhat poorly executed.—msh210℠ (talk) 16:03, 4 August 2011 (UTC)
- OK, thanks. It does seem like it would be better to call it something other than "New messages", since that's exactly what new messages on one's user talk page are called. —Angr 15:57, 4 August 2011 (UTC)
Updating anagram format
I've actually only avoided raising this issue as I consider it so minor in relation to other areas where we could make progress. Wiktionary:Votes/pl-2009-12/Modify anagram section of ELE is now out of date as {{alphagram}}
displays nothing unless the first parameter isn't a valid page name. So this example:
* {{alphagram|opst}}: [[opts]], [[pots]], [[spot]], [[stop]], [[tops]]
displays in fact
That is, an isolated colon with a space either side of it, preceded immediately by a bullet point. I'd simply like to amend this vote to exclude {{alphagram}}
and delete it (or RFDO it and let the community make that decision separately). Not necessarily delete it to never come back again, but it shouldn't be allow to be used whilst it's blank, and Conrad.Bot which added it in the first place is inactive, Conrad.Irwin hasn't said whether he intends to use it ({{alphagram}}
) again. --Mglovesfun (talk) 14:07, 4 August 2011 (UTC)
- Sounds good to me.—msh210℠ (talk) 15:48, 5 August 2011 (UTC)
Romanizations of languages in ancient scripts
This point has been brought up before but it has never really been properly solved. Many old languages on Wiktionary were written in scripts that are no longer common and the texts in which they appear are more commonly published in romanized form than in the original script. The situation would be as if ancient Chinese texts were now almost exclusively published in pinyin. So although the original script was the only script used in contemporary attestations, modern readers will almost exclusively read texts in that language in Latin script. Grammars and dictionaries are written in Latin script as well, and this is the script that people will most likely want to look up words in. So I think using Latin script as the main script of these languages would have far more practical value for users than the original script ever will. I'm not saying that the words should not be present in the original script, but I would prefer it if we turned the tables: that the entries in original script link to the modern Latin-script versions of the terms. —CodeCat 14:15, 4 August 2011 (UTC)
- Sounds like a good idea in principle, but there may be a spectrum with no clear boundary here. For example, Sanskrit is usually written in Devanagari in India, but usually in romanization in the West, so it's not clear which should predominate. (Of course, Devanagari is a script that's still widely used for modern languages too, so that may tilt the tables in its favor.) Some languages' scripts aren't even encoded in Unicode yet, like Tocharian, so everything in Category:Tocharian A language and Category:Tocharian B language is already necessarily in romanization. But I think you have a good point for, say Gothic and Primitive Irish. No one really goes around reading Gothic script or Ogam nowadays; instead, romanization is practically universal. Definitely worth thinking about. —Angr 14:42, 4 August 2011 (UTC)
To be honest, I'm a bit tired of you bringing up that topic once again. Last time, it's been shown that the community doesn't want this, repeating the whole discussion will not change a thing. You can try starting a vote if you really want to push this through, but whether or not this will pass is entirely up to the people here. -- Liliana • 12:39, 5 August 2011 (UTC)
- It bears consideration. Earlier this year somebody deleted our once considerable collection of romanized Sumerian, Akkadian, etc. I think these ancient languages that used inadequate or little-known scripts deserve at least the treatment that we permit for Chinese and Japanese. If an ancient language is usually studied in the English-speaking countries using the Latin script, then we should have the romanized spelling just like we do with Pinyin. Whenever entries can be created in the original ancient script (whether cuneiform, Devanagari, hieroglyphics, Mayan logographic script, Linear B, or whatever), then the romanized spelling could be made to redirect to the ancient script. Deleting ancient words that are added in the Roman alphabet was a dreadful loss, and maintaining all the entries in these ancient languages strictly in their lesser used and rather inaccessible scripts makes them not very useful to the people who want to study those languages. —Stephen (Talk) 13:24, 5 August 2011 (UTC)
- I'm not at all sure about making the Latin alphabet the main script for these languages, as CodeCat suggests, i.e. having the Latin-alphabet entry be the primary one, while the original-alphabet entry merely says "<Original alphabet> spelling of <Latin alphabet>" or the like. But we should definitely have listings for the romanizations of such entries. For example, qino should be an entry saying something like "Romanization of 𐌵𐌹𐌽𐍉 (qinō)" rather than a red link. —Angr 14:13, 5 August 2011 (UTC)
- It bears consideration. Earlier this year somebody deleted our once considerable collection of romanized Sumerian, Akkadian, etc. I think these ancient languages that used inadequate or little-known scripts deserve at least the treatment that we permit for Chinese and Japanese. If an ancient language is usually studied in the English-speaking countries using the Latin script, then we should have the romanized spelling just like we do with Pinyin. Whenever entries can be created in the original ancient script (whether cuneiform, Devanagari, hieroglyphics, Mayan logographic script, Linear B, or whatever), then the romanized spelling could be made to redirect to the ancient script. Deleting ancient words that are added in the Roman alphabet was a dreadful loss, and maintaining all the entries in these ancient languages strictly in their lesser used and rather inaccessible scripts makes them not very useful to the people who want to study those languages. —Stephen (Talk) 13:24, 5 August 2011 (UTC)
- Last time it came up, I saw no definitive resolution, certainly not that "the community doesn't want this". I find it silly to have entries only in scripts that they literally have never been published in.--Prosfilaes 21:32, 5 August 2011 (UTC)
- Romanizations are generally mentions, not uses. However if you romanize a whole text, or a whole series of texts in Gothic, this would be Gothic in Latin script, right? So the uses of the words are uses and not mentions. Am I missing something here? So you don't need an exception to CFI. Mglovesfun (talk) 11:22, 6 August 2011 (UTC)
- Yes, you’re missing something here. Ancient dead languages are not like modern living languages. Books and magazines are not being published in Sumerian or Akkadian. The vast majority of the known texts are in the form of images. Words of Spanish are used, but words in Sumerian are only mentioned. A couple of ancient languages are in the process of being revived, which is why we have an Old English Wikipedia; and a couple (Old Coptic, Ge'ez, Old Church Slavonic) are still in limited use liturgically, but most of these languages are simply studied, compared, and referenced. —Stephen (Talk) 12:36, 6 August 2011 (UTC)
- I agree with Angr: ... qino should be an entry saying something like "Romanization of 𐌵𐌹𐌽𐍉 (qinō)" rather than a red link. — I support the idea of romanisation of some languages, Gothic and others, for practial reasons. I've been thinking about Gothic script for a while (see [1] and here) but I guess that most users won't be able to type any Gothic characters, so there should be some kind of romanisation. Most dictionaries and grammars use romanised Gothic, so we shouldn't be "more catholic than the pope". --MaEr 14:17, 6 August 2011 (UTC)
- I agree with User:Angr and MaEr, "qino should be an entry saying something like "Romanization of 𐌵𐌹𐌽𐍉 (qinō)" rather than a red link". - -sche (discuss) 23:31, 6 August 2011 (UTC)
- I agree too, it makes no sense to have entries that nobody can search for. Romanised entries would be very helpful. BigDom 09:48, 7 August 2011 (UTC)
- I don't know if you're missing anything, but it doesn't seem to have been our practice to record Gothic in the Latin script, even though it so published.--Prosfilaes 18:37, 6 August 2011 (UTC)
- Maybe I'm missing something, discussing is not my strong side. — Indeed, it isn't practice to record Gothic words in Latin script in this wiktionary, but in my opinion there should be some romanised entries that link to the Gothic script entries (as Angr suggested). Otherwise users have nearly no chance of looking up a Gothic word.
- Imagine you find a word like aþþan in an etymological dictionary: how would you look up this word in Gothic script? --MaEr 09:22, 7 August 2011 (UTC)
- Appendix:Gothic script -- Liliana • 11:39, 7 August 2011 (UTC)
- Do you really expect people to copy and paste each individual letter for every word they look up in Gothic?? —CodeCat 12:05, 7 August 2011 (UTC)
- Appendix:Gothic script -- Liliana • 11:39, 7 August 2011 (UTC)
- No. This is why the Gothic script is featured in the edittools, so you can just click on all the letters you need to create your words. -- Liliana • 12:22, 7 August 2011 (UTC)
- If I had to do that just to look up one word, I'd probably find a better dictionary instead... —CodeCat 12:29, 7 August 2011 (UTC)
- I would think you could integrate the edittools toolbar onto the Main Page somehow, given good enough JavaScript skills. The setting on WT:PREFS doesn't seem to work for me, but in my opinion, this solution would be much easier to implement than a policy change. It wouldn't be a hassle for readers at all if it were implemented this way, as very many non-Latin dictionaries feature such a system. -- Liliana • 12:34, 7 August 2011 (UTC)
- This may work for some scripts like Gothic that are still superficially similar to Latin. But it would not work for cuneiform which is very different. Should we expect students of Hittite to learn cuneiform? —CodeCat 12:49, 7 August 2011 (UTC)
- I would think you could integrate the edittools toolbar onto the Main Page somehow, given good enough JavaScript skills. The setting on WT:PREFS doesn't seem to work for me, but in my opinion, this solution would be much easier to implement than a policy change. It wouldn't be a hassle for readers at all if it were implemented this way, as very many non-Latin dictionaries feature such a system. -- Liliana • 12:34, 7 August 2011 (UTC)
- If I had to do that just to look up one word, I'd probably find a better dictionary instead... —CodeCat 12:29, 7 August 2011 (UTC)
- No. This is why the Gothic script is featured in the edittools, so you can just click on all the letters you need to create your words. -- Liliana • 12:22, 7 August 2011 (UTC)
- This would not work even for Gothic. If somebody wanted to look up the Gothic words eyz or noicz, how would he be able to transliterate into the Gothic script unless he knew the alphabet? (And, of course, he would need a Gothic font installed or he wouldn’t see anything but boxes.) Many Gothic transliterations are even more cryptic than these two. The edittools are there for our editors, and not really for someone trying to look up a word. People who study ancient dead languages have completely different goals than those who study modern languages, and most of them have little need to learn the ancient scripts, particularly if it is a difficult one like cuneiform or hieroglyphics, and more often than not will carry out most of their studies on words in the Roman alphabet. —Stephen (Talk) 13:13, 7 August 2011 (UTC)
- It's not a hassle for readers at all to force them to transliterate words into an archaic script that isn't used any more? I think you have a different definition of the word hassle than I am, because I think that would be a PITA.--Prosfilaes 20:04, 7 August 2011 (UTC)
- I count six people in this discussion who (seem to) want or tolerate Romanisations, and one person who doesn't want them; I presume more supporters and opponents have commented in other discussions. (I hope one of the other discussions explained why pinyin and romaji are allowed.) So, let's set up a page for a vote (but not start the vote yet), so that we can begin working out how the vote should be set up, and ultimately decide this issue. As to how to set up the vote: I suggest having (on the same page) different votes for different languages, so users can (for example) vote to allow Romanisations of Gothic but oppose allowing Romanisations of Hittite, if they want. - -sche (discuss) 20:35, 7 August 2011 (UTC)
AWB
Hi, I wondered if I could be put on the approval list to use AutoWikiBrowser. I used it over at Wikipedia even before becoming an admin there so I know how to use it. At the moment, I need it to fix a small error I found in the conjugation table {{lb-conj-regular}}
which has affected a few pages and using AWB would be far quicker than going through the individual entries. Cheers, BigDom 16:51, 5 August 2011 (UTC)
- I don't see a problem with it. Granted. -- Liliana • 16:54, 5 August 2011 (UTC)
- Thanks, appreciated. BigDom 16:58, 5 August 2011 (UTC)
Glosses in old languages
Some words in old languages, such as Old High German, are attested only as glosses with translations in foreign-language (usually Latin) texts, instead of in running texts. In theory this is a mention and not a use, so as far as I know those words would fail CFI. However, because of the special situation of old languages, especially those that are sparsely attested, should there be an exception for these cases? —CodeCat 21:46, 5 August 2011 (UTC)
- This is also a problem for rare and recently extinct languages and dialects, such as the Vegliot dialect of Dalmatian, for which nearly all information comes from a German translation of an Italian text written by the scholar Matteo Giulio Bartoli and based on an interview with the sole surviving speaker of the language (and he was old, polylingual, partly deaf, and hadn't spoken the language in 20 years at the time of the interview).
- For Classical languages and languages known only from scholarly publications, the attestation criteria are normally relaxed. --EncycloPetey 04:51, 8 August 2011 (UTC)
- Liliana (Prince Kassad) is a fan of allowing mentions for otherwise unattested languages. I'm not really a fan myself, but I doub I would actually object to it if it were voted on. But I also doubt I'd support it. --Mglovesfun (talk) 11:00, 9 August 2011 (UTC)
- For clarity's sake, I'd object to mentions being allow for all dead languages, particularly as some dead languages are quite well attested, better attested in writing than some living languages! But for otherwise unattested languages such as Dalmatian, I would neither oppose nor support it (what I said above). --Mglovesfun (talk) 11:01, 9 August 2011 (UTC)
- Liliana (Prince Kassad) is a fan of allowing mentions for otherwise unattested languages. I'm not really a fan myself, but I doub I would actually object to it if it were voted on. But I also doubt I'd support it. --Mglovesfun (talk) 11:00, 9 August 2011 (UTC)
Regional distribution of colloquial terms
For colloquial terms it is often difficult to find information about regional distribution. Evidence for pan-UK and pan-US usage is not too hard, but print evidence for other usage seems more difficult. How can we accumulate evidence on other colloquial use? Is there a tag-and-category arrangement that would help. Can we accumulate votes or opinions somehow, perhaps using an entry's talk page?
AFAICT, we have never had a systematic effort to address this. We have had individuals who advocated particular dialects (Ireland, Canada, Australia, and Singapore come to mind). No catch-all category addresses this problem. Do we need a project page for each region to provide focus for potential contributors who may have some familiarity with a particular region or dialect? What would be good regions or dialects for experimenting? Scotland? Ireland? Australia? AAVE? Canada? Southern US? India? DCDuring TALK 20:21, 6 August 2011 (UTC)
Two entries that illustrate issues are toey (See also WT:RFV#toey.) and stupid-head (which bears an invisible remark in {{attention}}
). DCDuring TALK 20:43, 6 August 2011 (UTC)
Reviewing entry Talk pages that contain "black" yields some candidates for an AAVE page (or one otherwise named) that might also improve some definitions. DCDuring TALK 20:43, 6 August 2011 (UTC)
Compound tenses in conjugation templates
I have recently been creating conjugation templates for Luxembourgish verbs; example {{lb-conj-regular}}
. I was wondering if there is really any need to have the compound tenses in there, as the patterns don't change between verbs so just showing the relevant auxiliary verb should be enough. Currently, the number of parameters required to use the templates is getting out of hand. This is mainly due to the Eifel Rule, which means that -n or -nn endings are removed if the following word begins with certain consonants. Would anyone object if I removed these tenses and just include the forms that are actual conjugations of the verb, as in the Dutch templates (e.g. {{nl-conj-wk}}
)? The only problem would be that I would have no idea how to convert the existing template calls into the new form. BigDom 20:34, 6 August 2011 (UTC)
P.S. the new template would look like this, which has parameters to add a line for preterite indicative and simple conditional if needed (only a few verbs have these conjugations). BigDom 23:09, 6 August 2011 (UTC)
- You could take a look at the tables on the Galician verb cantar and Latin verb amō, to see how this has been handled for some compound tenses in Romance languages. I'm not familiar enough with Luxembourgish to offer an opinion to you. --EncycloPetey 04:43, 8 August 2011 (UTC)
- That's fair enough, not many people are too familiar with Luxembourgish! I had a look at those, and came up with User:BigDom/Template:lb-conj, which is based on the Latin and French templates. BigDom 18:13, 8 August 2011 (UTC)
Including compound tenses in tables is a help to readers, and there is no good reason to exclude it (in a paper dictionary, the good reason would be the paper used). Lmaltier 19:31, 18 August 2011 (UTC)
d, di, de
It is not "bullshit", d and di are the alternatives of de (的) and de (地). Please see Basic Rules of Hanyu Pinyin Orthography Chapter 7.4. And see here and here. Engirst 11:28, 7 August 2011 (UTC)
- That site (pinyin.info) does support your claim. How reliable is it? Do any printed reference works make the same claim? - -sche (discuss) 20:42, 7 August 2011 (UTC)
- Please see the original Chinese edition Basic Rules of Hanyu Pinyin Orthography (The national standard of the People's Republic of China). Engirst 22:47, 7 August 2011 (UTC)
- uhha... What exactly is your point? It obviously says de is the accepted form and the others are secondary (i.e. COULD be used) in 4.7.4, which comes back to my point, why are you putting in translations with secondary forms instead of the accepted primary form? Or maybe you just want to be difficult? Jamesjiao → T ◊ C 22:52, 7 August 2011 (UTC)
- Also please see printed reference works here and here. Engirst 04:36, 8 August 2011 (UTC)
- And Chinese Romanization: Pronunciation and Orthography. Engirst 05:01, 8 August 2011 (UTC)
- Right, "d" and "di" are clearly secondary forms. Do you think we should use them rather than the primary forms? Why? - -sche (discuss) 05:08, 8 August 2011 (UTC)
- "But it may be desirable in certain situations to differentiate the three. In this case, they may be assigned different written forms: 的, the most commonly used, as "d"; 地 as "di"; and the third, 得, as "de"." (Please see here)
- Anyway, the entry "di" shouldn't be deleted (Please see here). Engirst 05:56, 8 August 2011 (UTC)
- Have you lived in China at all? Try using pinyin with people, yes that's not going to turn out good for ya. So no, pinyin will never replace characters and will always STAY a pronunciation scheme. Anyway the pdf you linked, it states at the very end of the section that Note: when necessary for technical purposes, the characters (referring to the 3 discussed here) may be spelled as d, di, and de respectively.. What technical purposes? What was your purpose to prefer di over de in your translations? It makes no sense whatsoever to do that. Jamesjiao → T ◊ C 23:15, 8 August 2011 (UTC)
- The subject of this topic only let everyone know that it is not a "bullshit". Engirst 01:36, 9 August 2011 (UTC)
- Have you lived in China at all? Try using pinyin with people, yes that's not going to turn out good for ya. So no, pinyin will never replace characters and will always STAY a pronunciation scheme. Anyway the pdf you linked, it states at the very end of the section that Note: when necessary for technical purposes, the characters (referring to the 3 discussed here) may be spelled as d, di, and de respectively.. What technical purposes? What was your purpose to prefer di over de in your translations? It makes no sense whatsoever to do that. Jamesjiao → T ◊ C 23:15, 8 August 2011 (UTC)
- Right, "d" and "di" are clearly secondary forms. Do you think we should use them rather than the primary forms? Why? - -sche (discuss) 05:08, 8 August 2011 (UTC)
- Please see the original Chinese edition Basic Rules of Hanyu Pinyin Orthography (The national standard of the People's Republic of China). Engirst 22:47, 7 August 2011 (UTC)
- de is the dominant form and will always be. You can list di and d as alternative forms, but should never use them in translations. It's misleading. By the way, stop creating pinyin entries until an agreement has been reached on how we will go about creating in the future. You will suffer a block again if you persist in your singleminded approach. Jamesjiao → T ◊ C 20:52, 7 August 2011 (UTC)
- We should follow Wiktionary's current rules and you too. Some entries of new format for experimental purpose just following your edit (please see here). Engirst 23:08, 7 August 2011 (UTC)
- de is the dominant form and will always be. You can list di and d as alternative forms, but should never use them in translations. It's misleading. By the way, stop creating pinyin entries until an agreement has been reached on how we will go about creating in the future. You will suffer a block again if you persist in your singleminded approach. Jamesjiao → T ◊ C 20:52, 7 August 2011 (UTC)
- Alright, my understanding of the above-cited references seems to mirror Jamesjiao's: "d" and "di" are secondary forms which exist, and which should definitely be mentioned in the main entries ([[的]] and [[地]], I presume), but which should be disused elsewhere in favour of the primary forms. - -sche (discuss) 04:48, 8 August 2011 (UTC)
- The pronuciation "di" (with no tone) exists for the two out of the three particles that have the normal reading "de" - 的 and 地. This pronunciation is still common in songs and poems as the alternative to "de". It's seldom used in dictionaries and in my observation, it's discouraged in China like everything non-standard. I wouldn't include "d" at all. It must be incorrect pinyin, standard hanyu pinyin NEVER uses consonants on their own (without a vowel), apart from "r" (as a final only).
- It is encouraged in China and is a National Standard. Please see the National Standard of the People's Republic of China for your reference. Engirst 06:45, 8 August 2011 (UTC)
- YES, it is a standard. It is a standard for PRONUNCIATION for Mandarin speakers just like IPA is an international standard for pronunciation. A pronunciation standard is such that it doesn't contain any ambiguity (that's why English words themselves cannot be used as a pronunciation guide because their pronunciations are ambiguous!). It does not replace Chinese characters and never will. Jesus, how far would you go to twist and turn words like that to fuel your vain attempt at degrading this dictionary into a pinyin dictionary? I am not sure what I have to do to drill this into your brain. Why don't you just make an IPA dictionary as well for all the languages on this website? Go ahead. Jamesjiao → T ◊ C 23:15, 8 August 2011 (UTC)
- It is encouraged in China and is a National Standard. Please see the National Standard of the People's Republic of China for your reference. Engirst 06:45, 8 August 2011 (UTC)
- The pronuciation "di" (with no tone) exists for the two out of the three particles that have the normal reading "de" - 的 and 地. This pronunciation is still common in songs and poems as the alternative to "de". It's seldom used in dictionaries and in my observation, it's discouraged in China like everything non-standard. I wouldn't include "d" at all. It must be incorrect pinyin, standard hanyu pinyin NEVER uses consonants on their own (without a vowel), apart from "r" (as a final only).
- Where do you see "d" on its own? Hanyu pinyin is a national standard for romanisation and as the learning tool, not a replacement for the proper script - hanzi. --Anatoli 20:54, 8 August 2011 (UTC)
- Pinyin entries are convenient for users to learn Chinese. Only you said that Pinyin are for replacing Hanzi. Engirst 00:44, 9 August 2011 (UTC)
- Where do you see "d" on its own? Hanyu pinyin is a national standard for romanisation and as the learning tool, not a replacement for the proper script - hanzi. --Anatoli 20:54, 8 August 2011 (UTC)
- BTW, pinyinfo is an interesting site using a lot of pinyin but their objective is replace hanzi with pinyin as the standard Chinese Mandarin script, same as our ill-famed abc123 aka Engirst, etc. --Anatoli 05:33, 8 August 2011 (UTC)
- You have also chosen articles that favour your arguments. See this: w:zh:汉字改革, w:zh:汉语拼音, especially the section on 汉语拼音化 (pinyinisation). Oh yeah, another interesting to note that everything on the zh wp is written in characters, not pinyin!! Does that tell you something? Jamesjiao → T ◊ C 23:29, 8 August 2011 (UTC)
- Don't depart from the topic. The subject of this topic only let everyone know that it is not a "bullshit". Engirst 01:16, 9 August 2011 (UTC)
- You have also chosen articles that favour your arguments. See this: w:zh:汉字改革, w:zh:汉语拼音, especially the section on 汉语拼音化 (pinyinisation). Oh yeah, another interesting to note that everything on the zh wp is written in characters, not pinyin!! Does that tell you something? Jamesjiao → T ◊ C 23:29, 8 August 2011 (UTC)
- BTW, pinyinfo is an interesting site using a lot of pinyin but their objective is replace hanzi with pinyin as the standard Chinese Mandarin script, same as our ill-famed abc123 aka Engirst, etc. --Anatoli 05:33, 8 August 2011 (UTC)
- Alright, let's put usage notes at [[的]] and [[地]] explaining that "d" and "di" exist as (nonstandard? uncommon?) secondary romanisations of the characters, noting (if desired) which authorities/references give them as secondary romanisations. (Unless someone has a specific argument against providing this information, e.g. that the information is invalid. Even then — even if it is invalid — if it is in printed reference works, it would seem helpful to users to have a usage note like "XyzReference lists "d" as a secondary romanisation of this character, but this is wrong...") Consensus, however, is not to use those romanisations anywhere else. There furthermore appears to be an argument about whether Chinese is written in characters (such as 革) or in pinyin, which is spilling over into this thread from elsewhere; consensus on that issue is clearly that Chinese is written in Chinese script characters. - -sche (discuss) 23:55, 8 August 2011 (UTC)
Vote: Attestation of extinct languages 2
FYI, I have opened the vote: Wiktionary:Votes/pl-2011-05/Attestation of extinct languages 2. --Dan Polansky 09:57, 8 August 2011 (UTC)
The problems of Mandarin entries
What is your suggestion for solving these problems? Engirst 10:11, 9 August 2011 (UTC)
- Untoned pinyin is not allowed. We should follow rules. :) —CodeCat 11:20, 9 August 2011 (UTC)
- We are talking about the search and redundancy problems. Please read these problems clearly first. Engirst 12:29, 9 August 2011 (UTC)
- Entries are already searchable by pinyin on Wiktionary. Type in "yinyue" into the Wiktionary search bar and you'll see pinyin and characters are all searchable. ---> Tooironic 13:38, 9 August 2011 (UTC)
- Please read these problems clearly first. Engirst 17:26, 9 August 2011 (UTC)
- Entries are already searchable by pinyin on Wiktionary. Type in "yinyue" into the Wiktionary search bar and you'll see pinyin and characters are all searchable. ---> Tooironic 13:38, 9 August 2011 (UTC)
- We are talking about the search and redundancy problems. Please read these problems clearly first. Engirst 12:29, 9 August 2011 (UTC)
- Engirst --
- It is clear that Tooironic has already "read these problems clearly". As previously noted, entries are already searchable by pinyin on Wiktionary. Try it. Seriously. Enter toneless pinyin into the Wiktionary search bar, and the results you get are quite close to the "good solution" you link to on Jamesjiao's Talk page. WT effectively already implements what you are suggesting, obviating any need for toneless pinyin entries. -- Eiríkr Útlendi | Tala við mig 18:10, 9 August 2011 (UTC)
- Thanks for your response. But the problems are: one problem about yapo mentioned by Contributions/71.66.97.228; another I am talking with Jamesjiao is about the duplication of traditional and simplified character entries. Engirst 20:33, 9 August 2011 (UTC)
- Perhaps you could restate the exact issues, then? Reading User_talk:Jamesjiao#yapo, the primary issue appears to be about searching, which is already addressed, and about page overlap between toneless pinyin entries and other languages, which is moot since toneless pinyin pages are not needed and should be (are in the process of being?) removed.
- I see your mention of duplication issues, but you do not give enough detail there for me to understand what you mean. Is your concern about duplication that the same entry content is duplicated across multiple heading words, such as 馬 and 马? This is an issue for multiple languages, even English (c.f. color vs. colour -- the content should be mostly identical, as these are essentially the same word, only spelled differently -- just as, for example, 呪い and 詛い in Japanese).
- Please explain. As it is, the main concern of yours that I can understand has already been dealt with. -- Cheers, Eiríkr Útlendi | Tala við mig 20:50, 9 August 2011 (UTC)
- Yes, we are talking about the duplication such as 馬 and 马 (Please see here as Jamesjiao mentioned).
- This should be a good solution. There is no duplication of entries of the dictionary of this "good solution". Please see the search results of "蘋果", "苹果", "ping2guo3" and "pingguo", there is no duplication indeed. Engirst 03:57, 10 August 2011 (UTC)
- Thanks for your response. But the problems are: one problem about yapo mentioned by Contributions/71.66.97.228; another I am talking with Jamesjiao is about the duplication of traditional and simplified character entries. Engirst 20:33, 9 August 2011 (UTC)
- Again, just ignore him, he's trolling again. It is true that the trad/simp entries could be synchronised in a way to make it easier to contribute, but so far no one has come up with any kind of solution. ---> Tooironic 00:46, 10 August 2011 (UTC)
- For all that, Engirst has apparently hit upon a real issue that has been a conceptual niggling thorn in my side as well. However, the crux of the issue -- the need to have multiple index fields having the same descriptor content -- touches on one of the core limitations of the wiki structure: you can transclude, but you can't have more than one index field (i.e., headword) per page. Dictionaries like the one that Engirst points to as potential solutions use very different back-end database structures, something that is just not possible on the current generation of wiki software (and probably won't be possible for the foreseeable future). This structure works fine for an encyclopedia, but it has real shortcomings when people try to apply it to a dictionary.
- Several months back, I recall participating in a similar discussion about how to unify English-language entries such as color and colour. There just doesn't seem to be an elegant way to do it; labeled section transclusion presents itself as one option, as does fancy selective transclusion using
{{#ifeq:}}
calls, but then the trouble is still that the content must reside under just one headword and then be referenced by the alternate spellings. Another option might be redirects, but then the destination of the redirects must include some way of explaining the alternate spellings and the reason for the redirection. The Semantic MediaWiki extension seems the most promising, and some folks have built interesting tools using this that might do the kind of many-headwords-to-one-entry structure that Engirst seems to desire, but I don't think this extension is enabled for WT, and it would require a gargantuan amount of work to support here.
- So Engirst, if you're reading this, I do feel your pain -- but there's nothing for it, unfortunately, as the reason that Wiktionary needs separate pages for 蘋果 and 苹果, or 馬 and 马, or colour and color, comes down to the core fundamentals of how the wiki software is designed -- and that's not going to change any time soon. -- Cheers, Eiríkr Útlendi | Tala við mig 06:31, 10 August 2011 (UTC)
- Don't feel too sympathetic, if you don't know the full story. The technical limitations were always there but the work on Mandarin and Serbo-Croatian was continued nevertheless, despite the necessity to maintain duplicate entries. People like Engirst slow down the work by not following the accepted rules and creating further redundancy, completely out of synch with existing simplified/traditional Mandarin entries, causing a lot of extra work for others. All the requests and blocks were ignored and he continued to do what he wanted using multiple anonymous accounts. --Anatoli 06:44, 10 August 2011 (UTC)
- The problem doesn't really come down to "the core fundamentals of how the wiki software is designed", exactly, only to how the Wiktionary editing system works. If we were to switch to using javascript tools as the primary way to edit entries, synchronizing data would be pretty simple. Side question: Is there any specific reason that the pages with toneless pinyin titles don't get
{{also}}
added to them, pointing to the actual entries, or is it just that nobody bothered? --Yair rand 06:54, 10 August 2011 (UTC)
All the sources I've found listing some hypocorisms agree about some entries which we've qualified of diminutive currently here. For example, our definition of this last term is:
- A word form expressing smallness or youth
And the article Johnny says:
- A diminutive of the male given name John
And I doubt that all the Johnnies were named from an older or bigger John...
That's the reason why I suggest to uniform these etymologies, by replacing the different below mentions by Template:hypocorism:
- Alec: diminutive
- Alex: shortened form
- Lex: pet form
- Kat: short form
- Joe: common nickname
- Deb: abbreviated form
JackPotte 18:01, 9 August 2011 (UTC)
- A diminutive also means a hypocorism. Isn't it simpler to add that definition to diminutive? Is the Wikipedia your only source? The diminutive definition is built in Template:given name. All diminutives/hypocorisms used to be defined as "given names", hence the confusion of terms above. Pet forms of given names are used in a different way in every language so strict standardization might not be a good idea. --Makaokalani 15:08, 10 August 2011 (UTC)
- Wikipedia is reliable, and the frontier is clear, as there as into all the dictionaries I've read, including in French which translations (hypocoristique & diminutif) are fully transparent.
- "hypocoristic diminutive" isn't a pleonasm.
- The origins of many surnames are obscured by one characteristic of the hypochoristic forms of many personal names, that is, the pet forms, diminutives, or 'short' forms of names.
- Jacko is a diminutive (informal) whereas Jacky is an hypocorism...
- I'll report these researches into our two articles when all the minds will be forged. JackPotte 20:56, 12 August 2011 (UTC)
Common nouns and proper nouns
I seem to be unclear on the difference between common nouns and proper nouns. Why, for example, is German a common noun when it means "a person from Germany" but a proper noun when it means "the German language"? It's capitalized in both meanings. —Angr 13:49, 10 August 2011 (UTC)
- Capitalization does not make a noun proper, nor does lower case make it common. The difference between common and proper nouns is intrinsic and lexical, with the decision of whether to capitalize or not being secondary. Capitalization in most languages is more by convention than by type. Spanish does not capitlaize the names of languages, even though they are proper nouns. English capitalizes the days of the week, but does not really use them as proper nouns. German capitalizes all nouns. Further, capitalization even in English has varied through time, so that abstract nouns like socialism and liberty were once regularly capitalized even though they are not capitalized today. This reflects a change in style of writing, and not a change in grammar.
- For more than you want to know about the difference between the two categories of noun, see the draft I started at User:EncycloPetey/English proper nouns. --EncycloPetey 14:36, 10 August 2011 (UTC)
- Thanks for the link to your draft. It's the first time I've ever seen a definition of proper noun that wasn't circular. Usually when I try to pin someone down on why something is a proper noun, they say "Because it's capitalized." And when I ask why it's capitalized, they say "Because it's a proper noun". What will be easiest for me to remember is that proper nouns are always definite and don't get pluralized (although some proper nouns are pluralia tantum, like "the Netherlands" and "the United States"). As for weekdays, I think they can be both. If I say "I'll do it on Friday", it's a proper noun as it's referring to a single unique day, but if I say "There are five Fridays in this month", it's a common noun because it's referring to members of a class. Noticing that your draft is called "English proper nouns", I wonder if it's possible to come up with a cross-linguistic definition of proper noun. Other parts of speech like "noun", "verb", "adjective", and "preposition" can be defined without reference to the language they occur in (though of course not all languages have all parts of speech). —Angr 14:52, 10 August 2011 (UTC)
- It is possible, but pushes into the realm of abstract linguistic philosophy, which will be understandable by few people. My choice was to work on a page treating English as exhaustively as possible, with enough examples and discussion to allow people familiar with other languages to make the extrapolation of the principles themselves. Even the quality of "definite" doesn't work across all European languages because there are shades of difference in what that means. Some languages have a "definite" and "indefinite" form for all their nouns. --EncycloPetey 14:55, 10 August 2011 (UTC)
- CGEL makes a distinction between proper nouns and proper names which we never have in English PoS headings, AFAICT, though former versions of CFI did.
- "The central cases of proper names are expressions which have been conventionally adopted as the name of a particular entity - or, in the case of plurals like the Hebrides, a collection of entities."
- "Proper nouns, by contrast, are nouns which are specialized to the function of heading proper names."
- As I would apply these definitions "German" is both a common and proper noun. It is a proper name when referring to the language. This usage seems to make it a proper noun. When referring to the people "(the) Germans" would seem to be the proper name. When referring to an individual "German", it is a common noun (capitalized). It also seems to function as a full adjective, being gradable, comparable, and able to serve as a predicate without any article or determiner. DCDuring TALK 15:09, 10 August 2011 (UTC)
- The CGEL distinction between "proper noun" and "proper name" hinges on the fact that they define a word as a cohesive unit lacking internal spacing. Since Wiktionary works with terms as its units, and permits internal spacing in these terms, the distinction between a "proper name" and "proper noun" becomes moot. But, your summary of what CGEL says is spot on. --EncycloPetey 15:24, 10 August 2011 (UTC)
- It is also true that typically we do not have Proper noun PoS sections at entries like [[Germans]] for "the Germans". Shouldn't we? DCDuring TALK 19:11, 10 August 2011 (UTC)
- Also, aren't informal demonyms also proper names and, therefore, proper nouns for Wiktionary purposes, eg, "(the) Brits"? Even derogatory ones would be. DCDuring TALK 19:18, 10 August 2011 (UTC)
- Informal demonyms have some properties of a proper noun, but that's true of most substantive biological nouns, and not just demonyms. Compare: "We planted a conifer." to "The conifers grow in boreal climates." In the former sentence, you a speaking of a member of a group, but in the latter, you are referring to the category as a whole. We generally do not create a separate entry for these collective senses. --EncycloPetey 19:23, 10 August 2011 (UTC)
- The CGEL distinction between "proper noun" and "proper name" hinges on the fact that they define a word as a cohesive unit lacking internal spacing. Since Wiktionary works with terms as its units, and permits internal spacing in these terms, the distinction between a "proper name" and "proper noun" becomes moot. But, your summary of what CGEL says is spot on. --EncycloPetey 15:24, 10 August 2011 (UTC)
Nei Mongol - Why is it locked?
"Mongol" of Nei Mongol shouldn't be an "abbreviation for Mongolia" (please see here for reference). Anyhow, the entry shouldn't be locked. Engirst 02:33, 11 August 2011 (UTC)
Why is it locked?
The etymology of Nei Mongol seems has problem. Engirst 00:57, 1 October 2011 (UTC)
Library of Congress vocabularies
Over at http://id.loc.gov/ you will find search and download entries to the Library of Congress' subject headings, name authorities and other vocabularies. For example, in the Geographic Areas file, you will find that "Sweden" has a broader term "Europe" and narrower terms such as "Lapland". This means library books tagged as Lapland may contain information about Sweden and Europe. It might be a hint that the Wiktionary entry Sweden should contain a pointer to the Wiktionary entries Europe and Lapland. I don't know if this is a useful source of ideas, but you can download it and play with it. Wiktionary has 81 links to loc.gov but none yet to id.loc.gov. --LA2 10:52, 11 August 2011 (UTC)
- It might be great as an authoritative substitute for our current unreferenced, whimsical topical category structure. It would allow the reversal of the hijacking of the usage context labels. DCDuring TALK 12:17, 11 August 2011 (UTC)
Admin-only definition editing options trial
It was suggested in the earlier discussion on enabling the definitions editing tool for a trial period that it would be better to first have opt-out trials for only administrators. So what do people think about turning it on for two weeks for admins? --Yair rand 18:54, 11 August 2011 (UTC)
- Support. —RuakhTALK 18:04, 12 August 2011 (UTC)
- Support. DCDuring TALK 18:36, 12 August 2011 (UTC)
- Okay.—msh210℠ (talk) 17:53, 14 August 2011 (UTC)
- Support, sounds good to me. --Neskaya … gawonisgv? 07:04, 15 August 2011 (UTC)
- Okay, trial started. I'll put a disabling button in this section. Anywhere else that should have a disable button? Maybe in WT:News for editors, or is that not really kind of thing that goes there? --Yair rand 21:06, 17 August 2011 (UTC)
- The trial is now over. --Yair rand 20:58, 31 August 2011 (UTC)
- Okay, trial started. I'll put a disabling button in this section. Anywhere else that should have a disable button? Maybe in WT:News for editors, or is that not really kind of thing that goes there? --Yair rand 21:06, 17 August 2011 (UTC)
What counts as a "derived term"?
A string of dubious and excessive edits to Japanese entries leads me to wonder, what counts as a "derived term"? Do simple compounds warrant listing as "derivations"?
By way of example, have a look at the 魔法#Japanese page. The list of "derived terms" includes things like 魔法カード "magic card" and 魔法能力 "magic ability", among others. Both of these are just plain old compounds -- one word plus another -- and I could just as validly say 魔法茄子 "magic eggplant" or 魔法鉛筆 "magic pencil". Note that these terms are not customary set phrases, like magic carpet, but just plain old compounds.
Do compounds like this, of the exceptionally prosaic and unremarkable sort, merit inclusion in lists of "compounds" or "derived terms" on entry pages? -- Cheers, Eiríkr Útlendi | Tala við mig 06:14, 12 August 2011 (UTC)
- To clarify, now that my brain has picked up some speed, how do we decide if a combination of words is just a sum of parts, or if it counts as something more? -- Eiríkr Útlendi | Tala við mig 06:16, 12 August 2011 (UTC)
- Have you looked at WT:AJA? There is an associated talk page. Note the existence of an "Idioms" header. DCDuring TALK 18:40, 12 August 2011 (UTC)
- Thank you DCDuring, yes, I have looked at that page. What I'm wondering about here is not quite about idioms, but rather what counts as a "derived term" or "compound". The WT:AJA subsection Derived terms doesn't quite answer the question. (But thank you for prompting me to read through that page again, as it clarifies that only kanji headwords should have a "Compounds" section.) -- Cheers, Eiríkr Útlendi | Tala við mig 19:01, 12 August 2011 (UTC)
- I was hoping yours was a Japanese-specific issue.
- I don't think this is a settled question at the margins - and the margins are ample. What can be under Derived terms would include all morphological or historically derived terms that meet WT:CFI. (I personally would prefer to put terms that are historically derived from other languages despite there being a morphological process to which the etymology could be ascribed in Related terms.) But there has also been inconclusive discussion about the desirability of inserting common collocations under Derived terms. I personally prefer having certain collocations illustrated in usage examples rather than in Derived terms, but, especially for large entries, citations appear on the Citations page where they are not searchable by default. DCDuring TALK 19:46, 12 August 2011 (UTC)
- Thank you DCDuring, yes, I have looked at that page. What I'm wondering about here is not quite about idioms, but rather what counts as a "derived term" or "compound". The WT:AJA subsection Derived terms doesn't quite answer the question. (But thank you for prompting me to read through that page again, as it clarifies that only kanji headwords should have a "Compounds" section.) -- Cheers, Eiríkr Útlendi | Tala við mig 19:01, 12 August 2011 (UTC)
unified Serbo-Croatian... by bot
Would it be acceptable to convert the subpages of Category:Croatian parts of speech to Serbo-Croatian by bot, as opposed to only by hand. It would be risk free, it wouldn't be possible to add the Cyrillic spellings but that's about the only thing a bot can't do. Specifically
- Convert ==Croatian== to ==Serbo-Croatian==
- Convert |hr}} to |sh}} (etyl templates)
- Convert {{hr-decl-noun| to {{sh-decl-noun|
- Convert |lang=hr to |lang=sh
- Convert [[Category:Croatian to [[Category:Serbo-Croatian
- Convert [[Category:hr: to [[Category:sh:
- Convert {{infl|hr| to {{infl|sh|
This would leave {{hr-adj}}
, {{hr-noun}}
and {{hr-noun-coll}}
. {{hr-noun-coll}}
has few enough transclusions that it can be done by hand in a few minutes; not so for {{hr-noun}}
and {{hr-adj}}
though; these as a temporary measure could categorize in [[Category:Serbo-Croatian <adjectives|nouns>]] while waiting for them to be removed; or, depending on your taste, AWB can skip any pages featuring these two templates. I'm pretty sure you can set up AWB (AutoWikiBrowser) to skip if it finds a certain sequence of characters on a page, such as {{hr-noun|. Mglovesfun (talk) 12:59, 14 August 2011 (UTC) IFYPFY.—msh210℠ (talk) 00:37, 14 November 2011 (UTC)
- The work of editors that have chosen Croatian headers, etc. must be respected. Lmaltier 19:14, 18 August 2011 (UTC)
- Even if I bought into this argument (which I don't) Ivan Stambuk created the majority of the Croatian entries, and he supports converting them to Serbo-Croatian. Mglovesfun (talk) 08:58, 19 August 2011 (UTC)
- I have no objections. —Internoob (Disc•Cont) 19:02, 21 August 2011 (UTC)
- It looks interesting and I support it but not sure what to do with existing Croatian, Bosnian and Croatian translations, they may coincide (same words) or differ with existing Serbo-Croatian translations, many of them are not formatted with
{{t}}
, only have square brackets. Serbian may have nested Cyrillic and Roman (occasionally Latin) translations and sometimes no nesting. If they coincide with existing Serbo-Croatian translations, they should not add duplications. --Anatoli 00:26, 22 August 2011 (UTC)- User:Mglovesfun/vector.js converts Serbian to Serbo-Croatian in translation tables, but does not convert Bosnian and Croatian to avoid possible duplication. Note: translation templates don't appear in the above proposition. The reason the vector converts Serbian is that they're the closest alphabetically and that only Serbian uses both script. Also, some Serbian translations only use the Cyrillic script and use a transliterationn into the Latin script, this despite the fact the Latin script is official in Serbian. I read a book on the matter while I had no Internet, and the book wasn't even a recent print! Mglovesfun (talk) 07:58, 22 August 2011 (UTC)
- It looks interesting and I support it but not sure what to do with existing Croatian, Bosnian and Croatian translations, they may coincide (same words) or differ with existing Serbo-Croatian translations, many of them are not formatted with
- Serbian uses both Cyrillic and Roman. --Anatoli 09:50, 22 August 2011 (UTC)
- Yes... Latin and Cyrillic. Mglovesfun (talk) 10:12, 22 August 2011 (UTC)
- Serbian uses both Cyrillic and Roman. --Anatoli 09:50, 22 August 2011 (UTC)
Current votes
These are the current votes:
--Daniel 04:04, 15 August 2011 (UTC)
- Thanks! --Neskaya … gawonisgv? 07:03, 15 August 2011 (UTC)
native-languages.org
Hello, do you think we could get informations from this website. On their FAQ, we can read
- Q: May I reprint information from your website on my own website or blog?
- A: Yes, as long as you link back to our website from the page where you have used our information.
Yet, there is also
- Q: I am a teacher. May I use information from your website in my classroom?
- A: Yes. All of the materials on our website may be freely used for noncommercial educational purposes.
Problem is the first affirmation looks mean we can get this information if we cite them but the second one says it is forbidden for commercial uses what is not compatible with Wiktionary licence. Maybe, we could write them to ask if we can exeptionnally import these data on Wiktionary? What do you think? Pamputt 07:41, 17 August 2011 (UTC)
- To be honest I don't really trust their information, so it's probably not worth asking. -- Liliana • 09:31, 17 August 2011 (UTC)
including context tags in inflected forms (of sh entries)
This vote (which links to kolovoza) raises for me a point worth discussing: for Serbo-Croatian entries, should we allow dialect/sublanguage context tags not only in main entries (kolovoza), but also in form-of entries? That would have the benefit of clarifying that the series of letters kolovoza is only used in Croatian; it might have the disadvantage of making readers think (until they clicked through to the main entry) that kolovoza was a Croatian genitive of a pan-Serbo-Croatian word and Serbian used a different genitive, like *kolovozu. Note that I say allow, not necessarily require (uniformity is good, but the work could be left to the editors who wanted to do it). - -sche (discuss) 07:55, 18 August 2011 (UTC)
- Dunno, as an analogy I dislike something like:
==English== ===Noun=== '''favors''' # {{US}} {{plural of|favor}}
As my initial reaction reading this is 'what is the non-US plural of favor'? -Mglovesfun (talk) 10:17, 18 August 2011 (UTC) - Inflected form entries are just glorified redirects, not mirrors of lemma entries. We should limit such additional content to cases for which the inflection has a context different from that of the lemma. Tne Serbo-Croatian aspect of this is a result of the vote on that matter. I'm surprised we haven't gotten more pushback on that vote. DCDuring TALK 12:26, 18 August 2011 (UTC)
- I agree with Mglovesfun that this part of your comment: "it might [… make] readers think […] that kolovoza was a Croatian genitive of a pan-Serbo-Croatian word and Serbian used a different genitive, like *kolovozu" hits it exactly on the nose. Form-ofs shouldn't duplicate lemmata's context-tags. That said, what if kolovoza really were a Croatian-specific genitive of a pan–Serbo-Croatian word? What about rare/archaic/dialectal plurals of ordinary English nouns? I think context-tags are potentially useful in those cases. [After e/c: this also seems to be what DCDuring is saying.] —RuakhTALK 12:40, 18 August 2011 (UTC)
- I agree with this, I can find one specific example:
====Verb==== '''spelt''' # {{chiefly|British}} {{past of|[[spell#Verb|spell]]}}
This to me seems to be correct. Mglovesfun (talk) 19:22, 18 August 2011 (UTC)- In addition to that type, with an explicit context tag, we have entries like [[boyz]], which uses
{{form of}}
with the in-template equivalent. Arguably it should have a better register indication than "informal". DCDuring TALK 20:05, 18 August 2011 (UTC)
- In addition to that type, with an explicit context tag, we have entries like [[boyz]], which uses
- I agree with this, I can find one specific example:
Pinball category?
What up homies. I've been adding some pinball terms. I don't know whether it merits a category, and don't know or care enough about categories to create one, but if anyone is so inclined then the following terms might possibly qualify: autoplunger, backbox, backglass, flipper, flipperless, gobble hole, kickback, knocker, multiball, rollunder, rollover, rolldown, outhole, outlane, pinballer, playboard, plunger, silver ball, sinkhole. Equinox ◑ 21:46, 18 August 2011 (UTC)
- I don't see why not. Mglovesfun (talk) 08:13, 19 August 2011 (UTC)
- Did you miss digit counter (it's in "Pinball Wizard"). SemperBlotto 08:19, 19 August 2011 (UTC)
{{suffix|verb|t}}
Could I please continue adding etymology to verb forms ending in -t‽ --Pilcrow 22:40, 18 August 2011 (UTC)
- I think you'd better avoid adding "===Etymology=== {{suffix|verb|t}}" to the likes of "dreamt", and avoid adding "===Etymology=== {{temp|suffix|verb|ed}}" to the likes of "dreamed". In general, I think verb forms should better have no etymology section, with exceptions in those cases where the etymology is unusual and of special interest.
- For the record, there exist the following categories:
- Category:English words suffixed with -ed - 342 members
- Category:English words suffixed with -t - 21 members
- --Dan Polansky 10:14, 20 August 2011 (UTC)
- But the -t suffix is irregular. If etymology sections should not be included, could I at least categorize those forms‽ --Pilcrow 16:40, 20 August 2011 (UTC)
Note that the hand picture links for these Dutch, Swedish and Mandarin entries these point to #English in the links - how to fix this? ---> Tooironic 01:15, 21 August 2011 (UTC) Also note that Category:Visual_dictionary contains both English and LOTE entries - is it supposed to mixed up like that? ---> Tooironic 01:20, 21 August 2011 (UTC)
{{picdiclabel}}
has a language parameter (lang). Mglovesfun (talk) 10:09, 21 August 2011 (UTC)- How do you use it? I'm a newbie about these kind of things. ---> Tooironic 12:46, 22 August 2011 (UTC)
- Like this, as you can see it took me two goes to get it right. Mglovesfun (talk) 12:48, 22 August 2011 (UTC)
- Awesome. Thanks. How about the Dutch and Swedish at hand? ---> Tooironic 21:34, 23 August 2011 (UTC)
- Was just about to do this for you, but Mglovesfun has beaten me to it! It's fixed now anyway. BigDom 22:32, 23 August 2011 (UTC)
- Awesome. Thanks. How about the Dutch and Swedish at hand? ---> Tooironic 21:34, 23 August 2011 (UTC)
- Like this, as you can see it took me two goes to get it right. Mglovesfun (talk) 12:48, 22 August 2011 (UTC)
- How do you use it? I'm a newbie about these kind of things. ---> Tooironic 12:46, 22 August 2011 (UTC)
Preferred forms for Japanese lemmata
Haplology and I wound up conversing a bit on the subject of lemma forms for Japanese, as relating to the keiyōdōshi part of speech (also known as "quasi-adjectives", and better known among Japanese learners as "な (na) adjectives"). So far, every Japanese dictionary that I've ever seen uses the uninflected base form of a な adjective as the headword -- except Wiktionary. For reasons lost in the mists of history, Wiktionary alone uses an inflected form of な adjectives as the headword, by including the な on the end. This causes some odd inconsistencies, such as the base uninflected forms being mostly just stub entries, sometimes being missing, sometimes being redirects to the inflected forms with な, and also to the base forms sometimes being classified as nouns (which is never correct AFAICT).
I was under the general impression that, while Wiktionary happily includes inflected forms of a word, the main entry should be under the uninflected form, with the inflected forms mostly just pointing to the main entry. (I cannot find an explicit description of this policy, neither at WT:ELE nor at WT:AJA; perhaps this could be added?) This holds true at least for English, German, Spanish, Latin, Korean, and Navajo terms, and for Japanese verbs and い (i) adjectives, to the best of my knowledge. If this understanding is correct, would anyone object to Japanese editors keeping the main entries for な adjectives under the uninflected, な-less base forms? -- Eiríkr Útlendi | Tala við mig 16:14, 23 August 2011 (UTC)
- I second this. It's easier as well, otherwise you would need to keep both entries - with な and without it. The same is true for の (-no) adjectives. You can add an adjective section to existing noun entries, e.g. 病気. --Anatoli 20:28, 23 August 2011 (UTC)
@Anatoli: Thanks for replying.
Some additional questions / considerations:
There are lots of entries with [Japanese word] + な, or [Japanese word] + に. By everything I've read (not just on WT), these な and に are essentially particles, which makes these entries just sum-of-parts and thus not meeting WT:Criteria for inclusion. I tried to explain some of how this area of Japanese grammar works, and why this means such entries are SOP and thus not valid, over at WT:RFD#親切に. The way Japanese keiyōdōshi work makes this even more important, in that keiyōdōshi are *both* adjectives and adverbs at the same time; in kanji compounds, there is no distinction between adjectival or adverbial senses, and in spoken or running text, the distinction is made by using either the な (or の for those rarer の-type keiyōdōshi) or the に particles -- i.e., by adding a separate word.
I propose that keiyōdōshi entry content be kept under the main keiyōdōshi headword, without any particles. Adjective/adverb senses can be shown by using Template:ja-na or similar, as is currently the case over at 特別. I further propose that the keiyōdōshi + particle entries be deleted, as these are sum-of-parts and are thus no more worthy of inclusion than English phrases like an apple or to the store.
However -- since keiyōdōshi are equally adjectives and adverbs, the WT:AJA#Quasi-adjectives (形容動詞) recommendation to use level-three or -four Adjective headings seems inadequate, as this ignores adverbial senses. So:
- Should we include both Adjective and Adverb headings for keiyōdōshi?
- Should we instead use some other heading, such as Keiyōdōshi, Quasi-adjective, or something else?
- Should we use just the Adjective heading, and 1) add something to Appendix:Japanese_glossary about this? 2) create Appendix:Japanese_grammar? 3) refer users to w:Japanese grammar?
TIA for your input, -- Eiríkr Útlendi | Tala við mig 18:34, 25 August 2011 (UTC)
I would go with including both Adjective and Adverb headings for keiyōdōshi. While clear to people familiar with Japanese grammar, other headings would be confusing to most readers. If both Adjective and Adverb are included next to each other, it should be clear from the juxtaposition that the word is both parts of speech at the same time. Such a system would also be easier for new contributors to pick up. In addition, it would be the most aesthetically pleasing, in my opinion.In any case I agree that pages with -na or -ni should be deleted and the content moved to the headword without -na or -ni. Haplology 15:36, 27 August 2011 (UTC)- I've changed my mind. In line with Eirikr, I conclude that na-adjectives or keiyōdōshi should be listed under the header "Nominal" in their uninflected form, that is, without a "na" or "ni". For example, おぼろ. Haplology 17:48, 29 October 2011 (UTC)
- I really don’t understand why English Wiktionary uses forms with な as entries. At least they should end with だ. First of all, there are two conflicting analyses of what they call na-adjectives:
- In the traditional Japanese grammar with keiyōdōshi, 科学的 is not a word but a stem, but it is against my native speaker’s intuition. It is a noun, even though its behavior is different from common nouns. If a Japanese pupil doesn’t know the meaning of 科学的な, he or she will quite naturally look for 科学的 in a dictionary, not 科学的だ or 科学的な. Why don’t we have a small table to show what they call keiyōdōshi and some inflected forms under the noun entry like the following?
- It is clearer, and relatively free from the academic disagreement. — TAKASUGI Shinji (talk) 12:30, 30 October 2011 (UTC)
- I think I agree and so does Eirikr, as I understand, at least mostly. I'm just not sure I understand your proposal exactly, so just to confirm:
- Which parts-of-speech headers to have and what to call them? Do you mean one header, Noun, with Attributive, Adverbial, and Predicative forms in a table inside Noun? Is that better than using Nominal instead of Noun?
- Table: The current {{ja-na}} template produces a table which has all of those forms and others, except it has Predicative for [[科学的だ] instead of Terminal. How about we use that table?
- Page title: In your example, is the page 科学的?
- In short, how would you change 素直?
- Thanks Haplology 16:23, 30 October 2011 (UTC)
- I didn’t know the nominal header. However, 素直 is a noun functioning like an adjective, rather than a word of another class functioning as a noun. It is not uncommon to say 素直が一番だ instead of the standard 素直なのが一番だ ("Being honest is the best thing"). If we don’t have an adjectival noun header, I believe it should be a noun, even though that is not traditional. The use of
{{ja-na}}
will be fine. — TAKASUGI Shinji (talk) 18:05, 30 October 2011 (UTC)
- I didn’t know the nominal header. However, 素直 is a noun functioning like an adjective, rather than a word of another class functioning as a noun. It is not uncommon to say 素直が一番だ instead of the standard 素直なのが一番だ ("Being honest is the best thing"). If we don’t have an adjectival noun header, I believe it should be a noun, even though that is not traditional. The use of
- I think I agree and so does Eirikr, as I understand, at least mostly. I'm just not sure I understand your proposal exactly, so just to confirm:
- It is clearer, and relatively free from the academic disagreement. — TAKASUGI Shinji (talk) 12:30, 30 October 2011 (UTC)
- Thank you for chiming in, Takasugi-san. I must admit I shy away from using the ===Noun=== header, precisely because 1) these do not function as regular nouns, as you point out, and 2) a number of these words cannot be used grammatically as nouns. For instance, I don't believe you can use the -的 (-teki) words as the subject of a sentence in a purely noun sense. A quick Google search for google:"科学的は" gives tons of hits, but most with punctuation between the 的 and the は; those few instances without punctuation instead show 科学的は as shorthand for 「科学的」とは or some similarly elided construction. This is not unlike English utterances such as "'decidedly' is an interesting word", where a part of speech that is not a noun is used as the noun subject of a sentence. One such Japanese example is [ here] on an Amazon book listing page:
科学的はどんな推論をし、アプローチするのか?
- A quick translation of this might be:
Scientific is what kind of reasoning or approach? (direct translation to match the source language)
- 科学的 (scientific) is clearly being used as the topic/subject of this sentence, but I would argue that it is still adjectival in nature. In the same mien, verbs and verb phrases can be used as subjects, as in 覚えておくがいい (it would be good to remember), but these are still verbs.
- After mulling for some time on what would be the best header for this POS, I find my thoughts keep coming back to the simple label na- adjective. My reasons:
- As noted above, these are not straight nouns, and labeling them ===Noun=== would cause no end of confusion.
- ===Nominal=== works to some extent, except I find that this is seldom used, and further thought leads me to suspect that few WT users would know what this is. Ditto for Nominal adjective, Copular adjective, Descriptive noun, Adjectival noun. And, FWIW, I've only ever seen the label Quasi-adjective here on WT.
- The term na- adjective is widely used in Japanese textbooks for the English speaker, whereas I have only seen other labels used by linguists in dense academic writing that most WT users probably won't have read. This label makes it clear that this is an adjective, but also that it's somehow different from being just an adjective.
- I poked around the Wiktionary talk:About Japanese page, but I couldn't find anything that looked like a real discussion of the proper labels for this part of speech. Since it looks like everyone (so far, at least) is happy to locate the lemmata for these under the stem form (i.e. without any trailing -na or -da), it looks like this particular Preferred forms for Japanese lemmata issue is closed. I will start a new thread momentarily about what to call 形容動詞 (keiyōdōshi) in English for purposes of the POS label. -- Cheers, Eiríkr Útlendi │ Tala við mig 16:15, 31 October 2011 (UTC)
- Yes, we seem to have agreed on the issue of lemmata. Just one thing to comment: がいい is rather a suffix today and you cannot have a pause between が and いい, and you cannot replace いい with another adjective. Its meaning is also slightly different from と いい or のが いい.
- 覚えておくと、いい。
- 覚えておくのが、いい。
- *覚えておくが、いい。 (ungrammatical: がいい cannot be split)
- おそらく覚えておくといい。
- おそらく覚えておくのがいい。
- *おそらく覚えておくがいい。 (ungrammatical: がいい cannot coexist with some modal adverb)
- 私が行くといいだろう。
- 私が行くのがいいだろう。
- *私が行くがいいだろう。 (ungrammatical: がいい cannot be used for the first person)
- — TAKASUGI Shinji (talk) 16:02, 1 November 2011 (UTC)
- Excellent note, Takasugi-san, thank you very much for that analysis. 勉強になります。 We should probably have a がいい page then, as well as the alternates がよい and が良い, as this usage is clearly idiomatic. (And congratulations as well on your adminship!) -- Eiríkr Útlendi │ Tala við mig 21:39, 1 November 2011 (UTC)
- Yes, we seem to have agreed on the issue of lemmata. Just one thing to comment: がいい is rather a suffix today and you cannot have a pause between が and いい, and you cannot replace いい with another adjective. Its meaning is also slightly different from と いい or のが いい.
- I updated WT:AJA slightly to reflect the decision not to have -na in headwords, namely by removing -na from the example provided under the section Quasi-adjectives. At the same time I deleted a few of my own creations, 無力な, むりょくな, and muryoku na, citing this discussion. This is just to double check that (a) that was right to do and (b) we should delete the other pages with -な, because they are SOP and therefore fail CFI? I want to make sure because there are a whole lot of them.
- By the way, @Takasugi-san, congratulations on becoming an admin! Actually I didn't know about the vote until today. Haplology 16:34, 1 November 2011 (UTC)
- Hmm, given the stated goal of "all words in all languages", and the way that other languages happily include inflected forms such as aß or hablo or even 見た, perhaps we should keep the entries ending in -na? If so, these entries should include just a brief description that the -na forms are the prepended adjectival inflections, as and otherwise act as stubs pointing to the main entries without the -na. Similarly for entries ending in -ni, including a brief description that these forms are the adverbial inflections and linking through to the lemmata. -- Eiríkr Útlendi │ Tala við mig 21:39, 1 November 2011 (UTC)
How to remove "Lithuanian nouns lacking gender"? ---> Tooironic 21:40, 23 August 2011 (UTC) Or is that supposed to be there...? :S ---> Tooironic 21:41, 23 August 2011 (UTC)
- Someone has to add the gender info into the entries, then the number will be reduced. --Anatoli 22:20, 23 August 2011 (UTC)
- Does
{{g}}
belong in translations? DCDuring TALK 22:22, 23 August 2011 (UTC)- Yes, the gender was lacking from the translation. I've added a pos parameter to
{{g}}
, so in theory at least you could do{{g|lt|pos=translations}}
and move it to Category:Lithuanian translations lacking gender. Mglovesfun (talk) 22:29, 23 August 2011 (UTC)- I thought gender was supposed to be a named parameter of
{{t}}
. DCDuring TALK 22:44, 23 August 2011 (UTC)- Unnamed actually, the third parameter (like
{{t|fr|foo|f}}
). But when there is no gender, users can choose to add{{g|fr}}
or{{g|French}}
). This doesn't cause any problems per se, but could apply to probably every English entry with a translation table. I tend to add genders when they're missing, but as long as the gender is in the target (the translation) I say there's no need to worry about it. Mglovesfun (talk) 10:28, 24 August 2011 (UTC)
- Unnamed actually, the third parameter (like
- I thought gender was supposed to be a named parameter of
- Yes, the gender was lacking from the translation. I've added a pos parameter to
- Does
Been thinking about this and seeing this specific entry. I've decided to use it as an example. WT:CFI line one says "As an international dictionary, Wiktionary is intended to include “all words in all languages”." Is this a word in a language? WT:CFI makes no attempt to define word or language in this specific context. Normally I'd be happy to consider this a word, for our purposes it might be better to consider commercial coinages like this to be nonwords. Furthermore, I wouldn't consider this English, but rather Translingual. For example on a bottle Palmolive of shower gel I had, the translations into Russian and Greek (as well as all the other languages) used the word Palmolive in the Latin script. So, one possible to the issue of brand names is to not consider them words in any language for CFI purposes. Mglovesfun (talk) 22:19, 23 August 2011 (UTC)
- Bear in mind that a lot of brand names are actually translated, though. I've seen some ingenious cases: if you get an Arabic or Georgian bottle of Coca-Cola (yeah, my paper shop gets the bottles that "fell off a lorry"), it has almost the same logo, but reworked slightly so as to write the name in the appropriate script. I suppose that makes it a different word, even if it's only a transliteration. Equinox ◑ 22:21, 23 August 2011 (UTC)
- Some of the best things we could do, IMHO, is remove the section for brand names from CFI, and include brand names whenever they are single-word and attestable. Brand names do not create any problems; they are just disliked as uncustomary for a dictionary, in spite of the fact that useful lexicographical information can be recorded on them, including pronunciation and etymology. You seem to be proposing the very opposite: to exclude all brand names. We can argue whether brand names are words, but fact is they have many properties typical of words: they get pronounced, they get printed, they take positions in sentences, they serve as a basis of derivation (there is the Czech word "adidasky" derived from "Adidas"), they have an etymology, etc. --Dan Polansky 09:03, 24 August 2011 (UTC)
- That's another approach, used by the French WT:CFI. It can get a bit silly doing it that way; names of films and books and TV series and whatnot. Mglovesfun (talk) 10:28, 24 August 2011 (UTC)
- You don't need to include all multi-word names of works ("Much Ado About Nothing") in order to include all attestable single-word brand names. By contrast, "Lysistrata", a play by Aristophanes, should IMHO be included, if only for its pronunciation--Wikipedia even has different UK and US pronunciations. --Dan Polansky 12:36, 24 August 2011 (UTC)
- FWIW, regarding specifically Talk:Adidas, was the RFD ever closed? Because it looks like a fail with two people wanting to delete it, and none wanting to keep it. Mglovesfun (talk) 18:28, 25 August 2011 (UTC)
- From Talk:Adidas and the archived discussion, it follows that a RFD started on 16 September 2007. There, two people wanted to delete the page--Connel MacKenzie (who claimed that it was "spam"[2], having tagged the entry in this revision, which had the non-promotional definitions "The German sports apparel manufacturer adidas AG, formally founded in 1949" and "A clothing product of this brand, especially a pair of shoes") and Williamsayers79, while two people were sympathetic with the entry even if stating no boldfaced "keep": DAVilla, and bd2412. Connel was utterly anti-brand, as follows from the vote linked to below: "There is no reason to include any brand name, product name or trademark in a dictionary [...] --Connel MacKenzie 17:41, 31 August 2007 (UTC)". In case of doubt, you may send "Adidas" to a new RFD (for which I vote keep) or to RFV via WT:BRAND, but this is the sort of entry that is likely to meet the current strict requirements of WT:BRAND. The 2007 RFD on "Adidas" was running in parallel with the second vote on brand names, which was running from 5 September 2007 to 5 October 2007 (Wiktionary:Votes/pl-2007-08/Brand_names_of_products_2), a vote that had bearing on whether "Adidas" met CFI. --Dan Polansky 08:23, 26 August 2011 (UTC)
- FWIW, regarding specifically Talk:Adidas, was the RFD ever closed? Because it looks like a fail with two people wanting to delete it, and none wanting to keep it. Mglovesfun (talk) 18:28, 25 August 2011 (UTC)
- You don't need to include all multi-word names of works ("Much Ado About Nothing") in order to include all attestable single-word brand names. By contrast, "Lysistrata", a play by Aristophanes, should IMHO be included, if only for its pronunciation--Wikipedia even has different UK and US pronunciations. --Dan Polansky 12:36, 24 August 2011 (UTC)
- That's another approach, used by the French WT:CFI. It can get a bit silly doing it that way; names of films and books and TV series and whatnot. Mglovesfun (talk) 10:28, 24 August 2011 (UTC)
"Category:en:Planets" with proper nouns only, etc.
Since Wiktionary:Votes/2011-07/Categories of names failed and our categories for names of things are named with language codes, I suggest letting these categories be populated only with proper nouns.
This means:
- removing ice giant and extrasolar planet from Category:en:Planets;
- removing pulsar and red giant from Category:en:Stars;
- and so on.
This is already the common practice concerning most of these categories. Compare:
- desert is not a member of Category:en:Deserts;
- skerry is not a member of Category:en:Islands;
- tributary is not a member of Category:en:Rivers;
- planetary is not a member of Category:en:Planets;
- French is not a member of Category:en:Countries (or of Category:en:Countries of Europe, for that matter).
Thoughts? --Daniel 00:25, 25 August 2011 (UTC)
- I think I intuitively approve of this. Disregarding my dislike for Category:Fictional characters, I would rather it only contained actual characters and not things like (deprecated template usage) protagonist and (deprecated template usage) soubrette. Equinox ◑ 00:36, 25 August 2011 (UTC)
- I think it could be useful to group words relating to (for example) rivers, though, like river, tributary, etc. I would not mind using appendices for that, though, or very long ===See also=== sections, or (perhaps the best option) linking to appendices from ===See also=== sections. I have no very strong feelings/interest in the matter. - -sche (discuss) 00:48, 25 August 2011 (UTC)
- Category:en:Rivers would be a terrible place to look for tributary, because that word would be effectively hidden among a long list of names of rivers. --Daniel 01:19, 25 August 2011 (UTC)
- I agree with Daniel. The terms "river", "tributary" and the like are also found in Wikisaurus:watercourse; while this is done outside of the category system, it is at least a workaround for those who prefer categories. --Dan Polansky 10:30, 25 August 2011 (UTC)
OK. I created this. --Daniel 23:42, 26 August 2011 (UTC)
Fancy button in rhyme pages
When editing rhyme pages such as Rhymes:English:-eɪm, I see a row that says "Add new rhyme:" followed by an input field. I find this pretty annoying and would like to see this disabled at least for me. How can I disable it?
What this button does is add an item to a wikilist. A person who cannot add an item to a wikilist should not edit a wiki, IMHO. --Dan Polansky 10:23, 25 August 2011 (UTC)
- I don't think it's for people who can't edit a list; the tool just makes it quicker and easier. No idea how to remove it, although I'm sure someone will be able to help you there. BigDom 10:32, 25 August 2011 (UTC)
- In the gadgets section of Special:Preferences, there's an option to "Disable the rhymes editor". --Yair rand 16:55, 25 August 2011 (UTC)
- (BTW, it doesn't only add it to the wikilist, it also adds the
{{rhymes}}
template to the rhyme's entry, and a pronunciation section if it doesn't already have one.) --Yair rand 17:05, 25 August 2011 (UTC) - Thank you. Pretty straightforward; I should have looked in Special:Preferences myself. --Dan Polansky 07:58, 26 August 2011 (UTC)
- (BTW, it doesn't only add it to the wikilist, it also adds the
- Unfortunately, it adds the "Rhymes" to the top of the Pronunciation section instead of to the bottom, and this is never correct. It also means that random vandalism or erroneous additions to Rhymes pages require additional cleanup, since users no longer have to open the page code and see the warning about stress on the correct syllable. --EncycloPetey 17:34, 28 August 2011 (UTC)
- I've changed the script so that it adds rhymes to the bottom of the pronunciation section. --Yair rand 21:52, 31 August 2011 (UTC)
- Unfortunately, it adds the "Rhymes" to the top of the Pronunciation section instead of to the bottom, and this is never correct. It also means that random vandalism or erroneous additions to Rhymes pages require additional cleanup, since users no longer have to open the page code and see the warning about stress on the correct syllable. --EncycloPetey 17:34, 28 August 2011 (UTC)
Removing words from Wiktionary:Wanted entries
I recently removed yhe from Wiktionary:Wanted entries as I believe that all Google book hits are either OCR errors or just typographical variants (for (deprecated template usage) the). Someone else reinstated it because "it might be a word in some language". Have we got a policy for such actions? SemperBlotto 14:02, 26 August 2011 (UTC)
- Why do we even have the page? Is it an inheritance from WP? It is really not much help to have a term "wanted" without a language specified. We have the whole family of requested entries by language. DCDuring TALK 15:24, 26 August 2011 (UTC)
- Would it be unreasonable to treat it as a cleanup page until emptied and discourage or "forbid" additions to the page? For example, we could restrict changes to admins and have blue links deleted daily or weekly. DCDuring TALK 15:36, 26 August 2011 (UTC)
- The only stumbling block I can think of is the way the top of the WT:Wanted entries list shows up at the top of a user's Watchlist. If there could be some way of allowing users to specify a language's "wanted" list (or maybe for multiple languages?) to display on the Watchlist, then I'd be all for DCDuring's proposal here. -- Eiríkr Útlendi | Tala við mig 15:45, 26 August 2011 (UTC)
- It actually says to check Special:WhatLinksHere before removing terms from the list. In some cases, all the incoming links are from outside the main namespace, often user pages and user talk pages. Such ones can be removed; any link in the main namespace should be checked for validity, such as typos. --Mglovesfun (talk) 15:59, 26 August 2011 (UTC)
- Couldn't a bot take care of the Special:WhatLinksHere check, then? The list is large enough, that would seem to make more sense than going through manually. -- Eiríkr Útlendi | Tala við mig 17:23, 26 August 2011 (UTC)
- Not really a Wiktionary bot, no, since no edits would be involved. A java script might be able to do it. But I'm not the person that can tell you about that. Mglovesfun (talk) 23:21, 27 August 2011 (UTC)
- I agree with DCDuring; the language-specific pages should be used (including Wiktionary:Requested entries:Unknown language). - -sche (discuss) 19:05, 26 August 2011 (UTC)
- I agree with removal since your work should be reflected somehow. Leaving all the terms there in perpetuity is not going to progress us toward our goal. If the language is unknown then it's even more useless. DAVilla 22:16, 27 August 2011 (UTC)
WOTD
I was keeping this updated this fairly well for the last few months, but RL work situation has made it impossible for the moment. In a couple of weeks when I'm back from Libya I am happy to crack on, otherwise if anyone else wants to update them then feel free. Ƿidsiþ 08:16, 27 August 2011 (UTC)
- Stay safe! Mglovesfun (talk) 10:04, 27 August 2011 (UTC)
- Yes, stay safe! I have set words and changed the templates for the 28th of August through the 3rd of September; that should give other editors (or me) time to set the rest of September. I have also set the 1st, 2nd, 4th, 5th, and 7th of October. (I think we should pick a word derived from German, or having to do with unity, for the 3rd of October, Germany's Day of Unity; it would be topical.) - -sche (discuss) 08:56, 28 August 2011 (UTC)
- Thanks! I'm not familiar with all the details of word-of-the-day, but I do have a few tips from what I've observed:
- The WOTDers try not to re-use words of the day. Therefore:
- Before setting something as word of the day, they check the upper-right-hand-corner of the entry to make sure it wasn't used before. According to [[pareidolia]], that word was already word of the day (17 February 2011).
- Conversely, when setting something as word of the day, they add
{{was wotd}}
to the entry (à la [[pareidolia]]) so that future editors can see that it has already been used (or is about to be used).
- When setting words of the day, they list them at (e.g.) [[Wiktionary:Word of the day/Archive/2011/August]], so other editors can see what they are. (There are a few editors who keep watch for upcoming WOTDs and do last-minute cleanup.)
- The WOTDers try not to re-use words of the day. Therefore:
- —RuakhTALK 13:57, 28 August 2011 (UTC)
- That pareidolia appeared twice is evidence of the conspiracy! (Wait, that's not an example of pareidolia, that's an example of paranoia.)
- Ok, I've checked all of the other August, September, and October words I set, they all look new (no WOTD links at the tops of the pages or in Whatlinkshere); thanks for pointing that out! I've also added
{{was wotd}}
to the words. Thanks for adding August to the archive; I've started a September archive. - -sche (discuss) 18:24, 28 August 2011 (UTC)
- Thanks! I'm not familiar with all the details of word-of-the-day, but I do have a few tips from what I've observed:
- Will someone be selecting words for September? I notice that "-sche" has selected some for October, but most of September has not been set as far as I know. I would be willing to do November, to give Widsith a break (if needed; I needed the same from time to time when I was selecting them), but I don't think I'd have time right now. I even went most of this last week without logging on much because of duties in the physical world. --EncycloPetey 17:24, 28 August 2011 (UTC)
Am hoping this will solve a problem or two; see my comment at Wiktionary talk:Votes/2011-07/Pinyin entries#Romanization. Mglovesfun (talk) 10:01, 27 August 2011 (UTC)
Languages written in more than one script, attestation
Is it a good policy or not to require that for a language using more than one script, each script form of a word (term, idiom, etc.) should be attested for it to be included? For example, do we expect верс to be attested separate with three citations from vers, or will three citations for both forms, Cyrillic and Latin do? Mglovesfun (talk) 10:24, 27 August 2011 (UTC)
- I think it varies based on language. For something like Serbian, where the scripts have a one-to-one correspondence, I think demanding that each script be separately attested is pointless and bureaucratic. For other languages, that don't have a simple transliteration between scripts, it's probably necessary to attest each separately.--Prosfilaes 10:41, 27 August 2011 (UTC)
- Or languages that have a dominant script and a rare one. I seem to think Tatar can be attested in the Latin and Arabic scripts, but is predominantly written in Cyrillic. Mglovesfun (talk) 11:06, 27 August 2011 (UTC)
- And for some like Japanese, where a word might be usually spelled in kanji, for instance, but hiragana and romaji (i.e. Latin alphabet) words are added as well to aid learners.
- Attestation concerns aside, given the kerfuffle about pinyin entries for Chinese and the discovery from that discussion that searching for pinyin finds hanzi entries just fine, I find myself asking -- do we really need Japanese headwords in kana and romaji? If we do, then wouldn't we also need pinyin headwords? What's the distinction? -- Eiríkr Útlendi | Tala við mig 18:33, 27 August 2011 (UTC)
- We allow pinyin as headword and have done for some time, though don't ask me when the first one was created (and not deleted). Mglovesfun (talk) 21:32, 27 August 2011 (UTC)
- I'm a bit invested in the romaji and kana pages so I'm biased, but at least they do serve to replace a Homophones section. Having one canonical kana page eliminated duplication or missed entries in Homophone sections scattered across many pages, and a single, comprehensive list of homophones might serve some benefit to learners of Japanese. Putting romaji pages in topical categories increases the duplication of entries but makes it easy for learners to read a group of related terms at a glance, which is helpful since learning groups of related terms at once is the best way to learn a foreign language. Having romaji terms in topical categories would allow people with no knowledge of Japanese to learn a handful of terms in a few seconds. There are arguments for and against, but I favor having them. Haplogy 04:16, 1 September 2011 (UTC)
- We allow pinyin as headword and have done for some time, though don't ask me when the first one was created (and not deleted). Mglovesfun (talk) 21:32, 27 August 2011 (UTC)
- The disambiguation role of romaji and pinyin is just fine. The trouble is when all the contents that should be in kanji/hanzi entries goes into pages only serving learners to cope with the complex script and find a proper word. It also seems that some people have an agenda of promoting non-standard writing. The proper native script should not be replaced with the romanisation.
- On Serbo-Croatian (or Serbian alone). One-to-one conversion is only 99% ok. Care should be taken on borrowed words, as Roman script often uses the orginal spelling and some letters combinations have variants. (I'm leaving the differences in dialects Ekavian/Ijekavian).
- On Tatar, Belarusian. Some nationalistically minded users created a bunch of entries in Roman, especially in Tatar. Tatar (but not Crimean Tatar, it's a different language) is officially written in Cyrillic, so is the majority of the online and printed material in Tatar. You may think I am biased but Azeri, Turkmen and Uzbek are now officially in Roman, even though they were written in Cyrillic, the entries should be primarily created and have translations in Roman. Failure to do so confuses the readers.
- We are on the way to create pinyin policy. Perhaps we should address some other languages, written in multiple scripts. There is no definiteness for some (e.g. Konkani in India) but we could have a guide for patrollers to use. --Anatoli 04:35, 1 September 2011 (UTC)
- The orginal question on верс. Yes, a correct Serbo-Croatian word and a correct conversion from Roman. --Anatoli 04:41, 1 September 2011 (UTC)
- In response partly to Haplology and partly to Anatoli, my view on Latin alphabet (and kana) entries for languages that generally use another script is to view them a bit like disambig pages. For Japanese in particular, romaji and kana pages may include multiple possible kanji, making romaji or kana pages very much indeed like disambig pages. As such, entries in non-standard scripts should probably do no more than provide a bulleted list of the main terms, with brief glosses to help users make the correct selection. -- Eiríkr Útlendi | Tala við mig 02:23, 2 September 2011 (UTC)
- This is in line with the current pinyin vote. --Anatoli 02:33, 2 September 2011 (UTC)
- So to sum up, trying to stay global rather than comment on specific cases, it depends on the language. Different languages should get different treatments. How am I doing? Mglovesfun (talk) 07:01, 2 September 2011 (UTC)
- This is in line with the current pinyin vote. --Anatoli 02:33, 2 September 2011 (UTC)
Template:ante and Template:post
Why do {{ante}}
and {{post}}
abbreviate their output to a. and p.? It saves two whole characters on each of them, and makes them quite a bit more opaque in meaning.--Prosfilaes 22:07, 27 August 2011 (UTC) IFYPFY.—msh210℠ (talk) 04:59, 28 August 2011 (UTC)
- I would support changing them so that they do not abbreviate. - -sche (discuss) 23:01, 29 August 2011 (UTC)
- I think the usual meanings are (and our glossary says the meaning are) "not after" and "not before" rather than "before" and "after". Thus, a quotation dated a. 1924 might be from 1924. So whatever this conversation decides, it should not be to change the displays to "before" and "after" (unless someone goes through every single time the templates are used and edits the dates!). "Ante" and "post" have a similar problem (as people know what they mean); so do "a." and "p.", but not as badly. But maybe "ante" and "post" don't have it badly enough to worry about: I don't know.—msh210℠ (talk) 01:49, 30 August 2011 (UTC)
- Surely we shouldn't justify having an abbreviation because we have a weird definition and that makes it more opaque.--Prosfilaes 02:09, 30 August 2011 (UTC)
- Msh210: I understood this as a request to change "a." to "ante" and "p." to "post". But are you saying "a." means something different than "ante"? (Are you saying "a." means "no later than" but "ante" means "before"?) If so, I echo Prosfilaes' comment. If not, the expansion will not cause a semantic problem. PS, note that while Wiktionary:Glossary does not define "a." or "ante", it defines "p." as "post or after, often used in quotations", which disagrees with Appendix:Glossary... one or the other should be corrected. - -sche (discuss) 21:12, 31 August 2011 (UTC)
- To address your last point first, Wiktionary:Glossary shouldn't define either, as they're not used in discussions here (and, anyway, are in Appendix:Glossary). To your first point: Ante and a. both mean (translate as) "before", and post and p. "after". But we don't use them that way, so in our citations they don't mean that. That's a bad thing (unless it's the same as what other dictionaries do, in which case it's okay, I suppose). But if that's the way it is, then yes, essentially, a. means "not after", since (as it's an abbreviation) we can make it mean whatever we want: people might not know it comes from ante; otoh, ante, a Latin word, clearly means "before". So a. is better. Best of all, though, would be to change our system (again, provided it doesn't match other dictionaries'), which should be doable by a complicated bot. (It would need to look for post and ante, written in by hand or by template, and change the year by one unless the "year" is a century or the like (in which case leave it) and unless the citation had been added or edited since the decision was made to switch over (in which case tag it for human attention). Or something like that.)—msh210℠ (talk) 15:37, 1 September 2011 (UTC)
- I dunno about dictionaries specifically, but isn't the use of "ante [year]" to mean "in or before [year]" pretty normal? I mean, do you take “They were reviewed in this journal when they originally appeared (ante 1973), III, 103–4 and (1976) IV, 125–6” and “The projected growth rates of labour supply under ‘normal’ (that is, ante 1973) demand conditions in both countries are about the same as those prevailing since the mid-1960s” (both c/o b.g.c.) to mean strictly before 1973? —RuakhTALK 17:58, 1 September 2011 (UTC)
- I take it to mean "strictly before", yes. If I'm wrong as to the general intention of writers, or in the minority among readers, ignore my 2c, above.—msh210℠ (talk) 18:42, 1 September 2011 (UTC)
- I don't know. I understand it differently from how you do, but it really just might be me. (I'm pretty sure I'm the one who gave the current glosses at Appendix:Glossary.) Very relatedly — I'm reading a book called Semantic Antics, about various English words that have changed meanings in bizarre ways, and it frequently says that a certain word has a certain meaning (say) "before 1483". I've been taking that to mean "by 1483", since it seems strange to emphasize that a year after which the word is already known to have a certain sense, but again, maybe that's just me? —RuakhTALK 20:17, 1 September 2011 (UTC)
- I take it to mean "strictly before", yes. If I'm wrong as to the general intention of writers, or in the minority among readers, ignore my 2c, above.—msh210℠ (talk) 18:42, 1 September 2011 (UTC)
- I dunno about dictionaries specifically, but isn't the use of "ante [year]" to mean "in or before [year]" pretty normal? I mean, do you take “They were reviewed in this journal when they originally appeared (ante 1973), III, 103–4 and (1976) IV, 125–6” and “The projected growth rates of labour supply under ‘normal’ (that is, ante 1973) demand conditions in both countries are about the same as those prevailing since the mid-1960s” (both c/o b.g.c.) to mean strictly before 1973? —RuakhTALK 17:58, 1 September 2011 (UTC)
- To address your last point first, Wiktionary:Glossary shouldn't define either, as they're not used in discussions here (and, anyway, are in Appendix:Glossary). To your first point: Ante and a. both mean (translate as) "before", and post and p. "after". But we don't use them that way, so in our citations they don't mean that. That's a bad thing (unless it's the same as what other dictionaries do, in which case it's okay, I suppose). But if that's the way it is, then yes, essentially, a. means "not after", since (as it's an abbreviation) we can make it mean whatever we want: people might not know it comes from ante; otoh, ante, a Latin word, clearly means "before". So a. is better. Best of all, though, would be to change our system (again, provided it doesn't match other dictionaries'), which should be doable by a complicated bot. (It would need to look for post and ante, written in by hand or by template, and change the year by one unless the "year" is a century or the like (in which case leave it) and unless the citation had been added or edited since the decision was made to switch over (in which case tag it for human attention). Or something like that.)—msh210℠ (talk) 15:37, 1 September 2011 (UTC)
Position of Template:was wotd
The current vertical position of {{was wotd}}
is about level with the L1 page title, above all the language (L2) entries (no matter where it is included). This implies that the whole page was featured when it fact it was just the English entry. This has two problems. (1) About 500 pages where this template is used have entries below the English section for which this implication is false, and (2) eventually we might want to feature an English entry where there is a preceding Translingual entry, which would break the even looser implication that the template applies to the following/top entry. I therefore propose that this template float on the right-hand side (I think this is the only RHS one that doesn't) like others do. We can then move the template uses into the English entries, at least where there's possibility for confusion. Sound OK?--Bequw → τ 16:08, 28 August 2011 (UTC)
- Well, we've only ever chosen English words, so the initial decision was to place the template as far up the page and out of the way as possible, so that it would not overlap any page content. In practice, this varies a bit by browser.
- Placing it in the English section will not be any less misleading. We only ever feature one part of speech, and English words often have more than one part of speech, so I don't see that moving the template would solve any actual problem. --EncycloPetey 17:24, 28 August 2011 (UTC)
- I know historically there were some "overlapping" layout problems, but our current floating RHS content doesn't suffer from any that I know of. And actually, the current position overlaps with the section-0 (page header) [edit] link that can be added with JS (originally from an en.wiki gadget). I find it quite useful and at least one other person uses this. As for the change being less misleading, the template could be placed in the actual part of speech that was featured. It'd be hard to see how narrowing the indication from the whole page down to a language's part of speech isn't more accurate. The move also helps with the logic/consistency that language content should be in their language sections. I can't think of any other language content template that isn't in its own section. Does the current position help the WOTD maintainers? If so we can have the default layout be "float"ing and then have a some WT:PREFS code to move to it's current, absolute position. --Bequw → τ 14:58, 29 August 2011 (UTC)
- Well, it helped me during the years I ran it. It was easiest to spot at the top of the page, rather than having to look around in (possibly) several places, where it might be hidden by images or wikipedia-link boxes. --EncycloPetey 05:11, 4 September 2011 (UTC)
The move you have made to the new position has resulted in a serious problem. The text that the template is supposed to display is no longer visible in many of the entries. Please correct the problem so that the text is visible, or please revert the change in position. --EncycloPetey 18:11, 5 September 2011 (UTC)
- That's odd. In none of my browsers (Chrome, FF, IE 9 on Vista) has the display changed on entries where I've moved the template position into the English entry (eg putrescible). The CSS positioning is "absolute" so it shouldn't change (and there shouldn't be any difficult cache issues since I didn't change anything else). What's your setup? Does it happen when you're logged out? What if you view it using the Monobook skin (I assume you use Vector)? Does anyone else have this problem? --Bequw → τ 01:22, 8 September 2011 (UTC)
- It's fine on my Mac at home using Safari, but not in IE (Windows) at work. I'm not sure which old version we're using, but it's school software and can't be changed. If the template is going to display at the top of the page, then I don't understand how it will help anyone to position the coding for the template inside a language section. That will just confuse future editors. --EncycloPetey 02:57, 10 September 2011 (UTC)
- Checking pages against IE 5.5, 6, 7, and 8 I couldn't find any oddities with WOTD. Maybe your school has a mixed-up configuration that isn't popular. I think it is best to make this template float by default and create WT:PREFS to return it to the original, top position for those that prefer it. This is partly why I moved into the English entries those WOTD invocations on pages with multiple entries. See the start of a broader cleanup at Wiktionary:Todo/Anomalous section0 content. --Bequw → τ 03:09, 11 September 2011 (UTC)
- It's fine on my Mac at home using Safari, but not in IE (Windows) at work. I'm not sure which old version we're using, but it's school software and can't be changed. If the template is going to display at the top of the page, then I don't understand how it will help anyone to position the coding for the template inside a language section. That will just confuse future editors. --EncycloPetey 02:57, 10 September 2011 (UTC)
- I've made the template simply
floatright
like other RHS templates. It might look weird as caches catch up. If you prefer the old style, you can get the raw CSS from the documentation shown at{{was wotd}}
or if you have the default (vector) skin you can get use WT:PREFS (look for "was-WOTD" in the bottom of the display section. --Bequw → τ 13:21, 18 September 2011 (UTC)
Klategory?
There are loads of these Ku Klux Klan terms. The following might be eligible for a category (some used only within the organisation and others more widely): Kladd, Klankraft, Klaliff, Klokan, Klarogo, Klexter, Klokard, kludd, klavalier, Kloran, klonvocation, klecktoken, kligrapp, Kleagle, Klabee, Klansman, Klanswoman, kloncilium, klonklave, klansman, klan, Klan, Ku Klux Klan, antiklan, klavern, KKK, Klannish. Equinox ◑ 10:57, 31 August 2011 (UTC)