View source for Thread:User talk:Bequw/OCS alternative forms and spellings/reply (7)
Add topicArchives: 2007, 2008, 2009, 2010
Notice: I will reply to messages wherever they are posted. Thus, if you leave a message here, I will respond here. If I leave a message on your page, please respond on your page; I will be watching it.
flood flag
[edit]I don't know whether you watchlist the page, so am pointing out that you might be interested in [[Wiktionary talk:Requests for flood flag#procedure]].—msh210℠ (talk) 00:11, 2 November 2011 (UTC)
Contents
Hi, I have removed your admin rights due to our policy on admin inactivity, as you have not used any admin tools in the past five years. This removal is without prejudice and you can request your admin rights to be restored at any time.
Hi!
You get this message because you are an admin on a Wikimedia wiki.
When someone edits a Wikimedia wiki without being logged in today, we show their IP address. As you may already know, we will not be able to do this in the future. This is a decision by the Wikimedia Foundation Legal department, because norms and regulations for privacy online have changed.
Instead of the IP we will show a masked identity. You as an admin will still be able to access the IP. There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right. We are also working on better tools to help.
If you have not seen it before, you can read more on Meta. If you want to make sure you don’t miss technical changes on the Wikimedia wikis, you can subscribe to the weekly technical newsletter.
We have two suggested ways this identity could work. We would appreciate your feedback on which way you think would work best for you and your wiki, now and in the future. You can let us know on the talk page. You can write in your language. The suggestions were posted in October and we will decide after 17 January.
Thank you. /Johan (WMF)
18:14, 4 January 2022 (UTC)
- Hello, Bequw
- How does the character info template get the corresponding image file?
- Does he identify it on the WikiMedia database?
- Shlomo.
Please remove the page "Uzbekistani". There is no such word as "Uzbekistani". One can use word "Uzbek" referring to Uzbekistan but not "Uzbekistani". Search, for example, Oxford dictionary: https://en.oxforddictionaries.com/search?utf8=%E2%9C%93&filter=dictionary&query=uzbekistani
Sorry, Uzbekistani is in use in the English language. See Google Books. The word meets our criteria for inclusion. Please notify the Oxford dictionary to include Uzbekistani in their next edition. Thank you.
I have deleted this as {{proto}}
has been deleted.
[1] - please stop doing this for OCS. There is a reason why these two are separate.
I agree with BQ here, if they're synonyms they go in the synonyms sections, but entries shouldn't use both alternative forms and spellings. The second is just a more specific case of the first.
Alternative spellings is for the spellings in the alternative script (Glagolitic/Cyrillic), and alternative forms if for variant forms of the same word. OCS corpus is stratified geographically and diachronically, so some words have variant forms reflecting different sound changes in different dialects or epochs.
I too agree with BQ - there is no reason why any entry needs both "alternative forms" and "alternative spellings".
Both Japanese and Serbian, which also use multiple scripts, put the form of the word in the other scripts on the inflection line. Is there a reason this is not done for OCS? Thryduulf (talk) 14:34, 15 July 2010 (UTC)
Spelling in another script is not an "alternative form" of a word - it's the same word in another script. spelling is also not a more specific case of form: spelling implies alternative script or orthography (which could all be "proper" and in simultaneous use), while form encompasses various kinds of etymologically related variants of the same word. In languages with phonological orthography, where differences in pronunciation are reflected in writing, which is the case with OCS but not e.g. English, the difference is much more obvious. dzvězda and zvězda cannot be treated as synonymous; they're as "synonymous" as e.g. English color and colour. Yes the scrips could be handled in the inflection line, and that's probably the best solution, but that will not work in case where there is more than 2 scripts used for a language. For now, merging these two sections, in case of OCS at least, doesn't make sense to me.
I think you're just taking a narrow view of the term "form" whereas others (including myself) are taking a broader view. I want a single general header as people's distinctions between "forms" and "spellings" are often vague and contradictory (as seen here). The key information is still represented under a unified header (the script qualifier accomplishes this). Putting the different scripts for a word on the inflection line is a problem for both layout (when multiple PoS exist) and clarity (a non-inflection being on the inflection line).
You do not have permission to edit this page, for the following reason:
You can view and copy the source of this page.
Return to Thread:User talk:Bequw/OCS alternative forms and spellings/reply (7).
"Yes the scrips could be handled in the inflection line, and that's probably the best solution, but that will not work in case where there is more than 2 scripts used for a language."
Japanese manages to show three scripts on the inflection line without any issue. Are there any languages that use more than three scripts? Thryduulf (talk) 21:21, 15 July 2010 (UTC)
Sanskrit is written in almost all Indic scripts, although we currently only use Devanagari. Perhaps it works for Japanese, but I don't think that handling 2+ additional scripts would work in the inflection line in alphabetic languages with rich inflection some of which would have to be listed in the inflection line too. The whole line would be too long. Serbo-Croatian has historically been written in 2 other scripts beside Cyrillic and Latin: Glagolitic and alphabetic form of Arabic script. There is no way that these 4 would all fit in the inflection line beside the usual information listed. There is the preliminary template {{sh-variant}}
to handle this, based on the successful usage of {{fa-regional}}
for Persian variant forms, but it's not used yet. Template-based table approach seems to me superior than either of those two alternatives.
MediaWiki:Gadget-CommentsInLocalTime.js -- via importScript( 'User:Bequw/comments in local time.js' ) -- is not working as intended
MediaWiki:Gadget-CommentsInLocalTime.js -- via importScript( 'User:Bequw/comments in local time.js' )
-- is not working as intended. I've left the short details on MediaWiki talk:Gadget-CommentsInLocalTime.js. w:en:MediaWiki:Gadget-CommentsInLocalTime.js is working correctly -- via importScript( 'User:Gary King/comments in local time.js' )
-- Thanks for your attention to this matter (and I'm always on IRC if you need to ask me something quickly wikitech:User:Technical_13 is a list of most of my IRC nicks...
Hi! Would it be possible to maintain a list in Wiktionary:Todo/proto problems, of entries that create a quasi-link without actually linking to anything? In particular entries like dorst (in the Dutch etymology) or afraid (Frankish word), which have a word in the second parameter of {{term}}
(and the first is empty, so no link) but with a * in front. Such cases should be replaced by {{recons}}
so it would be good to have a list of them. I don't know if you could also make this work when someone writes *''thurst'' without using {{term}}
at all (it might give a lot of false positives) but it would also be useful. Thanks in advance!
The template {{proto}}
is deprecated and shouldn't be used, so the message your bot generates should probably be changed as well.
Also, I wonder about "quasilinking". I have noticed quite a lot of cases where reconstructed terms use *{{term||word}}
or some variety of that, with the asterisk right before the template. Your bot doesn't seem to catch those, but I'm not sure why. Could it be that the bot only looks at the first parameter, and skips when it is empty? It would be very helpful if it caught cases like the above as well, because in many cases something like *{{term||word}}
should really use {{recons||word|lang=xx}}
or even {{recons|word|lang=xx}}
with a link.
I am trying (and failing) to add support for {{lg-noun}}
and {{lg-proper noun}}
. I made these changes in an attempt to add it, but Yair rand reverted me, probably because I did soemthing horribly wrong. Can you please explain to me how to go about this correctly? Thank you so much in advance!
Well, if you can't find the requisite documentation or whatever in the future, I think Category:Portuguese terms needing attention gets cleaned out semi-regularly, so you could always use {{attention}}
.
Well it doesn't have to be updated much. It's more useful than {{MonobookTab}}
now that the default is Vector. It probably should be used on some help pages.
Ok. I've been trying to get as many templates as possible properly categorized, partially to make them more findable by purpose or other characteristics, partially to eliminate deadwood from the categories, and partially to make the list of uncategorized templates a more useful cleanup list than it had been with about 8-9,000 potential members, but actual membership limited to 5,000. The categories and documentation should be on separate documentation pages transcluded by <noinclude>{{temp|documentation}}</noinclude> (or equivalent) to apply to the template itself. This is intended to make it easier to edit documentation without adding the transclusion-processing queue. Many of the uncategorized templates pages are template subpages of the form "langcode/script", some of which are very widely transcluded. I'm not at all sure that others share my belief that this is worth doing, but it seems important to a healthy project.
I enjoy the regexp practice. I've made WT:Todo/non-standard pronunciation transcriptions where I've listed the pattern for people who would like to rerun it. Let me know if it needs tweaking.
Cool, thanks! That list includes entries with "s inside <ref> tags, which is not the same issue per se, but I'm glad we didn't think to exclude them: most of them happen to have other problems that need addressing. (Mostly they're archaic spellings that should use {{alternative form of}}
but instead duplicate the content of the lemmata.)
Is this needed? No topical categories use zh: anymore, they all use cmn.
You created it, just in case you weren't aware of that. I don't know what it does to be honest.
Your bot removed a lot of names from Wiktionary:Etymology/language templates and I realised this is because a lot of the code templates were recategorised recently. All code templates are now in Category:Code templates, and they are further subdivided. I'm letting you know in case your bot needs to be updated.
There are some more issues on Wiktionary:Todo/etyl problems with entries being marked with 'nonexistent etymology code'. I assume it has the same cause?
Hi! Do you think you could add something to your bot so that entries that use {{term}}
without specifying a language are added to a cleanup list? {{term}}
doesn't actually default to English like many people might expect, but instead it defaults to 'no language'. It's not strictly an error, but from a usability point of view it's less than optimal, and it would be better I think if most of those cases could be fixed.
I already tried to make a list using categories a few days ago, but there ended up being so many (well over a hundred thousand!) so it seemed hopeless. But I think maybe your bot would be better suited because it can subdivide the entries and make them more manageable. Of course, we can't actually say what language {{term|water}}
is supposed to link to... so I think for now the best way to subdivide the entries would be based on what language section the template appears in. So, if a German section contains {{term|water}}
, add it to the subpage for German entries. (Wiktionary:Todo/Temp without language/German?) It would probably be a good idea to make further subpages by letter as well, so if Wasser contains {{term|water}}
it would be listed on Wiktionary:Todo/Temp without language/German/w.
What do you think of this idea? Is it doable?
It's pretty simple to construct using regular expressions. I used =German==([^-]|-[^-])+\{\{term(\|(?!lang=)[^\|\}]+)+\}\}
to construct one for German just using AutoWikiBrowser. (Did you want "term" in the title instead of "temp" or were you planning on expanding this to other templates that should have language parameters). It's probably better for someone to just run this regexp for languages for which there's cleanup interest (non-English would be the most bite-sized). Once the backlog has been whittled down then it can be easily be turned into a periodic, maintenance cleanup-list.
Oh yes that was a mistake, sorry. I meant 'term'. I can't use AWB because it's only for Windows, but I could probably do the same with Python. I don't really know how the XML dumps work though, could you explain it a bit maybe?
mw:Manual:Pywikipediabot has a module xmlreader that you can use to parse the xml dump into pages. Other tools are mentioned at m:Data dumps.
User:Koavf has been tagging files to be deleted or moved to commons, and I've been deleting the no-longer-needed ones I uploaded and moving the useful ones others uploaded. I was going to move this one, but you didn't formally licence it. Could you stick a Creative Commons licence on it, or re-upload it to Commons yourself? Or if you'd rather delete it, or leave here licenceless, I actually don't mind.