User talk:Ivan A. Krestinin

From Wikidata
Jump to navigation Jump to search

I'm not sure why you deprecated the DOI on Barringer Medal citation for Michael R. Dence (Q101634288). It's a working DOI, so that deprecation seems unfounded. Trilotat (talk) 07:27, 26 November 2020 (UTC)[reply]

Are you about this edit? Bot did not change deprecation rank. It just made the value upper cased. — Ivan A. Krestinin (talk) 15:49, 2 January 2021 (UTC)[reply]

Non-capturing regex group (?:)

Hi Ivan,

At Wikidata:Property_proposal/URL_match_pattern, we are trying to figure out which would be sensible default pattern for replacement.

At Property_talk:P973, this would be probably \2, but, if Krbot supports non-capturing regex groups, we could use "\1".

this tries to test it with that url. Will Krbot convert it? It seems to be busy with other things in the meantime.

Wikidata:Property_proposal/URL_match_pattern could probably be useful for Krbot as well. used by (P1535) could qualify the ones used by Krbot. --- Jura 07:54, 27 November 2020 (UTC)[reply]

PCRE v8.39 is used by KrBot. Please see the library documentation for supported syntax. As I understand general idea is to replace {{Autofix}} to properties. Autofix template currently supports several different use cases. Do you have idea how to describe all of them using properties? — Ivan A. Krestinin (talk) 16:03, 2 January 2021 (UTC)[reply]

Connection to other wikis

Hey! Can you move Q9212417 to Q8564503 ?

The topic is the same: en.: Category:Jazz clubs

it.: Categoria:Locali jazz

The move will allow the connection to other wikis.

Thanks! --CoolJazz5 (talk) 10:59, 2 December 2020 (UTC)[reply]

@CoolJazz5 I've done this. In future, please specify links to the pages that you write about. This greatly simplifies execution of your request. Michgrig (talk) 22:41, 2 December 2020 (UTC)[reply]
Michgrig Ok, thanks!

Question about the edits made by the bot

Why did the bot edit the two entries Q87402631 and Q104417514? It is true that the announcements that the bot moved were displayed with a warning. But that doesn't mean they're wrong too. --Gymnicus (talk) 10:48, 24 December 2020 (UTC)[reply]

As I noticed just after resetting, there were no warnings at all. For this reason, the edits of the bot make even less sense from my point of view. --Gymnicus (talk) 10:59, 24 December 2020 (UTC)[reply]
It is very common mistake then person-specific property is set for items that describes person groups. See this edit as example. Bot fixes such cases. It is logically incorrect to use properties like country of citizenship (P27), sex or gender (P21) for human groups. — Ivan A. Krestinin (talk) 16:21, 2 January 2021 (UTC)[reply]
I don't see it as easy as you do. For the property country of citizenship (P27) I would go with you. But I can't understand that with the property occupation (P106). There is also an example with the data object Igor and Grichka Bogdanoff (Q12920) where your bot does not remove this property and this statement was added on November 22, 2020. Then why does he remove it from the examples I mentioned? --Gymnicus (talk) 10:28, 6 January 2021 (UTC)[reply]

Request

Could you make your bot:

  • Add P750 > Q1486288 to items with P2725
  • Add P750 > Q1052025 to items with P5944
  • Add P750 > Q1052025 to items with P5971
  • Add P750 > Q1052025 to items with P5999
  • Add P750 > Q22905933 to items with P7294
  • Add P750 > Q135288 to items with P5885

--Trade (talk) 21:55, 27 December 2020 (UTC)[reply]

Please add statements like:
property constraint (P2302)
Normal rank item-requires-statement constraint (Q21503247)
property (P2306) distributed by (P750)
item of property constraint (P2305) PlayStation Store (Q1052025)
constraint status (P2316) mandatory constraint (Q21502408)
0 references
add reference
add value
Bot fixes such statements automatically. — Ivan A. Krestinin (talk) 16:49, 2 January 2021 (UTC)[reply]

Fenestella troubles

There are two different genera named Fenestella, namely an animal one (Q20975616) and a fungus one (Q17317929). Despite the names, they are completely different. So should their corresponding article items be, and their category items (Q9651255 and Q18283983, respectively) be. However, by name confusion, there has been mistakes, like linking the Swedish fungus article to the Commons category for the animals. Unhappily, these mistakes also seem to have caused confusion in wikidata.

I do not understand why your robot talks about a "redirect from Q20975616 to Q17317929" in the summary of this edit, but I know that this edit, and others mentioned in the summary details, have contributed to the confusion. I'll revert these edits, correct what I can, and add further "different from" properties to the central items (the four enumerated supra). I hope that this will lessen the risks for this particular confusion in the future. Best regards, JoergenB (talk) 20:01, 29 December 2020 (UTC)[reply]

I think different from (P1889) should be enough. Thank you for your job! — Ivan A. Krestinin (talk) 17:17, 2 January 2021 (UTC)[reply]

Ανδρέας (Greek given name) abusively mutated to Slavic equivalent(?) Анджей

Problem here may be caused by one or more edits in Q14941830... To correct it manually is hard: got an error message: 'Item Q87263878 already has label "Ανδρέας" associated with language code el, using the same description text.' Perhaps a bot can be more helpful.

Happy New Year, Ivan A. Krestinin, Klaas `Z4␟` V11:31, 1 January 2021 (UTC) ( on behalf of NameProjectMembers & notifiers)[reply]

Looks like everything was fixed already. Happy New Year! — Ivan A. Krestinin (talk) 17:21, 2 January 2021 (UTC)[reply]

Protection of user page

Hi! As suggested here, I've raised the protection level of your user page to administrators, in order to avoid accidental creations by registered users. If you want to create the page in the future, you can always make a request to WD:AN. Best regards and happy 2021! --Epìdosis 14:46, 1 January 2021 (UTC)[reply]

Good, thank you! Happy new year! — Ivan A. Krestinin (talk) 17:35, 2 January 2021 (UTC)[reply]

Error in BnF correction

Hi,

The correction made here is wrong, the correct id is 12746940n.

When it starts with FRBNF, the last character is always false and must be recalculated.

eru [Talk] [french wiki] 17:54, 1 January 2021 (UTC)[reply]

What did KrBot do?

Hi Ivan, ich don't get the reason of this change of your bot. As far as I see, the bot didn't change anything :-)

and of course - a happy new year. greetings from Germany --Z thomas (talk) 13:01, 2 January 2021 (UTC)[reply]

Hi Thomas, bot removed non-printable symbols at the end of name. Happy new year! — Ivan A. Krestinin (talk) 17:53, 2 January 2021 (UTC)[reply]
Thanks. I assumed something like that. Greetings --Z thomas (talk) 18:19, 2 January 2021 (UTC)[reply]
Hi, could you also add non-breaking spaces (\u202F and \u00A0) and multiple spaces (\s{2,} -> " ") to your script, please? Recently there was not very successful import from DACS (ping @Hannolans:), containing such spaces in en/nl labels. --Lockal (talk) 16:31, 9 January 2021 (UTC)[reply]
AI, wasn't aware of this, this was a download from the unmatched mix n match that I uploaded with openrefine. Would be great if the bot can repair this. Double spaces is also very useful. --Hannolans (talk) 22:36, 9 January 2021 (UTC)[reply]

Murder -> Homicide

Hi!

Could you check the discussion at Property_talk:P1196#Allow_assassination? and, unless there is a good reason that I and others have so far managed to overlook, kindly ask your bot to stop edit-warring? Best, --Matthias Winkelmann (talk) 19:58, 2 January 2021 (UTC)[reply]

Hi! Are you about this edit? Please remove corresponding {{Autofix}} rule from Property talk:P1196 page if the rule is inapplicable for some cases. — Ivan A. Krestinin (talk) 22:53, 14 February 2021 (UTC)[reply]

Edits made based on a wrong constraints on MUBI ID

Hey, Trade added two item-requires-statement constraint (Q21503247) on MUBI film ID (P7299) that weren't right since MUBI has ID for many films that are not on MUBI. They are basically there to show up in search and then suggest similar titles for potential viewers. Your bot added is adding statements based on these constraints. Could you undo them? thanks, Máté (talk) 05:18, 3 January 2021 (UTC)[reply]

Seconding this. Plesae undo these edits. Trivialist (talk) 18:50, 24 January 2021 (UTC)[reply]
Hi! ✓ DoneIvan A. Krestinin (talk) 23:58, 14 February 2021 (UTC)[reply]
Thanks! Máté (talk) 08:21, 15 February 2021 (UTC)[reply]

Bot adding of distributed by (P750)

Hi! I have notice that you have added a number of distributed by (P750) based on identifiers on items. First of all I have not found any bot request for this work. This is needed for an approval for a new task. For music items it's not correct to add Spotify (Q689141), Tidal (Q19711013), Deezer (Q602243), etc as distributers. That is like saying ham (Q170486) is distributed by (P750) Walmart (Q483551). Within the music industry, music distribution is the way that the music industry makes recorded music available to consumers. This is often done by a record label. So, please remove the added items on music related pages and create a bot request for this work. --Premeditated (talk) 14:08, 5 January 2021 (UTC)[reply]

Wait, i though we followed the same model with music releases as we do with video games and film? --Trade (talk) 15:20, 5 January 2021 (UTC)[reply]
What do you mean? Like Rocky IV (Q387638) is distributed by (P750) Metro-Goldwyn-Mayer (Q179200), not Netflix (Q907311) or Apple TV+ (Q62446736) (just examples) because they are available on those sites. For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games. --Premeditated (talk) 09:02, 6 January 2021 (UTC)[reply]
The theatrical and home media video version of Rocky IV (Q387638) are distributed by (P750) Metro-Goldwyn-Mayer (Q179200) while the video on demand (Q723685) version are distributed by (P750) Netflix (Q907311) (in lieu of being distributed on Netflix' video-on-demand service)
'For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games' A publisher are the one who publishes the game. A distributor is the website that the game download are being sold on tho sometimes there are exceptions for physical releases and streaming platforms. @Premeditated: --Trade (talk) 12:59, 8 January 2021 (UTC)[reply]
@Trade: I think you are mixing distribution format (P437) for distributed by (P750). Like The Beatles (Q3295515) has distribution format (P437)music streaming (Q15982450). --Premeditated (talk) 13:47, 8 January 2021 (UTC)[reply]

I corrected my examples. So, why do you think that listing music streaming platforms are outside the scope of distributed by (P750)? @Premeditated:--Trade (talk) 00:42, 10 January 2021 (UTC)[reply]

@Trade: Sorry for late response. I think that it should be made a new property named "distribution platform", that could be used for all of those platforms like Steam (Q337535), Spotify (Q689141), Microsoft Store (Q135288), etc. Instead for cluttering distributed by (P750). - Premeditated (talk) 12:22, 20 January 2021 (UTC)[reply]

KrBot malfunction at Wikidata:Database reports/Constraint violations/P8988

Hello, your bot failed to detect any violations at Wikidata:Database reports/Constraint violations/P8988, which is improbable. Please could you look at what is wrong? Vojtěch Dostál (talk) 15:08, 5 January 2021 (UTC)[reply]

Looks like everything is fine with the page now. — Ivan A. Krestinin (talk) 22:45, 14 February 2021 (UTC)[reply]

KrBot malfunction at Wikidata:Database reports/identical birth and death dates

Hello, some entries Wikidata:Database reports/identical birth and death dates were fixed some days ago but not removed, could you have a look? Some examples:

Thank you! --Emu (talk) 21:40, 6 January 2021 (UTC)[reply]

Now all the items are removed. Maybe somebody fix the items. Or maybe some caching issue... — Ivan A. Krestinin (talk) 22:43, 14 February 2021 (UTC)[reply]

KrBot2 sleeping

Hello Ivan! KrBot2 he fell asleep. Please wake me up! :) Thanks Palotabarát (talk) 00:13, 13 February 2021 (UTC)[reply]

Ah, I get it. Now I know where the dump is, which gives the data. Thanks for the reply! Palotabarát (talk) 00:15, 15 February 2021 (UTC)[reply]
20210303.json.gz is corrupted also unfortunately. The issue is tracked here. — Ivan A. Krestinin (talk) 09:54, 8 March 2021 (UTC)[reply]
FixedIvan A. Krestinin (talk) 19:52, 29 March 2021 (UTC)[reply]

Resolving redirects

My understanding that withdrawn identifiers should be handled by deprecating them and marking them as withdrawn. Your bot is instead replacing them without more. The withdrawn (and replaced) identifiers are still used in other systems and linking them may still be desired.

For example withdrawn VIAF identifiers are still used by Worldcat Identities. Though Worldcat should update and merge their entries, until they do the old VIAF ID is still useful. Int21h (talk) 00:12, 15 March 2021 (UTC)[reply]

@Int21h: Hi! For VIAF ID (P214) there was consensus here for the removal of redirected and withdrawn IDs since VIAF clusterization has many problems (e.g. Q212872#P214) and keeping trace of it would be quite problematic. Bye, --Epìdosis 07:44, 15 March 2021 (UTC)[reply]
Ok thanks I wasn't aware of previous discussions. Good to know! Int21h (talk) 16:28, 15 March 2021 (UTC)[reply]

KrBot and Single Constraint

Hi, would it be possible when checking the Single Constraint violations of identifiers ignoring the ones who have a deprecated rank and reason for deprecated rank (P2241):redirect (Q45403344) as a qualifier? One example would be this one, which is listed in the constraint report. Those are considered valid values and should be kept, so having them in the report makes maintenance and cleanup harder. -- Agabi10 (talk) 18:00, 18 March 2021 (UTC)[reply]

Hello, there is technical troubles to implement this. Maybe I can propose alternative way. Does IMDb allow to get all valid (non-redirect) identifiers? If yes I can create bot that will fix such redirects continuously. — Ivan A. Krestinin (talk) 19:56, 29 March 2021 (UTC)[reply]
I don't know if it allows getting all identifiers, but at least for now they shouldn't be replaced, as long as they have been valid identifiers they should be kept with deprecated rank. If checking the qualifier is too much trouble just ignoring the statements with deprecated rank when creating the report would be more feasible? -- Agabi10 (talk) 09:45, 7 April 2021 (UTC)[reply]
@Agabi10: That's a good interim solution - yes, skipping item with deprecated statements would really be best.Vojtěch Dostál (talk) 10:12, 7 April 2021 (UTC)[reply]

Hi, it's been more than a year since the request. If checking the qualifiers and ignoring the ones with given deprecation reasons isn't technically feasible, would it be feasible ignoring all the claims with deprecated rank for the Single Constraint violations? It's not ideal, but most of the violations in the Single Constraint of IMDb ID (P345) are for claims that should be left as is, which makes that section of the report completely useless, at least in this case. -- Agabi10 (talk) 13:20, 27 September 2022 (UTC)[reply]

Reverted merge

Hello. Q20540007 was mistakenly merged with Q17165321. Then KrBot re-linked statements pointing to the redirect. Then the merge was reverted. Could you also revert the bot-actions? Thanks in advance. Greetings, --FriedhelmW (talk) 14:52, 21 March 2021 (UTC)[reply]

Hello, ✓ DoneIvan A. Krestinin (talk) 19:57, 29 March 2021 (UTC)[reply]
Thank you! --FriedhelmW (talk) 16:09, 30 March 2021 (UTC)[reply]

NGA number

Hello. I have seen your bot doing great work fixing up light characteristic (P1030) and ARLHS lighthouse ID (P2980). I wonder if you might be able to help with NGA lighthouse ID (P3563)? Often these are written as a 5-digit number but are missing the 3-digit volume prefix. (Compare [1]). The volume depends on the geographic area, which may be deducible from country (P17). This map shows how the 7 volumes are distributed. If you can help, that would be great. MSGJ (talk) 21:45, 22 March 2021 (UTC)[reply]

Hi, this is out of some current bot tasks. It is better to put the request to Wikidata:Bot requests. — Ivan A. Krestinin (talk) 20:06, 29 March 2021 (UTC)[reply]

David van Dantzig

@KrBot: Hi Ivan: The University of Utrecht never was employer of David van Dantzig. Please see the biography of van Dantzig written by Gerard Alberts Twee geesten van de wiskunde : biografie van David van Dantzig published in 2000 or the paper of his student Jan Hemelrijk The Statistical Work of David Van Dantzig (1900-1959) published in 1960 or the short biography in Academic Genealogy of Mathematicians (page 310) by Sooyoung Chang published in 2011. Moreover, Utrecht University is not cited in the Complete Dictionary of Scientific Biography neither in MacTutor History of Mathematics.--Ferran Mir (talk) 11:22, 23 March 2021 (UTC)[reply]

Please @KrBot:, read my arguments against the statement that University of Utrecht was employer of David van Dantzig.--Ferran Mir (talk) 15:00, 23 March 2021 (UTC)[reply]
Hi, Ferran, KrBot is just a bot) It uses very simple rule: each item with Catalogus Professorum Academiae Rheno-Traiectinae ID (P2862) property should have employer (P108) = Utrecht University (Q221653) statement according to constraints specified on Property:P2862. This edit will help. — Ivan A. Krestinin (talk) 20:17, 29 March 2021 (UTC)[reply]
OK @KrBot: @Ivan A. Krestinin:, I have seen the exception included in the restriction. That's right! Thanks.--Ferran Mir (talk) 07:40, 30 March 2021 (UTC)[reply]

Qualifier reason for deprecated rank (P2241) on property constraints

Hi Ivan A. Krestinin,

to use Help:Property_constraints_portal/Entity_suggestions_from_constraint_definitions, some constraint statements have the above qualifier (and deprecated rank). Can you skip those constraints for Krbot? In the most recent update, the report throws an error. --- Jura 10:23, 13 April 2021 (UTC)[reply]

Hi Jura, I added the property to ignore list. The nearest update is in progress already, so it will report the error. The next update should be fine. — Ivan A. Krestinin (talk) 20:48, 24 April 2021 (UTC)[reply]

Bot is doing strange things on Petit-Rocher Lighthouse (Q106498634) — Martin (MSGJ · talk) 20:17, 19 April 2021 (UTC)[reply]

Bot executes rules from {{Autofix}}. The rules were added by Jura. Better to discuss the issue with him. — Ivan A. Krestinin (talk) 20:32, 24 April 2021 (UTC)[reply]
Yes, it seems to work as planned (removing the dots). --- Jura 20:48, 24 April 2021 (UTC)[reply]
It took 8 edits to do it though? — Martin (MSGJ · talk) 19:29, 25 April 2021 (UTC)[reply]
To avoid breaking things, I think I did "."→" " as sometimes a space to following them was missing and some "." shouldn't be replaced.
We could probably have more Autofix rules that try to do it in fewer steps, but then these would have to be checked on every run as well.
This report has more patterns that might need to be normalized, but it's a tricky thing. --- Jura 07:11, 29 April 2021 (UTC)[reply]


constraint scope (P4680) qualifier, error on KrBot update

Hi Ivan,

Maybe the qualifier should be handled somehow or ignored.

I removed it at [2] for [3], but maybe there are cases where it's useful (possibly at this property). --- Jura 15:11, 29 April 2021 (UTC)[reply]

single-value constraint (Q19474404) is checked for main value only. So the property is not looked as something useful. — Ivan A. Krestinin (talk) 15:57, 10 May 2021 (UTC)[reply]
I tend to agree. I think people started adding them as there was some oddity with the Wikibase extension (initially checking by default everywhere). Or was that about the distinct value constraint? Go figure. --- Jura 09:38, 12 May 2021 (UTC)[reply]
In that case, would you be able to provide constraint reports while ignoring that qualifier (instead of throwing an error and producing no report, as is currently done)? Mahir256 (talk) 18:35, 6 June 2021 (UTC)[reply]
@Ivan A. Krestinin: Thoughts on the idea of ignoring that qualifier? Mahir256 (talk) 17:56, 17 June 2021 (UTC)[reply]
Hi! I reviewed several usages of the property. It is looked as completely redundant as I see. Why not just remove it? Also it is very confusing because it is similar to property scope (P5314). — Ivan A. Krestinin (talk) 20:49, 21 June 2021 (UTC)[reply]
@Ivan A. Krestinin: I agree that it is redundant for you, given that you only check main values, which is why I'm asking if you could ignore it and possibly other properties not applicable in that situation when generating reports. I believe that P4680 is still useful for the gadget for which @Lucas Werkmeister (WMDE): proposed that property in the first place, and possibly for other future tools which are developed for constraint checking. (@MisterSynergy:, as the proposer of P5314, who might have more to say on that point.) Mahir256 (talk) 21:42, 21 June 2021 (UTC)[reply]
property scope (P5314) defines where a property might be used, while constraint scope (P4680) defines where the constraint should be checked.
As an example, consider identifier properties. They are usually allowed (via property scope (P5314))) as main values and references. A distinct-values constraint (Q21502410), however, should not be checked on references, as the reference value might occur on different claims and even different items.
MisterSynergy (talk) 09:39, 23 June 2021 (UTC)[reply]
distinct-values constraint (Q21502410) is checked for main values only. I do not see any reason to duplicate this fact on each property page. — Ivan A. Krestinin (talk) 22:56, 23 June 2021 (UTC)[reply]

resolved redirects

I came here to pat the bot. I had no idea that I had left so many links to my redirects. Today, there was a good bot, thanks for that.--RaboKarbakian (talk) 00:26, 2 May 2021 (UTC)[reply]

Lingering uses of redirects in statements

Hi! I understand your bot is responsible for fixing statements that refer to redirected items. I see that high-mass X-ray binary (Q71963720) was redirected on 2019-12-10T17:42:33‎, but there are still many statements using the old item (e.g. here is one I fixed today). Do you know why these are not getting fixed? (I noticed this when trying to work out why there were so many type violations on Wikidata:Database_reports/Constraint_violations/P59.) Cheers, Bovlb (talk) 16:30, 26 May 2021 (UTC)[reply]

Hello, currently bot fixes redirects only once per redirect. After this bot adds item to special "already fixed" list and ignores it. Bot fixes all redirects to Q71963720 at 2019-12-12, but after this the redirect was used by User:Ghuron, see [4] as example. Looks like I need to create special algorithm to detect fixed redirects reusage. — Ivan A. Krestinin (talk) 09:47, 6 June 2021 (UTC)[reply]
I need to fix that in my script, thanks Ghuron (talk) 15:45, 6 June 2021 (UTC)[reply]
I also added such items detection to my bot. Links to high-mass X-ray binary (Q71963720) were fixed. — Ivan A. Krestinin (talk) 05:59, 7 June 2021 (UTC)[reply]

Removal of "occupation" for Peter and Rosemary Grand

Hi, your bot is removing one of the occupations ("evolutionary biologist", the most important one !) of Peter and Rosemary Grant (Q3657692), as in this edit, and I cannot understand why ? (by the way, a bot should probably not repeat twice the same edit if it has been manually reverted; it should likely lead to a discussion). Cheers, Schutz (talk) 09:11, 28 May 2021 (UTC)[reply]

Hi, bot removes person specific properties like birth/death dates, nationality, spoken language, occupation and etc. from items about person groups. It is very common mistake that is repeated many times by different bots, half-automatic procedures and some users. — Ivan A. Krestinin (talk) 09:53, 6 June 2021 (UTC)[reply]
In any case, the bot should not blindly remove several times the same information -- it should alert the user instead. But I don't really see why "writer", in this case, is kept, while "evolutionary biologist" is not. In this case, the latter is not an error, as the couple worked together as evolutionary biologists. The problem is that by simply removing the property, nothing meaningful appears in the infobox at w:fr:Peter et Rosemary Grant (at the moment it is only "writer", translated in French). If you have any suggestion about how the interesting information can be displayed (in other words, how the Wikidata item can include the information that the pair has worked as evolutionary biologists, and not "writers", so that this information can trickle down to the infobox), I'd love to hear it. Otherwise, could you change your bot so that it does not remove this useful (and correct) information ? Many thanks in advance, Schutz (talk) 13:54, 5 July 2021 (UTC)[reply]
Wikidata is not just collection of information. The information should be structured also. Discussed type of error is too common. Too many users made it from time to time. Bot fixed 4649 cases of this type already... 4649 notifications on user`s pages... Will be looked like spam bot) I changed the article a bit to make the article information more clean. — Ivan A. Krestinin (talk) 22:17, 12 July 2021 (UTC)[reply]

Request

Can this bot replace all current and future value in Namuwiki ID (P8885) from %20 to space (example from 머라이어%20캐리 to 머라이어 캐리)? Thanks. Hddty (talk) 01:05, 9 June 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 17:00, 10 June 2021 (UTC)[reply]

Resolving redirect (Q12368077_Q5334199)

As a consequence of bad merge there are 292 erroneous links now[5]. These links should be "unresolved". May I also suggest that your bot waits longer after a merge before resolving links, perhaps a week, or rather even a month. So that it would be liklier that bad merges will be caught before. 2001:7D0:81DA:F780:A8E0:C965:57F9:B464 06:53, 24 June 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 07:59, 26 June 2021 (UTC)[reply]

Update the report of P2991

Hi Ivan. Could you maybe let your KrBot run over the property IBSF athlete ID (P2991) regarding Wikidata:Database reports/Constraint violations/P2991? I had removed several bugs, but I also noticed that there are definitely false messages and I would be interested to know whether these will now be removed. --Gymnicus (talk) 11:25, 25 June 2021 (UTC)[reply]

Thank you --Gymnicus (talk) 14:54, 25 June 2021 (UTC)[reply]

To merge

Why no updates of User:Ivan A. Krestinin/To merge anymore? - FakirNL (talk) 08:24, 2 July 2021 (UTC)[reply]

Hi! Bot failed on wrong different from (P1889) values like this and this. I added some checks to skip such values. The reports should be updated in 1-2 days. — Ivan A. Krestinin (talk) 09:26, 3 July 2021 (UTC)[reply]

Wrong statements based on wrong constraint

Hey, could you please revert edits based on this statement? The constraint was erroneous. – Máté (talk) 04:50, 4 July 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 10:45, 4 July 2021 (UTC)[reply]


Hi Ivan,

Similarly to replacement property (P6824), can you ignore this when found in property constraint? Otherwise KrBot would generate an error. --- Jura 11:33, 11 July 2021 (UTC)[reply]

Hi Jura, interesting property. I added it to ignored for conflicts-with constraint (Q21502838) and none-of constraint (Q52558054). But looks like bot should do something more than just ignoring this property. Currently the property used a bit random. For example I do not understand the property usage for location of creation (P1071):
Do you have some ideas how to make its usage more structured maybe? — Ivan A. Krestinin (talk) 21:22, 12 July 2021 (UTC)[reply]
I had seen that use too, but wasn't sure what to think of it. The problem seems to be that the replacement isn't always applicable. Personally, I'd remove that.
The samples at replacement value (P9729) are closer to how I'd use them. @Dhx1: documented them at Help:Property_constraints_portal/None_of.
I noticed platform (P400) has plenty of constraints that can use it. --- Jura 14:54, 13 July 2021 (UTC)[reply]

Thank you

Just wanted to drop a big "thank you" for cleaning up the apparent mess that Pi bot made with all the duplication. Great catch. Huntster (t @ c) 23:52, 15 August 2021 (UTC)[reply]

New constraint Label in Language

Hi Ivan,

Please see phab:T195178. To test the future deployment, I added label in language constraint (Q108139345) at [6]. You might need to have Krbot skip it. --- Jura 12:30, 7 September 2021 (UTC)[reply]

Hi Jura, I added fake implementation for the constraint. It is not so hard to add real implementation. But this requires to load information about labels. This information requires some memory. But memory is critical resource now unfortunately. See my message bellow for details. — Ivan A. Krestinin (talk) 20:58, 8 September 2021 (UTC)[reply]
I had previously implemented it with complex constraints, see Help:Property_constraints_portal/Label_language. --- Jura 07:59, 9 September 2021 (UTC)[reply]

Wikidata:Database_reports/Constraint_violations/P2088 has not been regenerated for 16 days.

Is there a way to force it to revalidate this property? --Vladimir Alexiev (talk) 14:24, 8 September 2021 (UTC)[reply]

# Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial
SELECT DISTINCT ?item ?itemLabel ?value WHERE {
	?statement wikibase:hasViolationForConstraint wds:P2088-DD4CDCEA-B3F6-4F02-9CFB-4A9E312B73A8 .
	?item p:P2088 ?statement .
	?statement ps:P2088 ?value.
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}
Try it!
  • Unfortunately this returns fewer violations compared to the pages generated by KrBot. See the comment in the query: "Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial"

"Unique value" violations due to duplicate external-id

Looking at https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P2088#%22Unique_value%22_violations, we see many Qnnn values that are the same.

I described them as "false positives" but then looked at some instances eg https://www.wikidata.org/wiki/Q5013693#P2088 and see that indeed there's a problem: the same external-id is recorded with and without a reference. The one without reference should be removed --Vladimir Alexiev (talk) 12:55, 9 September 2021 (UTC)[reply]

Culture Bas-Saint-Laurent

Salut Ivan,

Nous avons pris soin comme organisation de compléter l'entrée de Culture Bas-Saint-Laurent (Q108475391). Nous allons demander le retrait de l'entrée Q87727973 puisqu'elle est maintenant désuète.

Merci à vous

 – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs). Template:Setion resolved

Hi Ivan,

{{Autofix}} allows to add additional statements based on existing values.

An interesting enhancement could be to do this as constraint as well, e.g.

if currentProperty + currentPropertyValue then requiredProperty + requiredPropertyValue

Also:

if currentProperty + currentPropertyValue then requiredProperty

Maybe the property in the condition could be an argument as well:

if currentProperty + otherProperty + otherPropertyValue then requiredProperty
if currentProperty + otherProperty + otherPropertyValue then requiredProperty + requiredPropertyValue

--- Jura 09:46, 12 September 2021 (UTC)[reply]

Hi Jura, could you provide some example for testing? — Ivan A. Krestinin (talk) 10:46, 19 September 2021 (UTC)[reply]
How about these ? --- Jura 14:28, 19 September 2021 (UTC)[reply]
I misread your message the first time. Now I understand your idea. It is possible to create such constraint. But is {{Autofix}} enough maybe? Do we need control such cases using constraints additionally? — Ivan A. Krestinin (talk) 14:46, 19 September 2021 (UTC)[reply]
I will try to dig up better examples. When Autofix is (safely) possible (for item datatype properties with a predefined value), the constraint wouldn't that useful . --- Jura 14:53, 19 September 2021 (UTC)[reply]

Samples:

Sorry for the delay. Happy holidays. --- Jura 00:32, 24 December 2021 (UTC)[reply]

COSPAR and Coordinates

Hello Ivan. Regarding your reverts to my removals at COSPAR ID (P247) and coordinate location (P625), you have to realize that more than just satellites are sent into space and receive COSPAR IDs. Probes sent to other worlds have need for coordinates (primarily Mars since that planet is supported in our system, but others are as well), and it's entirely possible that Earth-bound spacecraft may potentially have a need for it as well. My point is, making these two properties mandatorily conflicting doesn't make sense in modern spaceflight. Huntster (t @ c) 13:25, 19 September 2021 (UTC)[reply]

Landing point is just one of the points in spacecraft live. So we can specify coordinates for some event, but not for spacecraft at all. It is same as specifying geographic coordinates of some human. Just add geo coords as qualifier to some event. Like this or this. — Ivan A. Krestinin (talk) 13:49, 19 September 2021 (UTC)[reply]


spouse (P26) duplicate statements

Hi Ivan,

What do you think of Wikidata:Bot_requests#Merge_multiple_P26_statements? Didn't your bot merge some statements? --- Jura 21:15, 22 September 2021 (UTC)[reply]

Hi Jura, usually my bot does not clean such duplicates because the values have different qualifiers. I started special job for this case. It is in progress now. — Ivan A. Krestinin (talk) 23:01, 22 September 2021 (UTC)[reply]
✓ Done, please check remaining 18 items manually. Bot failed to resolve data conflicts in its. — Ivan A. Krestinin (talk) 21:49, 23 September 2021 (UTC)[reply]

Добрый день. Здесь какое-то странное обновление пришло. Первый и второй объект без второго значения, а третий и вовсе без свойства. И такая картина почти по всем. Ощущение, что бот формировал отчёт ещё 17 сентября, а опубликовал только сейчас, 28 сентября. 185.16.139.123 20:52, 28 September 2021 (UTC)[reply]

DOI format restriction

Hi, I noticed something strange at the DOI property, maybe you can identify the root of the problem? As you can see in Wikipedia in Health Professional Schools: from an Opponent to an Ally (Q108747926), the DOI property is falling under a format restriction, [a-z]*. Not sure how to fix it. Good contributions, Ederporto (talk) 00:13, 30 September 2021 (UTC)[reply]

Hello, just use upper case: [7]Ivan A. Krestinin (talk) 16:09, 30 September 2021 (UTC)[reply]

Men's basketball

The bot is adding "men's basketball" to male basketball players. This property is not for individuals; it is for clubs, teams, competitions. Therefore all those bot contributions create an exclamation mark (!) which can be avoided by stopping this activity. When I remove the thingy from the individual sportsmen's items, the bot comes and adds it again! (Now I used an exclamation mark. :) Cheers. --E4024 (talk) 15:23, 3 October 2021 (UTC)[reply]

Hi, could you provide link to the edit sample? — Ivan A. Krestinin (talk) 19:26, 3 October 2021 (UTC)[reply]
Ömer Faruk Yurtseven (Q18129444) and many others... --E4024 (talk) 23:13, 3 October 2021 (UTC)[reply]
This bot behavior is caused by this edit. I deleted the constraint. So bot will not do such edits anymore. Also I added conflict with constraint for better protection. What we should do with existing values of competition class (P2094) in human (Q5) items? 1. Delete all such values. 2. Delete only added by my bot. 3. Move all values to sports discipline competed in (P2416). 4. Something else?) — Ivan A. Krestinin (talk) 16:30, 4 October 2021 (UTC)[reply]

Merge overlaps

Hi. Trying to look how we not report different versions of a work as duplicates, and not having to put on a "do not merge" list or mark as different. Example

Here we have the parent work, and respective versions or translations xwiki, and they are listed on the parent. We will have an expectation that this would be a widespread situation as more and more works are transcribed. Is it worthwhile not listing as duplicate where they are both listed on the parent with has edition or translation (P747). Thanks for the consideration.  — billinghurst sDrewth 23:47, 10 October 2021 (UTC)[reply]

Hello, it is better to ask User:Pasleim about this. He is the reports author. Possible adding different from (P1889) should help. — Ivan A. Krestinin (talk) 17:09, 11 October 2021 (UTC)[reply]

Добрый день. В отчётах по GeoNames творится какой-то ад, разбирать на две-три жизни. Вы не могли бы проверить точность настроек? Как вариант, поделить списки на страны, позвать добровольцев… В общем проблема сама себя не решает. 194.50.15.241 05:35, 12 October 2021 (UTC)[reply]

Приветствую, подобный ад творится здесь практически в каждом популярном свойстве. Меня к сожалению на все свойства не хватает. Настройки генерации отчета вы можете найти на Property:P1566. Не слишком популярные проблемы, например, некорректный формат, проще поправить руками, там всего 10 элементов. Для массовых проблем вы можете попытаться выделить группы ошибок и предложить какие-нибудь автоматизированные процедуры их исправления. В этом вам могут помочь на Wikidata:Bot requests. — Ivan A. Krestinin (talk) 05:59, 12 October 2021 (UTC)[reply]
Меня больше интересовало, не мог бы ваш бот отсортировать проблемы по типам объектов (реки, озёра, горы) и по странам (Россия, СНГ). Это уже будет возможно разбирать. 194.50.15.241 20:13, 12 October 2021 (UTC)[reply]
Да, этого можно добиться добавив свойство group by (P2304). Посмотрите как это сделано, например, здесь: Property:P1538. Есть правда неприятный момент — сгруппировать сразу по двум свойствам не получится. Либо по стране, либо по типу. Также можете попробовать сформировать произвольный отчет с помощью SPARQL. — Ivan A. Krestinin (talk) 20:48, 12 October 2021 (UTC)[reply]
Анонимам запрещено править свойства. Вы не могли бы помочь? Типы: lake (Q23397), river (Q4022), mountain (Q8502); страны: Russia (Q159), Ukraine (Q212), Belarus (Q184). 194.50.15.241 18:48, 13 October 2021 (UTC)[reply]
Добавил группировку по государствам. Но лучше создайте учетку и продолжите сами. Или воспользуйтесь SPARQL. Пример получения всех объектов России у которых число кодов больше, чем один:
SELECT ?item ?itemLabel
WHERE
{
	{
		SELECT DISTINCT ?item {
			?item wdt:P1566 ?value1 .
			?item wdt:P1566 ?value2 .
            ?item wdt:P17 wd:Q159
			FILTER( ?value1 != ?value2 ) .
		}
	} .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,en" } .
}
Try it!
Ivan A. Krestinin (talk) 21:33, 14 October 2021 (UTC)[reply]
Благодарю. Я так понимаю, группировка появится при обновлении отчёта? Сейчас свежесть от 8 октября. 194.50.15.241 03:18, 16 October 2021 (UTC)[reply]
Да, при следующем обновлении. База Викиданных сильно подросла последнее время, к сожалению теперь боту требуется уже дней десять, чтобы сгенерировать очередную версию отчетов. SPARQL в этом плане удобнее. — Ivan A. Krestinin (talk) 15:14, 16 October 2021 (UTC)[reply]

Q10497835

Is this posible to reverse this replace? Eurohunter (talk) 15:48, 15 October 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 09:53, 16 October 2021 (UTC)[reply]

An actress changes sex and becomes an actor... :-)

Hello Ivan. In the item Maurane (Q509029) —a belgian female singer—, KrBot repeteadly changes another one of her occupations —actress— in its male counterpart: "actor". I don't know why but could you please solve this? Thanks a lot in advance: Tatvam (talk) 16:10, 15 October 2021 (UTC)[reply]

@Tatvam: This processing is quite intentional. Because it's not that the data object actor (Q33999) only describes male actors, but also female actors, i.e. actresses. The item is no different from the data object singer-songwriter (Q488205), which also describes both female and male persons. --Gymnicus (talk) 18:45, 15 October 2021 (UTC)[reply]
Thank you for your answer, but I used the data object actress (Q21169216), not actor (Q33999) and I would like it to stay like that. It is KrBot which repeteadly changes actress (Q21169216) to actor (Q33999) without reason. Tatvam (talk) 18:57, 15 October 2021 (UTC)[reply]
@Tatvam: If you do not want this change you would have to raise this concern on the discussion page of occupation (P106), because there the bot is asked to make these changes. --Gymnicus (talk) 19:19, 15 October 2021 (UTC)[reply]
Hello, @Tatvam:, KrBot makes the changes because Property talk:P106 has {{Autofix}} template for this value. Please discuss the case on Property talk:P106 and delete the autofix if required. — Ivan A. Krestinin (talk) 10:47, 16 October 2021 (UTC)[reply]

Problematic bot

Your bot (KrBot) is replacing proper item (Q4354683) with some nonsensical disambiguation page (Q3537858), tens of pages are affected. Please stop that. --Orijentolog (talk) 18:17, 15 October 2021 (UTC)[reply]

@Orijentolog: Ivan can now do very little for these arrangements. The bot is programmed to resolve redirect links and the two of the data objects mentioned were accessed by a user on May 18, 2021
merged and only separated from each other on October 4th. In the meantime, the bot has done its job and replaced the redirect links. The bot cannot see that the merging was wrong. --Gymnicus (talk) 18:41, 15 October 2021 (UTC)[reply]
Thanks for the info, it's mostly OK now because I fixed most mistakes manually. I just want to be sure that bot won't repeat the same mistakes. Greetings to both. :) Orijentolog (talk) 18:45, 15 October 2021 (UTC)[reply]
@Orijentolog: The bot should not change it back, since it is no longer a redirect link and in principle the bot ignores this links now. If such changes happen, then something is wrong with the programming. --Gymnicus (talk) 19:06, 15 October 2021 (UTC)[reply]
Hi, Orijentolog, I reverted bots changes. Bot waits for some time before resolving redirects. But wrong merge that exists for a long time creates issue not only for my bot. Humans ans other bots use target item only. So reverting of old wrong merge requires reviewing all links in any case. — Ivan A. Krestinin (talk) 14:35, 16 October 2021 (UTC)[reply]

Lexeme constraint

Did you see Property talk:P1296#Lexeme language? FogueraC (talk) 07:43, 16 October 2021 (UTC)[reply]

Hi, I have too many notifications. So I did not see {{Ping}} mentions. Sorry. — Ivan A. Krestinin (talk) 15:10, 16 October 2021 (UTC)[reply]
No problem. And thanks! FogueraC (talk) 15:59, 16 October 2021 (UTC)[reply]

detected wrong merge

I just discovered a faulty merge, which unfortunately also led to your bot being edited. Could you see if you can undo the edits made by your bot where it changed the data object Chowdhury (Q30971895) to Chaudhry (Q1068345)? --Gymnicus (talk) 22:59, 16 October 2021 (UTC)[reply]

I've created a new item - Q108911685 for non-Latin surnames. Probably it also needs separation. --Infovarius (talk) 23:39, 16 October 2021 (UTC)[reply]
@Infovarius: Thank you, I also see that as useful. But shouldn't we also separate the individual languages ​​(Bengali, Nepalese, Urdu and Newari)? At least the names look very different to me as a layman who has no idea about these languages. --Gymnicus (talk) 11:07, 17 October 2021 (UTC)[reply]
I reverted my bot edits. — Ivan A. Krestinin (talk) 01:56, 17 October 2021 (UTC)[reply]
Thank you very much --Gymnicus (talk) 11:07, 17 October 2021 (UTC)[reply]

Hi;

Why have you removed this property? I would ask you to undone the change, please. —Ismael Olea (talk) 10:58, 31 October 2021 (UTC)[reply]

Hi, both have too many violations (more than four hundreds). It is too many for mandatory constraint (Q21502408). This flag was created for monitoring and manual fixing few number of unexpected cases. But the mechanism is broken now. Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations stopped updating because it current size is ~7 Mb (page size limit is 2 Mb). — Ivan A. Krestinin (talk) 11:26, 31 October 2021 (UTC)[reply]

Why is KrBot removing Swedish Open Cultural Heritage URI (P1260) like this? Swedish Open Cultural Heritage URI (P1260) is allowed to have duplicate values. /ℇsquilo 13:40, 4 November 2021 (UTC)[reply]

Why would you want a duplicate value? — Martin (MSGJ · talk) 15:14, 4 November 2021 (UTC)[reply]
Several absolutely equal values is mistake in the most cases. I can add Swedish Open Cultural Heritage URI (P1260) as exception. But maybe it is better to add some qualifier to the values? For example applies to part, aspect, or form (P518). Currently the values are looked very strange for humans also. Looks like Wikidata has single item for the lighthouse, Swedish database has single record also. It is not obvious why the identifier should be specified twice. — Ivan A. Krestinin (talk) 15:24, 4 November 2021 (UTC)[reply]
I guess the squirrel didn't notice that the values were exactly the same! — Martin (MSGJ · talk) 18:09, 4 November 2021 (UTC)[reply]

Constraint Violation Statistics

Hi Ivan,

I'm currently conducting research on the constraints violations of Wikidata and I have found your bot KrBot2. My question is if the queries/scripts for violations counting are available in some git repo, or if there is another way to get them. Thank you for your time!

Cheers, Nicolas

Hi Nicolas, the code is located in private repo currently. The code loads full and incremental Wikidata dumps. This process takes ~9 days and requires significant amount of memory. So I am not sure that the code will be useful for your task. Maybe this report will be enough for your research. — Ivan A. Krestinin (talk) 14:36, 5 November 2021 (UTC)[reply]

Hi! Why is this bot edit? The gallery and the category are two completely different concepts; existence of the one doesn’t imply the existence of the other, and even if they both exist, their name may not match (cf. c:Category:Moscow vs c:Москва). Now no statement carries the information that there’s a gallery about Evolution. And I don’t even see anything on Property talk:P935 that would instruct the bot to do so, so I can’t stop it. (Which is unfortunately often an issue with your bots: not open source so I can’t use the source look, often unclear edit summaries, no community control over certain tasks.) —Tacsipacsi (talk) 14:07, 7 November 2021 (UTC)[reply]

Hello, Evolution (Q336251) does not have gallery on Commons. Evolution (software) is redirect to category. I agree, edit summary is confusing a bit for this case. — Ivan A. Krestinin (talk) 16:24, 7 November 2021 (UTC)[reply]
I see. Yes, please use an appropriate edit summary in this case—the bot didn’t move the statement value, it removed it because it was no longer appropriate. As I explained above, I can’t even imagine a case where this edit summary was appropriate assuming the edits themselves are correct. —Tacsipacsi (talk) 00:55, 8 November 2021 (UTC)[reply]

Update on constraints reports

Nice bot. Any chance to get an update on the constraints report for Norwegian historical register of persons ID (P4574)? I might report a few more properties with recently added constraints soon assuming that's ok. Thanks. --Infrastruktur (T | C) 07:45, 18 November 2021 (UTC)[reply]

Hello, bot does not touch report if the only change is item count or date. I updated it manually. Bot did not detect any constraint violations. — Ivan A. Krestinin (talk) 22:07, 18 November 2021 (UTC)[reply]

How to undo consequence of a incorrect merge?

Hi,

There was an incorrect merge of Alsace wine (Q80114014) and Alsatian Vineyard Route (Q1334019) (done Andrew Dalby and undone by Jon Harald Søby last October, thanks).

But I noticed today that you not replace the first by the second which make the current situation of mess: 700+ wine are now defined as a road... (and also this cause 700+ constraint violations), see Q41437058 for one example.

My question: is there a simple way to undo the replacement? (or at least to find the list to do an overwrite with Quickstatements).

Cheers, VIGNERON (talk) 12:45, 20 November 2021 (UTC)[reply]

Sorry about that. It seemed useful at the time and I had no idea that this chain reaction would happen. Andrew Dalby (talk) 14:05, 20 November 2021 (UTC)[reply]
@Andrew Dalby: no probem, it happens... that's why I'm very carefull hen merging, there can quickly be dire consequences but errare humanum est so shikata ga nai. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)[reply]
Thanks for your reply, @VIGNERON:. Thinking it over, I guess one should consider before merging whether the pages have, or ought to have, the same instance of (P31) values. It will normally be true. But in this case it wouldn't have been true, and that would have been a warning. Andrew Dalby (talk) 15:38, 23 November 2021 (UTC)[reply]
Hello, I rolled back the links change. Previously links like this were used for rollback also. See User_talk:Ivan_A._Krestinin/Archive#suggestion: using edit groups for solving redirects. for details. @Pintoch:, @Pasleim: currently edit group tool shows 'Edit group "Q80114014_Q1334019" not found' error. Is it some bug? — Ivan A. Krestinin (talk) 22:00, 20 November 2021 (UTC)[reply]
@Ivan : yes, I noticed this error too, I was wondering if it was just me or not. Anyway, thanks for the quick answer and I'll let you look into it. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)[reply]
@VIGNERON, Ivan A. Krestinin: thanks for the notification. KrBot still seems to have its edits tracked by EditGroups (https://editgroups.toolforge.org/?user=KrBot) but somehow this batch seems to have been missed, it is not clear to me why. I will look into the problem. − Pintoch (talk) 14:36, 24 November 2021 (UTC)[reply]

Unitless range constraint (Q21510860) for united quantity isn't checked "correctly"

Many properties are naturally expressed with units, but have unitless range constraints. (This should be deprecated and fixed, but that's another discussion).

For example, duration (P2047) has a constraint which Wikibase interprets as meaning that the maximum allowed duration is a billion seconds.

KrBot2 currently doesn't list violations of the billion-seconds constraint, such as 70 years or more after author(s) death (Q29870196). Wikibase does. (In this case, the constraint is inappropriate and should be removed, IMHO).

If it's not too hard to do, it would be nice if KrBot2 and Wikibase could agree on how to interpret such constraints.

Streetmathematician (talk) 13:29, 22 November 2021 (UTC)[reply]

Originally "Range" constraint checks value only. Units are ignored by this constraint. Looks like it was reimplemented in Wikibase using a bit strange normalization algorithm. I agree that some properties may require taking into account units. But there are few examples of such cases. duration (P2047) is looked as error usage of "Range" constraint. For example duration of Sun live is more than 1000000000 seconds. I fixed the constraint. Reason why units are not supported by Range constraint is very simple. Conversion from one unit to another might be non-trivial. For example Mach number (Q160669) -> kilometre per hour (Q180154). Also set of all units used on Wikidata is not defined. I can implement units support for some specific cases. But not for all possible units. — Ivan A. Krestinin (talk) 17:05, 23 November 2021 (UTC)[reply]

Please don't edit Q4233718

The edits to anonymous (Q4233718) by your bot are incorrect. Please make sure that your bot doesn't edit that item. Multichill (talk) 16:45, 27 November 2021 (UTC)[reply]

Fixed: [8], [9], [10], [11], [12]. — Ivan A. Krestinin (talk) 21:11, 27 November 2021 (UTC)[reply]

Hello! Your bot used old dead link vwo.osm.rambler.ru for this list: Wikidata:Database reports/Constraint violations/P884. Машъал (talk) 11:25, 8 December 2021 (UTC)[reply]

Приветствую, бот всегда берет первое значение из свойства formatter URL (P1630), несмотря на его ранг. Такое вот ограничение. Обошел проблему поменяв маски местами. Вообще если подумать, то какой смысл хранить устаревшие маски на странице свойства... Истории место на странице истории... — Ivan A. Krestinin (talk) 22:33, 10 December 2021 (UTC)[reply]
Спасибо. Я тоже не знаю зачем, тем более ссылка умерла. Но кто-то так настроил, наверное по правилам нужно? Машъал (talk) 19:06, 14 December 2021 (UTC)[reply]
Да нет никаких особых правил на этот счет, просто кому-то не захотелось удалять устаревший линк. — Ivan A. Krestinin (talk) 21:18, 14 December 2021 (UTC)[reply]

Scholarly article duplicates

Hi - I've gone through the list at User:Ivan A. Krestinin/To merge/Scholarly articles as it was a few weeks ago, and merged a large number of them (over 1000). However I see the list has been updated. Would it be possible to sort this list by items recently created (for example Q109?????? duplicates at the top, etc.) Or is the list possibly incomplete and items might be added that were created a long time ago but weren't caught by your checks yet? ArthurPSmith (talk) 21:29, 13 December 2021 (UTC)[reply]

Oh - I just realized you put it in a sortable table so I could have sorted on Qid from the start! Anyway, I guess I'll wait for the next update to this to see what I may have missed. I've contacted one person who was creating duplicates and that seems to have ceased, so hopefully we won't get so many going forward. ArthurPSmith (talk) 21:35, 13 December 2021 (UTC)[reply]
Hi Arthur, you made great job, thank you! The list is incomplete of course. It is limited by size (1 Mb). Full report size is 36 Mb now. bot sorts the report by internal rank. So end of full report contains false positives mostly. I update the report after Wikidata:Database reports/Constraint violations/P356 and some other pages update. — Ivan A. Krestinin (talk) 21:17, 14 December 2021 (UTC)[reply]

Database reports/identical birth and death dates

The valuable report Wikidata:Database reports/identical birth and death dates/1 seems to contain a lot of matches with 1 January for date of birth or date of death at the moment, most of which are probably spurious precision. Would it be worth suppressing these, as they very seldom represent an actual match ? Jheald (talk) 13:00, 23 December 2021 (UTC)[reply]

The best solution here is just fix wrong values as for me. Did you contact with user who created values with wrong precision? Maybe he has some instrument for fix. — Ivan A. Krestinin (talk) 22:26, 23 December 2021 (UTC)[reply]
You're probably right, that it's better to try to fix issues than just to hide them. I've been in touch with Ghuron, who created about 760 entries like this as part of a recent upload of data from The Righteous Among the Nations Database (Q77598447) at Yad Vashem. Some others may have been created back in 2014 by User:GPUBot (since blocked). There may also be others again, created in other uploads (cf https://w.wiki/4bcR - quite a diverse set of items); but with luck the situation should become clearer once the Yad Vashem ones are sorted out. Jheald (talk) 17:23, 24 December 2021 (UTC)[reply]
I started task that fixes January 1 values from The Righteous Among the Nations Database (Q77598447), edit example: [13]. — Ivan A. Krestinin (talk) 08:31, 25 December 2021 (UTC)[reply]

Autofix - P17

Hi, can you stop the automated edits of Catalonia to Spain? While I am totally against these edits as established by our Wikipedia community, Catalonia, like other nations such as Kurdistan, is not only in what is considered a sovereign state, but divided, in our case, in two (Spain and France). Therefore, it is possible that certain technically wrong edits may be made. Regards, --KajenCAT (talk) 10:12, 7 January 2022 (UTC)[reply]

Hello, just remove {{Autofix|pattern=Q5705|replacement=Q29}} line from Property talk:P17. I do not know situation with Catalonia in details. Maybe it is good to start discussion on Property talk:P17 before or after removing the autofix template. — Ivan A. Krestinin (talk) 17:46, 7 January 2022 (UTC)[reply]
Thank you for your response. I will provisionally withdraw it and open the subject. Thank you again. KajenCAT (talk) 23:16, 7 January 2022 (UTC)[reply]

Remove audio podcast

can you please remove distribution format: audio podcast from JRE episodes such as JRE #312 - Steve Rinella, Bryan Callen (Q109306593), as most episodes are video podcasts and only very few of them are audio only Germartin1 (talk) 10:41, 8 January 2022 (UTC)[reply]

Hello, I added rollback task to bot`s task list. Bot should rollback 1420 items today or tomorrow. — Ivan A. Krestinin (talk) 18:16, 8 January 2022 (UTC)[reply]
✓ DoneIvan A. Krestinin (talk) 19:44, 9 January 2022 (UTC)[reply]
Thanks, what about these ones, some of them are video podcasts https://www.wikidata.org/w/index.php?title=Q101011923&type=revision&diff=1408455138&oldid=1335982903 Germartin1 (talk) 11:28, 14 January 2022 (UTC)[reply]
✓ Done I reverted also edits based on Spotify show ID (P5916). — Ivan A. Krestinin (talk) 01:10, 15 January 2022 (UTC)[reply]

Adding internet archive identifiers to items for people

Hello,

Can your bot stop adding an internet archive identifier to items for people such as Q110486431. Thank you. Gamaliel (talk) 16:59, 10 January 2022 (UTC)[reply]

Hello, just delete line {{Autofix|pattern=<nowiki>https?://archive\.org/details/([0-9A-Za-z@][0-9A-Za-z._-]+)|replacement=\1|move_to=P724}}</nowiki> from Property talk:P973. This job was added by Jura1 several years ago. Maybe it is good to discuss it with him. Also I added one more conflicting value. This should prevent such edits also. — Ivan A. Krestinin (talk) 17:48, 10 January 2022 (UTC)[reply]

Добрый вечер! Хочу вас пригласить как инженера Русской Википедии в википроект Россия. Вы можете помочь сообществу теснее интегрировать данные РуВики в общий банк данных, создавать нужные свойства и т.д. MasterRus21thCentury (talk) 16:54, 18 January 2022 (UTC)[reply]

Приветствую, если будут конкретные задачи, то обращайтесь. Свободного времени у меня не очень много, но какие-то задачи возможно решу. — Ivan A. Krestinin (talk) 17:01, 18 January 2022 (UTC)[reply]
Например сейчас можете принимать участие в обсуждении предполагаемых свойств Викиданных из российских источников. MasterRus21thCentury (talk) 17:35, 18 January 2022 (UTC)[reply]

Подведение итогов по свойствам Викиданных

Иван, привет! Вы бы не могли подвести итоги по свойствам Викиданных, поскольку новые свойства не создаются с понедельника, а также скопилось 61 свойство, ожидающее решение администратора или создателя свойств? MasterRus21thCentury (talk) 17:12, 21 January 2022 (UTC)[reply]

Приветствую, довольно редко занимаюсь созданием новых свойств, лучше обратитесь к другим участникам. Я в основном специализируюсь на автоматических процедурах поддержания целостности и качества данных. — Ivan A. Krestinin (talk) 17:15, 21 January 2022 (UTC)[reply]

Many articles with PubMed ID = 9541661

Hi. I find there are 24 articles with PubMed ID = 9541661. e.g., Indirect (repeat) prescribing (Q84597236), The pharmaceutical industry (Q84597219). Can we recover the edits? And is this a one-time event? Kanashimi (talk) 06:23, 23 January 2022 (UTC)[reply]

Hello, it was once-run task. The edit is looked correct: both 19790808 and 19790797 were deleted by PubMed. PubMed marked its as duplicate of 9541661. Looks like all these IDs were merged because it is single large work actually. And the Wikidata items are correspond to chapters of this work. Usually we have no separate item for each chapter on Wikidata. I suggest to merge all these items following for PubMed. — Ivan A. Krestinin (talk) 10:28, 23 January 2022 (UTC)[reply]
Thank you. Kanashimi (talk) 10:36, 23 January 2022 (UTC)[reply]

Taxonomy bug?

Hi, no idea how this happend, just reporting: https://www.wikidata.org/w/index.php?title=Q469652&diff=1564671901&oldid=1564583128&diffmode=source

Best, AdrianoRutz (talk) 12:52, 28 January 2022 (UTC)[reply]

regular constraint reports

Hi Ivan,

Seems constraints reports are much more frequently updated, almost daily. Excellent news. Thanks!

Maybe we should mention it on Wikidata:Status_updates/Next#Other_Noteworthy_Stuff --- Jura 08:22, 10 February 2022 (UTC)[reply]

Cool. I added a note to the weekly news. --- Jura 13:59, 11 February 2022 (UTC)[reply]
Update: +280$ for extending RAM and update cycle is 9 hours now. In practice update frequency is limited by 24 hour period of incremental dumps generation. — Ivan A. Krestinin (talk) 21:21, 15 February 2022 (UTC)[reply]

stats on of (P642) as qualifier by property

Maybe you have seen Property talk:P642.

I think it would be helpful to have statistics about the properties currently using it as qualifiers.

As there are 14 million uses, this is hard to do on query service.

I noticed the constraint report for P31 has them (197165).

Do you have a simple way to generate a summary for all properties (even those without allowed qualifier constraints, e.g. P279). --- Jura 12:01, 10 February 2022 (UTC)[reply]

It is not so simple, but I am thinking on possible implementation. — Ivan A. Krestinin (talk) 16:33, 11 February 2022 (UTC)[reply]
In the meantime we got some approximation with the query Vojtěch provided.
Maybe stats on each pair property / qualifier could be interesting, beyond P642.
OTH, the problematic might not necessarily be the most used ones. Personally, I think "applies to part" is the most problematic one. --- Jura 17:01, 11 February 2022 (UTC)[reply]
@Jura1: usage report: User:Jura1/P642 usage. — Ivan A. Krestinin (talk) 10:48, 13 February 2022 (UTC)[reply]

instance of (P31) removal of maintained by wikiproject

Why would you do this? Lectrician1 (talk) 01:11, 11 February 2022 (UTC)[reply]

Just because it throws error and is not something commonly used. Question: can discussed cases be fixed automatically? Or its require non-trivial manual work? Maybe it is better to add {{Autofix}} or something like it? — Ivan A. Krestinin (talk) 16:42, 11 February 2022 (UTC)[reply]
Then why don't we just make it an allowed qualifier? I don't think we should autofix this stuff. Lectrician1 (talk) 17:47, 12 February 2022 (UTC)[reply]
I just used old principle: entities should not be multiplied beyond necessity. Is the qualifier used for some automated work? — Ivan A. Krestinin (talk) 21:09, 15 February 2022 (UTC)[reply]
@Ivan A. Krestinin It's to give people an idea about who to contact if they have questions about the constraint. A lot of of the constraints are for managing the Wikiproject Music data model which is complex and new contributors might have questions about it. Lectrician1 (talk) 01:06, 16 February 2022 (UTC)[reply]
I added the qualifier to ignored qualifiers list. Please rollback my edit. — Ivan A. Krestinin (talk) 21:41, 16 February 2022 (UTC)[reply]

identical dates and deprecated January 1

Hi Ivan,

As we kept getting entries with deprecated January 1 dates, I started listing them at False_positives#pairs_with_deprecated_January_1_date. I left some notes about it at #January_1_as_date.

Since then, more get created with deprecated rank directly added (sample: Q110842925#P569).

Accordingly, I'd filter any deprecated "January 1"-date by default. --- Jura 10:35, 11 February 2022 (UTC)[reply]

Удаление DOI

Добрый день. Почему ваш бот удаляет коды DOI, пусть и нерабочие, но подтверждённые источником? Они в частности позволяют избегать дублирования элементов. --INS Pirat ( t | c ) 05:11, 12 February 2022 (UTC)[reply]

Приветствую, они скорее наоборот мешают находить и объединять дублирующиеся элементы. Так как у статьи корректный код как правило только один, а вот некорректных кодов может быть сколько угодно. В результате оказывается, что есть два элемента описывающих одну статью, но DOI коды у которых разные. В свое время была сделана массовая чистка некорректных кодов, в результате количество объединенных элементов уже больше 50 тысяч. Кое-какая информация об этой работе: Property talk:P356#15138 wrong values. — Ivan A. Krestinin (talk) 07:07, 12 February 2022 (UTC)[reply]
Вы говорите о некорректных кодах в целом, а не о приведённом случае. И каким образом мешают? После создания повторного элемента, опираясь на тот же источник, точно так же DOI будет помечен как уже используемый (хоть и сам он нерабочий). И у материала может быть и верных кодов несколько. --INS Pirat ( t | c ) 09:53, 12 February 2022 (UTC)[reply]
Было много пар элементов, где в одном из них был указан корректный код, а в другом - некорректный. Робот или человек видел два элемента с разными кодами и делал логичный вывод: это разные статьи, объединять нельзя. Не знаете кстати какова природа этих некорректных кодов? Откуда берется такое большое количество невалидных значений? Кстати идентификатор DOI ведь не единственный по которому можно искать дубликаты. Для статьи, что мы обсуждаем вполне можно искать дубликаты по вполне валидному значению свойства Cairn publication ID (P4700). — Ivan A. Krestinin (talk) 16:10, 12 February 2022 (UTC)[reply]
Я не совсем понимаю вашу позицию. Да, не единственный идентификатор. Но я не считаю, что это препятствует использованию других. Есть факт: в первоисточнике указан определённый DOI. Я оформил его соответствующим образом (для того ранг нерекомендуемости и квалификаторы и нужны). Если, повторюсь, у материала больше одного рабочего DOI, ситуация получается такой же, как вы описываете. --INS Pirat ( t | c ) 16:54, 12 February 2022 (UTC)[reply]
Позиция простая: если превращать Wikidata в коллекцию заблуждений (пусть и помеченных соответствующим образом), то станет крайне сложно выполнять даже такие простые операции, как поиск элементов-дубликатов. Проблема еще осложняется тем, что некоторые участники начинают массово правильные коды помечать рангом нерекомендуемости и выставлять те же квалификаторы. И тут вообще полный ад начинается. Давайте просто не будем без особой на то необходимости тянуть в Wikidata невалидные значения. То, что мы можем это сделать не значит же, что мы должны это делать. — Ivan A. Krestinin (talk) 17:24, 12 February 2022 (UTC)[reply]
С чего бы кому-то таким заниматься, ещё и массово? Где вы такое увидели? И действия участников не должны влиять на допустимость информации. И всё ещё не ясно, какие вы видите помехи поиску дубликатов (как я уже сказал, наоборот, должно помогать). Более глобально: боты вообще не должны касаться целенаправленно внесённой невандальной информации (допустим, с рангами/квалификаторами), по крайней мере уж повторно, при отмене. --INS Pirat ( t | c ) 20:45, 12 February 2022 (UTC)[reply]
Посмотрите, например, эти правки: [14], [15], [16], там правда не DOI, а другой идентификатор, но суть та же. Давайте подробно опишу всю историю: среди элементов описывающих научные статьи дубликаты заливались и продолжают заливаться тысячами. Я решил заняться массовыми мержеваниями. Работа важная, т. к. из-за такого обилия данных движок SPARQL в ближайшем будущем умрет, вычистка дубликатов хоть немного отсрочит его смерть. Главная опасность этой работы - не объединить лишнего, т. к. откат одного неправильного объединения - дело тяжелое, а откат пары сотен неправильных объединений - вообще беда. Потому алгоритмы приходится делать сильно параноидальными, малейшее различие и объединение надо прерывать. Ориентироваться здесь на ранги идентификаторов не получается, т. к. ранг "нерекомендуемый" расставлен достаточно случайным образом (см. примеры выше). Бот успешно отработал и объединил что-то около 10000 пар элементов. После этого я начал анализировать случаи, когда бот не объединял похожие элементы. Выяснилось, что в большом количестве случаев была ситуация, когда в одном элементе был корректный код, а в другом - невалидный. Или в обоих невалидные коды. Это касалось не только DOI. Но DOI был один из самых информативных и самых "замусоренных". Дальше началась долгая история с вычисткой невалидных кодов. Часть кривых кодов была удалена как полный треш, который неизвестно откуда взялся и даже по формату на DOI не был похож. Потом нашлись коды похожие на DOI по формату, но кодами DOI не являющиеся. Пришлось договариваться с организацией CNRI, которая обслуживает этот идентификатор относительно валидации всех 27 миллионов имеющихся у нас кодов. Бот работал больше месяца, но в итоге вычистил почти 80 тысяч кодов, не являющихся DOI. После всех этих работ удалось объединить уже больше 50 тысяч пар элементов и работа продолжается. Только сегодня робот объединил больше 2000 пар элементов. По поводу того, чтобы не делать какие-то изменения повторно, если они были отменены кем-то: с одной стороны я с вами согласен, было бы наверное здорово действовать именно так. Но тут три проблемы: 1. это технически весьма сложно, а технически сложные системы обычно содержат много багов и как следствие склонны к невалидному поведению. 2. многие типы ошибок повторяются многократно разными пользователями. 3. на Викиданных приходится оперировать десятками тысяч, если не миллионами элементов, вручную исправлять кейсы, где были откаты нет никакой возможности. — Ivan A. Krestinin (talk) 22:47, 12 February 2022 (UTC)[reply]
Это несколько перпендикулярно теме опоры на источники, но уже более убедительно. (Почему сразу-то к этому не перешли?)
1) Умеет ли алгоритм объединять элементы, в одном из которых указан верный DOI, а в другом - нет никакого (и нет иного уникального идентификатора)? 2) Умеет ли алгоритм объединять элементы с разными, но валидными DOI? --INS Pirat ( t | c ) 17:45, 17 February 2022 (UTC)[reply]
Бессмысленно же опираться на источник, когда в нем указан очевидно ошибочный идентификатор. Опечатались, неправильно данные подтянули, чего только не бывает на больших объемах. Что же нам теперь превращаться в собрание всех ошибок на свете... 1) Если в элементе нет никакого идентификатора, то бот может еще найти по совпадению title (P1476). 2) Нет, сейчас алгоритм ориентируется только на точные совпадения. Есть еще отчет про то, что боту показалось похожим, но объединить автоматически он "не решился": User:Ivan A. Krestinin/To merge/Scholarly articles. — Ivan A. Krestinin (talk) 20:20, 17 February 2022 (UTC)[reply]
doi.org использует очень качественный датасет, но, как и любой датасет, он содержит некоторое количество ошибок. Существует ли способ пометить "невалидный" DOI (P356) так, чтобы бот его пропускал? Например How much of the solar system should we leave as wilderness? (Q63858167)DOI (P356)10.1016/J.ACTAASTRO.2019.03.014, и мы можем в этом убедится пройдя по этой ссылке, хотя https://doi.org/10.1016/J.ACTAASTRO.2019.03.014 возвращает 404
То же самое с Photometric and spectroscopic observations of the neglected near-contact binary Cl* Melotte 111 AV 1224 (Q68976229)DOI (P356)10.1088/1538-3873/AAD5D9 (см. https://iopscience.iop.org/article/10.1088/1538-3873/aad5d9).
Может быть использовать какую-нибудь причину нерекомендуемости? Ghuron (talk) 12:07, 17 October 2022 (UTC)[reply]
Коллега, посмотрите в историю Randomization-based inference for Bernoulli trial experiments and implications for observational studies (Q49942301). То что мы с вами делаем - бессмысленно Ghuron (talk) 07:14, 23 October 2022 (UTC)[reply]
Коллега, я понимаю что у вас много сообщений, но, пожалуйста, увидьте это обращение Ghuron (talk) 08:27, 27 December 2022 (UTC)[reply]
Да, прошу прощения, действительно не заметил сообщения. Добавил в эти элементы корректные с точки зрения DOI коды. Обычно, когда возникает подобная ситуация, то я ищу в Гугле точное имя статьи плюс "DOI". Среди первых нескольких ответов обычно бывает правильный код. — Ivan A. Krestinin (talk) 16:52, 27 December 2022 (UTC)[reply]

Regarding Wikidata Q3105247

‪You recently changed the SBN author ID‬ ‪(P396)‬: IT\ICCU\BVEV\090371 to BVEV090371, but now the authority control of en.Wikipedia says: "The ICCU id BVEV090371 is not valid". Why?--Ruotailfoglio (talk) 16:29, 20 February 2022 (UTC)[reply]

Hello, it is some kind of cache most probably. I do not see any marks at Q3105247#P396. Please try F5 and Ctrl+F5 in your browser. Maybe it is browser cache. Or wait for day or two. — Ivan A. Krestinin (talk) 16:33, 20 February 2022 (UTC)[reply]
Thank you! Ruotailfoglio (talk) 18:24, 21 February 2022 (UTC)[reply]

Reverting an Autofix?

Hello,

Last week a mistaken {{Autofix}} was added to platform (P400) (replacing personal computer (Q16338) with Microsoft Windows (Q1406)), and KrBot duly replaced on thousands of items. is there a way to revert all these autofixes? See Property_talk:P400#Autofixes for details. Thanks! Jean-Fred (talk) 20:29, 21 February 2022 (UTC)[reply]

Did you review the source? Arlo Barnes (talk) 22:12, 24 February 2022 (UTC)[reply]

Very quickly. Skeleton is not sex or gender. It is looked as inappropriate value for sex or gender (P21), please see the property constraints. — Ivan A. Krestinin (talk) 23:30, 24 February 2022 (UTC)[reply]

Code sample

Hi Ivan, hope you're having a great day :) Any hope of see the code sample that Krbot2 uses to update constraint violation report? I'd really appreciate it Joseph202 (talk) 17:17, 1 March 2022 (UTC)[reply]

Hi Joseph, the code was not published. Bot is written on C++. Constraints report update task uses many shared code with another wiki-related tasks (200+ tasks). Please write me email. I can send some parts. Or put some questions here if you need some implementation details instead of the code actually. — Ivan A. Krestinin (talk) 03:54, 2 March 2022 (UTC)[reply]
@Ivan A. Krestinin: Thank you for your reply, actually, I want to use the code on a third-party installation of wikibase, that's why I was asking Joseph202 (talk) 17:18, 6 May 2022 (UTC)[reply]
Bot works with dumps. Do you have planes to dumps of compatible format? What amount of data is planned? We can think about connecting my bot to your project too. — Ivan A. Krestinin (talk) 18:47, 6 May 2022 (UTC)[reply]
@Ivan A. Krestinin: Currently, the only way we get/generate dumps is via our Special:DataDump, although there seem to be a way via API that I haven't tried before.
But you can have a look if you wish. Joseph202 (talk) 20:40, 6 May 2022 (UTC)[reply]
@Ivan A. Krestinin: Hi, I trust you're having a great day. Per the above, We actually get dumps via the Special:EntityData special page, we can also get for the formats that Wikidata can get too. How can we begin to work on this?
Hope to hear from you soon. Joseph202 (talk) 08:26, 12 May 2022 (UTC)[reply]
Hi Joseph, I take a look to the project. One problem is different identifiers. For example P31, P279, Q21502404 - all are hard-coded now. Need to make some parametrization. Another question is project size. Constraints system requires some efforts for deploy and maintenance. I am not sure that it is reasonable to use it on small-size projects. Maybe it is better to focus on data volume increase first. — Ivan A. Krestinin (talk) 19:15, 12 May 2022 (UTC)[reply]
Hello, thanks for replying.
Yes, the IDs are different, is it not possible to configure it to fit in to Gratisdata? and yes, there are over 3000 data available currently and still counting is that not considerable in terms of being large volume?
I'd love to hear from you Ivan, thank you! Joseph202 (talk) 18:37, 13 May 2022 (UTC)[reply]

explain why constraints aren't applicable on {P7883}

Hi, I noticed you removed mandatory constraint (Q21502408) from multiple properties including Historical Marker Database ID (P7883), Would you provide clarification why these were removed? Given they were just removed without comment and I'm the one who added them I'd like to know how to more accurately apply these in the future. Wolfgang8741 (talk) 19:25, 2 March 2022 (UTC)[reply]

Hello, the constrains have 100+ violations and are not looked as something easy to fix. Could you add the flag after fixing most part of violations? Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations was made as tool for quick revert of vandalism or wrong edits. But the report is unmaintainable now. Size is too large. I am trying to improve the situation with quality using different approaches. — Ivan A. Krestinin (talk) 19:42, 2 March 2022 (UTC)[reply]
Ah, thanks for the explanation. Prior to adding the type constraint, there was no means of ensuring consistent use across the IDs and why applying the type constraints generated a large report. I started a discussion to constrain and cleanup the marker IDs at WikiProject_Cultural_heritage#Adding_item_Type_Constraints_for_Historic_Marker_Properties ideally leading to a model for IDs related to markers. I'm still digging deeper into Wikidata's structure for nudging for data consistency and preventing conflation of concepts. Does one of your approaches rely upon the Wikidata:Database_reports/EntitySchema_directory? These constraints weren't meant to be left, but prompt cleanup moving the IDs. So what I'm hearing is once the IDs are cleaned up adding the mandatory constraint would be appropriate. Wolfgang8741 (talk) 20:32, 2 March 2022 (UTC)[reply]
I do not use Wikidata:Database_reports/EntitySchema_directory directly. My approaches use property constraint (P2302), {{Autofix}}, many different custom bot tasks, for example automatic duplicate values cleanup. The most tasks are focused on fixing different popular mistake types. Adding "mandatory" mark is nice practice after completing work on property. — Ivan A. Krestinin (talk) 22:07, 2 March 2022 (UTC)[reply]

идентификатор в Яндекс.Дзене (P8816)

День добрый. Бот по ряду персон (пример - всего таких персон лично мне известно не менее 14-ти) в идентификаторе в Яндекс.Дзене (P8816) меняет значение индентификатора, удаляя участок "id/", после чего Яндекс.Дзен перестаёт открываться. Формально бот действует в соответствии с определённой маской URL https://zen.yandex.ru/$1, но по факту в таких статьях работающие ссылки работать перестают. Как понимаю, когда обсуждался идентификатор, просто не учли, что помимо основной маски, имеющейся у большинства персон в Яндекс.Дзене, у некоторых есть ещё и такой с "id/". Поэтому вопрос, как можно проблему решить? Или настроить бот, чтоб он по таким статьям не менял идентификатор, или какие другие варианты возможны? Спасибо. --Uchastnik1 (talk) 11:12, 3 March 2022 (UTC)[reply]

  • Так понимаю, что после этой правки произошла замена. --Uchastnik1 (talk) 11:40, 3 March 2022 (UTC)[reply]
  • Посмотрел по вкладу бота за соответствующий период, число таких статей/элементов ВД увеличилось где-то до 20-ти, и также обнаружился такой момент, что это не только персон касается, но и других сущностей (предметов), к примеру: Kion, Вокруг ТВ, Холодильник.ру, то есть вряд ли можно сказать, что по духу исходных условий создания идентификатора не подразумевался охват идентификатором этих сущностей (то есть разница тут чисто технического плана, в части этого добавочного "id/", не более того, равно как и персон это касается таким же образом). --Uchastnik1 (talk) 14:55, 3 March 2022 (UTC)[reply]
    • Приветствую, да, вы верно нашли правку, которая сказала боту, что нужно убирать префикс id. Если ее отменить, то бот перестанет так делать. Чуть позже запущу отмену этих изменений. Хорошо бы еще поправить описание и свойства на странице Property:P8816, чтобы они допускали такой префикс. — Ivan A. Krestinin (talk) 16:48, 3 March 2022 (UTC)[reply]

Remove also non-existing files from references and qualifiers

Could your bot also remove non-existing files from references and qualifiers? I have found few of them and manually fixed them (example) but it would be nice if this would be done automatically. I have seen your work do similar changes, so maybe this could be done by it as well?

Similarly, bot could change a filename of a Commons file used in a reference or a qualifier, when it is moved on Commons (example). Mitar (talk) 14:22, 11 March 2022 (UTC)[reply]

Hi, I am working on fixing qualifiers. But references processing will be more hard task. — Ivan A. Krestinin (talk) 13:55, 12 March 2022 (UTC)[reply]

Check P2190 constraint config as it moves from string to numeric ID

Hi, could you double check if I migrated the property constraints correctly to numeric from string on C-SPAN person ID (P2190). This move was discussed on the property talk page and project chat. Wolfgang8741 (talk) 15:11, 14 March 2022 (UTC)[reply]

Hi, everything is fine. I just improved property a bit. — Ivan A. Krestinin (talk) 20:49, 14 March 2022 (UTC)[reply]
Hi, Thanks for that. Looking at the constraint report the deprecated string values are flagged for format violation. Shouldn't a deprecated value be exempt from the currently accepted format checks as well as from a single value constraint? Two reasons to retaining the deprecated IDs are
1. matching with archived versions of the data using the old ID affording checks of the data consistency over time or when the data was initially added
2. assisting in matching existing data to convert to the new identifier.
This is partially a technical question and partially a statement as I noticed you had removed a few ids have been removed instead of deprecating them per Help:Deprecation since the values were valid prior to transition, but less reliable. They still could be matched to the Internet archive or other archive. Wolfgang8741 (talk) 15:05, 19 March 2022 (UTC)[reply]
More ideas:
I fixed constraints, now we have zero violations. But splitting the property is more correct way. — Ivan A. Krestinin (talk) 22:00, 21 March 2022 (UTC)[reply]
Thank you, that would have been helpful guidance a while back. Where is splitting the property the more correct way documented? This is important process to have documented. I asked both on the property talk and Project chat for this guidance and for nearly a month no one responded to my questions about process to change the property format with certainty or clear path. How is one suppose to find the "more correct way" documentation or learn? Should we go about splitting the property so the ID can be properly constrained for monitoring? Wolfgang8741 (talk) 16:37, 24 March 2022 (UTC)[reply]
I know too few about Wikidata project documentation. Actually different approaches were used for different properties in the past. I just highlighted the best approach from my point of view. And listed reasons why it is the best. You are right, splitting allows improve constraints. — Ivan A. Krestinin (talk) 19:39, 24 March 2022 (UTC)[reply]

Languages statistics on a lexeme property

Hello,

On Wikidata:Database reports/Constraint violations/P10338#Languages_statistics, it is stated that the property Dico en ligne Le Robert ID (P10338) is used 9 times. In fact, it is much more than that. Do you know why the statistics seem incorrect? Maybe it is related to the fact that lexemes are in a separate dump (I don't how your bot works, so it's just a blind guess)?

Cheers, — Envlh (talk) 21:48, 27 March 2022 (UTC)[reply]

Hello, looks like real bug. I will investigate this. Thank you. — Ivan A. Krestinin (talk) 23:00, 31 March 2022 (UTC)[reply]
FixedIvan A. Krestinin (talk) 20:37, 8 April 2022 (UTC)[reply]
Thank you! I confirm it is properly working since you fixed it :) Cheers, — Envlh (talk) 16:23, 25 April 2022 (UTC)[reply]

Must revert some KrBot changes

Hello!

There is problem, when had merged two diffrent entries - dispersed settlement (Q1372205) and dispersed settlement in Latvia (Q16352482). That was in March 26. In Marth 28 KrBot set claim values (watch this change). There are so many changes, which must revert. Can these changes cancelled with bot? --Treisijs (talk) 12:19, 30 March 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 23:01, 31 March 2022 (UTC)[reply]

Does KrBot still update P214 monthly ?

Hi, this article https://ejournals.bc.edu/index.php/ital/article/view/12959 states that KrBot "updates links in Wikidata items to redirected VIAF clusters and removes links to abandoned VIAF clusters." on a montthly basis. Is it still the case ? I'm not sure based on the statistics I got here https://bambots.brucemyers.com/NavelGazer.php. Thanks !

Hello, bot had some troubles. Not all items were processed. I fixed the issue. Everything should be fine now. All items were processed now. Thank you! — Ivan A. Krestinin (talk) 20:18, 8 April 2022 (UTC)[reply]
Thanks for the info and for the great job ! 193.52.26.94 06:49, 11 April 2022 (UTC)[reply]

Wikimedia import URL constraint violations

Edits like this one trigger this constraint violation. The constraint was added by user:Tacsipacsi (and last edited by user:Nikki). Not sure how to best solve it? You could use http://www.wikidata.org/entity/Q111645043#P10039 & http://www.wikidata.org/entity/P10039#P2302 (or even http://www.wikidata.org/entity/P10039#P10039$4d1d74b4-4a3f-cc5b-c760-d133e2ac8fd9)? Or we can just remove the whole constraint... Multichill (talk) 16:36, 18 April 2022 (UTC)[reply]

Looks like a valid use case (more or less: as an outsider, it may not be obvious at first what those URLs mean). I definitely don’t recommend working around the constraint by using a different URL—if a constraint is wrong, it should be fixed, not worked around. However, I still see a constraint here: using Wikimedia import URL (P4656) with a Wikidata URL is valid if, and only if, a inferred from (P3452) statement is also present (which it makes more precise). Unfortunately this cannot be expressed with the current constraint system, so this constraint either needs to be replaced by a complex constraint that handles this situation, or a new property needs to be created for this purpose (which could have—non-complex—constraints that require that it points to a Wikidata URL, and to require that it’s always used together with P3452). The former avoids creating yet another property, the latter lets us continue to use non-complex constraints, which provide feedback to the user in context, not only on some hidden constraint report pages. —Tacsipacsi (talk) 15:41, 19 April 2022 (UTC)[reply]

Reverting redirect resolutions after wrong merger

Hi Ivan, human sexual behavior (Q608) and Sex (Q98040930) were wrongly merged. Could you please revert the redirect resolutions? Thanks, Máté (talk) 05:45, 22 April 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 16:58, 23 April 2022 (UTC)[reply]

Hello! Very nice things your bot does, however it's making an understandable mistake. It is removing the statement has part(s) (P527) YouTube comment (Q110875209) on YouTube comment (Q110875209). I undid it with the reason "YouTube comments can have other YouTube comments as replies", but it did it again. AntisocialRyan (Talk) 14:56, 24 April 2022 (UTC)[reply]

Hello, usually recursive link in has part(s) (P527) is just mistake. This is reason why bot removes it. But your case is interesting. I think reply comment is another comment. It is linked with original comment using specific relation "reply to". This relation is not similar to "has part" relation. Lets take analogue from another area to verify our statement. For example "human has part human" because one human is child of another human. This statement is looked wrong. So I think that statement "comment on Youtube has part comment on Youtube" is wrong too. — Ivan A. Krestinin (talk) 16:12, 24 April 2022 (UTC)[reply]
Alright, I see where you're coming from actually. YouTube comments can have replies, but replies to YouTube comments can't have more replies. I will create a new item for this, thanks! AntisocialRyan (Talk) 16:59, 24 April 2022 (UTC)[reply]

Monks

Hi Ivan! These edits would not be done by your bot that it does not delete the value but replaces it with monk (Q733786)?

I understand why you delete from the occupation property which order of monks you are a member of, but if you don't write that you are a monk, the item can be left without an occupation (see). Thanks Palotabarát (talk) 21:52, 26 April 2022 (UTC)[reply]

Hi @Palotabarát:! Thanks for raising the point. We are trying to establish a standard for data regarding members of Catholic religious orders at Wikidata_talk:WikiProject_Religions#Members_of_Catholic_religious_orders; since there wasn't any objection, I applied this change, but of course it is reversible and improvement can be made (e.g. adding new P106s in order to fill the gap). We can continue the discussion there. Good night, --Epìdosis 22:42, 26 April 2022 (UTC)[reply]

Laurence Olivier Award for Best xxxx

Hi, Ivan. The KrBot changed a number of entries I made yesterday for the series of Laurence Olivier Awards (Laurence Olivier Award for Best Actor in a Supporting Role in a Musical (Q19870586), Laurence Olivier Award for Best Actress in a Supporting Role in a Musical (Q19870588), Laurence Olivier Award for Outstanding Achievement in Music (Q16995976), Laurence Olivier Award for Best Actress (Q6500774), etc.), changing the country from "England" to "United Kingdom".

The Olivier Awards are not presented to theatre productions in the entire United Kingdom, made up of the individual countries of England, Scotland and Wales, along with Northern Ireland. These awards are only presented for theatre work in England (specifically London's West End theatre district), while the other countries of the United Kingdom have their own theatre awards.

When I chose "England", it even comes up with the phrase "constituent country of the United Kingdom", as it is, in fact, an actual country within the UK.

I tried using the statement for "applies to jurisdiction", set to England, but that makes my statement for London raise the flag "An entity with located in the administrative territorial entity should also have a statement country." I want the country to be England, as that is the country that matters, but the bot will just change it to United Kingdom.

I do not think that the KrBot needs to be changed, nor anything like that. Just need your help. Do you know a way that I can tell it "England" and "London", and not have any flags nor have a bot make a change? Thanks. Jmg38 (talk) 02:09, 28 April 2022 (UTC)[reply]

Hi again. I think I'll use "country (P17)" of United Kingdom, and "applies to jurisdiction (P1001)" of London. That captures everything I need, and avoids having to mention England at all while also avoiding having to fuss about England being a country, as the real important part is the "applies to jurisdiction (P1001)" of London. The KrBot was helpful in ways you may not have expected, as it forced me to think through what I was doing, which is never a bad thing! Thank you. Jmg38 (talk) 05:16, 28 April 2022 (UTC)[reply]
Hello! Could you review previous England-related discussions on Property talk:P17? Bot just executes {{Autofix}} rule from the property discussion page. You may delete the rule and bot will stop such edits. — Ivan A. Krestinin (talk) 17:03, 28 April 2022 (UTC)[reply]

Q7686436 and Q105550321 merge and replace

Hello Ivan! @Howard61313: has incorrectly merged the occupation (military aviator (Q105550321)) and the category (Category:Military aviators (Q7686436)), and Krbot2 has changed the occupation of all persons in the profile (example). Can you do that back? (links) Thanks. Palotabarát (talk) 10:57, 1 May 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 18:46, 1 May 2022 (UTC)[reply]
Thank You! Palotabarát (talk) 07:05, 2 May 2022 (UTC)[reply]

Single value constraints for Dutch municipalities

Dear Ivan A. Krestinin,

In your report, DATABASE_REPORTS/CONSTRAINT_VIOLATIONS/P382#"SINGLE_VALUE"_VIOLATIONS, I notice that there is probably searched for items with more than one claim of an identifier (which should be unique by definition). The Dutch CBS Municipality code in The Netherlands can become obsolete (then it should have an end date), changed into a new code (and then have doubles) or be a whole new code when a new municipality is created. When we for instance look at Etten-Leur Q9833#P382, we see that there are two CBS codes claimed. Which is correct. The old one has an end date and the current one is actual. Recently I also updated its rank in an attempt to work with it from a script.

Would it be possible, in your database report, to include a check on the end date and perhaps a double check on the rank and only report when there are more than one codes claimed without an end date or on an equal rank?

Furthermore, if you are interested, perhaps the check on end dates could be automated. I recently tried something with the api of CBS and found a table (70739ned) with the dates registered. See a script for population data and a script for surface area's where the table is used for mapping. Please forgive me my level of programming, I gave it my best try. If you have any questions, want to work together or have remarks towards me, please let me know. (My availability is varying, but intend to respond asap.)

Best regards, Démarche Modi (talk) 18:02, 3 May 2022 (UTC)[reply]

And see https://nl.wikipedia.org/wiki/Gebruiker:D%C3%A9marche_Modi/Kladblok/python/cbs_codes for the whole table I mentioned. Démarche Modi (talk) 18:54, 3 May 2022 (UTC)[reply]
If more values are possible over time, the property should have single-best-value constraint (Q52060874) instead of single-value constraint (Q19474404). And the item statement should have normal rank instead of deprecated, as deprecated rank is not for once-correct information. —Tacsipacsi (talk) 01:03, 4 May 2022 (UTC)[reply]
Specifying separator (P4155) for the constraint fixes the issue. See [17]. Of course start time (P580) and end time (P582) are need to be specified in items. — Ivan A. Krestinin (talk) 21:45, 4 May 2022 (UTC)[reply]
All done (for the ones reported with violations), statements have normal rank again and are provided with a start and end date. For the remaining bulk (without warnings in the report) I will consider a bot request. Démarche Modi (talk) 14:55, 6 May 2022 (UTC)[reply]

what right does your robot have to deny reality?

̃https://www.wikidata.org/w/index.php?title=Q28105001&diff=prev&oldid=1634524482

?????????????

A.BourgeoisP (talk) 14:53, 8 May 2022 (UTC)[reply]

The statement is already on "significant event", the bot would've added it if it wasn't already there. If that is the right property is beyond me AntisocialRyan (Talk) 15:32, 8 May 2022 (UTC)̇[reply]
https://www.wikidata.org/w/index.php?title=Q28105001&action=history
A.BourgeoisP (talk) 12:11, 14 May 2022 (UTC)[reply]
@A.BourgeoisP: position held (P39) shouldn't be used for oldest human (Q254917) (it's for formal positions), use significant event (P793) instead. Reverting the bot over and over again is as useful as banging your head against a wall. Multichill (talk) 12:42, 14 May 2022 (UTC)[reply]
why oldest person in France (Q107344155) and list of European supercentenarians (Q1637694) are in p 39 and oldest human (Q254917) is in p793 ? what justifies separating them? Read the˞ infobox of French version of the article... we see oldest person in France (Q107344155) and list of European supercentenarians (Q1637694) but not oldest human (Q254917)! Why ? That is rediculous and stupid ! Please solve this problem. Wikipedia.fr does not have to be vandalized by the action of a robot on Wikidata! A.BourgeoisP (talk) 19:03, 14 May 2022 (UTC)[reply]
Helloooo ǃ? A.BourgeoisP (talk) 19:26, 17 May 2022 (UTC)[reply]
Hello, bot makes such changes because page Property talk:P39 has template {{Autofix|pattern=Q254917|replacement=Q254917|move_to=P793}}. Just delete the template and bot stops this activity. But I recommend to discuss the issue on Property talk:P39 or on Wikidata:Project chat first. — Ivan A. Krestinin (talk) 20:41, 17 May 2022 (UTC)[reply]

compositions ≠ musical works

en:Category:2020 compositions (Q97275139) don't contain en:Category:2020 albums. be:Катэгорыя:Музычныя_творы_2020_года, ru:Категория:Музыкальные_произведения_2020_года and zh:Category:2020年音樂作品 (Q111684969) contain en:Category:2020 albums. Compositions ≠ Musical works, Musical works contain Compositions. -- 15:32, 15 May 2022 (UTC)[reply]

Q9059213Q5626704, Q9059213 contain Q5626704. -- 15:36, 15 May 2022 (UTC)[reply]
  • I merged the items because "ru:Категория:Музыкальные произведения XXXX года" is the same as "en:Category:XXXX compositions". I do not see difference between "Composition" and "Musical work". Its are looked as full synonyms after translation to my native language. Could you fix the issue? Fixing this is looked too hard task for me. You may use different from (P1889) to prevent wrong merges in future. — Ivan A. Krestinin (talk) 15:51, 15 May 2022 (UTC)[reply]

"Determination method" error

Hello! determination method or standard (P459) was added as a main statement to just setting up my twttr (Q64790997). Instead, it should be added as a qualifier to number of likes (P10649), number of comments (P10651), number of dislikes (P10650), and number of reblogs (P10756).

https://www.wikidata.org/w/index.php?title=Q64790997&diff=1640345856&oldid=1639539094

This would actually be super helpful because it is annoying to add! AntisocialRyan (Talk) 23:52, 16 May 2022 (UTC)[reply]

Hello! Bot will stop such edits after this change. Could you discuss with User:Trade this edit? — Ivan A. Krestinin (talk) 20:56, 17 May 2022 (UTC)[reply]
Hello! That isn't the problem, the problem is that it was added as a main statement to the item and not a qualifier of one of the properties.
See its revision here, which I have since reversed: https://www.wikidata.org/w/index.php?title=Q64790997&oldid=1640622098
It did it to a number of items. AntisocialRyan (Talk) 21:10, 17 May 2022 (UTC)[reply]

Wrongly adding the country "France"

https://www.wikidata.org/w/index.php?title=Q1713379&type=revision&diff=1648001537&oldid=1647679444 seems wrong to me.--Danny Diggins (talk) 12:15, 27 May 2022 (UTC)[reply]

@Danny Diggins: as you can see in the edit, this is based on Mérimée ID (P380) which was incorrectly added by User:Ayack. Undid the edits. Multichill (talk) 17:00, 27 May 2022 (UTC)[reply]

Constraint violations P3503

Hello, your KrBot2 is not detecting any violations at Wikidata:Database reports/Constraint violations/P3503. Even if new items are added with the same property and some violations are being fixed, the bot is not running. Do you know what is wrong with it? Elemar (WMIT) (talk) 09:32, 30 May 2022 (UTC)[reply]

Hi, bot uses incremental dumps as data source. Its are generated daily, but data is delayed in its. Anyway everything is looked fine now. — Ivan A. Krestinin (talk) 20:23, 10 June 2022 (UTC)[reply]

FandangoNow

Привет. По всей видимости, вывод из наличия свойства FandangoNow ID (P7970) утверждения distributed by (P750) = FandangoNow (Q80948336) не является корректным. Например, у Bugsy (Q241085) есть корректный код, но при этом не похоже, чтобы он был доступен для просмотра как на FandangoNow раньше, так и на Vudu теперь. И судя по тому, что пишут в обсуждении ru:Википедия:Форум/Общий#Викиданные, таких случаев очень много. Вероятно, лучше всего удалить вообще все добавления FandangoNow как дистрибьютора, сделанные на основе наличия свойства. —putnik 17:14, 30 May 2022 (UTC)[reply]

См. также ru:Обсуждение участника:KrBot#FandangoNow. colt_browning (talk) 11:02, 31 May 2022 (UTC)[reply]
Приветствую, откатил все изменения. — Ivan A. Krestinin (talk) 12:46, 11 June 2022 (UTC)[reply]
Прошу прощения, какие все? В Beetlejuice (Q320384), например, утверждение на месте. (Там, правда, ещё правка этого утверждения была другим участником.) colt_browning (talk) 10:56, 12 June 2022 (UTC)[reply]
Те утверждения, что правили люди бот естественно не трогал. Будет не слишком правильно ботом вычистить вклад участников. Сейчас всего 10 значений осталось, не могли бы их глянуть и удалить при необходимости? — Ivan A. Krestinin (talk) 19:20, 16 July 2022 (UTC)[reply]

Please do not remove deprecated statements

They are present for a reason: when the source database contains incorrect information, it prevents the bad data from being mistakenly reimported to Wikidata. I had to revert several edits you made removing deprecated NTIS accession number (P7791) statements on 27 February 2022. John P. Sadowski (NIOSH) (talk) 19:02, 4 June 2022 (UTC)[reply]

Hello John, I think data validation during import is better way. This will make our project more clear for users. Collecting all mistakes in the world makes nobody happen. But it is long discussion... — Ivan A. Krestinin (talk) 19:04, 16 July 2022 (UTC)[reply]

Autofix for qualifiers

Hi! In Property talk:P1013 I added two {{Autofix}} six months ago to fix some thousands of occurrences in qualifiers (see https://w.wiki/4tkM and https://w.wiki/4tkU). Is it normal, I am missing something or there is a problem with the bot? Thanks very much in advance, --Epìdosis 20:01, 4 June 2022 (UTC)[reply]

Hello, {{Autofix}} with move_to parameter does process qualifiers. It is unclear how this case should be processes. But your requests report two items only now. Looks like the cases are fixed already. — Ivan A. Krestinin (talk) 20:51, 10 June 2022 (UTC)[reply]
Thanks for the reply. These cases should be processed simply in this way, and many thousands are still to be fixed (this is one of the two cases). Maybe should I compile {{Autofix}} in a different way, perhaps only {{Autofix|pattern=Q1376230|move_to=P3831}} instead of {{Autofix|pattern=Q1376230|replacement=Q1376230|move_to=P3831}}? Thanks, --Epìdosis 11:11, 12 June 2022 (UTC)[reply]
Qualifiers moving is not supported by {{Autofix}} unfortunately. Could you create request on Wikidata:Bot requests? — Ivan A. Krestinin (talk) 19:24, 16 July 2022 (UTC)[reply]
Thanks, request created. And I specified that qualifiers moving is not supported in the documentation. --Epìdosis 07:48, 17 July 2022 (UTC)[reply]

Please don't merge academic theses with scientific publications

Hi there, I've found three instances where you have merged a dissertation with a publication, they are not the same thing and it makes a real mess trying to untangle them again. Please don't merge any thesis (especially those that are a part of the NZThesisProject) with publications, even if they have the same title. Thank you.

Here is one of them - you merged this doctoral thesis https://www.wikidata.org/wiki/Q111965723 (one author, held at the University it was submitted to) into this edited book with three authors and published by Taylor and Francis. https://www.wikidata.org/wiki/Q57409615 Obviously these are not the same thing, but if you could find a way for your bot to avoid merging entirely different entities like this it would be helpful. I have had to go back and fix the references from the author items that you also changed, and which broke constraints as the "academic thesis" statement was pointing at a publication rather than a thesis. People often publish a paper with the same title as their thesis, you need to be able to distinguish them.DrThneed (talk) 23:21, 8 June 2022 (UTC)[reply]
It would be helpful, Ivan, if you could outine for us the exactly criteria by which you judged a P31=scholarly article to be the same as a P31=doctoral thesis - in the case of Psychological Aspects of Inflammatory Bowel Disease (Q111965723). From the outside it seems inexplicable that these would be merged merely by a label-string match. --Tagishsimon (talk) 00:35, 9 June 2022 (UTC)[reply]
Bot merges the items because its have same title, same publication year and similar type. No any conflicting properties were detected. Now I changed bot`s algorithm, such items will not be merged. Also different from (P1889) will protect the items. — Ivan A. Krestinin (talk) 21:14, 10 June 2022 (UTC)[reply]
Thank you, Ivan. The P31 values on the merged items were not "similar type"; I hope that that is the area of your algorithm which has been improved. You are right that different from (P1889) will protect the items, but users should not have to take extra measures to protect their items from bots, in a situation in which there are clear distinctions - P31; number of author statements &c - between the items the bot erroneously merged. --Tagishsimon (talk) 21:47, 10 June 2022 (UTC)[reply]
Thank you Ivan. There were plenty of conflicting properties between the items - different numbers of authors, publisher, DOI, etc. A different publisher should be all that it takes not to merge items, as a single author could legitimately publish the same titled article in different venues in the same year, and those items should not be merged. So it would be reassuring to know that the bot algorithm will not touch items like that? Further it is not possible to protect every thesis item against merging with false duplicates that have not even been added to Wikidata yet, we just need your bot to be more discriminating. Thank you, DrThneed (talk) 22:04, 10 June 2022 (UTC)[reply]

multi-value constraint and constraint scope:

Hi Ivan, looks like this edit made your bot unhappy. I didn't know it either, but according to Help:Property constraints portal/Multi value, constraint scope (P4680) is a valid qualifier. Can you update your bot to handle this? Thanks, Multichill (talk) 16:57, 20 June 2022 (UTC)[reply]

Hello, it is mistake or incomplete documentation. Single and multi value constraints are checked for main values only. So the qualifier is redundant. — Ivan A. Krestinin (talk) 18:46, 16 July 2022 (UTC)[reply]
The documentation describes how the current constraint software behaves. You can see at Q46316#Q46316$FBABC74E-0209-4129-96E4-6931F395B8E6 and Q134942#Q134942$2e7aa178-4ae3-0c0e-4912-d98d2130f83f that currently the constraint is triggered. If you add the constraint scope, it now longer will be triggered. Can you update the bot to do the same?
@Lucas Werkmeister (WMDE), Wostr: ^^ Multichill (talk) 08:28, 17 July 2022 (UTC)[reply]
There are multiple implementations of constraints systems. Generally its behavior is the same. But some details may be different. The documentation does not describe the difference. Lets assume that Multiple values constraint is supported for qualifiers. Q46316 has 3 values of depicts (P180). It is unclear should multiple values constraint be triggered or not. 3 > 1, so current behavior of Wikidata integrated implementation is looked as bug. This is reason why the most constraints were supported for main values only. Another question is why P180 should have multiple values. For example Q1170315#P180, what value should be added? — Ivan A. Krestinin (talk) 10:46, 17 July 2022 (UTC)[reply]

It looks like this happened again with these edits breaking Wikidata:Database reports/Constraint violations/P1651. I have no experience with constraints, and it appears to be a complicated topic, so I'd rather have someone more experienced look at this than trying to understand it myself. @Middle river exports: ^ –JustAnotherArchivist (talk) 05:07, 29 August 2022 (UTC)[reply]

This does become a problem for editing lexemes because there are a number of properties which generally are not necessary for lexemes/senses/forms but which can make sense on occasion in a reference on one of these types. (The reason I made this edit is because especially for spoken language / pronunciation, video references can be helpful. Also most news media in Pakistan is in video format rather than articles, so citing these makes sense on lexemes for Pakistani languages.) Middle river exports (talk) 05:18, 29 August 2022 (UTC)[reply]

Constraint violations P373

Hi, Wikidata:Database reports/Constraint violations/P373 is not updated since 4 July and giving a bit crypit errror message "too many files for existance check". Any detaily or workaround to fix it? Jklamo (talk) 11:16, 4 July 2022 (UTC)[reply]

Hello, I am investigating the issue. — Ivan A. Krestinin (talk) 18:48, 16 July 2022 (UTC)[reply]
FixedIvan A. Krestinin (talk) 18:58, 17 July 2022 (UTC)[reply]

Please leave "invalid" DOIs on the item and flag them with a qualifying statement instead

I had input a list of DOIs to create items for in Magnus Manke's SourceMD tool, and unfortunately it appears to have created a couple of items for articles which it could not find the title for. In these cases, the DOI link did not lead to the document, but the DOI was not incorrect. The problem with removing the identifier is then there is no way to correct the issue and an item is left blank without any metadata. The "incorrect" DOI link leads to a form to contact the DOI database maintainers to fix the rror, which is a useful link to have.

Here is an example of what I mean: A Basic Parts of Speech (POS) Tagset for morphological, syntactic and lexical annotations of Saraiki language (Q113190216) was added with a DOI and nothing else. The article can be found here, and the DOI for it is at the top of the page, and consistent with the format for other articles from this journal: https://scholar.archive.org/work/p7avuxo46nagdlollxvnf5xone/access/wayback/http://journal.buitms.edu.pk/j/index.php/bj/article/download/459/281

The fact that the DOI didn't work is an error on either the DOI maintainers' or the publishers' part, so I have contacted them to fix this. However, the DOI will likely be retained whenever it is fixed so it does not make sense to remove, especially when there was no other way to tell what article the item was for. --Middle river exports (talk) 21:02, 24 July 2022 (UTC)[reply]

Hello, looks like the identifier was really wrong. I specified correct DOI value in the item. — Ivan A. Krestinin (talk) 21:27, 24 July 2022 (UTC)[reply]
Ah, thank you very much. How did you find the correct one? Is there a way for the bot to tell this? Middle river exports (talk) 21:28, 24 July 2022 (UTC)[reply]
I found it using Google. This exact case is looked hard for automation. Maybe some other cases will be more easy for automation. — Ivan A. Krestinin (talk) 21:34, 24 July 2022 (UTC)[reply]

P968 no longer automatically fixed

Format issues for email address (P968) for which an autofix rule exists are no longer fixed by your bot. As far as I can tell all regexes are valid. Can you check what's causing this? Mbch331 (talk) 13:39, 25 July 2022 (UTC)[reply]

Hello, bot has ~650000 items to fix now. This need a time. Mostly the it is caused by this issue. I will disable DOI references fixing temporary to give a chance for another autofixes. — Ivan A. Krestinin (talk) 17:31, 25 July 2022 (UTC)[reply]

incorrect

https://www.wikidata.org/w/index.php?title=Q49133&type=revision&diff=1278140635&oldid=1277973898&diffmode=source Oursana (talk) 10:33, 6 August 2022 (UTC)[reply]

Hi Ivan. Regarding your edit here, while I partially agree, I've been thinking about the utility of having both instance of and subclass of being requirements for rockets. All rockets should realistically be instances of either rocket family (Q109542585), rocket series (Q111722634), or rocket model (Q110055303), and I've been filling these in as I find them, but the majority of rocket items are still missing it. Having instance of be a requirement would serve as a useful indicator that the field is missing. Just a thought. Huntster (t @ c) 16:57, 7 August 2022 (UTC)[reply]

Hi, you may add second value-type constraint (Q21510865) to space launch vehicle (P375). As for me instance of (P31) is not so useful as rocket families hierarchy. But you may create and use parallel classification. space launch vehicle (P375) have one more issue: usually this property is used to specify type or vehicle. But sometimes it is used to specify exact instance of vehicle. For example: Ulysses (Q156081). — Ivan A. Krestinin (talk) 21:42, 8 August 2022 (UTC)[reply]

When does this bot stop?

When does this bot stop deleting correct information? Roelof Hendrickx (talk) 21:45, 8 August 2022 (UTC)[reply]

Hello, bot executes many different tasks. Could you provide link to some edit? — Ivan A. Krestinin (talk) 22:12, 8 August 2022 (UTC)[reply]
See: https://www.wikidata.org/w/index.php?title=Q29400988&action=history&curid=31048137 which is the latest deletion of correct information. Roelof Hendrickx (talk) 08:09, 9 August 2022 (UTC)[reply]
@Roelof Hendrickx, is this what you are looking for? Michgrig (talk) 17:43, 9 August 2022 (UTC)[reply]
That should have been the end result yes. Moving instead of deleting. Roelof Hendrickx (talk) 22:04, 9 August 2022 (UTC)[reply]
Bot executes moving in two steps: 1. adding new value; 2. deleting original value. If new value already exists then bot just removes original value. — Ivan A. Krestinin (talk) 22:25, 9 August 2022 (UTC)[reply]

Добрый день! Похоже сломалось, см. историю. Машъал (talk) 19:28, 19 August 2022 (UTC)[reply]

Single best value constraint

Hi Ivan, some time ago single-best-value constraint (Q52060874) was introduced as a refinement of single-value constraint (Q19474404) so that cases like this one don't trigger a constraint. Can you update your bot to report single best value instead of single value violations? The query is only slightly different. I just noticed by the way that the single value constraint query (example) filters out statements with deprecated ranks. Maybe you can do the same? Thanks, Multichill (talk) 10:17, 21 August 2022 (UTC)[reply]

+1 to this. This report's single-value violations report will grow to thousands of invalid violations over time. —seav (talk) 11:04, 18 December 2022 (UTC)[reply]

Q283299 and Q21491395 merge and replace

Hello Ivan! @Terot: has incorrectly merged the family: (Bethlen family (Q283299)) and the family name (Bethlen (Q21491395)), and Krbot2 has changed the family name of all persons in the profile (example). Can you do that back? Pallor (talk) 15:15, 21 August 2022 (UTC)[reply]

Poland for GREL correct?

Hi Ivan,

is this change correct?-- Négercsókk (talk) 12:00, 28 August 2022 (UTC)[reply]

Английские метки для создаваемых элементов по ЖД станциям России

Иван, приветствую!
С ноября прошлого года я занимаюсь созданием и наполнением в Викиданных элементов по станциям и остановочным пунктам России. При создании я указываю название и метку на русском языке. В какой-то момент Edoderoobot дополнял создаваемые мной элементы описаниями на английском языке, но потом ботовод решил перестать участвовать в ВД и, соответственно, бот перестал выполнять эту задачу.
Можно вас попросить подхватить эту задачу? А, может, еще и сразу добавлять транлитерированное с русского название. Michgrig (talk) 19:40, 28 August 2022 (UTC)[reply]

Добрый вечер, можете скинуть пример правки бота или более точное описание задачи? Если нужна транслитерация, то какую-нибудь документацию о том, как правильно это делать. Быстро сделать в любом случае не обещаю. — Ivan A. Krestinin (talk) 19:37, 31 August 2022 (UTC)[reply]
Иван, спасибо за ответ!
Вот изначальный мой запрос на эту задачу. Совсем по минимуму - можно ограничиться Россией.
Единственное исключение из того, что описано там - сейчас (по крайней мере для российских станций) не указывается instance of (P31) = railway stop (Q55678) и instance of (P31) = passing loop (Q784159), везде только railway station (Q55488).
Пример правки бота не могу найти, просмотрел последних страниц 30 и не увидел.
Насчет транслитерации: здесь есть код с записью правил.
Но уже после того, как я вам написал, я подумал, что можно обратиться к другому участнику, User:Mr. Ibrahem, у которого есть бот, добавляющий описания на арабском. Возможно, ему будет проще дополнить свои скрипты. Сейчас напишу ему и если он откажется, тогда вернусь. Ладно? Michgrig (talk) 09:35, 1 September 2022 (UTC)[reply]

Hi, are you aware of that Mac App Store and App Store share the same identifier App Store app ID (P3861)? There are currently no Mac with a touchscreen. So adding input device (P479) -> touchscreen (Q165970) based on this identifier is not recommended and should be fixed/reverted. If the application is for multiple platforms like tvOS, MacOS, and etc. than the input device (P479) should also have a qualifier for applied platform. Premeditated (talk) 10:37, 7 September 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 23:06, 25 September 2022 (UTC)[reply]

Note that adding app-related properties like input device (P479) or distributed by (P750) to the items about companies is not appropriate and also violates the relevant type constraints. Please revert these edits. Thanks. Jklamo (talk) 09:28, 9 September 2022 (UTC)[reply]

Hello, could you give links to some wrong edits? — Ivan A. Krestinin (talk) 08:58, 25 September 2022 (UTC)[reply]

No updates for Wikidata:Database reports/Constraint violations/Summary

Hi, could you check why there are no updates for Wikidata:Database reports/Constraint violations/Summary since 2 May 2022? Thank you in advance! Lockal (talk) 15:51, 9 September 2022 (UTC)[reply]

Hello, size of this page exceeded 2 Mb limit. I made some size optimizations. Now it should be fine. But we will face with the same issue in the future. — Ivan A. Krestinin (talk) 08:56, 25 September 2022 (UTC)[reply]
What about splitting the page, e.g. into three parts P1..P5000, P5001..P10000 and P10001+? And as the number of properties grows, just add more baskets. —Tacsipacsi (talk) 00:34, 26 September 2022 (UTC)[reply]
Column sorting will not work in this case unfortunately. — Ivan A. Krestinin (talk) 19:33, 26 November 2022 (UTC)[reply]
True, but a list which can be sorted only in parts is still better than a page that’s not updated at all, isn’t it? By the way, what does this 2 MB limit refer to? Page source size or post-expand size? Because if it’s the former, a Lua module (which stores data on several subpages and then combines them to one wiki table) could work this limit around without any loss to functionality on user side. —Tacsipacsi (talk) 17:19, 27 November 2022 (UTC)[reply]

Thanks for removing my duplicated ADS entries.

I don't know how I'm making these erroneous duplications. Thanks for fixing them with your bot. Trilotat (talk) 00:47, 20 September 2022 (UTC)[reply]

You are welcome. It is very common mistake type. Bot fixed 2590657 mistakes of this type already. — Ivan A. Krestinin (talk) 08:41, 25 September 2022 (UTC)[reply]

Removing brackets from article names

Hi. Why did you make these changes (e.g. [18])? The brackets properly indicate that the name indicated is a translation and not the original name. Nadav S (Talk) 05:14, 7 October 2022 (UTC)[reply]

Deprecated rank not taken into account for P648 "single value" report?

It looks like this report was updated yesterday, but it still contains an entry for Q107000 that I fixed back in July https://www.wikidata.org/w/index.php?title=Q107000&action=history

Any idea why? The report is great, but it's much less useful if things don't get removed from it when they are fixed. Tfmorris1 (talk) 23:33, 23 October 2022 (UTC)[reply]

Hello, it is very old limitation of the bot. It takes into account and validates all values, including deprecated. The only way to avoid this is adding to exclusion list. — Ivan A. Krestinin (talk) 19:10, 26 November 2022 (UTC)[reply]

Hello! Most of the violations on this page are actually deprecated values. Would it be possible for the bot to exclude values marked deprecated? The same applies to the single value violations. Thanks — Martin (MSGJ · talk) 18:03, 27 February 2023 (UTC)[reply]

I have merged my thread into Tfmorris1's as I notice this has been brought up before. Please can you change your bot's code so deprecated statements are ignored? — Martin (MSGJ · talk) 18:14, 27 February 2023 (UTC)[reply]
And reading this talk page, this issue has been brought up many many times. Are we any closer to making progress on this issue? — Martin (MSGJ · talk) 18:17, 27 February 2023 (UTC)[reply]

A similar concern re ROR values is discussed at https://github.com/ror-community/ror-updates/issues/9823. So instead of the constraint violations page, I advised them to use this query: https://w.wiki/9J5g --Vladimir Alexiev (talk) 15:29, 26 February 2024 (UTC)[reply]

Stop that nonsense, please!

Hi, congratulations: your KrBot won the award for the Netflix Employee of the Year! He puts Netflix in just about every data object related to movie as the only distributor. And he seems to know only one distribution channel at all: video-on-demand. All this can’t be true. “Artificial intelligence” like that makes me angry. To be clear, Wikidata is not an advertising platform for Netflix spam. So please stop that nonsense! J.-H. Janßen (talk) 19:32, 24 October 2022 (UTC)[reply]

P1749

Hello Ivan A. Krestinin, since Gymnicus did the edit https://www.wikidata.org/w/index.php?title=Property:P1749&diff=1735245748&oldid=1735244422 there is a problem with reporting constraint violations of P1749 (see https://www.wikidata.org/w/index.php?title=Wikidata:Database_reports/Constraint_violations/P1749&diff=1735576132&oldid=1735206149 ). I have no idea what is wrong there. I contacted Gymnicus on his/her talk page, but Gymnicus claims the real problem is your bot or so (see https://www.wikidata.org/w/index.php?title=Topic:X5snndri9flk48xu&topic_showPostId=x5sqhmolo0qw9osr&fromnotif=1#flow-post-x5sqhmolo0qw9osr ). Please have a look at this issue. - Robotje (talk) 16:26, 27 October 2022 (UTC)[reply]

With this modification the problem was solved so the constraint violation report is working again for P1749. - Robotje (talk) 06:24, 8 November 2022 (UTC)[reply]
Hello, glad to hear that the issue was solved. Apologize for long response time. — Ivan A. Krestinin (talk) 18:25, 26 November 2022 (UTC)[reply]

Wait longer before resolving redirects?

Sometimes a merge is incorrect and is undone, but KrBot has already resolved the redirects generated by the merge (example: [19]) leaving them as links to the wrong item. I would suggest waiting at least a week before resolving redirects. Thanks! Swpb (talk) 18:08, 8 November 2022 (UTC)[reply]

Please undo the redirect of Q19961860 to Q1001344

They were merged in error. Thank you. William Graham (talk) 00:18, 19 November 2022 (UTC)[reply]

A mistaken bot fix

your bot mistakenly replaced a correct personal name item with a disambig name. See here. DGtal (talk) 11:24, 20 November 2022 (UTC)[reply]

Is CrossRef bullshit?

At Once-daily dolutegravir-based antiretroviral therapy in infants and children living with HIV from age 4 weeks: results from the below 14 kg cohort in the randomised ODYSSEY trial (Q114119963) there is the same author 2 times.-- Jewmensch (talk) 20:02, 2 December 2022 (UTC)[reply]

Hello, are you about "Avy Violari"? Nice to discuss this with owner of User:Research Bot who originally created the item with duplicate names. Or with User:Daniel Mietchen who replaced the string to Avy Violari (Q91877369) item. — Ivan A. Krestinin (talk) 06:28, 3 December 2022 (UTC)[reply]

Bad merge Q43874039_Q22969563

Hi. Could you please undo this. Cultural heritage designations of different countries were erroneously mixed with this merge. 2001:7D0:81FD:BC80:31CA:4A24:4144:E64B 11:03, 3 December 2022 (UTC)[reply]

Label language

Any chance of support for counting the number of violations of label in language constraint (Q108139345) on the constraint reports? I'm changing "complex constraints label language" into "label in language constraint"s, and someone commented to me this was one feature they were missing. It would be a really nice feature to have. Infrastruktur (talk) 18:44, 8 December 2022 (UTC)[reply]

Maybe if you published the source codes? Probably C++ is not the most well-known language in the Wikimedia technical community, but I’m sure there are some people who are willing to help, and could have an idea how to solve the memory issues you mentioned in #New constraint Label in Language. —Tacsipacsi (talk) 19:29, 9 December 2022 (UTC)[reply]
@Pasleim: If memory is tight, it might be handled temporarily or permanently by use of SPARQL. Though is this better handled by Deltabot? Opinions?  – The preceding unsigned comment was added by Infrastruktur (talk • contribs) at 19:29, 10 December 2022 (UTC).[reply]
Hi Ivan, I noticed these edits. Note that multiple qualifiers are allowed and that it means or. I just haven't got around to fixing the template
If you are working on supporting the constraints, don't use SPARQL, use search, it's much faster. Check the API documentation. Multichill (talk) 10:52, 27 December 2022 (UTC)[reply]
Yes, I worked on the constrains support. It is operational now. Thank you for example of API call. You are right, search is more stable than SPARQL. As for multiple language per single constraint: I checked several properties, such usage was looked as mistake in the most cases. It is unclear why presence of 'fr' label should have influence to 'nl' label for example. More strong data model (like "always one language") usually is better. Currently only 5 properties with multiple languages remain. — Ivan A. Krestinin (talk) 16:40, 28 December 2022 (UTC)[reply]
Good to hear you got it working. Did you also implement the multiple languages part? That's how it's defined and documented so your bot should follow that.
I updated the LUA part and also fixed Module:Constraints so Property talk:P650 works again. Why is Wikidata:Database reports/Constraint violations/P650 still empty? Multichill (talk) 21:17, 28 December 2022 (UTC)[reply]
I did not implement support of P424 values. Do we really need such constraint for RKDartists ID (P650)? — Ivan A. Krestinin (talk) 19:39, 1 January 2023 (UTC)[reply]
Yes we do need your bot to not crash with an "ERROR: Error while Q108139345 constraint parameters loading: Too many P424 values. " when perfectly valid constraints are set. Should be only a minor change. Multichill (talk) 20:52, 1 January 2023 (UTC)[reply]
As for me it is looked as real mistake. So we just need fix the constraint documentation and fix remaining 4 properties. More strong validation usually produce better data quality. — Ivan A. Krestinin (talk) 21:04, 1 January 2023 (UTC)[reply]
No, it's not a mistake. Hence the description at Help:Property constraints portal/Label in language Contains the required language(s). If more than one language is listed, the constraint is satisfied if the item has a label in any of the listed languages (“or” condition). To require labels in several languages (“and” condition), define several separate constraints of this type, with a single language parameter each. based on what was implemented in phab:T195178.
You should implement the reports how it's defined, not make up a slightly different definition because you think you know better. Multichill (talk) 21:15, 1 January 2023 (UTC)[reply]
Why not, implementation from T195178 might have bugs or things that were not implemented in good way. We should understand why it was done in this way. After this we may decide is it bug or needed feature. — Ivan A. Krestinin (talk) 21:23, 1 January 2023 (UTC)[reply]
It's a feature, stop being so stubborn and just implement it. Multichill (talk) 22:06, 1 January 2023 (UTC)[reply]
Maybe truncate the report at 1000 entries? Infrastruktur (talk) 15:26, 1 January 2023 (UTC)[reply]
Do we have some issue related to large amount of items in report? Sometimes I (and maybe other users) use the reports as fix lists for other bots. Reducing amount of items will reduce amount of fixed items. — Ivan A. Krestinin (talk) 19:39, 1 January 2023 (UTC)[reply]

Your bot is malfunctioning on Wikidata:Database reports/Constraint violations/P650 and overwriting it with an empty page. Please fix. Multichill (talk) 12:25, 15 January 2023 (UTC)[reply]

Where does "separate spacecraft" come from? I cannot find any information that suggests more action than simple refurbishment and replacement of some body panels damaged by micrometeorites. Like Westar 6, it is the same physical object, just repaired. It doesn't make sense to have two items. Huntster (t @ c) 19:23, 26 December 2022 (UTC)[reply]

Hello, [20] says that the spacecraft was modernized. It got new name after the modernization. Most databases have separate records for Palapa B2R and Palapa B2. Wikipedia has separate articles for the entities. It is common practice then Wikidata has separate items in such cases, for example: OTV-3 (Q2163196), OTV-1 (Q980708). — Ivan A. Krestinin (talk) 06:53, 27 December 2022 (UTC)[reply]
I would agree in some situations, though the OTV items are not good examples as those items (and articles) are about the missions rather than the spacecraft, and truly are distinct entities. We do have Wikimedia permanent duplicate item (Q21286738) specifically for this sort of circumstance. Further, TBS (an enthusiast site) is the only one I've seen mentioning modernization; Multiple other publications say refurbished (plus many more), and NASA consistently uses this term. Huntster (t @ c) 11:13, 27 December 2022 (UTC)[reply]
I think we may just add type space mission (Q2133344) to the items. Actually border between refurbishing and reworking is very thin. I think we should make decision based on our data model. Wikidata items contains only one COSPAR ID (P247) property in the most cases for example. I think it is better to continue this model usage. — Ivan A. Krestinin (talk) 21:49, 27 December 2022 (UTC)[reply]

Data import from lt.wikipedia

Hello/Здравствуйте, @Ivan A. Krestinin.

Could You make a Lithuanian geodata import from lt.wikipedia? In detail:

If You are not up to this, then maybe You could recommend somebody who performs such kind of tasks? Regards, Hugo.arg (talk) 11:36, 11 January 2023 (UTC)[reply]

Hello, it is better to create request on Wikidata:Bot requests. I am apologize for delayed response. — Ivan A. Krestinin (talk) 18:41, 9 April 2023 (UTC)[reply]

Inconsistent application of autofix for P106

Per Property talk:P106, KrBot is removing occupation claims of "Q102039658" preferring to place them in "position held" instead. I am sure there is some justification for this, but it is inappropriate for an ontological vocation such as bishop, priest or deacon. Furthermore, KrBot is being inconsistent in its application, as it has not moved claims of "[Latin] Catholic priest" or "[Latin] Catholic deacon" from the occupation property, so why is "[Latin] Catholic bishop" special, and why is it not an "occupation"? Elizium23 (talk) 16:19, 15 January 2023 (UTC)[reply]

Hello, it is better to discuss this on Property talk:P106. You may add or remove some {{Autofix}} on the page to fix the situation if needed. — Ivan A. Krestinin (talk) 18:37, 9 April 2023 (UTC)[reply]

Bot malfunction on P2070

Hi,

I think that KrBot is trying to remove non-numerical characters from the start of Fellow of the Royal Society ID (P2070) values. However it appears that the match pattern is faulty: even if the value is wholly numerical, the first digit is removed (see e.g. https://www.wikidata.org/w/index.php?title=Q23979128&action=history ). As a result I think the entire store of P2070 values have been corrupted. Could you do a mass revert please? Thanks. Dogfennydd (talk) 14:21, 17 January 2023 (UTC)[reply]

Caused by me, I'm already in a progress of mass revert. Lockal (talk) 12:44, 18 January 2023 (UTC)[reply]

Single value constraint

Hi! Shouldn't single value check be changed to single best value? Example. --Bean49 (talk) 10:25, 11 February 2023 (UTC)[reply]

Several format constraints not fixed

I've been noticing recently that your bot no longer fixes format constraints for which a autofix template exists and hasn't been altered recently. Example: Wikidata:Database reports/Constraint violations/P4985 and Wikidata:Database reports/Constraint violations/P968. What's the reason for this? Mbch331 (talk) 12:29, 20 February 2023 (UTC)[reply]

Hello, sometimes somebody adds wrong {{Autofix}} that covers huge amount of items. Bot has protection and such edits are failed in the most cases. But bot spends too many time to attempts to apply this wrong fix. And stops working by timeout. Such case happened some time ago. But now everything is fine as I see. — Ivan A. Krestinin (talk) 18:31, 9 April 2023 (UTC)[reply]

P10689 way OpenStreetMap

Hi

You replace an old property by this one + keeping only the numeric value (ID).

But in the current definition of P10689, we still have to precise "way/" before the numeric value (I don't know why.

For the moment you should keep the format way/XXXX as value. You can see that if you click the link (it is broken if you leave only the numeric value Benoît (discussion) 06:29, 25 March 2023 (UTC)[reply]

Hello, the fix was added by User:GZWDer. It is better to discuss it with him. Good place for discussion is Property talk:P10689. — Ivan A. Krestinin (talk) 18:23, 9 April 2023 (UTC)[reply]

Загрузить координаты

Здравствуйте. Могу ошибаться, но вроде бы видел, как Ваш бот копирует координаты из локальных разделов на Викиданные. Возможно ли сделать такую процедуру для нижних списков: здесь и здесь? Вручную не реально. Машъал (talk) 11:09, 30 March 2023 (UTC)[reply]

Приветствую, Машъал, этим занимался код, который я довольно давно не запускал. Не знаю насколько он соответствует нынешним реалиям. Можете проверить все ли хорошо бот сделал вот в этой тестовой правке? — Ivan A. Krestinin (talk) 15:07, 9 April 2023 (UTC)[reply]
Координаты и атрибуция полностью соответствуют Рувики. Полагаю в правке всё верно. Стоит также проверить обработку карточек с форматами 1/[NS]/2/[EW], 1/2/[NS]/3/4/[EW], 1/2/3/[NS]/4/5/6/[EW]. Машъал (talk) 19:30, 9 April 2023 (UTC)[reply]
Также есть третий список, но бот почему-то отдаёт ошибку: Wikidata:Database reports/Constraint violations/P7678. Свойство P7678 заполнено всего в 4400 статьях, превышения лимита (10 000?) в теории быть не может. Машъал (talk) 19:36, 9 April 2023 (UTC)[reply]
Сейчас подправил, может заработать. Машъал (talk) 19:40, 9 April 2023 (UTC)[reply]
✓ Done, некоторые элементы робот не стал обрабатывать во избежании каких-нибудь ошибок, например, там, где вместо типа "река" было указано stream (Q47521). — Ivan A. Krestinin (talk) 07:05, 16 April 2023 (UTC)[reply]
Спасибо. Но stream (Q47521) ничем не отличается в плане свойств от "река", это такой же водоток с истоком и устьем. Более того, stream (Q47521) много где стоит ошибочно из-за себуанцев. В списках ещё много озёр осталось и водохранилищ. Машъал (talk) 03:22, 25 April 2023 (UTC)[reply]

KrBot2

Hello. KrBot2 seems to have been stopped since 05:23, 05 April 2023 (UTC). Could you check it? Thanks. ウース (talk) 06:00, 8 April 2023 (UTC)[reply]

Hello, yes, ISP issue, working on it. — Ivan A. Krestinin (talk) 08:15, 9 April 2023 (UTC)[reply]
Fixed Reports should be updated in ~20 hours. — Ivan A. Krestinin (talk) 14:53, 9 April 2023 (UTC)[reply]

Reverted merge/ResolvingRedirect (Q27710721_Q254096)

Hello Ivan, in February 2022 Proceedings of the USSR Academy of Sciences (Q27710721) was wrongly merged with Proceedings of the Academy of Sciences (Q254096). This was reverted some weeks later. In between, KrBot changed nearly 10,000 statements (mainly scholarly articles published ≤1992). Could you revert the bot's edits please? Regards AmsaTalla (talk) 00:53, 16 April 2023 (UTC)[reply]

??

ist auch doppelt Dieter123 (talk) 09:02, 1 May 2023 (UTC)[reply]

@Dieter123, what's wrong? Two items, Microsoft Copilot (Q117289408) and Microsoft Copilot (Q116793893) were merged. In places where the former was used, the bot replaces it with the latter. Michgrig (talk) 11:17, 1 May 2023 (UTC)[reply]
@Michgrig > https://translate.google.com/?sl=auto&tl=ru&text=Unter%20%22besteht%20aus%22%20stand%20zweimal%20%22Bing%20Chat%22&op=translate Dieter123 (talk) 21:20, 2 May 2023 (UTC)[reply]

Resolving redirect

Hi! This edit seems to be not helpful because

  1. The statement is already marked as deprecated with reason redirect (Q45403344)
  2. The correct value is already present in a different statement

— Martin (MSGJ · talk) 10:15, 10 May 2023 (UTC)[reply]

Bot suspended

Hello.

KrBot and KrBot2 seem to be suspended since 22 May. Could you check it? Thanks. ウース (talk) 00:33, 24 May 2023 (UTC)[reply]

Sorting of entities in types statistics

Hi! Would it be possible sorting the values of type statistics in usages desc, type instead of only by usages desc? I've seen multiple examples where the bot changes the order of the elements of the type report from one run to the next one without changing the usage quantity, which makes the diffs huge. For example it changed this block of the previous usages list:

|-
| {{yes}} || [[Q1094617]] || 61
|-
| {{yes}} || [[Q1800833]] || 61

To this other one that appeared in the current one:

|-
| {{yes}} || [[Q1800833]] || 61
|-
| {{yes}} || [[Q1094617]] || 61

I don't know if it's a common use case, but when I check constraint reports, especially huge ones I start checking the new values since the latest time I checked (to make sure that I'm not checking the same values every time) and changing it to sort also by type doesn't seem too hard while having huge diffs makes skipping things like type statistics annoying. Kind regards, Agabi10 (talk) 13:59, 1 June 2023 (UTC)[reply]

I also usually start by looking at the diff. While I don’t read the diffs of these tables, I appreciate all changes that make the diffs smaller, so that I can more easily find other, more interesting parts of the diff. —Tacsipacsi (talk) 22:48, 1 June 2023 (UTC)[reply]

Repair of KrBot in university teaching items

Hi Ivan!

Epìdosis corrected P31 in Q25339110 : title -> title of authority

This was perhaps correct, but as a result, KrBot moved all properties occupation (P106): full professor (Q25339110) to property position held (P39): full professor (Q25339110), which is no longer correct in all cases. Unfortunately, it is difficult for me to describe the problem, but in Hungarian, for example, university teacher is a multi-meaning term: it means both a person who teaches at a university of higher education and also a person who received the title of "full professor" after completing the required credits . For this reason, a significant part of the elements actually had the wrong name for the occupation attribute (that is, it is not bad in Hungarian, but the person's occupation was indicated with the wrong element). KrBot has now appointed many hundreds of people as "full professors" without providing any sources for them. I admit, I don't know if this problem occurs in other languages (Spanish and Catalan are similar terms), do you have any suggestions for a solution? Thanks. Pallor (talk) 00:45, 6 June 2023 (UTC)[reply]

@Pallor: If they do not have "full" professor appointments, I think it would be better to use university teacher (Q1622272) instead (and leave it as an occupation). --99of9 (talk) 00:58, 6 June 2023 (UTC)[reply]
@Pallor: thanks for having raised the issue, it is an interesting problem to be discussed. Just a little precisation firstly: KrBot was in fact triggered not by my edit on the instance of (P31) of full professor (Q25339110) (through which I made it the same as associate professor (Q9344260)), but by the new {{Autofix}}es that I added in Property talk:P106. So, there was an issue of inconsistent data modeling: values like full professor (Q25339110) (and others) were used randomly with P106 and P39, so that a user trying to query them had to take into account both possibilities, which is not desirable; since, according to one of the examples Property:P39#P1855, the types of professorship should be position held (P39), whilst of course I agree that university teacher (Q1622272) goes into occupation (P106), I decided to uniform types of professorship to position held (P39) through autofix. If you want, I can make a query looking for position held (P39)full professor (Q25339110) with no sources (should I add other conditions?) so that the cases you mentioned could either be checked manually or massively removed. Let me know. --Epìdosis 10:59, 6 June 2023 (UTC) P.S. If professor is considered as an academic title in Hungary, maybe it should just go into academic degree (P512) in such cases?[reply]

To @Ivan A. Krestinin:: I see that in https://www.wikidata.org/w/index.php?title=Q51858005&action=history (and probably also elsewhere) the bot did something wrong, because there were two statements of occupation (P106) with same value associate professor (Q9344260) but different qualifiers and the bot moved them to just one statement of position held (P39) conflating the qualifiers of the two previous statements into one statement. This should not happen! Could you fix this for the future? Thanks, --Epìdosis 10:59, 6 June 2023 (UTC)[reply]

Please also see
M2k~dewiki (talk) 17:31, 7 July 2023 (UTC)[reply]

KrBot inserts duplicate references

Hi, First of all, thanks for the great job done by your bot. Though, unless I am mistaken, here it adds a duplicate reference to BNF. Is there a problem with the format of the reference ? CaféBuzz (talk) 20:07, 13 June 2023 (UTC)[reply]

Edit war with Frettiebot

Hello, KrBot is in an edit war with Frettiebot in Paulo Trigo Pereira (Q21072405). Frettiebot puts occupation (P106): member of parliament (Q486839) and KrBot remove it and move it to position held (P39), leaving several identical references in P39. I leave a similar message in User talk:Frettiebot. Joao4669 (talk) 06:40, 15 June 2023 (UTC)[reply]

I think the autofix is fine and the problem lies in the autofixed statement being readded: reported. --Epìdosis 07:32, 10 August 2023 (UTC)[reply]

ukr. Кравець not ru. Кравец

так понял, это было сделано умышленно [21]. Вопрос: зачем? Бучач-Львів (talk) 11:41, 15 June 2023 (UTC)[reply]

Это автомат - исправляет ссылки на перенаправления при объединении элементов (которое здесь имело место). --Infovarius (talk) 14:14, 16 June 2023 (UTC)[reply]

Bot fix needed

This was not a good edit; the value was already marked as deprecated, and the correct value (which your bot added) was already present. I have reverted it.

Please prevent the bot from repeating it, or making similar edits. Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:16, 30 July 2023 (UTC)[reply]

Hello, VIAF is very specific database. It contains huge amount of redirects. We discussed already how to handle it in Wikidata: Property_talk:P214/Archive_2#Removal_of_redirected_and_deleted_IDs. — Ivan A. Krestinin (talk) 00:14, 10 August 2023 (UTC)[reply]

KrBot references

Hello,

The references added by KrBot are wrong. Please comply with Help:Sources/ru. For example, references added to The Boys (Q48796994) are wrong.

Cordially. ―Eihel (talk) 19:42, 1 August 2023 (UTC)[reply]

Hello Eihel, are you about this reference? I understand that it is not very good. But it is the best that I found. Could you suggest better reference format for such edits? — Ivan A. Krestinin (talk) 00:03, 10 August 2023 (UTC)[reply]

Hello,

On Q121081132, the bot keeps changing correct links. There's nothing wrong with the other links; they are correct and work fine. On the Dutch Wikipedia we say something like “if something is already correct, don't overcorrect”. Please make it stop changing correct links.

Regards, Mondo (talk) 10:47, 2 August 2023 (UTC)[reply]

Hello Mondo, please see AllMusic artist ID (P1728) property documentation. It describes single format as correct. Examples and format patterns say nothing about format with name prefix. I understand that both formats produces correct links. But Wikidata is not just link storage. Different formats make data manipulation more complex. For example distinct-values constraint (Q21502410) will fail to find duplicate items if they will have codes in different format. — Ivan A. Krestinin (talk) 23:57, 9 August 2023 (UTC)[reply]
Hello Ivan, thanks for your answer. Be that as it may, this is about a Dutch band on the Dutch Wikipedia, and like I said, we think different about these kind of changes. And as you confirmed, the other links are correct too, so there's no reason to change them. I will continue to undo these kind of changes, but it would be nice if it could stop doing this automatically or if there was a way to disable the bot on certain pages. Mondo (talk) 10:35, 10 August 2023 (UTC)[reply]
Bot uses {{Autofix}} rules from Property talk:P1728. Just remove the rules from the page and bot will stop. But it will be nice to discuss the case with somebody else. Data normalization is important for success property maintenance. — Ivan A. Krestinin (talk) 22:11, 21 August 2023 (UTC)[reply]

KrBot2 suspended (August 2023)

Hello,

KrBot2 seems to be suspended since 2 August. Could you fix it? Thank you. ウース (talk) 14:45, 5 August 2023 (UTC)[reply]

Still no activity. Is there any problem? - Robotje (talk) 20:21, 8 August 2023 (UTC)[reply]
Hello, yes, I have troubles with Internet connection for my bot. Everything should be fixed in 1-2 weeks. — Ivan A. Krestinin (talk) 23:40, 9 August 2023 (UTC)[reply]
Maybe time to move the bot to Toolforge or Cloud VPS? Internet connection should be no problem there. I know your bot is very resource-intensive, but I’d expect WMF to provide you the necessary resources as this bot is really crucial to ensuring the quality of Wikidata data. (Toolforge provides up to 4 GB of RAM, Cloud VPS more.) —Tacsipacsi (talk) 13:23, 10 August 2023 (UTC)[reply]
Oracle Cloud Always Free Tier also offers a 4-core ARM64 VM with 24GB RAM and up to 200GB SSD storage for free, however you might have trouble accessing that offer if your current location is Russia. GreenReaper (talk) 02:27, 12 August 2023 (UTC)[reply]
WMF infrastructure (Toolforge or Cloud VPS) has the additional benefit that should Ivan unable/unwilling to operate the bot any further (be it because of sanctions against Russia, medical issues or simply Ivan losing interest in contributing to Wikimedia), someone else can take it over. (For Toolforge, there is Abandoned tool policy, I assume something similar exists for Cloud VPS as well.) Oracle Cloud, or any other third-party cloud provider, doesn’t provide this benefit, which is why I’d like Ivan to move to WMF infrastructure. —Tacsipacsi (talk) 20:24, 12 August 2023 (UTC)[reply]
Thank you for you ideas, situation is not so critical) Internet channel issues were resolved several days ago. New channel is much better than previous. Currently I use hardware with 128 GB of RAM. Usually such virtual environments are expensive a bit. Previous request to WMF for infrastructure had drown in discussions. It was made than I had issues with hardware. But currently I use new machine with enough memory. So everything should be fine in near time perspective. — Ivan A. Krestinin (talk) 22:04, 21 August 2023 (UTC)[reply]
In the near time perspective it should, but in the long term it won’t. Closed-source code running on a server only you know how to access – a perfect example of a bus factor (Q1812883) of one. Even though it may be hard to believe after only just over a decade of Wikidata running, all of us will leave the project sooner or later – and if you leave the project, let that be tomorrow or in half a century, constraint reports will stop being updated, not for a few days, but maybe months or years, until someone rewrites your code from scratch.
I didn’t know you already tried WMF infrastructure, but if you gave it another try, I’d be happy to advocate for it. Of course, another option for increasing the bus factor is publishing the source code: if the source code is public, anyone can restart the job (as an emergency solution, even a much less RAM + huge SSD swap area setup would suffice – it’s probably extremely slow, multiplying the time passing between two updates, but at least it works). And maybe a C++ expert looking at the source code finds a way to drastically decrease the memory footprint. —Tacsipacsi (talk) 19:18, 22 August 2023 (UTC)[reply]
Do not wary, the situation is not so dramatic) You may find at least one more bot in constraints reports history. At least he can continue the project in case of most worse "bus factor". Also I am trying to keep constraints system from complexity growing. So re-implementation should not be very hard task. And of course I will try to publish source code or find another botmaster if I will not have ability to continue the bot support. 32 GB RAM + 256 GB SSD swap was used some time ago. Reports update cycle was more than week. As for memory usage optimization - I did several optimizations already. Now bot uses much less memory than previously. But Wikidata DB is really big. 1505302848 mainsnak values, 20163700 unique qualifiers... — Ivan A. Krestinin (talk) 19:53, 23 August 2023 (UTC)[reply]

Do not wary, the situation is not so dramatic) You may find at least one more bot in constraints reports history. At least he can continue the project in case of most worse "bus factor".

DeltaBot updates are often of inferior quality compared to yours. A recent, extreme example: DeltaBot report with one section, reporting “more than 5000” violations; KrBot2 update a few hours later, with seven sections, exact violation counts even for very common violations, and statistics about accepted and violating values where applicable. The only good thing about DeltaBot is that it’s more error-tolerant; it doesn’t stop doing anything if one constraint is malconfigured, so other, correctly configured constraints continue to be updated.

Also I am trying to keep constraints system from complexity growing.

Which is not necessarily a good thing. If source code was public, you could, instead of preventing the complexity growth, tell anyone who wants to have a new constraint to implement support for it in the bot (or have support implemented by someone).

So re-implementation should not be very hard task. […] As for memory usage optimization - I did several optimizations already.

Exactly these optimizations are hard to re-implement. Of course anyone can write a bot that can create constraint reports if it has access to unlimited resources, but writing an efficient bot isn’t easy.

And of course I will try to publish source code or find another botmaster if I will not have ability to continue the bot support.

What if you don’t see in advance that you won’t be able to continue the support?

As for memory usage optimization - I did several optimizations already. Now bot uses much less memory than previously. But Wikidata DB is really big. 1505302848 mainsnak values, 20163700 unique qualifiers...

Of course there is no guarantee that anyone will find a way to reduce the memory footprint. However, there’s only case in which it’s guaranteed that no one will – if you don’t publish the source code. There may be a smarter way to avoid reading data from disk that won’t be used, to avoid reading data before it’s used and free the memory once it’s no longer used etc. Who knows. —Tacsipacsi (talk) 14:10, 24 August 2023 (UTC)[reply]
IMO the smart thing to do for Ivan would be to not publish the source code. I can't see how pestering him about this is going to help. He's providing a free service out of the goodness of his heart and out of his own pocket as well and all you can do is bitch about it. *shakes head in disapproval* Maybe next time show some appreciation? Infrastruktur (talk) 20:08, 11 September 2023 (UTC)[reply]

I support the work of Ivan's bot and therefore he should avoid WMF infrastructure because it belongs to the Western complex.--U. M. Owen (talk) 20:05, 20 September 2023 (UTC)[reply]

Useless edits about postal code (P281)

Hi there, I see you haven't been active on Wikidata since April, hope everything's ok.

I would like to let you know that your KrBot is doing useless edits about postal codes with the message "см. / see autofix на / on Property talk:P281". The "useless edit" means it changes nothing at all in the already set value of postal code (P281). One example of such edit. I've checked the hex values of all six characters of the value in both before and after the edit, and they are identical.

Is there any reason why this occurs? Jimkats (talk) 17:34, 8 August 2023 (UTC)[reply]

The fourth character is not identical: the space (U+0020) was replaced by a non-breaking space (U+00A0). It’s the first {{Autofix}} in the section Property talk:P281#Autofix. (If you enable Navigation popups in your preferences and hover over your diff link, you’ll see that was replaced by \u00a0.) —Tacsipacsi (talk) 20:38, 8 August 2023 (UTC)[reply]
Hello Jimkats, Tacsipacsi, I have some troubles in non-wiki life. My bot have no suitable Internet connection right now. Bot should return back in 1-2 weeks. I am supporting existing tasks in operational state, but have too few time for other activities now. As for your question: Tacsipacsi is right, this autofix replaces usual spaces to non-breakable. As I remember many time ago postal codes were imported with different space types. This autofix normalizes all codes to a single format. — Ivan A. Krestinin (talk) 23:38, 9 August 2023 (UTC)[reply]

variable-order class

Krbot is reverting variable-order class (Q23958852) being an instance of itself. Please have the bot not do this, as the self-instance is correct. Peter F. Patel-Schneider (talk) 10:57, 13 September 2023 (UTC)[reply]

Hi Peter, thank you. I added variable-order class (Q23958852) to exceptions. — Ivan A. Krestinin (talk) 23:40, 13 September 2023 (UTC)[reply]

Please respect deprecated rank of identifiers

I add deprecated rank and reason for deprecation when there is a defective DOI in the dataset. Your bot then "fixes it", which is redundant because the DOI is already on the item. But also, this will cause my bot to just edit again. Please check for deprecated rank and whether the resulting transformed identifier is already on the item. (I've set up a new already-edited-items cache so the bot should not edit war indefinitely, but it may be do while working through items not registered in the cache yet.) Harej (talk) 01:03, 14 September 2023 (UTC)[reply]

Hello, DOI (P356) usage is more than 30 millions. It is very hard to maintain this data set. Misprints, incorrect values, duplicate items, several identifiers in one item, correct values marked as deprecated, incorrect values with normal ranks and etc... I merged ~34000 items, resolved ~2000 short DOI codes, fixed ~100000 incorrectly formatted codes, but more and more errors are coming... And wrong values with deprecated rank make maintenance much more hard. Algorithms are too complex already. Deprecated rank processing requires much more hard algorithms. Could you just avoid values with deprecated rank usage for DOI? Simple DOI code validation during import will make our life much more easy. You may use this endpoint for validation: https://doi.org/api/handles/10.1037/0278-7393.23.5.1106 External databases contain many errors. Nice to avoid wrong values import. It is much more easy than maintenance of huge amount of error values. — Ivan A. Krestinin (talk) 23:06, 16 September 2023 (UTC)[reply]

Another question about the edits made by the bot

Why did the bot remove the claim in Q6296230 that Q251266 is the publisher? Link from the official website [22] of Q6296230 where the Journals Permissions page [23] states "In general, reusing or reproducing substantial portions of Sage Publishing content requires permission." I catalog these publications. Please restore the claim. --BethGuay (talk) 14:39, 19 September 2023 (UTC)[reply]

Cleaning up after a faulty merge

Hi Iván, please undo these exchanges : there was a wrong merger, which I have already corrected (Szucs (Q37447676) <> Szűcs (Q50374671). Thanks Pallor (talk) 19:59, 20 September 2023 (UTC)[reply]

No need anymore, Tacsipacsi did it, thanks. Pallor (talk) 00:03, 21 September 2023 (UTC)[reply]

Bot adding Surname Human

Good after noon! this morning I happend to create a wrong prpoperty constraint to Norwegian historical register of persons ID (P4574) wich again made your Bot creating by import family name (P734) to human (Q5) for persons having country of citizenship (P27) Norway (Q20) and an Norwegian historical register of persons ID (P4574). Cay you please revert these changeg made by the bot. an example will be Alv Lea (Q28358095) Best regards Pmt (talk) 16:41, 24 September 2023 (UTC)[reply]

A plea to leave DOIs that don't resolve

@Ivan A. Krestinin I've noticed that KrBot is deleting DOIs that don't resolve, e.g. MISCELLANEOUS BOTANICAL NOTES 4* (Q94145860). Can I ask you to please not do that? Is it possible to restore those that you have deleted? In many cases DOIs that don't resolve can be fixed, for example I regularly send lists to Crossref (Q5188229) and they work with publishers to fix DOIs that don't resolve. By deleting these KrBot is also making it harder to find these articles (for example, I have tools that check whether an article has been added by looking for a DOI, if KrBot has deleted a DOI then there's an increased chance of adding a duplicate). I understand that DOIs that are broken are annoying, but in my opinion deleting them makes the situation worse. Rdmpage (talk) 12:39, 10 October 2023 (UTC)[reply]

Maybe these bad DOIs can be added to a list before deletion? bdijkstra (overleg) 19:55, 15 October 2023 (UTC)[reply]
@Bdijkstra@Ivan A. Krestinin My point is that they shouldn't be deleted. To do so conflates the existence of an identifier with whether that identifier currently works as expected. I'm arguing that it is better to either (a) flag DOIs that don't resolve, perhaps by adding a qualifier to DOI, or (b) communicate with the DOI agency to see if the lack of resolution can be fixed. But deleting DOIs simply because they don't resolve is needlessly destructive of other people's work. Rdmpage (talk) 13:28, 16 October 2023 (UTC)[reply]
I agree with Rdmpage, this kind of removal is nothing short of vandalism when the reference supports the DOI, even if it doesn't resolve. --Jahl de Vautban (talk) 10:43, 11 November 2023 (UTC)[reply]
10.5271/SJWEH.3662 code was looked as correct code. I wrote request on doi.org via web-form. They fixed the issue in one day. Deprecated rank is incorrect for such cases. Because the code was not deprecated actually. It was just mistake in doi.org database. But this case is very rare. My bot deleted ~88000 codes. And the most of the codes were really invalid. Valid codes were added by other bots latter in many cases. Many items were merged after deleting wrong codes. And one of merged items contain valid code in many cases. Wrong values require much more efforts for maintenance than normal values. You need to answer to many questions each time then you see invalid code: is it just misprint? is it vandalism? is it invalid code in the past and valid now? was it marked as deprecated by mistake? was the mistake fixed in external source database? and etc. You can do the validation manually when you have 10 invalid codes. But it is absolutely impossible for 10000+ values. So you need some maintenance bot. This bot should have very complex algorithm and take into account all possible cases. Does somebody ready to create such bot? I seen such attempts in the past. And now we have many valid codes wrongly marked as deprecated as the result. My suggestion is: lets focus on maintenance of valid DOI values. We have no enough forces for this task. For example take a look to User:Ivan A. Krestinin/To merge/Scholarly articles. Much more important to merge and fix duplicate items than keep invalid codes. We really have no resources for invalid DOI codes maintenance. — Ivan A. Krestinin (talk) 22:58, 13 November 2023 (UTC)[reply]
Excuse me for my earlier comment, it was unnecessarily rash. And thank you for having resolved this particular DOI case, I'll try it next time before reverting Krbot. However I think the initial problem stand, that is, by indiscriminately removing the statements we have no way of recording that it may be a deeper problem that merely a misprint or a vandalism, and one problem that we (as Wikidata) can help solving. Would it be possible not to remove the claim if there is a source attached to it? --Jahl de Vautban (talk) 12:00, 14 November 2023 (UTC)[reply]
My original comment still stands. Please don't delete DOIs simply because they do not resolve. Leave them for people (or bots) who are willing to invest the effort into discover whether the DOIs have, for example, not been correctly registered. Deleting ~88,000 codes is not, in my opinion, a positive thing. Yes, it would be nice to have a clever bot to monitor DOI status and flag cases that need investigation. But in the meantime deletion is not a solution. Do you have a list of the 88,000 DOIs your bot deleted? That would help me recover the DOIs I added that have been deleted, and also would help me make bulk requests to CrossRef (and other agencies) to fix the broken DOIs. Rdmpage (talk) 14:11, 14 November 2023 (UTC)[reply]

ModDB entries are not automatically freeware

They can also be free software or anything else, so the bot should not add that on his own. Matthias M. (talk) 16:36, 12 October 2023 (UTC)[reply]

Hello, you fixed the issue already, thank you. I rolled back several items, example: [25]. — Ivan A. Krestinin (talk) 00:37, 13 November 2023 (UTC)[reply]

KrBot2 on OpenStreetMap Tag:species

The bot always changes space for underscore. However, this has no sense with plant species names such as Aesculus hippocastanum or Quercus robur. Please, add an exception on OSM Tag:species=* for keeping spaces. Property_talk:P1282#"_" or " " as value Davileci (talk) 02:03, 15 October 2023 (UTC)[reply]

Hello, you may delete the {{Autofix}} from Property_talk:P1282 or change it. I added some notes to Property_talk:P1282#"_" or " " as value discussion. — Ivan A. Krestinin (talk) 00:11, 13 November 2023 (UTC)[reply]

I've raised the issue of the krbot deleting valid and referenced occupations at https://www.wikidata.org/wiki/Property_talk:P106#Value_Jesuit_(Q12319698)_will_be_automatically_replaced_to_value_Society_of_Jesus_(Q36380)_and_moved_to_religious_order_(P611)_property and Wikidata:Project chat#Why are bots allowed to delete valid referenced statements? and as stated there I believe the autofix tags should be removed for these. Piecesofuk (talk) 07:50, 8 November 2023 (UTC)[reply]

Hello, sorry, I have no time for deep dive to the case discussions. I hope other users helped you. If no, just delete corresponding {{Autofix}} rule from Property talk:P106 if it works wrong. — Ivan A. Krestinin (talk) 23:40, 12 November 2023 (UTC)[reply]

Mass changing to disambiguation??

Why is your bot making mass changes from building interior to Interior? ([26] and etc.) That's a mistake, please restore it. --Orijentolog (talk) 05:35, 5 December 2023 (UTC)[reply]

I think the bot made those changes because someone (for reasons I don't understand) had merged the two items. I undid the merge, but the damage had already been done by that time. It would be great if the bot could revert those replacements. - Eureka Lott 00:22, 6 December 2023 (UTC)[reply]
The подробности / details link points to the EditGroups tool, which allows to revert all edits with a single click. You can do it yourself, but you can also ask any bot owner to do it if you don’t want the tool to make over 11k edits in your name. —Tacsipacsi (talk) 01:27, 6 December 2023 (UTC)[reply]
I reverted the edits, thank you for your message. — Ivan A. Krestinin (talk) 02:13, 27 December 2023 (UTC)[reply]

Bot does not consider rector (Q212071) to be an occupation (P106)

The KrBot does not consider rector (Q212071) to be an occupation (P106) but instead puts it under position held (P39). This is incorrect. In the Swedish context, rector is consider to be an occupation. The bot is constantly chaning my changes at Q19977282. Can this be resolved somehow? Monir Elias Bounadi (talk) 11:58, 11 December 2023 (UTC)[reply]

@MonirEliasBounadi: I have deactivated the autofix template which triggered the edits of KrBot about this; I suggest opening a discussion in Talk:Q212071 about the use of position held (P39) or occupation (P106), pinging the competent WikiProject (Wikidata:WikiProject Education). --Epìdosis 14:25, 11 December 2023 (UTC)[reply]

Language move mistake?

Hi Ivan, @Ivan A. Krestinin, Emu: can you look at why at B. C. Schiller (Q15784916) is moved German (Q188) to Barbara Schiller (Q100989423) even though group of humans (Q16334295) is also with the instance of B. C. Schiller (Q15784916)? Even if German (Q188) is deprecated.

See this https://www.wikidata.org/w/index.php?title=Q15784916&action=history, if you want. --Frettie (talk) 15:16, 12 December 2023 (UTC)[reply]

  • Hi, languages spoken, written or signed (P1412) is person-specific property. It is inapplicable for group of persons according to description. We can relax the property description of course and include person groups. But it is bad idea as for me. Relation between language and person group is too unclear in general case. Does it mean that somebody from a group speak the language? Or this means that everybody from the group speaks the language? How group of peoples can "speak"? It is not choir in general case. How to specify reference if we have no single source and property was aggregated from per-person sources? And etc. — Ivan A. Krestinin (talk) 23:05, 26 December 2023 (UTC)[reply]

Do not merge

Hello, in wiki-merge list there is huge group about psalms, which are already on do-not-merge lists [27] [28] but still on lists. Its bug or feature? JAn Dudík (talk) 21:19, 13 January 2024 (UTC)[reply]

Wikidata:Database reports/Identical ISNI

Can you create Wikidata:Database reports/Identical ISNI (Property:P213) like Wikidata:Database reports/Identical VIAF ID and Wikidata:Database reports/Identical GND ID? CV213 (talk) 11:40, 22 January 2024 (UTC)[reply]

ISNI - Q21286738 as allowed type

Why is Wikimedia permanent duplicate item (Q21286738) listed as allowed type at https://www.wikidata.org/w/index.php?title=Wikidata:Database_reports/Constraint_violations/P213&oldid=2059605708#Types_statistics ? CV213 (talk) 13:27, 22 January 2024 (UTC)[reply]

Moving statements

These bot edits don't seem helpful. Removing instance of (P31) produces error messages.. --Kolja21 (talk) 12:29, 8 February 2024 (UTC)[reply]

KrBot incorrectly removing leading dot in P7085 (TikTok username)

Hi, in this edit KrBot removes the leading dot from the TikTok username which is incorrect, it now leads to a whole different profile. PaperHuman (talk) 22:03, 28 March 2024 (UTC)[reply]

I realized KrBot might be just checking against the property's regex, so I created Property talk:P7085#TikTok username can start with a dot (which is currently not allowed) about this. PaperHuman (talk) 22:18, 28 March 2024 (UTC)[reply]

Revert bot

Can your bot revert all like this? My guess is less than 100 so if not I will do it manually. Incorrect merge since the two noble family (Q13417114) are different classes in Sweden. Maundwiki (talk) 18:35, 12 May 2024 (UTC)[reply]

The edit summaries contain a подробности / details link. If you click on it, there will be a button to revert the entire edit group (in case of these redirect resolving edits, one edit group is resolving one redirect in all items). The revert edits will appear in your name, which you may or may not like (I don’t like it), but it’s no worse than manually doing all those edits, and it’s definitely faster. —Tacsipacsi (talk) 22:28, 12 May 2024 (UTC)[reply]
Thanks, I have no problem to be behind it. I reverted the merge that triggered your bot. Maundwiki (talk) 15:00, 14 May 2024 (UTC)[reply]

revert of merge Q10910206 ( Wolong He)

Hello Ivan,

Can you undo/revert your bot's merging of Q10910206 with Q34933233?
Q10910206 describes a river in Sichuan, China (zh:卧龙河) while Q34933233 (Geonames entry in ceb and sv-wiki) point to some river in Heilongjiang, China. --Katpatuka (talk) 11:47, 13 May 2024 (UTC)[reply]

Reverting an archived edit group

Hi, is it possible to revert an edit group by KrBot that has been archived such as [29]? I have only just undone the merging of these two items from 2019, which in my opinion was a bad merge, but I seem to be unable to revert this edit group because of how old it is. Monster Iestyn (talk) 20:23, 24 May 2024 (UTC)[reply]

KrBot2 suspended (May 2024)

Hello, now KrBot2 seems to be suspended since 16 May, 2024. Could you check it? Thank you. ウース (talk) 12:35, 26 May 2024 (UTC)[reply]

Incorrect Batushka official website labeling

Hello, I put the facebook address on the official website entry because this is the only website of Krzysztof Drabikowski's version of the Polish band Batushka and it is misleading that only appears the website of Bartłomiej Krysiuk as the solely "official website" of Batushka, even more when the Polish courts ruled in favor of Drabikowski. [30] Both URL should appear under the entry "official website". Also the entry "Krzysztof Drabikowski" is needed. Thank you very much. Grojasp44 (talk) 20:09, 8 June 2024 (UTC)[reply]

do - undo?

Hi Ivan A. Krestinin, your KrBot derives properties from other properties, but does not care for undo in case the originating property was undone. See [31].

14:50, 21. Feb. 2024‎ Arch2all Diskussion Beiträge‎  36.588 Bytes −1.415‎   ‎Aussage entfernt:  Objekt-ID der Datenbank österreichischer Kulturdenkmale (P2951): 119587
...
16:00, 1. Feb. 2024‎ KrBot Diskussion Beiträge‎  37.902 Bytes +956‎   ‎Fundstelle der Aussage hinzugefügt:  Schutzkategorie (P1435): Denkmalgeschütztes Objekt in Österreich (Q1188447)
16:00, 1. Feb. 2024‎ KrBot Diskussion Beiträge‎  36.946 Bytes +432‎   ‎Aussage erstellt:  Schutzkategorie (P1435): Denkmalgeschütztes Objekt in Österreich (Q1188447), according to Eigenschaftseinschränkung „Datenobjekt erfordert Aussage“ (Q21503247) constraint from Objekt-ID der Datenbank österreichischer Kulturdenkmale (P2951)
12:46, 1. Feb. 2024‎ Arch2bot Diskussion Beiträge‎  36.514 Bytes +1.070‎   ‎Fundstelle der Aussage hinzugefügt:  Objekt-ID der Datenbank österreichischer Kulturdenkmale (P2951): 119587, Add archINFORM reference
12:46, 1. Feb. 2024‎ Arch2bot Diskussion Beiträge‎  35.444 Bytes +345‎   ‎Aussage erstellt:  Objekt-ID der Datenbank österreichischer Kulturdenkmale (P2951): 119587, Lookup from archINFORM

now we have an inconsistent state, heritage designation (P1435) is not relevant any longer (only the items corresponding 1:1 to protected objects are marked with this property, not any sub-objects thereof). Can you teach your bot to do also the reverse operation in case the reason for the operation was undone? best --Herzi Pinki (talk) 13:37, 10 June 2024 (UTC)[reply]

Russian/English summary for KrBot

I think two languages is too much. Please keep it English only as that field is not localized. It is really awkward and hard to read. I also don't edit here with German in front of English in the summaries. Matthias M. (talk) 20:37, 17 June 2024 (UTC)[reply]

KrBot2 not creating reports since 2024-07-02

https://www.wikidata.org/w/index.php?title=Special:Contributions/KrBot2&target=KrBot2&dir=prev&offset=20240702155715

Can you have KrBot2 creating error reports again? It is not creating for old properties and not for new ones, e.g. Property talk:P12890 has several red links in the constaints section. BergwachtBern (talk) 21:27, 13 July 2024 (UTC)[reply]

Multiple languages in label in language constraint (Q108139345) results in error

Hi, I noticed after I added a "label in language constraint" with two language codes to Uppslagsverket Finland ID (P4854) that the constraint violation report results in an error due to "too many P424 values". The same error is present in 15 other reports as well. According to Help:Property constraints portal/Label in language if the constraint has multiple language codes it gets "satisfied if the item has a label in any of the listed languages (“or” condition)" but this is not reflected in the constraint violations reports. Could you have a look and fix the issue? Samoasambia 09:51, 27 August 2024 (UTC)[reply]

Even larger problem is that 97 reports fail to load due to having constraint scope (P4680) qualifier on some constraint. That should be accepted as well. Samoasambia 18:01, 29 August 2024 (UTC)[reply]

Incorrect autofix for genre statements

Hi Ivan, KrBot has replaced the values for "genre" property statements on multiple items for photo to "seinen" and then changed it into an "intended public" statement, as can be seen here. I think this happened because KrBot was performing autofixes, which can be seen on the genre property talk page. Do you know how this autofix can be removed: Value black-and-white photography (Q3381576) will be automatically replaced to value seinen (Q237338)? I am not sure whether this needs to be discussed on the property talk page or can just be deleted. After that has been done, can you please undo the edits? Thanks! Emily Helm Hall (talk) 22:50, 11 September 2024 (UTC)[reply]