Shortcuts: WD:RFBOT, WD:BRFA, WD:RFP/BOT

Wikidata:Requests for permissions/Bot

From Wikidata
Revision as of 08:32, 9 December 2024 by Pi bot (talk | contribs) (Archiving closed requests to Wikidata:Requests for permissions/RfBot/December 2024)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Wikidata:Requests for permissions/Bot
To request a bot flag, or approval for a new task, in accordance with the bot approval process, please input your bot's name into the box below, followed by the task number if your bot is already approved for other tasks. Then transclude that page onto this page, like this: {{Wikidata:Requests for permissions/Bot/RobotName}}.

Old requests go to the archive.

Once consensus is obtained in favor of granting the botflag, please post requests at the bureaucrats' noticeboard.

Translate this header box!


Bot Name Request created Last editor Last edited
KlaraBot 2024-12-08, 00:17:49 Iamcarbon 2024-12-09, 06:42:03
So9qBot 11 2024-12-03, 15:02:59 So9q 2024-12-11, 10:34:31
CarbonBot 2024-10-16, 18:41:08 Ymblanter 2024-11-03, 07:46:31
QichwaBot 2024-09-25, 17:03:35 Wüstenspringmaus 2024-10-02, 13:06:08
Leaderbot 2024-08-21, 18:17:53 Lymantria 2024-09-10, 17:29:51
ZLBot 2024-08-03, 12:45:33 Wüstenspringmaus 2024-08-28, 15:26:21
UmisBot 2024-07-25, 16:44:40 Ymblanter 2024-08-16, 20:25:14
DannyS712 bot 2024-07-21, 03:09:22 Ymblanter 2024-07-26, 04:29:22
TapuriaBot 2024-06-03, 16:18:28 BrokenSegue 2024-06-07, 15:31:46
IliasChoumaniBot 2024-06-03, 10:16:37 IliasChoumaniBot 2024-07-18, 11:01:28
Browse9ja bot 2024-05-16, 02:16:04 Browse9ja bot 2024-05-25, 13:12:09
OpeninfoBot 2024-04-16, 11:14:27 Ymblanter 2024-05-09, 19:22:52
MidleadingBot 5 2024-02-05, 13:04:20 Ymblanter 2024-11-05, 19:27:35
So9qBot 9 2024-01-05, 18:41:06 Fnielsen 2024-12-04, 21:46:59
So9qBot 8 2023-12-17, 15:07:59 So9q 2024-11-27, 19:05:16
HVSH-Bot 2023-12-31, 12:37:18 So9q 2024-01-02, 10:35:04
RudolfoBot 2023-11-29, 09:29:38 TiagoLubiana 2023-11-30, 23:47:22
GamerProfilesBot 2023-10-05, 11:06:23 Jean-Frédéric 2024-05-19, 07:39:50
MangadexBot 2023-08-06, 18:01:17 RPI2026F1 2024-01-25, 16:22:21
WingUCTBOT 2023-07-31, 10:07:51 So9q 2024-01-02, 10:50:02
MajavahBot 2023-07-11, 19:54:55 Wüstenspringmaus 2024-08-29, 11:05:24
FromCrossrefBot 1: Publication dates 2023-07-07, 14:31:17 Succu 2023-11-07, 20:19:56
UrbanBot 2023-06-29, 16:04:49 Urban Versis 32 2023-07-15, 02:40:06
AcmiBot 2023-05-16, 00:36:49 BrokenSegue 2023-06-22, 20:40:33
WikiRankBot 2023-05-12, 03:36:56 BrokenSegue 2024-02-22, 15:59:51
ForgesBot 2023-04-26, 09:30:12 BrokenSegue 2023-04-26, 17:13:55
IngeniousBot 3 2023-03-22, 16:29:58 Ymblanter 2023-06-23, 19:04:15
LucaDrBiondi@Biondibot 2023-02-28, 18:25:03 LucaDrBiondi 2023-03-31, 16:10:37
Kalliope 7.3 2022-12-07, 09:16:20 DannyS712 2024-06-09, 07:00:55
DL2204bot 2 2022-11-30, 11:19:21 DannyS712 2024-06-09, 07:02:03
Botcrux 11 2022-11-28, 09:05:27 Wüstenspringmaus 2024-08-30, 09:13:32
Cewbot 5 2022-11-15, 02:20:05 Midleading 2024-11-04, 15:42:28
Mr Robot 2022-11-04, 14:09:41 Liridon 2023-03-02, 13:03:34
RobertgarrigosBOT 2022-10-16, 19:43:23 Robertgarrigos 2022-10-16, 19:43:23
YSObot 2021-12-16, 11:33:29 So9q 2024-01-02, 10:32:27
AradglBot 2022-03-14, 19:43:27 Wüstenspringmaus 2024-08-29, 10:55:49
PodcastBot 2022-02-25, 04:38:31 Iamcarbon 2024-10-16, 21:26:09


KlaraBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Iamcarbon (talkcontribslogs)

Task/s:

Append a human's lifespan to descriptions when they can be authoritatively sourced and when they do not already exist.

for example: Lithuanian historian -> Lithuanian historian (1923–2017) [diff]

Code:

Human descriptions will be updated under the following conditions:

  • There is no existing lifespan
  • A birth and/or death date is available that:
  1. has at least one reference
  2. is precise to at least the nearest year
  3. uses the Gregorian calendar (or the default calendar)
  • Adding the lifespan does introduce ambiguity in relation to reigns, terms in office, or other significant roles

Additional Rules:

  • Deprecated statements are excluded
  • Preferred values are prioritized over non-preferred ones
  • In cases of conflicting claims:
  1. if the item has a Wikipedia article that includes the lifespan in the first sentence, it will be used to corroborate an existing claim
  2. the item will be logged for publishing in a future public report, so we can manually improve the claims
  • Data from unreliable sources or individuals born before 1700 may be ignored

Function details: --Iamcarbon (talk) 00:17, 8 December 2024 (UTC)[reply]

 Comment in enwiki also such life-span edits are made at short-descriptions. I would consider them "not-bot like". Side-notice @Iamcarbon: You got shock-blocked, because you manually removed labels where mul-label existed. Estopedist1 (talk) 08:19, 8 December 2024 (UTC)[reply]
Bot description looks fine for the most part. I think AI assisted unattended edits are a bad idea in general, I would much prefer that in cases of multiple or conflicting claims it logs the claim and writes out a publicly available report on them at some developer-determined interval. These cases ought to be handled manually by a human. Edit: Or alternatively, depending on the amount, just ignore the ambiguous cases. Infrastruktur (talk) 17:53, 8 December 2024 (UTC)[reply]
Agreed. I have replaced the LLM with a Wikipedia parser to disambiguate conflicting claims, and added a Sqlite database to log these items so they can be reported in the future. If we can't corroborate the year with wikipedia, it will be skipped. Iamcarbon (talk) 20:47, 8 December 2024 (UTC)[reply]


So9qBot 11 (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: So9q (talkcontribslogs)

Task/s: Import of 564 hiking paths from Naturkartan.

Code: https://github.com/dpriskorn/NaturkartanScraper

Function details:

  • I extracted all hiking paths and cleaned the data (removed bicycle routes).
  • Then I exported all trails >= 5km => 601 trails
  • Then I cleaned away those with duplicate names in OpenRefine => 544 trails (CSV)

All the metadata about the trails are maintained via this service by public organizations. Unfortunately the public organizations have not been willing to publish open data by themselves directly, which would of course be preferred.

Some of the data can be found in a rather messy and unusable from the länstyrelser via naturvårdsverket.

This import would help the Swedish OSM community because I match all Swedish hiking trails in Wikidata against OSM and link them using hiking trail matcher. This makes it very easy for anyone to see how many official trails are still missing in OSM.

I can upload a few example items if anyone wants to review the schema.

To be discussed:

  • are hiking paths around 5 km length notable? Would we rather make the cut off at 10 km? Or perhaps 2 km?

--So9q (talk) 15:02, 3 December 2024 (UTC)[reply]

Discussion

[edit]
Ainali Belteshassar so9q Vätte Popperipopp Tulipasylvestris Esquilo Daniel Mietchen — with focus on topics related to research (Q42240) VisbyStar Haxpett QubeCube Marcus.linneberg Vitplister Spisen Sollentuna Myohmy671 Autom S4b1nuz E.656 JoranL

Notified participants of WikiProject SwedenSo9q (talk) 08:30, 4 December 2024 (UTC)[reply]

The data source doesn't seem openly licensed, or am I missing something? Ainali (talk) 16:36, 4 December 2024 (UTC)[reply]
That can be debated, basically the only four bits of information I extracted are:
Name, length, municipality, type of route. None of that is protected by copyright in the US (it's metadata and not copyrightable). The same is most probably the case in Sweden, but then we have the database protection thing. I would argue thta since this is information from public bodies already released to the public through the service no judge would admit a suit for copyright violation.
Scraping this information actually improves the service because it could increase the traffic and thus importance of the service.
I'm going to contact them and ask for a written approval. So9q (talk) 08:42, 7 December 2024 (UTC)[reply]
Naturkartan is only a publishing tool for principals who manage nature and outdoor life, primarily municipalities, county administrative boards and regions.
We (Outdoormap/Naturkartan) do not own any information. Just as Automattic/Wordpress does not own the information on a Wordpress site.
So we cannot release customers' data in the way you suggest.
However, a process is underway where more and more municipalities are opening up data as open data via https://www.dataportal.se/sv I assume that there may be more opportunities for more outdoor data to be available as open data that Wikidata can also link to. We welcome this development and look forward to more outdoor data becoming available as open data.
With kind regards,
Emil Hellström (CEO Outdoormap - owner of the website/service Naturkartan.se) Wdnk99 (talk) 11:49, 7 December 2024 (UTC)[reply]
Thanks for the response. I'll abandon this import and import them on a custom Wikibase instead.
@Ymblanter could you mark this abandoned? So9q (talk) 10:33, 11 December 2024 (UTC)[reply]



QichwaBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Elwinlhq (talkcontribslogs)

Task/s: Creating wikidata lexemes for the Quechua languages

Code: lexeme_upload.py describes the code for creating lexemes for the quechua languages based on a list extracted from the Qichwabase, which is a Wikibase.cloud instance of Quechua lexemes.

Function details: The tasks carried out by the bot include mainly the creation of Lexemes for the Quechua Languages based on the Qichwabase. The lexemes were already modelled according to Wikidata Lexemes model.

A small subset of the lexemes were already imported into Wikidata using the lexeme_upload.py with the support of Kristbaum (talkcontribslogs). Here is one example of a Quechua Lexeme: aparquy/aparquy (L1322219).

Afterwords, a pronunciation audio was added to the lexemes, with the support of the LinguaLibre tool.

Now, I would like to continue this process, by continuing creating Lexemes, so the pronunciation audio for them can be recorded.

Thanks for your support and understanding.

--Elwinlhq (talk) 17:03, 25 September 2024 (UTC)[reply]

Leaderbot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Leaderboard (talkcontribslogs)

Task/s: phab:T370842 and meta:Global reminder bot

Code: https://github.com/Leader-board/userrights-reminder-bot, though this is under development

Function details: See the above phabricator task. It should be noted that

  • I'm submitting near-identical requests on multiple wikis, and
  • I do not expect this bot to run that much (if at all) on Wikidata and will not require a bot flag; however, Wikidata:Bots explicitly mention that approval is needed (and the botflag set which I find unnecessary and even a bad idea), and
  • Here's a test edit (the text will be generalised for Wikidata). Users will be able to opt-out from this using a central page on Meta.

P.S: I also noticed that it says that bot "should never be used to make non-automated edits in the user talk namespace" which my bot will do - not sure if there's a way out of that.

--Leaderboard (talk) 18:17, 21 August 2024 (UTC)[reply]

Can you link examples of temporary rights on Wikidata? Sjoerd de Bruin (talk) 16:34, 26 August 2024 (UTC)[reply]
@Sjoerddebruin: [1], [2] and [3]. As noted above,
  • Wikidata does not make that much use of temporary rights (the flooder right is automatically ignored), and
  • many (but not all) of them are IPBE - some communities prefer that the bot exclude them. In that case it will run rarely, like in the case of the third example I shared above.
Leaderboard (talk) 05:29, 27 August 2024 (UTC)[reply]
I don't understand how a bot flag is needed for a bot that makes "non-automated edits in the user talk namespace"? This may be my confusion... --Lymantria (talk) 17:11, 9 September 2024 (UTC)[reply]
@Lymantria:, the edits are automated, just that the frequency is (very) low. Leaderboard (talk) 08:00, 10 September 2024 (UTC)[reply]
I'd prefer that you go for a global bot account. --Lymantria (talk) 13:00, 10 September 2024 (UTC)[reply]
@Lymantria But global bots are disabled on this wiki (see Meta:Special:WikiSets/14 where Wikidata is in the opt-out set). If there is consensus from the community that global bots should be allowed to run on Wikidata, that's fine by me as well. To reiterate, I don't even need a bot flag in the first place, just approval to run this bot (without one). Leaderboard (talk) 16:29, 10 September 2024 (UTC)[reply]
I'm sorry, you are right. --Lymantria (talk) 17:29, 10 September 2024 (UTC)[reply]

Wikidata:Requests for permissions (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Micrluo (talkcontribslogs)

Task/s: request SPARQLs for RAG

Code:

Function details: --Micrluo (talk) 12:45, 3 August 2024 (UTC)[reply]

@Micrluo Could you give use some more information & fix your request? --Wüstenspringmaus talk 15:26, 28 August 2024 (UTC)[reply]

UmisBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Stuchalk (talkcontribslogs)

Task/s: This bot will add string representations of units of measurement to units of measurement Wikidata pages.

Code: The Python project on the "Units of Measurement Interoperability Service" (UMIS), that this bot will support/enable, is at https://github.com/usnistgov/nist_umis .

Function details: String representations of different units of measurement are being aligned to allow translation between different unit representation systems. As the developer of the UMIS, I have concluded that Wikidata is the best place to organize/align unit representation strings. Once available at nist.gov later this year, the UMIS website will offer additional functionality to enable users to programmatically translate between unit of representation systems, and additional functionality is planned. There are already Wikidata properties for some of the unit representation systems (e.g. QUDT) and additional ones will be requested. This is my first bot permission request so if more info is needed please let me know. --Stuart Chalk (talk) 16:44, 25 July 2024 (UTC)[reply]

Please make some test edits. Ymblanter (talk) 20:25, 16 August 2024 (UTC)[reply]

DannyS712 bot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: DannyS712 (talkcontribslogs)

Task/s: I want to get approval for a bot with translation admin rights that will automatically mark pages for translations if and only if the latest version is identical to the version that is already in the translation system, i.e. only pages with no "net" changes in the pending edits.

Code: not yet

Function details: I am filing almost identical requests for bot approval on a bunch of wikis, and figured I should put some of the details in a central location. Please see meta:User:DannyS712/TranslationBot for further info. --DannyS712 (talk) 03:09, 21 July 2024 (UTC)[reply]

@Lymantria @Ymblanter just noting here that I cannot do test edits unless the bot is granted translation admin rights, unless you want me to test under my own account --DannyS712 (talk) 00:57, 26 July 2024 (UTC)[reply]
✓ Done Ymblanter (talk) 04:29, 26 July 2024 (UTC)[reply]

TapuriaBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: محک (talkcontribslogs)

Task/s: interwiki

Code: interwikidata.py from PAW, Mainly from Mazandarani and Gilaki Wikipedias.

Function details: novice --محک (talk) 16:18, 3 June 2024 (UTC)[reply]

there isn't enough info here. i don't understand what this is doing or how it is doing it BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]

IliasChoumaniBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Ilias Choumani / IliasChoumaniBot (talkcontribslogs)

Task/s: Automatic updating of data from JSON files on German scientists

Code: Will be in Python (not there yet)

Function details: --IliasChoumaniBot (talk) 10:16, 3 June 2024 (UTC)[reply]

what json files? we need more details BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]
We are students from TH Köln tasked with automating the process of updating data for scientists on Wikidata. Our objective includes verifying the presence of researchers and creating entries if they are not already listed. Similarly, we extend this process to projects, such as those found in GEPRIS, where these researchers have been involved. Subsequently, our goal is to establish connections between these projects and the respective researchers.
Our JSON files contain comprehensive data necessary for expanding information on researchers (QID, name) and their associated projects (project name, project ID) within Wikidata. This ensures that accurate and up-to-date information is seamlessly integrated into the Wikidata ecosystem.
This approach leverages automated tools and careful data handling to contribute valuable knowledge to the scientific community on Wikidata. IliasChoumaniBot (talk) 14:35, 17 June 2024 (UTC)[reply]
What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)[reply]
We have the data from various online sources such as gepris, orcid or pubmed. we have exrtahted data from various german scientists and their publications and would like to automatically insert them into wikidata as part of our studies. IliasChoumaniBot (talk) 11:01, 18 July 2024 (UTC)[reply]


Browse9ja bot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Browse9ja

Task/s: Automated data retrieval and updates for Browse9ja project, focusing on Nigerian and African-based information, integrating a chatbot, NLP API, knowledge graph, and machine learning model.

Code: (Not applicable, as I am using a combination of existing APIs and services)

Function details:

The Browse9ja bot is designed to perform the following tasks:

- Retrieve and update data on Wikidata related to Nigerian and African-based information - Integrate with a chatbot to provide users with accurate and up-to-date information - Utilize natural language processing (NLP) API for text analysis and understanding - Contribute to the development of a knowledge graph for African-based information - Apply machine learning models to improve data accuracy and relevance

The bot will operate under the supervision of the operator (Browse9ja) and adhere to Wikidata's policies and guidelines. --Browse9ja (talk) 02:16, 16 May 2024 (UTC)[reply]

 Comment OP has no track record of contributions either here or on any other project.
 Question Can you please give more details of how the chatbot will be integrated? Do you intend to have an LLM suggest content to add to Wikidata? Bovlb (talk) 15:37, 21 May 2024 (UTC)[reply]
Details of Chat-bot Integration as requested: My chat-bot will be integrated into the Browse9ja.com as a bot to provide users with accurate and up-to-date information related to Nigerian and African-based data on Wikidata. The integration will involve utilizing a natural language processing (NLP) API for text analysis and understanding. The Chat-bot will enable users to interact with the Browse9ja bot in a conversational manner, allowing for seamless access to information and updates on Wikidata. Additionally, the chat-bot will play a role in contributing to the development of a knowledge graph for African-based information. While the chat-bot will facilitate user interaction, the machine learning models will be applied to improve data accuracy and relevance, ensuring that the information provided is of high quality and relevance to the users.
About LLM Content Suggestion: The chat-bot integrated with Browse9ja bot will have the capability to suggest content to add to Wikidata. Leveraging natural language processing (NLP) and machine learning models, the chat-bot will be able to analyze user queries and suggest relevant content for addition to Wikidata. This functionality aligns with the broader goal of the Browse9ja bot to automate data retrieval and updates for Nigerian and African-based information, ensuring that the information contributed to Wikidata is accurate, up-to-date, and relevant.
Hope this clarifies my intent and would please also increase my chances for an approval.Thanks alot.
.
Browse9ja bot (talk) 13:12, 25 May 2024 (UTC)[reply]

OpeninfoBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Fordaemdur (talkcontribslogs)

Task/s: importing financial data (assets, equity, revenue, EBIT, net profit) from openinfo.uz to entries on public Uzbek companies in Wikidata.

Code:

Function details: I have a project going with openinfo.uz which is a state-owned public portal for financial disclosures of all public Uzbek companies. All joint-stock companies and banks in Uzbekistan have to disclose their financials there by law. I have created entries for all Uzbek banks at User:Fordaemdur/Uzbek banks and would like to test imports of financial data there (Openinfo is ready to provide API for that). If successful, the bot will import financials once per quarter. Next steps would also be creating entries for all other notable public Uzbek companies, not just banks, and import financials there too. --Fordaemdur (talk) 11:14, 16 April 2024 (UTC)[reply]

How many companies are we talking about? ChristianKl18:57, 17 April 2024 (UTC)[reply]
@ChristianKl, currently there items on about 50 public Uzbek companies (30+ are banks) - all can be found on my userpage. I am planning on creating items for all companies listed at the Tashkent Stock Exchange, so we'll end up with about 150 companies. There are about 600 joint-stock companies in Uzbekistan and I assume at least one third of them is notable. The test will be run on few companies - a mix of banks and corporates, and I don't expect more than 100 edits on a test run. If the test run is successful, the bot will be occupied with populating these items that i'm manually creating rn (checking notability for each individual entry before creating it). Best, --Fordaemdur (talk) 19:17, 17 April 2024 (UTC)[reply]
Add:Openinfo.uz now has an entry to facilitate referencing its data: Unified Portal of Corporate Information Data (Q125505748) --Fordaemdur (talk) 19:19, 17 April 2024 (UTC)[reply]
 Support adding all joint-stock companies is fine given the kind of notability rules we have. If you would want to small businesses as well, it would be a harder call whether or not to allow it. ChristianKl11:48, 18 April 2024 (UTC)[reply]
Thank you for clarification. I confirm that I won't be working on small businesses. Openinfo and Tashkent Stock Exchange (which i'm using for data imports) only have data on joint-stock companies. Best, --Fordaemdur (talk) 14:48, 18 April 2024 (UTC)[reply]

MidleadingBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Midleading (talkcontribslogs)

Task/s: Create items for books in National Diet Library (Q477675).

Function details: As part of WikiProject, items for each book in National Diet Library (Q477675) have to be created so files in Wikimedia Commons can link to Wikidata. The items will have properties DOI (P356), NDL Bib ID (P1054), publication date (P577), author (P50), main subject (P921) and others. The number of items to be created is likely to be more than 100,000. --Midleading (talk) 13:03, 5 February 2024 (UTC)[reply]

Note: If item with specific NDL Authority ID (P349) does not yet exist, we need to also create an item for the author (which may include people and organizations). Also there are books not yet in public domain so can not be uploaded to Commons, but items for them can still be created. GZWDer (talk) 13:48, 5 February 2024 (UTC)[reply]
Also you may first create some example items.--GZWDer (talk) 13:56, 5 February 2024 (UTC)[reply]
Like local history of Kagoshima City (Q111372556)? This item was created by @Sakoppi:. Do you know any Japanese user who are interested in this topic? Some items may have already been created by these users. Midleading (talk) 15:14, 5 February 2024 (UTC)[reply]
I don't think that item is not following the guidelines at Wikidata:WikiProject_Books. BrokenSegue (talk) 17:37, 5 February 2024 (UTC)[reply]
What it mean: (1) any created item with a specific DOI (P356)/JPNO (P2687)/NDL Bib ID (P1054)/publication date (P577)/place of publication (P291) should be an "edition" (instance of (P31)=version, edition or translation (Q3331189)); (2) there are possibly multiple editions of the same work, so (in the next step, after we have the edition items) we create an item for the "work" and link two items with edition or translation of (P629)/has edition or translation (P747). For example c:Category:人事興信録 contains multiple different editions of Q109727675 (a "work"), each edition have a different set of IDs. We should create items for each edition. In the future, the Commons category should be diffused to one for each edition which links to the edition item (instead of the current one for the work item).--GZWDer (talk) 03:17, 6 February 2024 (UTC)[reply]
Note we have another dedicated property for NDL id (besides JPNO (P2687) and NDL Bib ID (P1054)): NDL Persistent ID (P9836). To prevent fragmentation of data, IDs should not be added to DOI (P356).--GZWDer (talk) 09:31, 6 February 2024 (UTC)[reply]
I have edited local history of Kagoshima City (Q111372508) as example of work item and local history of Kagoshima City (Q111372556) as example of edition item. Midleading (talk) 14:05, 7 February 2024 (UTC)[reply]
this still looks wrong? is it a book series? or a written work (Q47461344)? There's also constraint violations? BrokenSegue (talk) 18:33, 7 February 2024 (UTC)[reply]
It was a book that later became book series. Anyway, I will use written work (Q47461344) uniformly, because this information isn't in Commons. genre (P136), official website (P856), copyright status (P6216) for work item and follows (P155), followed by (P156), genre (P136), copyright holder (P3931), official website (P856) for edition item also can't be imported. has edition or translation (P747) statements also will not have any qualifiers when imported. Midleading (talk) 03:29, 16 February 2024 (UTC)[reply]

Hello, I want to do this for books in National Library of Spain. I see that following Wikidata rules for books, two items are needed at least, one for the written work (Q47461344) and one for each version, edition or translation (Q3331189). I created an example for this entry in datos.bne.es:

Do you think they are correct? It think that adding the "(PUBLISHER, YEAR)" in the label for each edition is useful, so you can see all that info quickly in the property has edition or translation (P747) in La vida es eterna: biografía de Víctor Jara (Q124538246). But I am open for suggestions. Of course, after we define that, I will open a request for my bot. Just wanted to use this discussion so we can unify the rules for all "book bots". Emijrp (talk) 18:12, 15 February 2024 (UTC)[reply]

generally looks good to me. personally I would like to see some more identifiers (see Wikidata:WikiProject_Books e.g. Library of Congress Control Number (LCCN) (bibliographic) (P1144) or Google Books ID (P675)) though having ISBN is good. Also a genre (P136) would be nice. BrokenSegue (talk) 18:27, 15 February 2024 (UTC)[reply]
oh also the description for the edition should say it's an edition (needs to be distinct from the work's description) BrokenSegue (talk) 18:29, 15 February 2024 (UTC)[reply]
"2023 edition of book by Mario Amorós"? What about writing the Spanish title in the English label? Is that OK or should I leave it blank when book hasn't been translated? Emijrp (talk) 19:19, 15 February 2024 (UTC)[reply]
The description for both items is that it's a "book", which is the least helpful label that could possibly be used. In Wikiproject:Books, we never use the word "book" because it could mean a work, an edition, a specific copy, a section within a work, or any of a dozen other meanings. Please do not use "book" as the description for a work or an edition; it isn't helpful and does not distinguish what it is. The data item for an edition should have "edition" in the description; not in the label. --EncycloPetey (talk) 17:18, 16 February 2024 (UTC)[reply]
I just fixed the labels and descriptions for both items (written work and edition). Is OK now? Btw, I repeat the same question, is OK to use Spanish title as English label when work hasn't been translated? Emijrp (talk) 16:15, 19 February 2024 (UTC)[reply]
Yes, it is ok to use the Spanish title if the work has not been translated. Other than that, are we ready for approval? Ymblanter (talk) 19:55, 6 March 2024 (UTC)[reply]
@Ymblanter This request was created by @Midleading: for his bot MidleadingBot. I don't know if his bot is ready for approval.
I am going to open a request for my own bot. Emijrp (talk) 19:43, 7 March 2024 (UTC)[reply]
Here is Wikidata:Requests for permissions/Bot/Emijrpbot 10. Emijrp (talk) 20:04, 7 March 2024 (UTC)[reply]

How is this still going on? The example items (local history of Kagoshima City (Q111372508) for work item and local history of Kagoshima City (Q111372556) for edition item) do not have constraint violations now. In the first phase, edition items will be created, with statements of instance of (P31)=version, edition or translation (Q3331189), title (P1476), country of origin (P495), publication date (P577), language of work or name (P407), document file on Wikimedia Commons (P996), DOI (P356), NDL Persistent ID (P9836), NDL Bib ID (P1054), JPNO (P2687). Other properties will depend on data source. If there is no interest in it, I will close this request in 2024. Midleading (talk) 15:38, 4 November 2024 (UTC)[reply]

I am fine with approving the bot, but I want to make sure there are no objections. Ymblanter (talk) 19:27, 5 November 2024 (UTC)[reply]

So9qBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: So9q (talkcontribslogs)

Task/s: Add DDO identifier to Danish lexemes.

Code: https://github.com/dpriskorn/LexDDO

Function details: Checks whether there are multiple hits in DDO for a lemma. If yes it is skipped. Checks if there is multiple lexemes with the same lemma and lexical category in WD, if yes, it skips. Otherwise we got a match and upload is done. If we get 404 from DDO a not found in + time statement is added. This is the easiest low hanging fruit kind of matching. I vetted the edits and it seems good to me. See ~50 test edits here https://www.wikidata.org/w/index.php?title=Special:Contributions/So9q&target=So9q&offset=20240105165217--So9q (talk) 18:41, 5 January 2024 (UTC)[reply]

What is this? Ymblanter (talk) 20:01, 11 January 2024 (UTC)[reply]
It is a placeholder. I add it when there are multiple choices for lexemes or no lexeme match like in this case. If they were numbered (by a bot or to-be-written user script perhaps) one could see it as in the second position we don't know which lexeme correspond. So9q (talk) 08:46, 7 October 2024 (UTC)[reply]
Are you still interested in the bot approval? Ymblanter (talk) 18:41, 8 October 2024 (UTC)[reply]
Yes, but I prefer that the community okay it first. Maybe @Fnielsen wants to support? So9q (talk) 15:07, 3 December 2024 (UTC)[reply]
@So9q, Ymblanter: This is fine by me. There is a Mix'n'Match-like tool here https://mishramilan.toolforge.org/#/catalogs/95 that also work on the DDO property. Finn Årup Nielsen (fnielsen) (talk) 21:46, 4 December 2024 (UTC)[reply]


So9qBot 8 (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: So9q (talkcontribslogs)

Task/s: Add missing names of European legal documents to labels and aliases of items with a CELEX identifier

Code: logic diagram, code

Function details: This is important for our coverage of EU legal documents. A bug is blocking creation of 50 test edits.--So9q (talk) 15:07, 17 December 2023 (UTC)[reply]

The bug has been fixed. See test edits So9q (talk) 17:41, 2 January 2024 (UTC)[reply]
@Samoasambia thanks for moving the test edits to title as suggested by the model and Ainali <3 So9q (talk) 08:56, 7 October 2024 (UTC)[reply]

Discussion

[edit]
  •  Support looks useful, thanks! -Framawiki (please notify !) (talk) 14:34, 6 January 2024 (UTC)[reply]
  •  Question Wouldn't title (P1476) be better than official name (P1448)? (That is what we used for the Swedish parliamentarian documents.) Ainali (talk) 08:41, 11 January 2024 (UTC)[reply]
    Yes, thanks for the suggestion. So9q (talk) 08:49, 7 October 2024 (UTC)[reply]
  • @So9q: FYI, I created some data modeling for EU legal acts here. The EUR-Lex metadata is available through a SPARQL end point which gives us some additional data compared to scraping. –Samoasambia 18:38, 9 March 2024 (UTC)[reply]
    Oh, I was not aware of the WikiProject. Looks very nice and title is suggested there like Ainali did above. I'm not sure the SPARQL endpoint is needed nor desired for this task. I had a look back when I wrote this request and ditched it. Can't remember why, but this code works and is reasonably fast :) So9q (talk) 08:53, 7 October 2024 (UTC)[reply]
  • @Samoasambia, Ainali, Framawiki: I updated the code to use title. I also fixed a small bug which caused duplicate references when the script was rerunning. I also added editgroups so anyone can later undo the changes in bulk easily if needed. I'm ready to run it on all ~4000 items with CELEX id now.--So9q (talk) 21:32, 8 October 2024 (UTC)[reply]
    Are there some test edits with the updated code? Ainali (talk) 21:41, 8 October 2024 (UTC)[reply]
    I'm planning to add data to EU legal acts and to create new items via the EUR-Lex SPARQL endpoint but scraping the titles is fine for me. Makes my life a bit easier :). I'd still add stated in (P248) = EUR-Lex (Q1276282) to the references but otherwise looks great to me. Samoasambia 22:13, 8 October 2024 (UTC)[reply]
    Fixed, see Test edit.
    Note: no reference is added to existing title-statements (this is to avoid duplicate references with different dates on consecutive runs of the script).
    The script is idempotent. It only adds missing title-statements, never remove or change existing statements.
    I added editgroups so a complete run of the script can be rolled back easily.--So9q (talk) 09:10, 18 October 2024 (UTC)[reply]
    I added extraction of "EUID" e.g. "(EU) 1979/110" from en descriptions in WD and add them as mul aliases. They make it easier to lookup laws in Wikidata using the search bar and are used as IDs by e.g. the swedish government. See test edit. So9q (talk) 12:16, 18 October 2024 (UTC)[reply]
    Looks good to me, So9q. However, there are some issues with the "EUID". The initialisms in the identifier stand for the legal domain under which the act was passed (European Union, European Economic Community, European Atomic Energy Community etc.). The current naming format of legal acts has been in use only since January 2015, so for example "(EU) 1979/110" is not correct, it should be "79/110/EEC" (in English, different in others). Since the Lisbon treaty most new acts have legal domain "EU" but some also have "EU, Euratom" or "CFSP". The legal domain appreviations are language-specific, so while in English it's "EU", in French it's "UE" and in Irish "AE" etc. I added a table of all of them here. More information can be found at the Interinstitutional Style Guide.
    So I would recommend that the bot shouldn't add "EUIDs" with the legal domains to mul aliases because the format depends on language. However, adding only the year-and-number-part (e.g. "79/110", "2016/679") is fine and I support that. I have started working on a python code that would extract short labels for legal acts from the full titles in different languages using regex. Maybe we could work on that together if I add the code to GitHub? Samoasambia 19:38, 18 October 2024 (UTC)[reply]
    Oh, I was not aware that the EUID had a component that differs along both language and legal domains. Thanks for the table. I can use that to translate the legal domain part before adding the alias.
    This is becoming increasingly complicated. EU is so complicated :sweat smile:
    I digged a little and found a use of the "EUID" without the parenthesis "EU 2023/138" from a Swedish government agency.
    So now we have 5 different EUIDs used by governmental workers to refer to the same law:
    • long EUID with parens e.g. "(EU) 2023/138"
    • long EUID without parens e.g. "EU 2023/138"
    • short EUID without the legal domain e.g. "2023/138"
    • ELI IDs (we are missing a property, see Wikidata:Property proposal/European Legislation Identifier) (used in EUR-Lex, but not by e.g. the Swedish government)
    • CELEX ids (used in EUR-Lex and Cellar, but not by e.g. the Swedish government)
    So9q (talk) 12:24, 19 October 2024 (UTC)[reply]
    I added support for localized EUIDs according to the table provided by @Samoasambia and only add the "short EUID" to mul. I did not add support for Euratom and CFSP for now (I set the script to raise an exception if the EUID cannot be extracted and will implement it if needed when the script fails). See test edit
    Also added support for extracting and adding the localized "EECID" e.g. "80/1177/EEC" to aliases, see test edit
    @Ainali, @Samoasambia WDYT? :) --So9q (talk) 16:53, 19 October 2024 (UTC)[reply]
    Do we really need to add the same alias in multiple languages? If it exists in one language, it shows up in the search independent of what language one is using. Is there some added value for this that I am not seeing? Ainali (talk) 18:26, 19 October 2024 (UTC)[reply]
    It is the most light way we have, so yes it is necessary, if we would add all the variants to mul as alias instead we would loose information. They are valid for each of the languages and deduplicated in the database so nothing to worry about IMO. So9q (talk) 07:59, 20 October 2024 (UTC)[reply]
    I have a still a couple issues left. Firstly I think we shouldn't use the full titles as labels, instead we should be using some sort of short titles. Unfortunately they are not directly available on EUR-Lex but I did some regex magic for extracting them out of the full titles in all official languages. You can find it here. Currently it works in 22 out of 24 languages and for nearly all acts published since 1 January 2015. Adjusting it for earlier acts needs still some extra work. The second issue is that I don't think the "long EUID without parens" (e.g. EU 1980/1177) is anything official, so I wouldn't include that. EUR-Lex seems to use only the version with parens, and that is what the interinstitunional style guide says [4][5]. And finally I would put stated in (P248) before the URL in the references since it looks a bit nicer that way :). Otherwise looks good to me! Samoasambia 22:20, 28 October 2024 (UTC)[reply]
    I agree, short labels are nicer, thanks for working on that!
    I suggest we use the shortest. I know that "long EUID without parens" is not official, but helps people who try to do entity recognition in case it is used in the wild so I still would like to add it as alias.
    Since your code does not work for all languages, how do you suggest we proceed? Should we proceed with what is currently working and add long labels for the ones where it does not? Or should we fix this first before proceeding?
    Could you detail how it fails so we can fix it?
    Is there a bug in the re-module regarding IGNORECASE, do you have a link to a bugreport in that case? So9q (talk) 09:37, 27 November 2024 (UTC)[reply]
    @Samoasambia I added your logic to the Title class and added some tests too. It currently only seems to fail for greek. What other language doesn't work as expected?
    Would you be willing to provide a regex for greek that workaround the ignorecase bug? So9q (talk) 19:05, 27 November 2024 (UTC)[reply]
  • @Ymblanter: ready for approval?--So9q (talk) 21:34, 25 October 2024 (UTC)[reply]
    I will wait for a few days to see whether there are objections. Ymblanter (talk) 19:34, 26 October 2024 (UTC)[reply]

HVSH-Bot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Historischer_Verein_SH (talkcontribslogs)

Task/s: Import data about politicians from the Database of important Persons from Schaffhausen (Q119949776), now only partially online available

Code: N/A

Function details: Import of reconciled data from OpenRefine with given name, familiy name, date of birth, date of death, place of origin, sex or gender, position held, language spoken, country of citizenship. --HVSH-Bot (talk) 12:37, 31 December 2023 (UTC)[reply]

Could you explain the logic using an activity planuml diagram? Could you make 50 test edits and link them here? So9q (talk) 10:35, 2 January 2024 (UTC)[reply]


RudolfoBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: RudolfoMD (talkcontribslogs)

Task/s:importing list of Drugs With Black Box Warnings; setting Property / legal status (medicine): boxed warning.

Code: N/A

Function details: Continue importing FDA list of Drugs With Black Box Warnings, as I've been doing, with OpenRefine. Ideally hope to create or have someone run a bot to maintain the data.

OpenRefine urges me to submit  Large edit batches for review.  I've done ~400 in batches of ~200.  
I want to do more, like https://www.wikidata.org/w/index.php?title=Q7939256&diff=prev&oldid=2019984699&diffmode=source.
This is what's set:
Property / legal status (medicine): boxed warning / rank
Property / legal status (medicine): boxed warning / reference
 reference URL: https://nctr-crs.fda.gov/fdalabel/ui/spl-summaries/criteria/343802
 title: FDA-sourced list of all drugs with black box warnings (Use Download Full Results and View Query links. (English)

Want to match more widely - on Q113145171, which has ~500 matches, and the other types which match and are drugs of some kind listed below.
Table has ~1600 rows, and the bulk have a matching drug in wikidata already.

Types: 
Q113145171 type of chemical entity (658)
Q59199015 group of sterioisomers (51)
Q12140 medication	DONE- first extract, I think (need to redo to add cites)
Q169336 mixture (45)
Q79529 chemical substance (40)
Q1779868 combo drug (28)
Q35456 essential med (13)
Q119892838 type of mixture of chem (3)
Q28885102 pharm prod (3)
Q467717 racemate (3)
Q8054 protein (biomolecule) (4)
Q422248 mab (12)
Q679692 biopharmaceutical (6)
Q213901 gene therapy (4)
Q2432100 vet drug (3)

I do not want to do for types 
Q13442814 article (NO)
Q30612 clinical trial (NO)
Q7318358 review article (NO)
Q16521 taxon (NO?) 

--RudolfoMD (talk) 09:29, 29 November 2023 (UTC)[reply]

  •  Comment Looks useful! Can we see some test edits with the actual bot code to be used?

GamerProfilesBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Parnswir (talkcontribslogs)

Task/s: Backfill GamerProfiles game IDs (P12001)

Code: https://github.com/GP-9000/GamerProfilesBot

Function details: The bot will regularly update existing video games with the GamerProfiles game ID (P12001) sourced from https://gamerprofiles.com. We plan to update the initial batch of around 55,000 games within a month of approval and then switch to a more relaxed (on-demand) update process.

--Parnswir (talk) 11:05, 5 October 2023 (UTC)[reply]

@Parnswir: Is Master Jaro (talkcontribslogs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)[reply]
@Kirilloparma @Lymantria Thank you for the info everyone! I didn't know about the "paid contributions" info before. And yes, I am a different person :) Since high-quality edits are also in the interest of the company, I have added the paid contributions template to my page as well now. Just let me know if anything else is missing. I've learned quite a bit over the last months, and will keep doing my best to produce helpful edits. Master Jaro (talk) 15:33, 10 November 2023 (UTC)[reply]
Please make 50 test edits and link them here. So9q (talk) 10:38, 2 January 2024 (UTC)[reply]
The contributions were already made on October 5th 2023: https://m.wikidata.org/wiki/Special:Contributions/GamerProfilesBot Parnswir (talk) 16:40, 2 January 2024 (UTC)[reply]
@Kirilloparma @Jean-Frédéric @BrokenSegue @Lymantria @So9q Thank you for your efforts everyone! Is there anything more we can do to help move this project forward? We would love to add more of the relevant IDs next to the other game edits we make along the way. Any help is highly appreciated :) Master Jaro (talk) 16:35, 27 March 2024 (UTC)[reply]

MangadexBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Binarycat64 (talkcontribslogs)

Task/s: add metadata from mangadex to manga with Mangadex manga ID

Code: not yet implemented

Function details: many manga items have a mangadex id specified, but not the id for other sites (MangaUpdates, Kitsu, AniList). however, this data exists on mangadex, so this bot would simply copy over the data.

the initial scope is quite small, only focusing on ID tags. --Binarycat64 (talk) 18:01, 6 August 2023 (UTC)[reply]

I'm concerned you are too inexperienced with wikidata (<500 edits) to be granted bot permissions. At the very least I'm going to need to see some test edits. BrokenSegue (talk) 18:39, 6 August 2023 (UTC)[reply]
I can certainly provide test edits if that's your concern. I will also adjust my code according to any reasonable concerns that are raised.
I'll start working on the code, seeing as there are no objections to the goal of the bot.
Is there a certain way i should do test edits? I can test most of the functionality on sandbox items, but I need a query endpoint to test the functionality of finding items to update, and test.wikidata.org doesn't seem to provide that. Binarycat64 (talk) 16:23, 7 August 2023 (UTC)[reply]
it is ok to make a small number of test edits on main wikidata using the bot account before approval. just make sure it is relatively few at low speed. BrokenSegue (talk) 06:28, 8 August 2023 (UTC)[reply]
Please, do so, let your bot make some test edits. --Lymantria (talk) 15:41, 17 September 2023 (UTC)[reply]
I implemented something very similar last year: https://github.com/PythonCoderAS/wikidata-anime-import
I'll take some time and revive the codebase, as I've taken an extended break but am ready to come back again. RPI2026F1 (talk) 16:22, 25 January 2024 (UTC)[reply]

WingUCTBOT (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Tadiwa Magwenzi (talkcontribslogs)

Task/s: Batch Upload of Niger-Congo B Lexemes , including Senses and Forms.

Code:https://github.com/Boomcarti/WingUCTBOT

Function details: Upload of 550 isiZulu Nouns as Lexemes, Including their associated Forms and Senses. --WingUCTBOT (talk) 10:07, 31 July 2023 (UTC)[reply]

Please make some test edits. Ymblanter (talk) 19:19, 7 August 2023 (UTC)[reply]
Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)[reply]
I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)[reply]
I've just redone about 250 test edits they are on the TestWikidata recent changes page. Some examples: https://test.wikidata.org/wiki/Lexeme:L3768 , https://test.wikidata.org/wiki/Lexeme:L3753 . The link to the page: Recent changes - Wikidata . WingUCTBOT (talk) 18:14, 9 September 2023 (UTC)[reply]
I took a quick look at the code. Are you aware of the python library WikibaseIntegrator which supports lexemes?
I prefer if you would use that or a similar library to make sure you honor the max edit thing on the servers.
Would you be willing to do that? So9q (talk) 10:50, 2 January 2024 (UTC)[reply]


The Lexemes were sourced manually by Professor M.Keet and Langa Khumalo.

https://github.com/mkeet/GENIproject/tree/master/isiZulupluraliser/isiZulu

@WingUCTBOT, Tadiwa Magwenzi: Your code appears to add the same sense multiple times and, among forms, adds the plural of a noun multiple times without including a form for the singular. (You may wish to consider using tfsl for your import; once it is installed, an overview of how it is used may be found here.) Mahir256 (talk) 00:05, 16 August 2023 (UTC)[reply]
Understood, will fix it now. WingUCTBOT (talk) 17:21, 16 August 2023 (UTC)[reply]
Good evening. I have addressed your concerns with the code and have uploaded a test batch of 50+ Lexemes( isiZulu Nouns, along with their senses and forms) WingUCTBOT (talk) 22:36, 16 August 2023 (UTC)[reply]
In time, i do intend to refactor the code to use tfsl WingUCTBOT (talk) 23:09, 16 August 2023 (UTC)[reply]

MajavahBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Taavi (talkcontribslogs)

Task/s: Import version and metadata information for Python libraries from PyPI.

Code: https://gitlab.wikimedia.org/toolforge-repos/majavah-bot-wikidata/-/blob/main/majavah_wd_bot/pypi_sync/main.py

Function details: For items with PyPI project (P5568) set, imports the following data from PyPI:

Additionally the PyPI project (P5568) value will be updated to the normalized name if it's not already in that form.

Taavi (talk) 19:54, 11 July 2023 (UTC)[reply]

how many statements do you think this will add? don't some packages have...lots of versions? BrokenSegue (talk) 20:05, 11 July 2023 (UTC)[reply]
Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)[reply]
i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)[reply]
I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does year.month.patch type releases so the first digit changing isn't really meaningful.
However I can filter out all packages generated from https://github.com/vemel/mypy_boto3_builder, as those are all very similar and not intended for human use directly anywyays. That cuts the total number of versions to a third (~70k) even before doing any other per-package limits. Taavi (talk) 21:15, 11 July 2023 (UTC)[reply]
See also Wikidata:Requests for permissions/Bot/RPI2026F1Bot 5 for discussion of a previous similar task (seems not active) and Github-wiki-bot imports version data from GitHub (see e.g. history of modelscope (Q120550399)); however you should care that version numbers may be different between GitHub and PyPI.--GZWDer (talk) 11:38, 12 July 2023 (UTC)[reply]
──────────────────────────────────────────────────────────────────────────────────────────────────── Oh yes, the RPI2026F1Bot task looks somewhat similar. I'm aware of Github-wiki-bot, but there are quite a few PyPI projects that are not hosted on GitHub, and I think my code should be able to handle items with data from both and ensure the two bots don't start edit warring for example. Taavi (talk) 17:23, 12 July 2023 (UTC)[reply]
@Taavi: Please make some test edits. --Wüstenspringmaus talk 11:05, 29 August 2024 (UTC)[reply]


FromCrossrefBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Carlinmack (talkcontribslogs)

Task/s: Using information from Crossref:

  1. Add publication date to items where they are not present in Wikidata
  2. Fix publication dates where they are erroneous

Code: Will be using Pywikibot in a similar way as I have done previously with this bot

Function details: Previously this bot has been used to add CC licenses to items which has been successful. In March 2022 it was realised that other bots/tools were using the wrong date for publication date in Crossref. Since I am working with this dump, I will step up to try fix this issue.

A simpler task is to fill in the details for items without publications. I've created a set of 80k items and once given the go ahead I will contribute these dates.

The issue of the wrong dates is a little more complicated as there are some false positives on both sides of this, sometimes Crossref is wrong and sometimes Wikidata is wrong. I'm sure that Wikidata is wrong more often, however before doing any edits I will do some manual validation to check the prevalence of false positives. When I am fairly confident I will start editing and I'll see whether I can deprecate the existing statement, add a reason and add the new date as preferred. If not, due to limitations in Pywikibot, I'll remove the previous statement instead. --Carlinmack (talk) 14:31, 7 July 2023 (UTC)[reply]

  •  Support This seems useful. However I see only one example edit for this so far, maybe you should do some more just to verify it's doing what we expect? You will be using the "published" date-parts data in the Crossref json files for this? If an item already has the correct published date value will you add the reference? Maybe that should only be done if the published date doesn't already have a reference though... ArthurPSmith (talk) 18:17, 24 July 2023 (UTC)[reply]
Pls make some test edits.--Ymblanter (talk) 15:53, 9 August 2023 (UTC)[reply]
@User:Carlinmack: What about "erroneous" in Crossref and corrected in WD? --Succu (talk) 20:19, 7 November 2023 (UTC)[reply]

UrbanBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Urban Versis 32 (talkcontribslogs) Note: A discussion at Wikipedia about this bot took place: Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category

Task/s: UrbanBot's task is to mass-add English descriptions to items that don't have one.

Code: Main repository for UrbanBot's code Source code file for task

Function details:

1. The bot operator will first enter a category name from the English Wikipedia. This category will be used to group similar pages (items on Wikidata) which will all have the same description added to them.

2. The bot operator will enter the description to be added to the pages in the Wikipedia category.

3. The bot will follow through these steps for each page:

3a. The bot will check if the Wikipedia page has a corresponding item.

3b. The bot will check if the item already has a description

3c. If the Wikipedia page has a corresponding item and the item does not already have a description, the bot will write the description specified by the bot operator in step 2 into the item.

3d. The bot will loop through to the next page in the category and run all steps in step 3 again.

Due to the bot requiring the bot operator to enter in the English Wikipedia category and the description for the items, the bot is semi-automated. I have already done the aforementioned process using the bot to add descriptions to items a few times to make sure the code was working properly.

Thanks, Urban Versis 32KB(talk | contribs) 16:04, 29 June 2023 (UTC)[reply]

 Support This sounds fine as long as you are aware of Wikidata's style guide for descriptions. Confirm that you've read Help:Description? BrokenSegue (talk) 16:24, 29 June 2023 (UTC)[reply]
@BrokenSegue Yes, I have and am aware of the formatting of descriptions. Urban Versis 32KB(talk | contribs) 03:55, 10 July 2023 (UTC)[reply]
 Support Looks fine to me too, at least if you'll be following pretty much the pattern you've tested with. One note - two items with the same primary label cannot have the same description string in Wikidata; I'm not sure if your bot would ever run into that but it might be an error condition you'll have to check for. ArthurPSmith (talk) 20:33, 29 June 2023 (UTC)[reply]
 Comment Another approach might be to add the short descriptions to enwiki, which are then automatically copied over here by Pi bot. That might help reduce the number of differences of descriptions here and there in the longer term. Thanks. Mike Peel (talk) 16:46, 3 July 2023 (UTC)[reply]
the style guide for wikipedia/wikidata descriptions are not the same though BrokenSegue (talk) 17:24, 5 July 2023 (UTC)[reply]
@Mike Peel Actually, this was my original plan and I discussed it at Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category but I was suggested to bring it here as the bot would mainly edit Wikidata and editing Wikipedia would only create extra steps. Urban Versis 32KB(talk | contribs) 03:57, 10 July 2023 (UTC)[reply]
@BrokenSegue, Urban Versis 32: Those are both problems that should be fixed. English Wikipedia seems to want the extra steps, it would be useful if they didn't self-contradict themselves... Thanks. Mike Peel (talk) 21:24, 12 July 2023 (UTC)[reply]
those won't be fixed in this request for permission. BrokenSegue (talk) 22:47, 12 July 2023 (UTC)[reply]
@Mike Peel Not sure what you mean by English Wikipedia wanting the extra steps, but if an en-wiki article is linked to a Wikidata item with a description, the description takes the place of a short description on Wikipedia. For example, viewing this Wikipedia category with the shortdescs-in-category tool will reveal that some articles have a locally-added short description whereas one page doesn't have a short description but its corresponding Wikidata item did have a description, which took the place of a Wikipedia short description. Urban Versis 32KB(talk | contribs) 22:50, 13 July 2023 (UTC)[reply]
@Mike Peel Actually, I stand corrected. I was looking through the en-wiki Wikiproject Short Descriptions (link here) and it looks like Wikidata descriptions are actually not really used as a replacement for a Wikipedia short description. Therefore, I think I will submit a bot request to en-wiki as you were correct about Short descriptions being a much higher priority on Wikipedia compared to Wikidata descriptions. I will leave this request up however, in case I run into people saying the same thing at Wikipedia as they did before. After the bot (hopefully) gets approved, I will take this one down. Thanks again, Urban Versis 32KB(talk | contribs) 02:40, 15 July 2023 (UTC)[reply]

ACMIsyncbot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Pxxlhxslxn (talkcontribslogs)

Task/s: Sync links with ACMI API.

Code: https://github.com/ACMILabs/acmi-wikidata-bot/blob/main/acmi_bot.py

Function details: As part of an upcoming residency with the ACMI (Q4823962) I have written a small bot to pull Wikidata links from their public API and write back to Wikidata to ensure sync between the two resources.The plan was to integrate this as part of the build workflow for the ACMI API (https://github.com/ACMILabs/acmi-api). This is currently set to append only, not removing any links Wikidata-side. While the initial link count is only around 1500 there will likely be significant expansion in the current weeks as we identify further overlaps. --Pxxlhxslxn (talk) 00:36, 16 May 2023 (UTC)[reply]

can you add a reference? can you set an edit summary (just add a "summary" arg to the write call)? Otherwise looks good. BrokenSegue (talk) 01:23, 16 May 2023 (UTC)[reply]
Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)[reply]
No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)[reply]
We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)[reply]
I don't think it's an IP block. BrokenSegue (talk) 20:40, 22 June 2023 (UTC)[reply]

WikiRankBot

[edit]

WikiRankBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)

Operator: Danielyepezgarces (talkcontribslogs)

Task/s: Use Alexa rank (P1661)

Code: Coming soon i publish the code

Function details: I am making a bot that can track the monthly ranking of websites based on Similarweb Ranking. The bot will receive a list of websites with their corresponding Wikidata IDs and domains to keep the data accurate.

The bot will have to use the Similarweb Top Sites API to get the traffic ranking of each website and store it in a MySQL database along with the date of the ranking. If the website already exists in the database, the bot should update its ranking and date every time there is a new ranking update.

Soon the bot will include some new features that will be communicated in the future.

The Similarweb ranking is not this property. It is Similarweb ranking (P10768).--GZWDer (talk) 05:16, 12 May 2023 (UTC)[reply]
If correct the bot uses the property P10768 and rewrites the old property P1661 since the public data of Alexa Rank ceased to exist,
when I put Similarweb Ranking I don't mean the property P10768 but that the bot took the data from similarweb.com website Danielyepezgarces (talk) 16:15, 17 May 2023 (UTC)[reply]
what edits is this bot making? BrokenSegue (talk) 15:59, 22 February 2024 (UTC)[reply]

ForgesBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Dachary (talkcontribslogs)

Task/s: Add licensing information to software forges entries in accordance to what is found in the corresponding Wikipedia page. It is used as a helper in the context of the Forges project

Code: https://lab.forgefriends.org/friendlyforgeformat/f3-wikidata-bot/

Function details: ForgesBot is a CLI tool designed to be used by participants in the Forges project in two steps. First it is run to do some sanity check, such as verifying forges are associated with a license. If some information is missing, the participant can manually add it or it can use ForgesBot to do so.

The implementation includes one plugin for each task. There is currently only one plugin to verify and edit the license information. The license is deduced by querying the wikipedia pages of each software: if they consistently mention the same license the edit can be done. If there are discrepancies they are reported and no action is done.

--Dachary (talk) 09:29, 26 April 2023 (UTC)[reply]

I don't think I understand the task. Can you do some (~30) test edits? Or try to explain again? BrokenSegue (talk) 17:13, 26 April 2023 (UTC)[reply]

IngeniousBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Premeditated (talkcontribslogs)

Task/s: Adding identifiers to album items, based on existing identifiers.

Code:

Function details: Adding Spotify album ID (P2205), Apple Music album ID (U.S. version) (P2281), YouTube playlist ID (P4300), SoundCloud ID (P3040), Pandora album ID (P10138), Amazon Standard Identification Number (P5749), Tidal album ID (P4577), Deezer album ID (P2723), Yandex Music release ID (P2819), Anghami album ID (P10972), Boomplay album ID (SOON), and Napster album ID (SOON). Based on previously mentioned properties. --Premeditated (talk) 16:29, 22 March 2023 (UTC)[reply]

can you go into more detail about how this lookup will be done? link to some test edits? BrokenSegue (talk) 16:36, 22 March 2023 (UTC)[reply]
@BrokenSegue: Test edits. Lookups are based on a given album identifier like for example, Spotify album ID (P2205). UPC, Spotify artist ID (P1902), artist name, number of tracks, name of tracks, ISRC (P1243), and more are compared and looked up on other streaming services API/scrapping to match "identical" relases. I have made a scoring system where only relases that score 80% or better are added by the bot. The matches that does not get published will be saved to a file for later to be added to Mix'n'match, maybe. - Premeditated (talk) 23:50, 22 March 2023 (UTC)[reply]
I believe you are misusing the inferred from (P3452) property. Look at the description of that property in English. Please go and fix all the test edits you made. Maybe you want stated in (P248) or similar.
I think you should add a based on heuristic (P887) statement in the reference? Maybe to record linkage (Q1266546) or similar. This whole workstream seems really similar to what is/was being done by User:Soweego bot. Can you explain how you are different/the same. Maybe we should get input from @Hjfocs:.
Can you go into more detail about what is creating these scores? How did you verify the scores are meaningful? What kind of model are you using? Is your source code available? What " looked up on other streaming services API/scrapping to match "identical" relases " are you using. Etc. BrokenSegue (talk) 16:59, 23 March 2023 (UTC)[reply]
Hey folks, happy to give my 2 cents. I second BrokenSegue's comments: (based on heuristic (P887), record linkage (Q1266546)) reference nodes sound good. @Premeditated: interesting project: it would be great if you could share the code and tell us something more about it. Cheers, Hjfocs (talk) 22:57, 25 March 2023 (UTC)[reply]
What is the situation here?--Ymblanter (talk) 19:04, 23 June 2023 (UTC)[reply]

LucaDrBiondi@Biondibot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: LucaDrBiondi (talkcontribslogs)

Task/s: Import us patent from a csv file

For example:

US11387028; Unitary magnet having recessed shapes for forming part of contact areas between adjacent magnets ;Patent number: 11387028;Type: Grant ;Filed: Jan 18, 2019;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210218300;Assignee Whylot SAS (Cambes) Inventors: Romain Ravaud (Labastide-Murat), Loic Mayeur (Saint Santin), Vasile Mihaila (Figeac) ;Primary Examiner: Mohamad A Musleh;Application Number: 16/769,182

US11387027; Radial magnetic circuit assembly device and radial magnetic circuit assembly method ;Patent number: 11387027;Type: Grant ;Filed: Dec 5, 2017;Date of Patent: Jul 12, 2022;Patent Publication Number: 20200075208;Assignee SHENZHEN GRANDSUN ELECTRONIC CO., LTD. (Shenzhen) Inventors: Mickael Bernard Andre Lefebvre (Shenzhen), Gang Xie (Shenzhen), Haiquan Wu (Shenzhen), Weiyong Gong (Shenzhen), Ruiwen Shi (Shenzhen) ;Primary Examiner: Angelica M McKinney;Application Number: 16/491,313

US11387026; Assembly comprising a cylindrical structure supported by a support structure ;Patent number: 11387026;Type: Grant ;Filed: Nov 21, 2018;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210183551;Assignee Siemens Healthcare Limited (Chamberley) Inventors: William James Bickell (Witney), Ashley Fulham (Hinkley), Martin Gambling (Rugby), Martin Howard Hempstead (Ducklington), Graeme Hyson (Milton Keynes), Paul Lewis (Witney), Nicholas Mann (Compton), Michael Simpkins (High Wycombe) ;Primary Examiner: Alexander Talpalatski;Application Number: 16/771,560


Code:

I would learn to write my bot to perform this operation. I am using Curl in c language, i have a bot account (that now i want to "request for permission") buy i get the following error message:

{"login":{"result":"Failed","reason":"Unable to continue login. Your session most likely timed out."}} {"error":{"code":"missingparam","info":"The \"token\" parameter must be set.","*":"See https://www.wikidata.org/w/api.php for API usage.

probably i think my bot account is not already approved...

Function details:

Import item on wikidata starting from title and description and these properties for now:

P31 (instance of) "United States patent" P17 (country) "united states" P1246 (patent number) "link to google patents or similar" --LucaDrBiondi (talk) 18:25, 28 February 2023 (UTC)[reply]

@LucaDrBiondi How many patents are you planning to add this way? ChristianKl12:33, 17 March 2023 (UTC)[reply]
The bot account to which you link doesn't exist. ChristianKl12:34, 17 March 2023 (UTC)[reply]


Hi i am still writing and trying it and moreover it is not yet a bot ...because it is not automatic.

I have imported patents data into a sql server database then i read a patent and with pywikibot i try for example to search the assignee (owned by property). If i not find a match i will search manually. only if i am sure then i insert the data into wikidata. this is because i do not want to add data with errors. For example look at Q117193724 item. LucaDrBiondi (talk) 18:27, 17 March 2023 (UTC)[reply]




@ChristianKl
At the end i have developed a bot using pywikibot.
It is not fully automatic because i have the property Owned_id that it is mandatory for me.
So i verify if wikidata has already an item to use for this property.
If I not find it then i not import the item (the patent)
I have already loaded some houndred items like for example this Q117349404
Do a limit of number of item that can i import each day exists?
I have received at a point a warning message from the API
Must i so somethink with my user bot?
thank you for your help! LucaDrBiondi (talk) 16:08, 31 March 2023 (UTC)[reply]

Kalliope 7.3 (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Push-f (talkcontribslogs)

Task/s: Update User:Kalliope 7.3/List of bots every hour.

Code: https://git.push-f.com/wikidata-bots/tree/bots.py

Function details:

I am planning on adding more features e.g. adding a parameter to {{Bot}} to allow bots to define which properties they edit and then generating a table like:

Property Bot
software version identifier (P348) Github-wiki-bot

but I still have to implement that.

--Push-f (talk) 09:16, 7 December 2022 (UTC)[reply]

@Push-f: You do not need bot right if the bot only edit subpages of your or your bot's user pages.--GZWDer (talk) 09:26, 7 December 2022 (UTC)[reply]
@GZWDer: I think I do need something because any attempt to edit a subpage via the API is failing with a Captcha (and I did confirm the email address for the account). --Push-f (talk) 14:12, 7 December 2022 (UTC)[reply]
You need a confirmed flag for this. GZWDer (talk) 14:13, 7 December 2022 (UTC)[reply]
Ah ok thanks, then I hereby request the "confirmed" right for my bot. --Push-f (talk) 14:29, 7 December 2022 (UTC)[reply]
confirmed flags are requested at Wikidata:Requests for permissions/Other rights. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]
oh I see you figured that out. never mind. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]
@Push-f is this request still relevant? I saw that you got the confirmed flag for the bot at one point but the bot hasn't run since 2023. If you'd like to have a permanent confirmed flag (at least until the account gets autoconfirmed) and then close this request since you don't really need bot approval if you just edit your subpage, we can do that --DannyS712 (talk) 07:00, 9 June 2024 (UTC)[reply]

DL2204bot 2 (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: DL2204 (talkcontribslogs)

Task/s: Correct messy entries for scholarly articles of Uztaro. Journal of Humanities and Social Sciences. (Q12268801) and Aldiri. Arkitektura eta abar (Q12253132) journals, add 2020-2022 articles.

Code: We are using WBI 12.0 for interaction with the source Wikibase, and with Wikidata.

Function details: In 2020-21, articles of the two journals (example Q108042527) have been uploaded using OpenRefine (see Q108042527 history). That dataset has several problems, such as repeated author statements (with and without "series ordinal" qualifier), incorrect issue number, DOI not present (although existing), download URL not present (although existing), etc. This proposal consists en re-writing all entries (see all using this query), using data from the newly created Inguma Wikibase (see items for these two journals using this query). Before the operation, we will check completeness and integrity of the data, and include some missing items (original source is the SQL database in the back of https://inguma.eus). --DL2204 (talk) 11:18, 30 November 2022 (UTC)[reply]

If I'm understanding your query correctly you are planning on editing just 1000 items? Personally I would be comfortable letting you do that without bot approval. Seems like a manual audit would be possible to ensure the quality is acceptable. Either way  Support. BrokenSegue (talk) 16:43, 7 December 2022 (UTC)[reply]
Please make some test edits.--Ymblanter (talk) 20:04, 11 December 2022 (UTC)[reply]
@DL2204 reminder to make your test edits (or do you want this closed?) --DannyS712 (talk) 07:02, 9 June 2024 (UTC)[reply]


Cewbot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Kanashimi (talkcontribslogs)

Task/s: Add sitelink to redirect (Q70893996) for sitelinks to redirects without intentional sitelink to redirect (Q70894304).

Code: github

Function details: Find redirects in wiki projects, and check if there is sitelink to redirect (Q70893996) / intentional sitelink to redirect (Q70894304) or not. Add sitelink to redirect (Q70893996) for sitelinks without sitelink to redirect (Q70893996) or intentional sitelink to redirect (Q70894304). Also see Wikidata:Sitelinks to redirects. --Kanashimi (talk) 02:19, 15 November 2022 (UTC)[reply]

sounds good. link to the source? BrokenSegue (talk) 05:28, 15 November 2022 (UTC)[reply]
I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)[reply]
The complete source code of my bot is here: https://github.com/MisterSynergy/redirect_sitelink_badges. It is a bit of a work-in-progress since I need to address all sorts of special situations that my bot comes across during the inital backlog processing.
You can of course come up with something similar, but after the initial backlog has been cleared, there is actually not that much work left to do. Give how complex this task turned out to be, I am not sure whether it is worth to make a complete separate implementation for this task. Yet, it's your choice.
Anyways, my bot would not be affected by the presence of another one in a similar field of work. —MisterSynergy (talk) 18:55, 21 November 2022 (UTC)[reply]

 Support Just another implementation of an approved task, why don't trust this one? Midleading (talk) 15:42, 4 November 2024 (UTC)[reply]


Mr Robot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)

Operator: Liridon (talkcontribslogs)

Task/s: Add descriptions/labels/aliases

Code: https://github.com/emijrp/wikidata

Function details: I have been using QuickStatements to work on large numbers of items and properties for a lot types of items, and have +12 mil edits so far. I intend to continue to do so, and after this discussion I am applying for the bot flag for this account in order to avoid flooding Recent Changes/Watchlists.--Liridon (talk) 14:09, 4 November 2022 (UTC)[reply]

I don't think we grant blanket approval for bots. Can you specify what tasks you will be working on? BrokenSegue (talk) 16:31, 4 November 2022 (UTC)[reply]
I've already done some tasks with this account using scripts which are part of the github link, eg ([8], [9] ...) through paws.wmcloud.org. Liridon (talk) 17:35, 8 November 2022 (UTC)[reply]
that doesn't really answer the question. I don't think we grant blanket approval. BrokenSegue (talk) 17:25, 11 November 2022 (UTC)[reply]
You guys did approve this one, which had similar task description.--Liridon (talk) 16:46, 13 December 2022 (UTC)[reply]
@BrokenSegue Hello. Liridon is flooding my Watchlist with his edits adding sq labels to people items. And he's saying he cannot use the bot account because the bot request here was not approved. Can we grant him approval specifically for this kind of edits? Please - for the sake of my watchlist... Thanks... Vojtěch Dostál (talk) 18:28, 18 February 2023 (UTC)[reply]
@Vojtěch Dostál: I'm not a bcrat. I can't assign the bot flag. BrokenSegue (talk) 18:43, 18 February 2023 (UTC)[reply]
Or we can block the user for running unapproved bot. Ymblanter (talk) 20:26, 19 February 2023 (UTC)[reply]
What? You cant block me because of this. I query Items throught https://query.wikidata.org/, find those without specific label or description, then edit all them with Quickstatements. They are not bot edits. Liridon (talk) 13:42, 20 February 2023 (UTC)[reply]
the bot policy does not specify what technology the bot uses to make the edits. the point of the policy is to provide some oversight over large batch edits. BrokenSegue (talk) 21:46, 26 February 2023 (UTC)[reply]
I'm not doing these edits with cadidate bot user(Mr Robot), but with my non-bot-account (Liridon). Exept for flooding watchlist of other users with my semi-automated edits (which I'm sure a lot of other users do), nothing is against any rules of Wikidata. Liridon (talk) 13:03, 2 March 2023 (UTC)[reply]

RobertgarrigosBOT (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Robertgarrigos (talkcontribslogs)

Task/s: I'm using Openrefine to edit items related to Wikidata:Wikiproject_Lieder, beginning by adding the new subclass lyrico-musical work (Q114586269) to the actual lieder in WD. I hope to gain some experience before going with further edits.

Code:

Function details:

--Robertgarrigos (talk) 19:42, 16 October 2022 (UTC)[reply]

YSObot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: YSObot (talkcontribslogs)

Task/s: Account for mapping Wikidata with General Finnish Ontology (Q27303896) and the YSO-places ontology by adding YSO ID (P2347) and for creating new corresponding concepts in case there are no matches.

Code: n/a. Uploads will be done mainly with Openrefine, Mix'n'Match and crresopoinding tools.

Function details: YSO includes over 40.000 concepts and about half of them are already maapped. The mapping includes

Matches are checked manually before upload. Double-checking is controlled afterwords by using the Constraint violations report

Flag/s: High-volume editing, Edit existing pages, Create, edit, and move pages

--YSObot (talk) 11:33, 16 December 2021 (UTC)[reply]

  • The bot was running without approval (this page was never included). I asked the operator to first get it approved. Can you please explain the creation of museum (Q113965327) & theatre (Q113965328) and similar duplicate items? Multichill (talk) 16:27, 15 September 2022 (UTC)[reply]
    museo (Q113965327) & teatteri (Q113965328) are part of the Finnish National Land Survey classification for places. These classes will be mapped with existing items if they are exact matches by using Property:P2959.
    Considering duplicate YSO-ID instances: these are most often due to modeling differences between Wikdata and YSO. Some concepts are split in the other one and vice versa. These are due to linguistic and cultural differences in vocabularies and concept formation. Currently the duplicates would be added to the exceptions list in the YSO-ID property P2347. However, lifting the single value constraint for this proerty is another options here.
    Anyway, YSObot is currently an important tool in efforts to complete the mappings of the 30.000+ conepts of YSO with Wikidata. Uploads of YSO-IDs are made to reconciled items from OpenRefine. See YSO-Wikidata mapping project and the log of YSObot. For the moment, uploads are done usually only to 10-500 items at time few times per day max. Saarik (talk) 13:46, 23 September 2022 (UTC)[reply]
    That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I  Oppose this bot. You should first clean up all these duplicates before doing any more edits with this bot. @Susannaanas: care to comment on this? Multichill (talk) 09:58, 24 September 2022 (UTC)[reply]
    This bot is very important, we just need to reach common understanding about how to model the specific Finnish National Land Survey concepts. I have myself struggled with them previously. There is no need to oppose to the bot itself. – Susanna Ånäs (Susannaanas) (talk) 18:02, 25 September 2022 (UTC)[reply]
    why do we want to maintain permanently duplicated items? this seems like a bad outcome. why not instead make these subclasses of the things they are duplicates of. or attach the identifier to already existing items. BrokenSegue (talk) 20:36, 11 October 2022 (UTC)[reply]
    I think this discussion went a little astray from the original purpose of YSObot.
    The creation of the Finnish National Land Survey place types were erroneously made with the YSObot account although they are not related to YSO at all. I was adding them manually with Openrefine but forgot to change the user ids in my Openrefine! I though that that would not be a big issue. The comments by @Multichilland @BrokenSegue are not really related to the original use of YSObot and do not belong here at all but rather to Q106589826 Talk page.
    About the duplicate question - Earliear, I did exactly that and added these to already existing items with "instance of" property. THe I received feedback and was told to create separate items for the types. So now I am getting to totally opposite instructions from you guys. Lets move this discussion to its proper place.
    And please, add the correct rights for this bot account, if they are still missing as we still need to add the remaining 10.000+ identifiers. Saarik (talk) 11:32, 27 October 2022 (UTC)[reply]
  •  Oppose as per above. If you refrain from creating new items I would probably support it if I could easily see the flow of logic.
  • I strongly encourage you to publish an actvity planuml diagram showing he logic of the matching.
  • Thanks in advance. So9q (talk) 10:26, 2 January 2024 (UTC)[reply]

PodcastBot (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: Germartin1 (talkcontribslogs)

Task/s: Upload new podcast episodes, extract: title, part of the series, has quality (explicit episode), full work available at (mp3), production code, apple podcast episode id, spotify episode ID. Regex extraction: talk show guest, recording date (from description) It will be manually run and only for prior selected podcasts. Code: https://github.com/mshd/wikidata-to-podcast-xml/blob/main/src/import/wikidataCreate.ts

Function details:

  • Read XML Feed
  • Read Apple podcast feed/ and spotify
  • Get latest episode date available on Wikidata
  • Loop all new episodes which do not exists in Wikidata yet
  • Extract data
  • Import to Wikidata using maxlath/wikidata-edit

--Germartin1 (talk) 04:38, 25 February 2022 (UTC)[reply]

How about episodes to podcasts with a Wikipedia article? @Ainali:--Trade (talk) 18:34, 12 June 2022 (UTC)[reply]