Wikipedia:Bots/Noticeboard: Difference between revisions
m →Problematic edit and confusing minor edit summary by poorly identified bot - GreenC bot / WaybackMedic / Wayback Medic: Fixing style/layout errors |
|||
Line 69: | Line 69: | ||
# WM is ''currently'' not removing 403 robots.txt links anymore but it is removing other types of 403s, 404s, certain types of 301/302. The reasons are many, some you described, and for other reasons. WM processed over a million links, deleted 30,000 (3%) and I am considering a project to restore 403 robots.txt that are back alive (about 3000 or 0.3%). As for checking other archives, absolutely WaybackMedic does it through Memento's API and it has saved a lot of links that way, mostly WebCite and LOC (see the project's Stats page "New alt archive"). Archive.is a special case and largely unusable by bots, as Cyberpower said. It doesn't prove the link was once verifiable, or even existed - WM found many links that never worked because they were added by bots that didn't have access to the IA API - the links were broken from the start. Furthermore monitoring has shown these non-robots.txt links are permanently dead. Wayback for other reasons (some intentional some due to internal data problems) will delete links unrelated to robots. -- [[User:Green Cardamom|<font color="#006A4E">'''Green'''</font>]][[User_talk:Green Cardamom|<font color="#009933">'''C'''</font>]] 03:24, 15 November 2016 (UTC) |
# WM is ''currently'' not removing 403 robots.txt links anymore but it is removing other types of 403s, 404s, certain types of 301/302. The reasons are many, some you described, and for other reasons. WM processed over a million links, deleted 30,000 (3%) and I am considering a project to restore 403 robots.txt that are back alive (about 3000 or 0.3%). As for checking other archives, absolutely WaybackMedic does it through Memento's API and it has saved a lot of links that way, mostly WebCite and LOC (see the project's Stats page "New alt archive"). Archive.is a special case and largely unusable by bots, as Cyberpower said. It doesn't prove the link was once verifiable, or even existed - WM found many links that never worked because they were added by bots that didn't have access to the IA API - the links were broken from the start. Furthermore monitoring has shown these non-robots.txt links are permanently dead. Wayback for other reasons (some intentional some due to internal data problems) will delete links unrelated to robots. -- [[User:Green Cardamom|<font color="#006A4E">'''Green'''</font>]][[User_talk:Green Cardamom|<font color="#009933">'''C'''</font>]] 03:24, 15 November 2016 (UTC) |
||
#Ok. "Wayback Medic is a bot" is what it says at [[User:Green_Cardamom/WaybackMedic_2]] which you link to. Maybe that should change? |
:#Ok. "Wayback Medic is a bot" is what it says at [[User:Green_Cardamom/WaybackMedic_2]] which you link to. Maybe that should change? |
||
#Great. Appreciate all the tons of good edits you/your bot does.--[[User:Elvey|<font color="burntorange">Elvey</font>]]<sup>([[User talk:Elvey|t]]•[[Special:Contribs/Elvey|c]])</sup> 07:52, 15 November 2016 (UTC) |
:#Great. Appreciate all the tons of good edits you/your bot does.--[[User:Elvey|<font color="burntorange">Elvey</font>]]<sup>([[User talk:Elvey|t]]•[[Special:Contribs/Elvey|c]])</sup> 07:52, 15 November 2016 (UTC) |
Revision as of 07:57, 15 November 2016
Bots noticeboard |
---|
Here we coordinate and discuss Wikipedia issues related to bots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about our bot policy and know where to post your issue. Do not post here if you came to
|
Bot-related archives |
---|
API Outage
FYI: There is/was an API outage today, see Wikipedia:Village_pump_(technical)#503_errors.2Fpossible_Twinkle_issues. — xaosflux Talk 16:20, 17 October 2016 (UTC)
Conflicting edits between Hazard-Bot and Cyberbot I
As of this edit, It looks like that the two bots have different "opinions" on what a blank template sandbox should look like.
To me, one of the bots should either stop patrolling the template sandbox, or stop doing the task completely if the bot task only patrol that page. NasssaNser (talk/edits) 04:07, 22 October 2016 (UTC)
- Pinging operators: Cyberpower678 and Hazard-SJ. — JJMC89 (T·C) 05:31, 22 October 2016 (UTC)
- Optimally, both bots should be able to get along just fine without arguing about what the content should be, which, as far as I'm aware, was why we have templates (in this case Template:Template sandbox reset) to guide what the content should be. My bot substitutes the aforementioned template, so that if there is a decision to change the content at some point, it could be easily done without my intervention, and especially if multiple bots (as is the case here) perform the same task, both would easily "update" to use the most recent content immediately and simultaneously. Checking the page history, your link is not nearly the first occurrence of the issue, and it had also happened with lowercase sigmabot II and Cyberbot I as well as in this edit). Cyberpower678, would it be okay for you to also adopt usage of that template as well? Hazard SJ 21:07, 23 October 2016 (UTC)
- Sure. The text is predefined values when the script starts.—cyberpowerTrick or Treat:Offline 02:11, 24 October 2016 (UTC)
High false positive rate for User:InternetArchiveBot
As its maintainer's talk page witnesses, User:InternetArchiveBot is currently not working very well. As the recent error reports have not been answered and the bot is still making edits, would it be useful to block it? − Pintoch (talk) 16:36, 22 October 2016 (UTC)
- @Cyberpower678: Any comments here? — xaosflux Talk 17:38, 22 October 2016 (UTC)
- The bot operator was active and fixing bugs as recently as 14 October 2016. See the talk page history. – Jonesey95 (talk) 19:31, 22 October 2016 (UTC)
- I wouldn't call this a high false positive rate. During testing it's been established that the false positive rate is 0.1%. Let's take 0.1% of 40 million links. That's 40,000 links. So it may seem high, but not if you look at the bigger picture.—cyberpowerTrick or Treat:Offline 22:32, 22 October 2016 (UTC)
- Ok, good to know. I was interested in this for my own bot. − Pintoch (talk) 07:47, 23 October 2016 (UTC)
- I wouldn't call this a high false positive rate. During testing it's been established that the false positive rate is 0.1%. Let's take 0.1% of 40 million links. That's 40,000 links. So it may seem high, but not if you look at the bigger picture.—cyberpowerTrick or Treat:Offline 22:32, 22 October 2016 (UTC)
- The bot operator was active and fixing bugs as recently as 14 October 2016. See the talk page history. – Jonesey95 (talk) 19:31, 22 October 2016 (UTC)
C678 how was the 0.1% estimated? -- Magioladitis (talk) 23:02, 24 October 2016 (UTC)
- The community tech team and I went through a sample of 1000 URLs that were tested by the bot, during the trial, and found only 1 to be a false positive.—cyberpowerTrick or Treat:Offline 01:59, 25 October 2016 (UTC)
The modification you tried to make was aborted by an extension hook
Good afternoon! When bot edit pages frequently get the error: "The modification you tried to make was aborted by an extension hook". How can it be called? Игорь Темиров (talk) 13:40, 29 October 2016 (UTC)
- @Игорь Темиров: this error message may be related to you hitting the abuse filter on a specific project. If this is on a WMF wiki, can you provide the account name, project name, and page you are trying to make the edit to? — xaosflux Talk 14:36, 29 October 2016 (UTC)
- @Xaosflux: No, this page. Some similar pages are saved properly, but part with this error. Игорь Темиров (talk) 15:39, 29 October 2016 (UTC)
- @Игорь Темиров: I have global abuse-filter access so was able to verify you are not hitting ruwiki's AbuseFilter, however you may be hitting their ru:Special:Log/spamblacklist - you will need to ask a ruwiki administrator to check that for you. — xaosflux Talk 15:55, 29 October 2016 (UTC)
- @Xaosflux: No, this page. Some similar pages are saved properly, but part with this error. Игорь Темиров (talk) 15:39, 29 October 2016 (UTC)
New adminbot request
There is a new adminbot request open at WP:BRFA. Please see Wikipedia:Bots/Requests for approval/MusikBot II for details. — xaosflux Talk 04:16, 1 November 2016 (UTC)
Featured logs in mobile view
Per this discussion and this one on the Village pump (technical), I have raised the possibility of removing Template:Fa top's invocation of the metadata
class, which in mobile view has the declaration display:none;
, from the div
section. This will allow the featured log to be viewed in the "mobile view". However, there was some fear that it might break a bot. The FACBot is okay, and I'm not aware of any other bot that accesses these pages, but it's best to ask first rather than proceeding and seeing what breaks. (Notifying Ian Rose and Bonvol:) Hawkeye7 (talk) 20:14, 1 November 2016 (UTC)
Problematic edit and confusing minor edit summary by poorly identified bot - GreenC bot / WaybackMedic / Wayback Medic
https://en.wikipedia.org/w/index.php?title=Taser&diff=737265819&oldid=736467529 is tagged GreenC bot and (WaybackMedic 2).
- It shouldn't be at all ambiguous whether Wayback Medic or GreenC bot did the edit. Is there a good reason for the discrepancy or should it be fixed?
- The edit is problematic for several reasons. Information about archived content that isn't viewable due to IA's robots.txt policy is not useless. It's a) subject to legal discovery at any time, and b) viewable if the robot.txt changes. c) the existence of the IA URL suggests that the content was verifiable d)Deleting it means a normal viewer of the article won't learn that, and e) the source is available at Archive.is : [1]. For these five reasons, the bot should not be removing non-working IA URLs, especially not without checking to see if there's a working Archive.is version.--Elvey(t•c) 01:02, 15 November 2016 (UTC)
- An observational comment: Archive.is has no API from what I can tell making it nearly impossible for bots to use it.—cyberpowerChat:Offline 01:16, 15 November 2016 (UTC)
- Well, then go fix our Archive.is article, User:Cyberpower678; it says they have an API. LOL. --Elvey(t•c) 02:04, 15 November 2016 (UTC)
- Memento isn't part of Archive.is. I mean that archive.is doesn't have it's own API bots can use, and sucks in general for delivering reliable results. It doesn't preserve HTTP codes making it impossible for bots to determine if it's a good archive or not.—cyberpowerChat:Offline 03:01, 15 November 2016 (UTC)
- In my experience it was like 50% soft-404 making it unusable, unless we spam Wikipedia with broken links. Elvey, do you know someone at Archive.is we can speak with about that? -- GreenC 03:26, 15 November 2016 (UTC)
- Archive.is seems to be a rather unresponsive one man operation. Without reliable communications, I absolutely refuse to use archive.is beyond the current URL recognition.—cyberpower (unsigned)
- I'm confused. I hear you say WM can tell that many archive.is links were broken from the start - and that you need an API to be able to tell if the links work or not. Seems contradictory. What am I missing? --Elvey(t•c) 07:52, 15 November 2016 (UTC)
- Archive.is seems to be a rather unresponsive one man operation. Without reliable communications, I absolutely refuse to use archive.is beyond the current URL recognition.—cyberpower (unsigned)
- In my experience it was like 50% soft-404 making it unusable, unless we spam Wikipedia with broken links. Elvey, do you know someone at Archive.is we can speak with about that? -- GreenC 03:26, 15 November 2016 (UTC)
- Memento isn't part of Archive.is. I mean that archive.is doesn't have it's own API bots can use, and sucks in general for delivering reliable results. It doesn't preserve HTTP codes making it impossible for bots to determine if it's a good archive or not.—cyberpowerChat:Offline 03:01, 15 November 2016 (UTC)
- Well, then go fix our Archive.is article, User:Cyberpower678; it says they have an API. LOL. --Elvey(t•c) 02:04, 15 November 2016 (UTC)
- @Green Cardamom: (ping to bot operator). — xaosflux Talk 01:24, 15 November 2016 (UTC)
- Thanks.--Elvey(t•c) 02:04, 15 November 2016 (UTC)
- There is no User:WaybackMedic. "WaybackMedic" is the name of the program being run by the bot account User:GreenC bot. The account runs multiple programs.
- WM is currently not removing 403 robots.txt links anymore but it is removing other types of 403s, 404s, certain types of 301/302. The reasons are many, some you described, and for other reasons. WM processed over a million links, deleted 30,000 (3%) and I am considering a project to restore 403 robots.txt that are back alive (about 3000 or 0.3%). As for checking other archives, absolutely WaybackMedic does it through Memento's API and it has saved a lot of links that way, mostly WebCite and LOC (see the project's Stats page "New alt archive"). Archive.is a special case and largely unusable by bots, as Cyberpower said. It doesn't prove the link was once verifiable, or even existed - WM found many links that never worked because they were added by bots that didn't have access to the IA API - the links were broken from the start. Furthermore monitoring has shown these non-robots.txt links are permanently dead. Wayback for other reasons (some intentional some due to internal data problems) will delete links unrelated to robots. -- GreenC 03:24, 15 November 2016 (UTC)
- Ok. "Wayback Medic is a bot" is what it says at User:Green_Cardamom/WaybackMedic_2 which you link to. Maybe that should change?
- Great. Appreciate all the tons of good edits you/your bot does.--Elvey(t•c) 07:52, 15 November 2016 (UTC)