Jump to content

Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Beetstra (talk | contribs) at 13:15, 14 August 2010 (travelsitesonline.net: Added). It may differ significantly from the current version.

Latest comment: 14 years ago by Beetstra in topic Proposed additions
Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged.

snippet for logging
{{sbl-log|2077557#{{subst:anchorencode:SectionNameHere}}}}

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

Chinese powerlevelling spam

















Similar domains

"The secondary markets permits some players to better enjoy their gaming experience by supplementing their in-game assets"







Spammers






NRV, pure spam. MER-C 09:05, 10 August 2010 (UTC)Reply

As far as I could check it was en.wiki only? Huib talk Abigor 09:16, 10 August 2010 (UTC)Reply
Nope. My experience is that pure spam gets listed here even if it is spammed on only one wiki (you've missed the blacklisting of 1500+ Chinese wholesale domains in the last two months). MER-C 09:21, 10 August 2010 (UTC)Reply
Added Added Huib talk Abigor 09:24, 10 August 2010 (UTC)Reply

sedoparking

Today I noticed COIBot giving the following remark:

And I've seen those before. Now, youporn.gr might go on the blacklist for obvious reasons .. but ...

  • sedoparking.com

They say:

Domain Parking is a simple way to earn money from your domains' natural traffic. If you have registered domain names, but they are not currently being used, then domain parking is a great way to put those domains to work, earning you revenue. You can make money without even lifting a finger! The idle domain is used to display relevant advertisements -every time a consumer clicks on one of the advertisements, you earn money.

Thoughts, what to do with 'sedoparked' sites? --Dirk Beetstra T C (en: U, T) 14:38, 10 August 2010 (UTC)Reply

Blacklist... Huib talk Abigor 14:44, 10 August 2010 (UTC)Reply
Blacklist on sight. There is no legitimate reason to add a parked domain to Wikipedia, and I have encountered several annoying sedo parked domain spammers on enwiki:
MER-C 07:12, 11 August 2010 (UTC)Reply

So:



--Dirk Beetstra T C (en: U, T) 21:47, 13 August 2010 (UTC)Reply

Added Added. --Dirk Beetstra T C (en: U, T) 21:48, 13 August 2010 (UTC)Reply


densitygs.info/.com











--A. B. (talk) 04:15, 12 August 2010 (UTC)Reply

Added Added. --Dirk Beetstra T C (en: U, T) 08:43, 13 August 2010 (UTC)Reply

bestchiropractoradelaide.com.au






Related to recently blacklisted site walkervillechiropractic.com.au. MER-C 07:55, 13 August 2010 (UTC)Reply

Added Added. --Dirk Beetstra T C (en: U, T) 08:35, 13 August 2010 (UTC)Reply

travelsitesonline.net



Spammers
















































More in the COIBot report. MER-C 12:42, 14 August 2010 (UTC)Reply

Added Added. --Dirk Beetstra T C (en: U, T) 13:15, 14 August 2010 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
datingagency.com.hk 2024-12-13 16:07:15 COIBot 104.21.59.62 109.229.165.128
176.138.101.174
223.228.253.112
45.232.83.8
78.123.9.199
82.64.65.152
92.49.193.198
2024-12-13 15:11:00 7 2
nepalsuperleague.com 2024-12-13 16:17:16 COIBot 172.67.192.108 R Spazzolo 2024-12-13 08:46:19 2502 4 0 0 3
polnische-ostsee-urlaub.de 2024-12-10 11:28:24 COIBot 85.215.148.185 2A02:8109:9B82:800:1C76:452E:F96D:CA28
2A02:8109:9B82:800:ADFB:148A:3F17:F6A8
2A02:8109:9B82:800:D851:3110:B8CE:CD4A
2A02:8109:9B82:800:F4A3:475:FADC:C8A5
2A02:8109:9B82:800:F5E1:9B27:B33E:DBF
217.89.121.67
80.157.174.116
2024-12-10 10:37:57 18 3
syr-ray.by 2024-12-13 13:58:33 COIBot 93.125.24.156 212.98.189.95
37.214.67.226
2024-12-13 13:53:56 10 3
topvzla.com 2024-12-13 16:03:40 COIBot 192.0.78.131 R Mauriziok 1970-01-01 05:00:00 2502 646 0 0 0
truyenthongphattrien.com.vn 2024-12-13 14:44:03 COIBot 112.78.2.100 1.52.185.27
103.249.22.173
125.235.232.230
2024-12-13 14:38:16 8 5
uk.zhujiworld.com 2024-12-13 13:01:26 COIBot 172.67.134.136 178.133.76.95
188.130.220.16
188.163.113.135
188.163.113.16
188.163.113.197
194.32.237.242
2024-12-03 12:01:06 11 7
vid.puffyan.us 2024-12-13 16:18:13 COIBot 135.148.139.193 2601:281:D87E:84F0:9066:C18A:2EC9:638C
2601:281:D87E:84F0:B4B1:D7C1:4FB:6D42
173.162.220.17
98.248.161.240
2024-12-13 16:11:48 10 2
yousantape.com 2024-12-13 06:41:42 COIBot 88.223.84.146 103.248.252.193
47.75.98.142
2024-12-13 06:30:01 16 3

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.

elusiva.com



Please remove from the spam-list software company elusiva.com — The preceding unsigned comment was added by 166.203.221.90 (talk)

Why? MER-C 08:02, 14 August 2010 (UTC)Reply

Whitelist cozycot.com



The article from English Wikipedia had a history of repeated creation and deletion, now its creation is permitted, after establishing the notability and undergoing various reviews. Relevant discussions:

Gatyonrew 20:48, 1 June 2010 (UTC)Reply

Here's the history here:
--A. B. (talk) 23:59, 9 June 2010 (UTC)Reply
I see, "this is a domain that has been the subject of paid editing", that's what I've found mentioned there. This issue has been discussed in the discussions mentioned above, together with the notability and the style of the article and the result is the current notable and neutral article. Gatyonrew 12:02, 11 June 2010 (UTC)Reply
It's been a while since this whitelist request remains here without a resolution, I wonder if there is any issue. To begin with, the case does not look similar to the average others from here, there was no actual spam, but a history of repeated creation of the article, then the domain name was lumped together with other domains appearing in a "paid editing" case. Gatyonrew 22:12, 23 June 2010 (UTC)Reply

I think I'd prefer that the domain be whitelisted on enwiki, since there's a real concern about this being pushed by editors being paid to do so. Deferred Deferred to enwiki.  – mike@meta:~$  10:57, 6 July 2010 (UTC)Reply

However, I didn't see the website blacklisted locally on English Wikipedia, it is here on Meta where it has a global blacklist, that's why I came here (considering also that there are articles in 5 other languages). The issues pertaining to the English article of the website (which originated the Meta blacklist request found by A.B.) have been discussed for a few weeks and the new English article has been considered neutral and "above-and-beyond what's requested by WP:N" (citing the user Hobit). This while it appears that there was never an issue about actual spamming (i.e. filling up the article with links towards cozycot.com or spamming such links in other Wikipedia articles). I'm not sure what can be done further in the English Wikipedia, what should I point out for discussion. Gatyonrew 06:52, 7 July 2010 (UTC)Reply
No website blacklisted here on Meta needs be blacklisted locally - that is the point of this blacklist, to save individual wikis from having to deal with it.
If there is a valid reason to place links to the website on a specfic wiki then the site should be whitelisted on that wiki as suggested above. I see no reason to remove it from here for now - if the folk at en wp wish to whitelist it then that is all you need and will be fine for your purpose (I presume). --Herby talk thyme 14:11, 13 July 2010 (UTC)Reply
Then this raises the question why was this website blacklisted on Meta, with no actual spamming occurring. It was an issue of repeated creation and deletion limited only to English Wikipedia and the case pointed out by A.B. should have been dealt there in order to let me request its whitelist there (after the conclusion of the clarifications about the notability and the neutrality of the article). I saw that articles about CozyCot in non-English languages have links to its website. What is the reason to have this website blacklisted everywhere? Gatyonrew 15:28, 13 July 2010 (UTC)Reply
I had assumed you were an "ordinary" editor trying to write an article on en wp about something that required a link to the website. If that is the case then whilelisting there will sort your problem out. If you have some interest in the website concerned that is another matter. --Herby talk thyme 16:13, 13 July 2010 (UTC)Reply
I had an important input in ascertaining the notability and the neutrality of the en.wp article (that's why I even considered to add it in my userpage list of articles created by me, I practically rescued it). I found out the blacklist and I followed back its steps with this whitelist. I thought it would be a formality, given that the English article that determined the blacklist became OK, while I saw no word of actual spamming, thus there were no apparent reasons for a Meta blacklist. It was not a formality, but there were no clear answers either. Regarding your last reply, I hope this "some interest" is your isolate presumption (and I should wait for serious users to consider this matter). Gatyonrew 18:00, 13 July 2010 (UTC)Reply
I have been a serious user here for quite some time actually. I just don't understand why whitelisting doesn't solve your problem. I'll leave it to others now I think. --Herby talk thyme 18:03, 13 July 2010 (UTC)Reply
OK, let's try to separate the issues. Indeed, first I had a problem when I tried to introduce the link as the "official website" in the article, thus I found out about the blacklist. But when considering to fill in the whitelist request, I thought it would be normal to turn back the previous overall blacklist. Once the issue was clarified, the situation should go back to normal. In my personal opinion, it was not necessary the blacklist of the non-English Wikipediae, it was unjustified. I am also a speaker of Japanese, and maybe some time in the future I'll want to create an article about CozyCot in the Japanese Wikipedia. I find really unreasonable to request there too a whitelist, because of what happened in the English Wikipedia. Gatyonrew 18:23, 13 July 2010 (UTC)Reply

compact.exe.su



Please remove from the spam-list Russian musical internet magazine compact.exe.su It was black-listed in 2007 due to the persistent spamming: see the request. Since then the magazine is stopped, there are only old materials at the site, some of them are useful for the articles about Russian rock groups. The user who made the request is no longer active. The sysop of ru.wiki Andrei Romanenko 12:04, 12 July 2010 (UTC)Reply

Obvious answer - whitelist it on ru wp surely? --Herby talk thyme 14:12, 13 July 2010 (UTC)Reply

powerplaymanager.com



I wrote a wiki page, but I could not enter the official website because it tells me that is blacklisted. I don't know why it was included in that list, italian administrators have told me that might be the nature of the site (management game), but other sites were allowed to post the link to the official website (I quote here the examples below, and post a link to a wiki page that has been published).

Other management game that have the link to their official site: [[1]], [[2]], [[3]], [[4]].

The wiki page that i wrote: [[5]] --User:Desperino 03:19, 15 lug 2010 (UTC+1)

It will have been excessively placed on Wiki sites some time in the past. The best/easiest answer is to seek whitelisting on it wp (if you know an admin there that will be easy!). Thanks --Herby talk thyme 15:23, 15 July 2010 (UTC)Reply

pln-autoaprender.webcindario.com



I've read The external links guideline so now i know i've added inappropiate links to attract visitors to this website. I'm sorry about it and, as not all links that i had added were removed, i've just completed the work and removed all my links. You can check it. Now, i would like this website not to be in the blacklist. Pintaio 14:04, 20 July 2010 (UTC)Reply

Your site is not in the blacklist. I-20the highway 16:41, 25 July 2010 (UTC)Reply

opennasa.uservoice.com



This is used officially by NASA to get feedback...there's no reason to block it.Smallman12q 23:33, 20 July 2010 (UTC)Reply

The 'problem' is the server it is on, uservoice.com. This is basically a petition/voting site, of which one was spammed cross-wiki. Similarly, many petition/voting sites are blacklisted on en.wikipedia, as they, generally, do not help in building the encyclopedia (violation of en:WP:SOAPBOX). Of course, exceptions exist, and this may be one. I would therefore suggest to whitelist specific links where necessary. Here Not done. --Dirk Beetstra T C (en: U, T) 07:10, 21 July 2010 (UTC)Reply

ascendercorp.com



I recently edited a few existing articles [6][7] [8] on English Wikipedia. I attempted to support my facts with a reference from an official site and found that it was blacklisted. I felt my posts were unbiased and the reference was relevant and factual. I am interested in finding out if Ascendercorp.com can be removed from the blacklist, as it is a legitimate site containing applicable information related to fonts and typography. Drewpoleon 18:40, 29 July 2010 (UTC)Reply

yfrog.com



Hello, Our site seems to be blacklisted for spam. I can not find any evidence to support this, but that's what I've been told here http://en.wikipedia.org/wiki/MediaWiki_talk:Spam-blacklist#Proposed_removals

Yfrog is a website and Twitter service that allows users to share photos and videos on Twitter and to broadcast their life as it happens. It is free for users, and no registration is required. Yfrog.com is owned and operated by Imageshack.us which is a top 100 site.

We did have some issues with email pill spams, but have since fixed the problem. We have worked with multiple anti-spam sites (http://inboxrevenge.com, http://www.infiltrated.net, http://www.arbornetworks.com, http://shadowserver.org, and http://uribl.com to name a few). And have helped spread information on pill scams by replacing images uploaded by spammers into a warning image. Here is an example: http://img683.imageshack.us/img683/3548/upoufacigeya.gif

We also have since hired a team of 8 content moderators that review images, and are able to flag the images as they come in to our servers.

Please let me know if there's any spam problem on Wikipedia from yfrog and I'll personally take action to resolve the issue. Thank you in advance. npettas 01:51:40, 5 Aug 2010 (UTC)

User:COIBot/XWiki/parascientifica.com



See discussion on User:COIBot/XWiki/parascientifica.com. --Dirk Beetstra T C (en: U, T) 08:33, 10 August 2010 (UTC)Reply

goo.gl

I don't see why it is blocked: it is a legit url-shortener - owned by google - which cannot be used by general public but only internally by google applications. e.g. google maps can (if you set it up appropriately) return a short url via goo.gl instead of the usual huge url. This is the regexp currently used:

\bgoo\.gl\b

--Noieraieri 11:29, 11 August 2010 (UTC)Reply

URL redirects do not need to be used, you can just copy/paste the whole link. Moreover, they can be used to circumvent specific blacklisting (e.g. the same google maps link, if that needed to be blacklisted for some obscure reason). And that is why all url redirect services go on the meta blacklist without discussion (or even abuse). I hope this explains. --Dirk Beetstra T C (en: U, T) 12:03, 11 August 2010 (UTC)Reply
Well, this specific redirect service cannot be used in the malicious ways you suggest. Anyway, nothing unexpected. --Noieraieri 14:51, 11 August 2010 (UTC)Reply
 Declined We don't accept url shortners. Huib talk Abigor 14:53, 11 August 2010 (UTC)Reply

Just as a note, a part of Google is blacklisted as it could be used malicious. Spamming often has nothing to do with the content being linked to, it is more how it is brought to Wikipedia. URL shorteners are generally inappropriate, as the page it redirects can just as easy be added. --Dirk Beetstra T C (en: U, T) 15:07, 11 August 2010 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

'broken' regexes (worldwide)

COIBot in the reports notes that regexes are broken, however, in the code it also finds regexes which are not broken, but which are incorrect: I will try and find all of these (see in which black/white/revertlist they are) after this. I would need some help in repairing them in the end. Thanks. --Dirk Beetstra T C (en: U, T) 08:35, 3 June 2010 (UTC)Reply

Now I know why I have some duplicates, I will clear out my part of the problem first .. :-( --Dirk Beetstra T C (en: U, T) 09:47, 4 June 2010 (UTC)Reply

Dumping new set of errors, will clarify later:

  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE ehbisigorta\.com/ at LinkSaver.pl line 2570, <LIST> line 177.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE ehbisigorta\.com/ at LinkSaver.pl line 2573, <LIST> line 177.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE liegervaringen\.com\b/ at LinkSaver.pl line 1359.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE usiqueclassique\.c\.la/ at LinkSaver.pl line 1359.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 1359.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 1359.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 1359.
  • Unrecognized escape \k passed through in regex; marked by <-- HERE in m/\k <-- HERE anal911\.com/ at LinkSaver.pl line 1359.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE liegervaringen\.com\b/ at LinkSaver.pl line 2538.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE usiqueclassique\.c\.la/ at LinkSaver.pl line 2538.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 2538.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 2538.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE vg\.net/ at LinkSaver.pl line 2538.
  • Unrecognized escape \k passed through in regex; marked by <-- HERE in m/\k <-- HERE anal911\.com/ at LinkSaver.pl line 2538.
  • Illegal hexadecimal digit 'o' ignored at LinkSaver.pl line 2570, <LIST> line 14.
  • Illegal hexadecimal digit 'o' ignored at LinkSaver.pl line 2570, <LIST> line 14.
  • Illegal hexadecimal digit 'o' ignored at LinkSaver.pl line 2573, <LIST> line 14.
  • Illegal hexadecimal digit 'o' ignored at LinkSaver.pl line 2573, <LIST> line 14.
  • Unrecognized escape \o passed through in regex; marked by <-- HERE in m/sgipt.\o <-- HERE rg/ at LinkSaver.pl line 2570, <LIST> line 11.
  • Unrecognized escape \o passed through in regex; marked by <-- HERE in m/sgipt.\o <-- HERE rg/ at LinkSaver.pl line 2573, <LIST> line 11.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE ranhome\.net/ at LinkSaver.pl line 2570, <LIST> line 72.
  • Unrecognized escape \i passed through in regex; marked by <-- HERE in m/\i <-- HERE ranhome\.net/ at LinkSaver.pl line 2573, <LIST> line 72.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE etaliran\.tk/ at LinkSaver.pl line 2570, <LIST> line 72.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE etaliran\.tk/ at LinkSaver.pl line 2573, <LIST> line 72.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE ymap\.ir/ at LinkSaver.pl line 2570, <LIST> line 72.
  • Unrecognized escape \m passed through in regex; marked by <-- HERE in m/\m <-- HERE ymap\.ir/ at LinkSaver.pl line 2573, <LIST> line 72.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/technorati.com\blogs\v <-- HERE iagra-generics.net/ at LinkSaver.pl line 2570, <LIST> line 159.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/technorati.com\blogs\v <-- HERE iagra-generics.net/ at LinkSaver.pl line 2573, <LIST> line 159.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE ampierenspel.yourbb\.nl/ at LinkSaver.pl line 2570, <LIST> line 156.
  • Unrecognized escape \v passed through in regex; marked by <-- HERE in m/\v <-- HERE ampierenspel.yourbb\.nl/ at LinkSaver.pl line 2573, <LIST> line 156.
  • Illegal hexadecimal digit 'n' ignored at LinkSaver.pl line 2570, <LIST> line 18.
  • Illegal hexadecimal digit 'n' ignored at LinkSaver.pl line 2570, <LIST> line 18.
  • Illegal hexadecimal digit 'n' ignored at LinkSaver.pl line 2573, <LIST> line 18.
  • Illegal hexadecimal digit 'n' ignored at LinkSaver.pl line 2573, <LIST> line 18.

--Dirk Beetstra T C (en: U, T) 11:39, 4 June 2010 (UTC)Reply

Is there any way to know what page each regex comes from?  — mikelifeguard@meta:~$  13:39, 12 June 2010 (UTC)Reply
Still need help? -- seth 19:38, 23 July 2010 (UTC)Reply

Exeption for backupurl



I need a Backup for a UNHCR-document at http://www.unhcr.org/3b9cc1144.pdf#page=234 WebCite does not work, but backupurl.com/site.php?key=qtr83b is a workaround. If you save the information from backupurl.com with suffix .pdf on your harddisk, you can open the pdf-documant properly. Is there a chance for an exeption? Or does somebody know a better workaround? --Diskriminierung 17:18, 30 June 2010 (UTC)Reply

I don't understand what you're trying to do. The UNHCR website isn't disappearing any time soon, I would think, so why do you need a backup?  – mike@meta:~$  16:32, 4 July 2010 (UTC)Reply

It is my fashion always to backup urls. But it doesn't matter anymore. I found a backup at googlebooks (which I used) but furthermore it is easy to circumvent this blacklist-entry (what I dind'd use here). Post me to know how, because I will not post it here officially. Here eod. --Diskriminierung 09:42, 6 July 2010 (UTC)Reply

O.K. You want it here: It is simple as that: archive a link which is not to be archived with WebCite with backupurl. You get a backupurl-Link. You can archive this backupurl-Link with WebCite in spite of the fact that WebCite would never archive the original link because of a interdiction in robots.txt. If then you put the WebCite-Link, which is a cloaked backupurl-link, as reference into Wikipedia the spam blacklist will not realise it. I suppose WebCite should put backupurl on its own blacklist. --Diskriminierung 14:04, 6 July 2010 (UTC)Reply

So if I understand it correctly, webcite can be used as a redirect for blacklist circumvention?



This is interesting. --Dirk Beetstra T C (en: U, T) 14:20, 6 July 2010 (UTC)Reply

Yes, of course archive services (like WebCite or the Internet Archive) can be used as a workaround to the URL blacklist if you're only interested in pushing content rather than trying to get hits on your website. Blacklisting webcitation.org would break a lot of content on English Wikipedia, though, so I hope that doesn't need to be considered. Gavia immer 17:55, 6 July 2010 (UTC)Reply
since WebCite pays attention to robots.txt it is possible that they use a blacklist, too. It would be better to inform them first and only after failure of that method to blacklist them here. --Diskriminierung 09:17, 7 July 2010 (UTC)Reply


way.com

The regex "\bway\.com\b" seems to be affecting all legitimate sites ending with "-way.com", for example http://www.german-way.com and http://www.con-way.com DHN 00:12, 12 August 2010 (UTC)Reply

Discussion

This section is for discussion of Spam blacklist issues among other users.

COIBot down up

Due to power issues, the box that is running a.o. COIBot and the linkwatchers has been taken down until the power issues have been resolved. This means that there is a gap from last Saturday until that moment in the link-addition database, and that COIBot will not refresh any reports. I'm sorry for the inconvenience. --Dirk Beetstra T C (en: U, T) 08:17, 7 June 2010 (UTC)Reply

Update
The database repair has completed. I'd like to say that COIBot and the linkwatchers are running normally, and reports are available again, but looking at Special:Contributions/COIBot, I'm not sure that's the case. I may try to restart the bots, or get Beetstra to look at this. COIBot etc is now back to normal.
For the future, we're doing several things to increase reliability:
  1.  We will be using storage with better crash recovery characteristics. In the event of a crash, recovery will be quicker.
  2.  We will be replicating the database to provide a read-only copy for web access. Aside from hopefully providing a useful interface to the data, this serves as a backup guarding against most forms of failure, and helps reduce load on the master server.
  3.  We will be doing periodic database backups. In the event of a catastrophic crash or human error where recovery from the slave is not possible, we can recover from backup.
  4.  We will be setting up a backup version of COIBot (etc). In the event of server failure, the bots can run from another server, writing to what was previously the slave until master recovery is complete. Replicate the intervening changes back to master and you're golden.
 — mikelifeguard@meta:~$  03:28, 16 June 2010 (UTC)Reply
LinkSaver is back to normal. I am not where I am doing something wrong, but I seem to leave it in debug-mode which results in it auto-crashing when I leave (as usual, but if I do not leave it in debug-mode, it restarts normally ..). I'll keep an eye so it runs on, today. --Dirk Beetstra T C (en: U, T) 09:37, 16 June 2010 (UTC)Reply

Another method of finding knockoff domains

In a rare instance where current copyright laws are actually useful, online takedown notices can be a fast way of finding Chinese knockoff spam domains. I found about 300 domains in 15 minutes by searching Google for "site:chillingeffects.org ugg google counterfeit" with very few duplicates. Now to find out whether these domains were spammed, the current state of the linkwatchers (i.e. dead) suggests this will take a while. There is no guarantee that said domains are related. MER-C 08:18, 24 July 2010 (UTC)Reply