Jump to content

Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Courcelles (talk | contribs) at 22:23, 22 December 2012 (dynoguy.com: Added). It may differ significantly from the current version.

Latest comment: 11 years ago by Courcelles in topic Proposed additions
Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged · current log 2024/12.

snippet for logging
{{sbl-log|4898055#{{subst:anchorencode:SectionNameHere}}}}


Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

inforapid.org



This is just a meta page that hasn't got own content. It's rather a wikipedia mirror. So it's not compatible to WP:EL (anywhere). Links to the domain are used as external links and even as references across several projects. COIBot has got 55 records, but there are many more:

Top 10 editors who have added inforapid.org: [name not readable due to unicode-problems] (4), Wikiherder (3), Cinmad (3), 78.6.226.210 (3), Susann Schweden (2), KurtR (2), Bernd Rieke (2), Veronidae (2), CorenSearchBot (1), Aschroet (1).
Top 10 wikis where inforapid.org has been added: w:en (20), w:de (14), w:it (6), w:bg (4), w:es (3), wikt:de (2), w:ru (1), w:pt (1), w:fr (1), w:az (1).

I deleted a couple of links in w:en, w:fr, w:it. But there are still many links left. The problem is: If I blacklist the page now at meta, there may occur some problems, e.g., some archive bots can't cope with threads containing blacklisted links. I could use CamelBot to remove the links from ANS and replace the links by urls in non-ANS, but my bot has got the bot-flag in w:de only.
What do we normally do in such cases? Are there global bots that could help? -- seth (talk) 18:58, 2 June 2012 (UTC)Reply

I deleted the ANS links by hand now. What to do with the rest of the links? -- seth (talk) 15:44, 3 June 2012 (UTC)Reply
I think we should monitor this one. I see no direct evidence of spamming, but the link is not very useful. EdBever (talk) 07:19, 9 June 2012 (UTC)Reply
It's just a wp-mirror. So it does not give any additional information to articles. Apart from that it even has been used many times as a reference. So it's not intended spamming of one person, but unintended spamming of many persons. -- seth (talk) 09:39, 9 June 2012 (UTC)Reply
I don't think there is something as unintended spamming. A number of users found the site useful and inserted it into various articles. We shouldn't judge the site's content on this page since it is meant for fighting spam. EdBever (talk) 14:20, 14 June 2012 (UTC)Reply
I understand your point. However, the users that link to that page just ignore/don't know our rules. links to wp mirrors do not satisfy w:de:WP:EL, nor w:en:WP:EL. I don't think, that other wp-projects want such links.
In w:de it is common to blacklist wp mirrors. Is that different to other wiki-projects? -- seth (talk) 15:57, 16 June 2012 (UTC)Reply
I blacklisted the domain at w:de now and asked for opinions at w:en. -- seth (talk) 09:20, 9 December 2012 (UTC)Reply
It was only linked to in one article at enWP, the rest was mostly user pages. It has been added by many people, rather than an evident concerted effort. I think that we should be leaving this to a local decision, not something where we are making a qualitative decision. — billinghurst sDrewth 11:55, 9 December 2012 (UTC)Reply

DGtraffic (Indonesia) spam on Wikipedia

DGTraffic is a large Indonesian SEO firm. [1] [2]

Reference:

Accounts






[4]



[5]













id wikipedia only





id wikipedia only



Domains spammed

These were spammed across Indonesian and English Wikipedias. The spam added to en.wikipedia was done solely for "link love" since the links led to Indonesian language sites. Some spam known to have been added to Commons, Simple, Lombard and Ten. Unfortunately, global contributions search is down, so I don't know if there's more out there.



























































































Related domains to blacklist

SEO blogs







Spammed domains not listed for blacklisting today


    • News publication possibly useful as a reference on id.wikipedia


    • Big Indonesian financial company
Related domain not listed for blacklisting today


    • SEO client; domain not known to have been spammed yet

--A. B. (talk) 01:33, 25 September 2012 (UTC)Reply

More domains and accounts












--A. B. (talk) 02:51, 11 October 2012 (UTC)Reply

Myspacetv.com



Noted this as caught by LiWa3/COIBot. This is a plain redirect site to myspace's video part (myspace.com/tv). Generally, we do blacklist redirect sites on sight, though for dedicated servers an exception may sometimes be made. However, we do have quite some myspace-pages blacklisted (10 myspace.com/<id> - rules), and there is a possible issue there (links to videos should always be double checked, and if a dedicated video server like YouTube has copyright violations on them, how about a social networking site ..). Should we just monitor this, or should we consider to blacklist and clean/convert? --Dirk Beetstra T C (en: U, T) 11:08, 12 November 2012 (UTC)Reply

Example of what this leads to w:en:Avril_Lavigne's_Make_5_Wishes#Video_Episodes - for me they are all blocked due to a local firewall, I hear that others can't see them either since they have to sign up. Therefore, suggesting also:

  • myspace.com/tv

That just fails our inclusion standards. Will need to have the 200 links cross-wiki cleaned up, though. --Dirk Beetstra T C (en: U, T) 11:18, 12 November 2012 (UTC)Reply

Note that the Avril Lavigne's Make 5 Wishes section has been nuked. --Beetstra public (talk) 06:58, 14 November 2012 (UTC)Reply

stellissimo.in redirects





















Added Addedbillinghurst sDrewth 03:16, 10 December 2012 (UTC)Reply

dynoguy.com



billinghurst sDrewth 03:23, 10 December 2012 (UTC)Reply

Added Addedbillinghurst sDrewth 03:24, 10 December 2012 (UTC)Reply
Added Added. --Courcelles 22:23, 22 December 2012 (UTC)Reply

etourismsolutions.com


























Nasty sockpuppet spam. MER-C (talk) 06:28, 22 December 2012 (UTC)Reply

Chinese spammer



Added Added. --Courcelles 22:21, 22 December 2012 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
nepalsuperleague.com 2024-12-13 13:32:21 COIBot 172.67.192.108 R Spazzolo 2024-12-13 08:46:19 2502 4 0 0 3
polnische-ostsee-urlaub.de 2024-12-10 11:28:24 COIBot 85.215.148.185 2A02:8109:9B82:800:1C76:452E:F96D:CA28
2A02:8109:9B82:800:ADFB:148A:3F17:F6A8
2A02:8109:9B82:800:D851:3110:B8CE:CD4A
2A02:8109:9B82:800:F4A3:475:FADC:C8A5
2A02:8109:9B82:800:F5E1:9B27:B33E:DBF
217.89.121.67
80.157.174.116
2024-12-10 10:37:57 18 3
syr-ray.by 2024-12-13 13:58:33 COIBot 93.125.24.156 212.98.189.95
37.214.67.226
2024-12-13 13:53:56 10 3
topvzla.com 2024-12-13 14:34:56 COIBot 192.0.78.131 R Mauriziok 1970-01-01 05:00:00 2502 646 0 0 0
truyenthongphattrien.com.vn 2024-12-13 14:44:03 COIBot 112.78.2.100 1.52.185.27
103.249.22.173
125.235.232.230
2024-12-13 14:38:16 8 5
uk.zhujiworld.com 2024-12-13 13:01:26 COIBot 172.67.134.136 178.133.76.95
188.130.220.16
188.163.113.135
188.163.113.16
188.163.113.197
194.32.237.242
2024-12-03 12:01:06 11 7
yousantape.com 2024-12-13 06:41:42 COIBot 88.223.84.146 103.248.252.193
47.75.98.142
2024-12-13 06:30:01 16 3

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Seth's blacklist check tool — remember to enter any relevant language code

pro-speleo.ru



Please EXCLUDE pro-speleo.ru from \bpro-*\.ru\b spam-series, since it has no relation to pro-gorod series, instead it is a major source for speleology releated articles in Russian. --Untifler (talk) 04:03, 2 November 2012 (UTC)Reply

Remove Remove I have done a less than perfect change to the regex, however, it will resolve your issue. — billinghurst sDrewth 07:08, 2 December 2012 (UTC)Reply
As a reply to Untifler "The question is why there was a desicion to ban all \bpro-*\.ru\b series" - If I recall correctly, more and more popped up, so a blanket ban was applied, maybe under the false assumption that there were no other sites that could be of interest anyway (not thát big a change anyway), or maybe that actually someone did a check to see whether there were other domains and they did not find any at that time. Anyways, I think this has been resolved. --Beetstra public (talk) 14:00, 2 December 2012 (UTC)Reply

youtu.be



This is officially used by YouTube, and when a link is shared, it is automatically given in this format:

http://youtu.be/3pZUCKt0RKc

First of all, there is generally absolutely NO reason to use link redirectors - the full link can simply be used. Youtu.be is a redirect site which is specific, at least not 'custom' pointing to other sites, however, it can be used to circumvent linking to youtube video's that are blacklisted (on the different individual language projects, and on meta, there are several specific youtube links being blacklisted because they were spammed or abused otherwise). That combined with the normal problems that YouTube links have (although there is a lot of good material, there are still plain copyright violations, it is not available to all, etc. etc.), is enough reason to decline this. You can use http://www.youtube.com/watch?v=3pZUCKt0RKc (IIRC, there is a checkbox for full link in the share option that gives you the full link, otherwise converting the 'youtu.be/' to 'youtube.com/watch?v=' does the trick). Thanks. --Dirk Beetstra T C (en: U, T) 11:06, 10 November 2012 (UTC)Reply

cartconvert.allowed.org



I suggest whitelisting cartconvert.allowed.org. It is a RESTFul cartograhpy service form transformation of geolocation bearing points to other geolocation representations. I tried adding the service to the German web page of http://de.wikipedia.org/wiki/Bundesmeldenetz but because *.allowed.org is blacklisted, it was not possible. I am the author or cartconvert.allowed.org too.— The preceding unsigned comment was added by 86.33.210.43 (talk)

I am not sure why Wikipedia should link to this .. could you elaborate on this. Note that global whitelisting is impossible, for that you'd need to place requests on the wiki where this link may be of interest. Thanks. --Beetstra public (talk) 05:38, 25 November 2012 (UTC)Reply
The german article of http://de.wikipedia.org/wiki/Bundesmeldenetz describes the former geodetic datum of Austria. Lot's of legacy data in this format is still lingering around. I am also involved in open government data and while a description of conversion between this geodetic datum is available, no authoritative public service is available to my knowledge. I provided this service as an open source project and it's available online as cartconvert.allowed.org. I would like to add a external link on the mentioned page so that people interested in putting points on eg. openstreetmap in the geodetic datum of Austria (Bundesmeldenetz) could have a start. — The preceding unsigned comment was added by 193.171.58.240 (talk)

www.bodybuilding-magazin.de



Hi, the reference in article de:Fouad Abiad cannot be formatted properly, due to the blacklist for "bodybuilding" on meta. --Valvetube (talk) 11:28, 27 November 2012 (UTC)Reply

updated regex, you should be right to go. — billinghurst sDrewth 12:39, 27 November 2012 (UTC)Reply
Thanks. --Valvetube (talk) 13:43, 27 November 2012 (UTC)Reply

Remove Remove, well in a fashion — billinghurst sDrewth 11:19, 30 November 2012 (UTC)Reply

www.apelosurgentes.com.br



This is the site of apparitions of Virgin Mary in Brazil, and was incorrectly marked as spam, because I edited the same topic in pages in 5 languages in short period of time. I did not know that it would be considered spam because the pages where in different languages and same topic. Maybe this can considered a bug in the spam classification. Please remove this site from spam list.— The preceding unsigned comment was added by 189.105.11.106 (talk)

That is quite an understatement, seen the from these IPs discussed here. --Beetstra public (talk) 08:19, 8 December 2012 (UTC)Reply

pantheon.org



I would like to suggest that Pantheon.org, home of Encyclopedia Mythica, be whitelisted. There are thousands of good informational pages about mythological figures, mythological creatures and ancient faiths from around the world there. The site has been 100% informational and highly ranked since 1998.

The fact Encyclopedia Mythica is such a large, well-established, popular site is probably why it tripped the filter. Considering the sheer amounts of information on the Encyclopedia Mythica, the risk of breaking hundreds of citations, some of which can not be replaced, is very high.

I found this block when trying to reference the site to restore citation on the Al-mi'raj article on Wikipedia. The creature in question is obscure and the information found in the Encyclopedia Mythica Witchcraft section about the "MIRAJ" is a valuable addition to the page. I have not found this information anywhere else online.

--70.174.126.195 08:46, 10 December 2012 (UTC)Reply

This is only on the English Wikipedia blacklist, not on the global one. I suggest you ask for whitelisting there. Pmlineditor (t · c · l) 18:11, 14 December 2012 (UTC)Reply

valleyofflowers.info



Please Exclude valleyofflowers.info from spam list. The website has great information about valley of flowers. It also has a great information about the flowers found in valley of flowers national parks. I dont know much about editing wikipedia but i feel that the site has great content and the information can be useful for wikipedia users.

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

I would like to add the following "published" template for documentation purpose. --LoKiLeCh (talk) 22:44, 1 November 2012 (UTC) {{published|cite=web|url=http://www.mechanical-engineering.suite101.com/article/ten-mechanical-failure-modes-a149010|title=Ten Mechanical Failure Modes |legal=yes|publisher=www.mechanical-engineering.suite101.com}}Reply

Discussion

This section is for discussion of Spam blacklist issues among other users.

New cross-wiki linksearch wiki sets

Don't click on these, they are just examples and you will get a timeout.

  • This cannot go any higher, the maximum execution time on Google App Engine is only 60 seconds.
  • {en,de,fr}.{wikipedia,wikibooks,wikiquote,wiktionary} + Commons, Meta and mediawiki.org (15 wikis). I will cover the new travel guide project when it is ready.

Just a reminder, I also have a spam archive search (just en and meta only). MER-C (talk) 13:37, 1 October 2012 (UTC)Reply

Hello MER-C, thanks for this - I'd like to note that currently spamarchivesearch.jsp only gives links to en.wikipedia. For example http://wikipediatools.appspot.com/spamarchivesearch.jsp?query=whale.to (recurring request listed here). Regards, -- MarcoAurelio (talk) 13:52, 1 October 2012 (UTC)Reply
I blame MediaWiki's sucky search engine for that. MER-C (talk) 00:50, 2 October 2012 (UTC)Reply
I changed the search string. It should now work reasonably. MER-C (talk) 12:12, 8 October 2012 (UTC)Reply

EzineArticles.com



What's the matter with this website? --Horcrux92 (talk) 08:32, 12 October 2012 (UTC)Reply

It was blocked due to this request. — billinghurst sDrewth 22:28, 12 October 2012 (UTC)Reply
The following discussion is closed.

billinghurst sDrewth 06:30, 15 November 2012 (UTC)Reply

Friendly reminder

Please don't comment out domains on the blacklist to remove them, just remove them and please log the removals. Also please log the reguexp changes you make. I say this because currently I see some domains commented out and I don't know if that's just temporary or definitive. If those are definitive, they should be removed. Thanks. -- MarcoAurelio (talk) 13:37, 17 October 2012 (UTC)Reply