Fighting spam in MediaWiki

23
Fighting spam in MediaWiki (C) WikiVote! 2012 Yury Katkov

description

SMWCon Fall 2012 conference tutorial on fighting spam. The video is available here: http://www.youtube.com/watch?v=rhC1DFeblik&list=PLwtfwT1GnUQRaLki-YcF-_n8ndayi--W5&index=3&feature=plpp_video

Transcript of Fighting spam in MediaWiki

Page 1: Fighting spam in MediaWiki

Fighting spam in MediaWiki

(C) WikiVote! 2012

Yury Katkov

Page 2: Fighting spam in MediaWiki

How effective is spam?

Effective enough to be profitable

(C) WikiVote! 2012

5.6% click-through rate

for porn spam0.02%

click-through rate for pharma spam

0.0075%click-through rate for Rolex watches spam

Page 3: Fighting spam in MediaWiki

Types of wiki-spam

By user• Anonymous spam• Spam from registered userBy page action• Spamming on a user page• Spamming by creating new page• Spamming on existing pagesBy sort of spam itself• Posting links to websites• Posting text with non-spam links for example liks

to a URL-shortener services• Posting text without a links

(C) WikiVote! 2012

Page 4: Fighting spam in MediaWiki

The best anti-spam techniques

1. Active community2. Bulk-editing cyborgs3. Blacklists4. Honeypots5. Captcha6. Reasonable delays and confirmations7. Behaviour analysis (in the worst

cases)

(C) WikiVote! 2012

Page 5: Fighting spam in MediaWiki

(C) WikiVote! 2012

Active communityWhen someone need your wiki they will get rid of spam

(C) WikiVote! 2012

Page 6: Fighting spam in MediaWiki

The best anti-spam techniquesActive community, cyborgs

If you have healthy community, the spam will be deleted by participants themselves.• Search for heroes• Turn the superheroes into

cyborgs– AutoWikiBrowser– Secretaribot– User:ClueBot_NG – just amazing– Nuke– Delete Batch

• Allow and encourage the use of bots by your heroes:– http://en.wikipedia.org/wiki/User:Emijrp/Anti-va

ndalism_bot_census(C) WikiVote! 2012

Page 7: Fighting spam in MediaWiki

(C) WikiVote! 2012

Honeypots

(C) WikiVote! 2012

Page 8: Fighting spam in MediaWiki

The best anti-spam techniquesHoneypots

Extension:SimpleAntiSpamPrinciple:

Adding hidden fields that only bot will fillAdvantages:

Plug-and-playDisadvantages:

Works only for the dummiest bots

(C) WikiVote! 2012

Page 9: Fighting spam in MediaWiki

(C) WikiVote! 2012

Blacklisting

(C) WikiVote! 2012

Page 10: Fighting spam in MediaWiki

The best anti-spam techniquesBlacklisting

What can be blacklisted:• Spam text patterns– Extension:SpamBlacklist

• Spammers IP addresses by IP or by DNS:– Extension:SpamBlacklist

– DNSBL

(C) WikiVote! 2012

$wgEnableDnsBlacklist = true;$wgDnsBlacklistUrls = array( 'xbl.spamhaus.org',

'opm.tornevall.org' );

$wgSpamBlacklistFiles = array( "[[m:Spam blacklist]]","http://en.wikipedia.org/wiki/MediaWiki:Spam-blacklist" );

Page 11: Fighting spam in MediaWiki

The best anti-spam techniquesBlacklisting

What can be blacklisted:• Spam text patterns– Extension:SpamBlacklist

• Spammers IP addresses by IP or by DNS:

– DNSBL

(C) WikiVote! 2012

$wgEnableDnsBlacklist = true;$wgDnsBlacklistUrls = array( 'xbl.spamhaus.org',

'opm.tornevall.org' );

$wgSpamBlacklistFiles = array( "[[m:Spam blacklist]]","http://en.wikipedia.org/wiki/MediaWiki:Spam-blacklist" );

Death to URL

shorteners!

Page 12: Fighting spam in MediaWiki

(C) WikiVote! 2012

CAPTCHA

(C) WikiVote! 2012

Page 13: Fighting spam in MediaWiki

The best anti-spam techniquesCAPTCHA

Many extensions for CAPTCHA exist, but you won’t make mistake if you choose ConfirmEdit:• Used by all Wikimedia sites• Has several types of CAPTCHA

included• Easily configurable and flexible

(C) WikiVote! 2012

Page 14: Fighting spam in MediaWiki

ReCaptcha Questy captcha

Advantages• Great for mediawiki

autonomous bots (asks meaningful questions)

• It’s as smart as you• Can be adopted to your

wiki!Disadvantages• No good for guided

spambots

UsageFor small and medium-sized wikis

Advantages• Unlimited set of captchas• Works most of the time

Disadvantages• May be tricky for users

UsageFor big public wikis or if you know that someone is hunting you

(C) WikiVote! 2012

The best anti-spam techniquesWhich CAPTCHA to choose?

You may also like:Asirra

Page 15: Fighting spam in MediaWiki

The best anti-spam techniquesWhat should be captchued?

CAPTCHA is usually needed when:• Anonymous is trying to register• Anonymous is adding the link• There are too much tries to log in– BTW, it’s good to turn on

$wgPasswordAttemptThrottle

(C) WikiVote! 2012

Page 16: Fighting spam in MediaWiki

The best anti-spam techniquesExpress yourself in a CAPTCHA

(C) WikiVote! 2012

Page 17: Fighting spam in MediaWiki

(C) WikiVote! 2012

Tricks and tips

(C) WikiVote! 2012

Page 18: Fighting spam in MediaWiki

The best anti-spam techniquesConfiguration tricks: permissions

Depending on how desperate you feel, you can do one of the following:

• Force people to register before they are allowed to edit

• Add a timeout interval after signing up.

• Require e-mail confirmation to edit:

(C) WikiVote! 2012

$wgGroupPermissions['*']['edit'] = false; $wgShowIPinHeader = false;

$wgAutoConfirmAge = 3600*24;$wgGroupPermissions['*']['createpage'] = false; $wgGroupPermissions['user' ]['createpage'] = false;$wgGroupPermissions['autoconfirmed']['createpage'] = true;

$wgEmailConfirmToEdit=true

Page 19: Fighting spam in MediaWiki

The best anti-spam techniquesConfiguration tricks: permissions

Depending on how desperate you feel you can do one of the following:

• Require the approval of new accounts by a bureaucrat:

• Turn off the registration for everyone:

• Turn off the server

(C) WikiVote! 2012

$wgGroupPermissions['*']['createaccount'] = false;

require_once("$IP/extensions/ConfirmAccount/SpecialConfirmAccount.php");

Page 20: Fighting spam in MediaWiki

The best anti-spam techniquesEmail configuration

Wiki allows people to send e-mails to each other. The following will be always useful: • require email authentication for using any email

function (except password reminder)

(C) WikiVote! 2012

$wgEnableEmail = true; $wgEmailAuthentication = true;

Page 21: Fighting spam in MediaWiki

(C) WikiVote! 2012

Behavior analysis

(C) WikiVote! 2012

Page 22: Fighting spam in MediaWiki

The best anti-spam techniquesBehavior analysis

In the worst cases you can install AbuseFilter:• Define heuristics of suspicious behavior• Can also handle vandalism• Use with great care! • Tip: you can copy the bad behavior from

Wikipedia:http://en.wikipedia.org/wiki/Special:AbuseFilter

(C) WikiVote! 2012

Page 23: Fighting spam in MediaWiki

Тел.: 8 (499) 506 74 31Эл. почта: [email protected]

Сайт: wikivote.ru

(C) WikiVote! 2012