[wp-hackers] Blocking SEO robots

David Anderson david at wordshell.net
Wed Aug 6 09:50:52 UTC 2014


This isn't specifically a WP issue, but I think it will be relevant to 
lots of us, trying to maximise our resources...

Issue: I find that a disproportionate amount of server resources are 
consumed by a certain subset crawlers/robots which contribute nothing. 
I'd like to just block them. I have in mind the various semi-private 
search engines run by SEO companies/backlink-checkers, e.g. 
http://en.seokicks.de/, https://ahrefs.com/. These things happily spider 
a few thousand pages, every author, tag, category, etc., archive. Some 
of them refuse to obey robots.txt (the one that specifically annoys is 
when they ignore the Crawl-Delay directive. I even came across one that 
proudly had a section on its website explaining that robots.txt was a 
stupid idea, so they always ignored it!).

I'd like to just block such crawlers. So: does anyone know of where a 
reliable list of the IP addresses used by these services is kept? 
Specifically, I want to block the semi-private or obscure crawlers that 
do nothing useful for my sites. I don't want to block mainstream search 
engines, of course. I've done some Googling, and haven't managed to find 
something that makes this distinction.

Or alternatively - anyone think this is a bad idea?

Best wishes,
David

-- 
UpdraftPlus - best WordPress backups - http://updraftplus.com
WordShell - WordPress fast from the CLI - http://wordshell.net



More information about the wp-hackers mailing list