[wp-hackers] Blocking SEO robots

Blue Chives info at bluechives.com
Wed Aug 6 10:15:05 UTC 2014


Depending on the web server software you are using you can look at using the htaccess file and block users/bot based on their user agent.

This article should help:

http://www.javascriptkit.com/howto/htaccess13.shtml

Alternatively do drop me a line if you would like a hand with this as we manage the hosting for a number of WordPress blogs/websites.


Cheers
John

> On 6 Aug 2014, at 10:58, Eric Hendrix <hendronix at gmail.com> wrote:
> 
> This is not a bad idea at all - and I'd like to second the request if
> anyone has researched this previously. David is correct as I've found the
> same issue with valuable server resources - especially when you're running
> a handful of heavy WP sites.
> 
> So, bot experts, what say you?
> 
> 
>> On Wed, Aug 6, 2014 at 5:50 AM, David Anderson <david at wordshell.net> wrote:
>> 
>> This isn't specifically a WP issue, but I think it will be relevant to
>> lots of us, trying to maximise our resources...
>> 
>> Issue: I find that a disproportionate amount of server resources are
>> consumed by a certain subset crawlers/robots which contribute nothing. I'd
>> like to just block them. I have in mind the various semi-private search
>> engines run by SEO companies/backlink-checkers, e.g.
>> http://en.seokicks.de/, https://ahrefs.com/. These things happily spider
>> a few thousand pages, every author, tag, category, etc., archive. Some of
>> them refuse to obey robots.txt (the one that specifically annoys is when
>> they ignore the Crawl-Delay directive. I even came across one that proudly
>> had a section on its website explaining that robots.txt was a stupid idea,
>> so they always ignored it!).
>> 
>> I'd like to just block such crawlers. So: does anyone know of where a
>> reliable list of the IP addresses used by these services is kept?
>> Specifically, I want to block the semi-private or obscure crawlers that do
>> nothing useful for my sites. I don't want to block mainstream search
>> engines, of course. I've done some Googling, and haven't managed to find
>> something that makes this distinction.
>> 
>> Or alternatively - anyone think this is a bad idea?
>> 
>> Best wishes,
>> David
>> 
>> --
>> UpdraftPlus - best WordPress backups - http://updraftplus.com
>> WordShell - WordPress fast from the CLI - http://wordshell.net
>> 
>> _______________________________________________
>> wp-hackers mailing list
>> wp-hackers at lists.automattic.com
>> http://lists.automattic.com/mailman/listinfo/wp-hackers
> 
> 
> 
> -- 
> 
> 
> *Eric A. HendrixUSA, MSG(R)*hendronix at gmail.com
> (910) 644-8940
> 
> *"Non Timebo Mala"*
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers


More information about the wp-hackers mailing list