[wp-trac] [WordPress Trac] #16893: Stop or reduce crawling of comment reply ?replytocom URLs
WordPress Trac
wp-trac at lists.automattic.com
Wed Apr 6 07:09:25 UTC 2011
#16893: Stop or reduce crawling of comment reply ?replytocom URLs
------------------------------------+------------------------------
Reporter: joelhardi | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Comments | Version: 3.1
Severity: normal | Resolution:
Keywords: has-patch dev-feedback |
------------------------------------+------------------------------
Comment (by joelhardi):
Reporting back on my running of attachment:general-template.php.17522.diff
(which adds the robots noindex,nofollow meta tag to ?replytocom URLs) on 2
live sites for the past couple of weeks since this ticket was added.
It's worked as well as (or better than) I expected and I'd recommend
adding this functionality to a future release.
?replytocom pages have not been indexed by Google and there's been no
increase in googlebot crawling of these sites (previously I'd had
robots.txt block access to these URLs). So, even the hypothesis about
googlebot intelligently not trying to recrawl these URLs once it
encounters the meta tag has borne out.
Also, in Google Webmaster Tools there's a "crawl errors" section which
normally lists URLs blocked by robots.txt. These URLs aren't included (in
fact they don't show up anywhere in Webmaster Tools) since they're blocked
by the meta tag. So, the end-user goal of users not having these URLs
litter their screen when they log into Webmaster Tools is also achieved. I
think this is a good improvement to quiet the complaining on the other
thread about Google now crawling these pages since the rel="nofollow"
attrib was dropped from <a> tags, and don't see any potential downsides.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/16893#comment:3>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list