[wp-trac] [WordPress Trac] #23070: Generated robots.txt file with privacy on should have Allow: /robots.txt

WordPress Trac noreply at wordpress.org
Fri Dec 28 17:08:32 UTC 2012


#23070: Generated robots.txt file with privacy on should have Allow: /robots.txt
-----------------------------+-------------------------
 Reporter:  iamfriendly      |       Type:  enhancement
   Status:  new              |   Priority:  normal
Milestone:  Awaiting Review  |  Component:  General
  Version:                   |   Severity:  normal
 Keywords:                   |
-----------------------------+-------------------------
 Scenario:

 You're developing a site online (perhaps a new version of an existing
 site) and you have privacy settings disallowing (or, rather, discouraging)
 crawling. This generates a robots.txt file with


 {{{
 User-agent: *
 Disallow: /
 }}}


 Now, you've finished the new site and want google to have a look, so you
 stop discouraging spiders. This can take up to 48 hours for google to
 respider your robots.txt file. There's a way 'around' that by using Google
 Webmaster Tools and using the 'Fetch as Google' tool (under 'Health').

 If the generated 'private' robots.txt file had


 {{{
 Allow: /robots.txt
 }}}

 in it, then you would be able to force google to respider your robots.txt
 file (and hence allow it to spider your sitemap.xml file) when you need it
 to, rather than having to wait for an indeterminate amount of time.

 Granted, there are ways in which you can avoid this scenario, but I also
 know there are circumstances where you can't, and adding that simple one-
 liner can help you out.

 Am I missing a trick? Am I being stupid? If not, then I'll write the very
 quick patch.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/23070>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list