[wp-meta] [Making WordPress.org] #5105: Remove bot blocking (403 responses) on *.trac.wordpress.org sites.

Making WordPress.org noreply at wordpress.org
Fri Mar 20 13:47:45 UTC 2020


#5105: Remove bot blocking (403 responses) on *.trac.wordpress.org sites.
----------------------------+--------------------
 Reporter:  jonoaldersonwp  |      Owner:  (none)
     Type:  defect          |     Status:  new
 Priority:  normal          |  Milestone:
Component:  Trac            |   Keywords:  seo
----------------------------+--------------------
 We have systems in place which actively prevent Google (and other agents?)
 from accessing `*.trac.wordpress.org` sites/URLs. We return a 403 response
 (and a raw NGINX template) in these scenarios.

 This 'solution' prevents these agents them from seeing/accessing the
 robots.txt file on those respective sites, and thus results in them
 continuing to attempt to crawl/index them (especially as these URLs are
 heavily linked to throughout the wp.org ecosystem).

 I propose that we remove the 403 behaviour, and rely on the robots.txt
 file to do its job.

 If we believe that it's necessary to restrict crawling behaviour for
 performance reasons, then we can consider tailoring the robots.txt rule(s)
 to be more restrictive, and/or implementing performance improvements
 throughout the site(s) (of which there are myriad available and
 achievable, both front-end and back-end).

-- 
Ticket URL: <https://meta.trac.wordpress.org/ticket/5105>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org


More information about the wp-meta mailing list