[wp-trac] [WordPress Trac] #18465: Prevent search engines from indexing wp-admin and wp-includes
WordPress Trac
wp-trac at lists.automattic.com
Fri May 18 07:23:00 UTC 2012
#18465: Prevent search engines from indexing wp-admin and wp-includes
--------------------------+---------------------
Reporter: Viper007Bond | Owner: ryan
Type: enhancement | Status: closed
Priority: lowest | Milestone: 3.4
Component: General | Version: 3.2.1
Severity: trivial | Resolution: fixed
Keywords: has-patch |
--------------------------+---------------------
Comment (by koebenhavn event):
Just to clarify.
It is only partially true that the robot.txt does not inhibit/request
crawlers no avoid indexing. For instance while the actual URL and anchor
text might be found in the google index, if searching specificly for it,
the crawler will not index the actual contend of the page. What the google
help page says is that you would have to know and search for the specific
URL or anchor text to find it in the google index, and would not see the
actual content.
/Event
Replying to [comment:28 joostdevalk]:
> This is a valid problem but the "fix" doesn't actually fix it. While the
addition to robots.txt blocks the crawler from opening the URL, a URL that
cannot be opened CAN still be listed in the index if Google finds enough
links pointing to it, see the note on
[http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449
this Google help] Example of this can be seen on my Dutch domain:
>
> https://www.google.com/search?q=site%3Ayoast.nl++inurl%3Awp-admin
>
> The solution is to not exclude the admin directory in robots.txt, but to
send an X-Robots-Tag HTTP header of value noindex (the HTTP version of a
robots meta tag [http://www.events-københavn.dk events]. ) for the files
in admin and for admin-ajax.php, will add a patch.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/18465#comment:38>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list