[wp-trac] [WordPress Trac] #52457: WordPress vulnerable to search-reflected webspam

Fri Feb 5 20:30:25 UTC 2021

#52457: WordPress vulnerable to search-reflected webspam
-------------------------+-----------------------------
 Reporter:  abagtcs      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  Awaiting Review
Component:  General      |    Version:
 Severity:  normal       |   Keywords:
  Focuses:               |
-------------------------+-----------------------------
 WordPress echoes back searched-for terms on its search results page. For
 example, a search on an installation on www.foo.edu for "scholarship
 programs" would have the URL of:

      https://www.foo.edu/?s=scholorship+programs

 The resulting page would include the text:

      Search results for: scholarship programs

 Web spammers have started to abuse search features of those sites by
 passing in spam terms and hostnames in hopes of boosting the search
 rankings of the spammers’ sites. For example, www.foo.edu might be being
 abused by URLs that look like:

 https://www.foo.edu/?s=Buy%20cheap%20viagra%20without%20prescrition%20-%20www.getcheapdrugs.com

 and that produce a page that includes:

                 Search Results for: Buy cheap viagra without prescrition -
 www.getcheapdrugs.com

 The spammers place these links in open wikis, blog comments, forums, and
 other link farms, relying upon search engines crawling their links, and
 then visiting and indexing the resulting search results pages and included
 spammy content.

 This attack is surprisingly quite widespread, affecting many websites
 around the world. Though some CMS’s and sites powered by custom-written
 code may be vulnerable to this technique, (based on preliminary
 investigation) it appears that --at least in the .edu space-- the most
 targeted web platform by far is WordPress.  For example, to see many
 examples of U.S. educational websites targeted by the attack, you can do a
 Google search for:

           site:edu inurl:s “buy”

 There are several possible ways to prevent a website from being abused by
 this method, but adding the appropriate header or meta tag (see
 https://developers.google.com/search/docs/advanced/crawling/block-indexing
 ) appears to be the most appropriate and effective, especially with
 respect to getting the spam URLs removed from search engine indexes.

 I have submitted this problem as a security concern for WordPress on
 HackerOne, but it was closed as not a vulnerability. So I'm suggesting,
 then, that core be modified to either always, or by default (with the
 ability to disable), add the appropriate meta tag into search result page
 headers:

    <meta name=’robots’ content=’noindex,follow’ />

 This will indicate to crawlers that the content is not to be indexed and
 prevent the site from being abused by web spammers

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/52457>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform