[wp-trac] [WordPress Trac] #52457: WordPress vulnerable to search-reflected webspam
WordPress Trac
noreply at wordpress.org
Fri Feb 5 20:30:25 UTC 2021
#52457: WordPress vulnerable to search-reflected webspam
-------------------------+-----------------------------
Reporter: abagtcs | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
Focuses: |
-------------------------+-----------------------------
WordPress echoes back searched-for terms on its search results page. For
example, a search on an installation on www.foo.edu for "scholarship
programs" would have the URL of:
https://www.foo.edu/?s=scholorship+programs
The resulting page would include the text:
Search results for: scholarship programs
Web spammers have started to abuse search features of those sites by
passing in spam terms and hostnames in hopes of boosting the search
rankings of the spammers’ sites. For example, www.foo.edu might be being
abused by URLs that look like:
https://www.foo.edu/?s=Buy%20cheap%20viagra%20without%20prescrition%20-%20www.getcheapdrugs.com
and that produce a page that includes:
Search Results for: Buy cheap viagra without prescrition -
www.getcheapdrugs.com
The spammers place these links in open wikis, blog comments, forums, and
other link farms, relying upon search engines crawling their links, and
then visiting and indexing the resulting search results pages and included
spammy content.
This attack is surprisingly quite widespread, affecting many websites
around the world. Though some CMS’s and sites powered by custom-written
code may be vulnerable to this technique, (based on preliminary
investigation) it appears that --at least in the .edu space-- the most
targeted web platform by far is WordPress. For example, to see many
examples of U.S. educational websites targeted by the attack, you can do a
Google search for:
site:edu inurl:s “buy”
There are several possible ways to prevent a website from being abused by
this method, but adding the appropriate header or meta tag (see
https://developers.google.com/search/docs/advanced/crawling/block-indexing
) appears to be the most appropriate and effective, especially with
respect to getting the spam URLs removed from search engine indexes.
I have submitted this problem as a security concern for WordPress on
HackerOne, but it was closed as not a vulnerability. So I'm suggesting,
then, that core be modified to either always, or by default (with the
ability to disable), add the appropriate meta tag into search result page
headers:
<meta name=’robots’ content=’noindex,follow’ />
This will indicate to crawlers that the content is not to be indexed and
prevent the site from being abused by web spammers
--
Ticket URL: <https://core.trac.wordpress.org/ticket/52457>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list