[wp-meta] [Making WordPress.org] #4138: PROPOSAL: Maintain a blacklist of obviously nefarious traffic sources

Making WordPress.org noreply at wordpress.org
Fri Feb 1 10:45:48 UTC 2019


#4138: PROPOSAL: Maintain a blacklist of obviously nefarious traffic sources
----------------------------+-----------------------
 Reporter:  jonoaldersonwp  |      Owner:  (none)
     Type:  defect          |     Status:  new
 Priority:  normal          |  Milestone:
Component:  General         |   Keywords:  analytics
----------------------------+-----------------------
 We monitor where traffic to wordpress.org comes from, in order to help
 understand our marketing performance and to prioritise strategies/tactics.
 ~40% of our traffic comes in the form of referrals from other sites.

 A large proportion of that traffic is obviously fake, and/or nefarious by
 nature; general spam bots, or, more maliciously targeted fake traffic,
 designed to confuse, elicit clicks, or obfuscate other behaviours.

 Some of this, we can see pretty easily. For example, between Jan 22nd-
 24th:
 - Nearly 8% of our referral traffic (46,000 users) originated from
 [https://menpower.life this **NSFW** site].
 - 2.5% of our referral visits (12,000 users) came from [gamefullpc.net]
 - 1.5% came from (9,000 users) came from [http://onlinetrafficbot.com]

 Further investigation into the behaviour of these particular visits shows
 significant evidence that they're bots - browsing patterns and meta data
 associated with the visits sticks out like a sore thumb.

 These are bad sites, sending fake users, costing us time and money, and
 muddying our understanding of _actual_ user behaviour.

 My original suggestion was going to be that we should filter out visits
 from these blacklisted sources/domains in Google Analytics ("ignore
 traffic with X referring source") - however, upon reflection, perhaps we
 should block these from the site entirely?

 My suggestion, then, is that:

 - We undergo a regular review process of referrering sources in GA.
 - If they're obviously _bad_, we add them to a list to block them on the
 load balancer level, based on the `referrer`.
 - If it's less obvious what's going on, we consider filtering out in GA on
 a case-by-case basis.

 Thoughts appreciated.

-- 
Ticket URL: <https://meta.trac.wordpress.org/ticket/4138>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org


More information about the wp-meta mailing list