[buddypress-trac] [BuddyPress] #4045: Activity Feed queries fail with millions of rows

buddypress-trac at lists.automattic.com buddypress-trac at lists.automattic.com
Wed Mar 21 00:31:57 UTC 2012


#4045: Activity Feed queries fail with millions of rows
-----------------------------------+--------------------
 Reporter:  zacechola              |       Owner:
     Type:  defect (bug)           |      Status:  new
 Priority:  normal                 |   Milestone:  1.6
Component:  Activity               |     Version:  1.5.4
 Severity:  normal                 |  Resolution:
 Keywords:  2nd-opinion has-patch  |
-----------------------------------+--------------------

Comment (by cnorris23):

 I did some testing of different scenarios over the weekend, and here's
 what I found.

 Test database was about 1,060,000 activity rows.

 Test servers
 1) local xampp
 2) standard shared hosting
 3) low-mid range VPS

 * All queries run with SQL_NO_CACHE

 Indexes added the most benefit, especially on the count query. Indexing
 alone cut the count by half across all testing environments. Using `USE
 INDEX (PRIMARY,date_recorded)` rather than just `USE INDEX
 (date_recorded)` produced even better results, although the most benefit
 came on the count query.

 I also tested selecting ids first, running a second query to fill in the
 objects based on those ids. The purpose of this would be to prevent the
 nasty filesort seen by mym6. All my test servers had plenty of memory, so
 this was difficult to test. The shared server offered me no ability to
 change settings. The VPS server was a production server, so I didn't want
 to do much there. I eventually tricked MySQL on my xampp server into
 thinking it was memory starved. I was able to reproduce the filesort, and
 the two query solution fixed this. However, across all environments, this
 was slower than the single query we have now, even with `USE INDEX`. But
 by slower, it was consistently, across all platforms, only .5-.8
 microseconds (.0005-.0008 seconds) slower.

 As much as I'd like to use the SQL_COUNT_FOUND_ROWS/SELECT FOUND_ROWS()
 combo, it produced varying results, the vast majority of them being 2-3
 times slower than the count query we have now.

 @boonebgorges I think making the count query optional, or caching it in a
 transient would be a good solution. While researching, this was a solution
 thrown out multiple times. As you said, the counts don't really matter,
 especially when you reach thousands, or in the case of this ticket,
 millions, of rows.

-- 
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/4045#comment:23>
BuddyPress <http://buddypress.org/>
BuddyPress


More information about the buddypress-trac mailing list