[wp-hackers] Extending query_posts?

David Chait davebytes at comcast.net
Wed Feb 1 15:14:14 GMT 2006


If you will only ever have say 200-300 posts with the category filtering, 
then you are probably okay -- if you really don't care about the overhead.

Note that I don't know the internals of MySQL, but I'd assume that in order 
to do an ORDER BY RAND(), I'd expect it has to randomize the ENTIRE table. 
Even with a LIMIT in place, I think it still computes the entire result 
table first.  That's why you are actually better off doing something like 
(butchering it here) select post id,post title from posts where category in 
(list of categories) order by post timestamp desc limit 100, then do a 
random routine that picks N items from the results.  Why?

1. select less data whenever possible.  with post id and title, you can make 
a link.

2. select a smaller result set.  category is your filter already, we'll 
stick with that.

3. don't randomize the entire set.  even in PHP, many of the randomization 
approaches will, say, randomize an entire array -- whereas you just want N 
random entries out of the array.  randomizing the entire set is wasteful.

4. overhead will continue to grow.  remember that this thing happens every 
pageload of a single page.  my approach could actually be further optimized 
by caching the results of the query for say 12h or something, and doing only 
the random-pick from the cache (I don't do this internal to CG-SameCat 
currently, but I think my home-grown query caching does it one level higher, 
for some number of hours).

best of luck,
-d

----- Original Message ----- 
From: "Mattias Winther" <mattias at winthernet.se>
To: <wp-hackers at lists.automattic.com>
Sent: Wednesday, February 01, 2006 12:18 AM
Subject: Re: [wp-hackers] Extending query_posts?


|
| Well, my thought is to get completely random from all posts in matching 
categor(y|ies), so I'm somewhere in between. But, since ordering by random 
works without hurting up to about 1000 hits, and considering the amount of 
posts I have, combined with the amount of posts/day I write, I figure I'll 
have a few years before this becomes an actual problem for me personally... 
And I'm guessing that once I start hitting the numbers, I will indeed choose 
to pick between more recent posts.
|
| Since I'm modifying the SQL ordering, from what I understand, the posts 
are first selected from the criteria, and then ordered according to, in my 
case, random order, which should mean that with selective query_posts-usage 
one should be able to use the simple version in a lot of cases, even with 
large number of posts, without suffering any real performance problems. 
Optimizing seems like a great idea, and once I get my new iMac, I'll have 
the hardware to test on too. :) What is the easiest way to populate a test 
instance of the database with, say, 10000 posts?
|
| On Tue, 31 Jan 2006 20:13:53 -0500, "David Chait" <davebytes at comcast.net> 
wrote:
| > If you are trying to do anything like CG-SameCat, the approach I used 
was
| > to
| > not use random in the select, but select posts (well, post ID and title
| > really...) where the category matches, ordered by post ID (or date)
| > descending, and limit of some factor time the number of random items 
asked
| > for... Thus, you want say three random, but recent, related posts -- so
| > grab
| > the last 10 or 20 and then use PHP to pick a random few from the set.
| > Just
| > a thought.
| >
| > Completely random from entire posts table != completely random from 
recent
| > posts in matching category... ;)
| >
| > -d
| >
| > ----- Original Message -----
| > From: "Mattias Winther" <mattias at winthernet.se>
| > To: <wp-hackers at lists.automattic.com>
| > Sent: Tuesday, January 31, 2006 7:12 PM
| > Subject: Re: [wp-hackers] Extending query_posts?
| >
| >
| > |
| > | Thanks, great link!
| > |
| > | Well, as for my own blog, I'm using to generate recent posts in the 
same
| > category, meaning that I'll have quite the long way to go before 1000 
hits
| > is becoming an issue. Then again, I would like to do a nice plug-in that
| > can
| > be trusted to perform at about the same speed with a ton of rows.
| > |
| > | I guess it's time to brush up on my SQL.
| > |
| > | /Mattias
| > |
| > | On Tue, 31 Jan 2006 13:36:45 -0500, "David Chait"
| > <davebytes at comcast.net>
| > wrote:
| > | > Yes, RAND is awful for ORDER BY.
| > | >
| > | > http://jan.kneschke.de/projects/mysql/order-by-rand/
| > | >
| > | > At the end, he shows times for 100, 1000, ... , 1M records.  RAND
| > starts
| > | > to
| > | > 'hurt' even when just 1000 records.  He walks through building an
| > | > alternate
| > | > approach, still in SQL (though using sub-selects, so MySQL 4.1+).
| > | >
| > | > Do you need a completely random entry from the entire data set?  Or
| > would
| > | > a
| > | > random-but-recent entry work? (i.e., select post_id limit 100, pick 
a
| > | > random
| > | > entry or entries, then do the 'full' query with post_id in {list of
| > | > selected
| > | > ids}...)
| > | >
| > | > -d
| > | >
| > | > ----- Original Message -----
| > | > From: "Mattias Winther" <mattias at winthernet.se>
| > | > To: <wp-hackers at lists.automattic.com>
| > | > Sent: Tuesday, January 31, 2006 12:27 PM
| > | > Subject: Re: [wp-hackers] Extending query_posts?
| > | >
| > | >
| > | > |
| > | > | I can't find any specific references to the performance problems 
you
| > are
| > | > referring to... Anyone else out there who recognizes this, 
preferrably
| > | > with
| > | > some kind of link for further analysis? Or, someone with a huge
| > database
| > | > that can run the numbers?
| > | > |
| > | > | /Mattias
| > | > |
| > | > | On Mon, 30 Jan 2006 08:10:03 -0500, Scott johnson
| > <fuzzygroup at gmail.com>
| > | > wrote:
| > | > | > I'd be careful with that.  My understanding of randomization in
| > mysql
| > | > | > is that the performance is awful.  I haven't done it myself but
| > that's
| > | > | > hwat I understand.
| > | > | >
| > | > | > Scott
| > | >
| > | > _______________________________________________
| > | > wp-hackers mailing list
| > | > wp-hackers at lists.automattic.com
| > | > http://lists.automattic.com/mailman/listinfo/wp-hackers
| > |
| > | _______________________________________________
| > | wp-hackers mailing list
| > | wp-hackers at lists.automattic.com
| > | http://lists.automattic.com/mailman/listinfo/wp-hackers
| >
| > _______________________________________________
| > wp-hackers mailing list
| > wp-hackers at lists.automattic.com
| > http://lists.automattic.com/mailman/listinfo/wp-hackers
|
| _______________________________________________
| wp-hackers mailing list
| wp-hackers at lists.automattic.com
| http://lists.automattic.com/mailman/listinfo/wp-hackers
| 



More information about the wp-hackers mailing list