[wp-trac] [WordPress Trac] #8351: Poorly optimized queries in
wp_get_archives() and wp_get_calendar()
WordPress Trac
wp-trac at lists.automattic.com
Wed Nov 26 09:14:08 GMT 2008
#8351: Poorly optimized queries in wp_get_archives() and wp_get_calendar()
---------------------------------+------------------------------------------
Reporter: vladimir_kolesnikov | Owner: anonymous
Type: defect | Status: new
Priority: normal | Milestone: 2.7
Component: General | Version:
Severity: normal | Keywords: has-patch
---------------------------------+------------------------------------------
Performance of the queries in is too bad wp_get_archives() and
wp_get_calendar() if you have a lot of records in wp_posts table (so bad
that hosting company suspends the account).
'''1. wp_get_archives():'''
{{{
EXPLAIN SELECT DISTINCT YEAR(post_date) AS `year`, MONTH(post_date) AS
`month`, count(ID) as posts FROM wp_posts WHERE post_type = 'post' AND
post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER
BY post_date DESC
+----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra
|
+----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | wp_posts | ref | type_status_date | type_status_date
| 124 | const,const | 3594 | Using where; Using index; Using
temporary; Using filesort |
+----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+
}}}
Nothing you can do with this query, as grouping by expression cannot use
indices by definition. The only thing is to remove DISTINCT from the
query, as grouping by year and month guarantees that every (year, month)
pair in the result set will be unique. Same for yearly and daily archives.
'''2. wp_get_calendar():'''
Although we can do almost nothing to wp_get_archives() (due to DB design
deficiencies), we can significantly boost wp_get_calendar().
First, wp_get_calendar() checks if we have any posts at all:
{{{
SELECT ID from wp_posts WHERE post_type = 'post' AND post_status =
'publish' ORDER BY post_date DESC LIMIT 1
}}}
Since we only need to check the existence of the record, we don't have any
reason to use ORDER BY (although it does nothing harmful as MySQL is able
to use index).
If the query succeeded, WP starts searching for the posts in past months:
{{{
SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM
wp_posts WHERE post_date < '2008-11-01' AND post_type = 'post' AND
post_status = 'publish' ORDER BY post_date DESC LIMIT 1
}}}
What is wrong: since we need only ONE record (see LIMIT 1), it will be
unique in the result set anyway, DISTINCT is not needed. Here's the
EXPLAIN of the query:
{{{
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 3238 | Using where; Using index; Using
temporary |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
}}}
As you can see, MySQL needs a temporary table (it doesn't matter that it
will have only one row, it will be created). If we remove DISTINCT, we can
get rid of the temporary table:
{{{
EXPLAIN SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM
wp_posts WHERE post_date < '2008-11-01' AND post_type = 'post' AND
post_status = 'publish' ORDER BY post_date DESC LIMIT 1
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 3238 | Using where; Using index |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
}}}
Then WP looks for the posts written in the next months:
{{{
SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM
wp_posts
WHERE post_date > '2008-11-01' AND MONTH( post_date ) != MONTH(
'2008-11-01' )
AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date ASC
LIMIT 1
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 356 | Using where; Using index; Using
temporary |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
}}}
Same as above: DISTINCT is not needed. But we still can rewrite the query
in a better way:
{{{
EXPLAIN SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM
wp_posts WHERE post_date > '2008-11-31 23:59:59' AND post_type = 'post'
AND post_status = 'publish' ORDER BY post_date ASC LIMIT
1;+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 1 | Using where; Using index |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
}}}
We got rid of Using temporary AND we were able to make MySQL scan much
less rows!
Next, WP looks for the days in the month when at least one post was
published:
{{{
SELECT DISTINCT DAYOFMONTH(post_date)
FROM wp_posts WHERE MONTH(post_date) = '11'
AND YEAR(post_date) = '2008'
AND post_type = 'post' AND post_status = 'publish'
AND post_date < '2008-11-24 18:30:04'
}}}
Only range optimization can be used here. But, as I have already
mentioned, MySQL is unable to use indices for expressions. Thus, our range
search will only be limited from the top:
{{{
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 3594 | Using where; Using index; Using
temporary |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
}}}
It is very easy to rewrite the query:
{{{
SELECT DISTINCT DAYOFMONTH(post_date)
FROM wp_posts
WHERE post_date >= '2008-11-01 00:00:00'
AND post_type = 'post' AND post_status = 'publish'
AND post_date < '2008-11-24 18:30:04'
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | wp_posts | range | type_status_date,post_type |
type_status_date | 132 | NULL | 267 | Using where; Using index; Using
temporary |
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
}}}
As you can see, MySQL needed to scan only 267 rows instead 3594, and the
more records you have in your wp_posts table, the more performance is.
Finally, WP fetches the days and post titles:
{{{
SELECT post_title, DAYOFMONTH(post_date) as dom
FROM wp_posts
WHERE YEAR(post_date) = '2008'
AND MONTH(post_date) = '11'
AND post_date < '2008-11-24 18:30:04'
AND post_type = 'post'
AND post_status = 'publish'
}}}
The above optimization can be applied here.
--
Ticket URL: <http://trac.wordpress.org/ticket/8351>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list