[wp-trac] [WordPress Trac] #52559: When doing JOINs for meta queries the meta_key should be in the ON clause whenever possible.
WordPress Trac
noreply at wordpress.org
Wed Feb 17 17:17:35 UTC 2021
#52559: When doing JOINs for meta queries the meta_key should be in the ON clause
whenever possible.
--------------------------+-----------------------------
Reporter: herregroen | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
Focuses: performance |
--------------------------+-----------------------------
Currently when doing `JOIN`s for meta queries the meta_key is only
included in the `ON` clause if the meta compare is `NOT EXISTS`. In all
other cases only the `post_id` is included.
This means that the meta key is only filtered after the JOIN is done.
Case in point, a meta_query that with 5 different keys generates the
following query:
{{{#!sql
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
LEFT JOIN wp_postmeta AS mt1 ON ( wp_posts.ID = mt1.post_id )
LEFT JOIN wp_postmeta AS mt2 ON ( wp_posts.ID = mt2.post_id )
LEFT JOIN wp_postmeta AS mt3 ON ( wp_posts.ID = mt3.post_id )
LEFT JOIN wp_postmeta AS mt4 ON (wp_posts.ID = mt4.post_id AND
mt4.meta_key = 'events_time_frame_end' )
WHERE 1=1 AND (
( wp_postmeta.meta_key = 'events_date_till' AND wp_postmeta.meta_value
>= '20210217' )
AND
(
( mt1.meta_key = 'events_date_till' AND mt1.meta_value > '20210217' )
OR
(
( mt2.meta_key = 'events_date_till' AND mt2.meta_value = '20210217'
)
AND
( mt3.meta_key = 'events_time_frame_end' AND mt3.meta_value >=
'14:59:19' )
)
OR
mt4.post_id IS NULL
)
) AND wp_posts.post_type = 'events' AND (wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'acf-disabled' OR wp_posts.post_status = 'dp-
rewrite-republish') GROUP BY wp_posts.ID ORDER BY wp_postmeta.meta_value
ASC LIMIT 0, 10;
}}}
This query takes 60 seconds on very well provisioned database with posts
that each have 50 postmeta rows in a database with 50 million total
postmeta rows.
In contrast the following query:
{{{#!sql
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id AND
wp_postmeta.meta_key = 'events_date_till' )
LEFT JOIN wp_postmeta AS mt1 ON ( wp_posts.ID = mt1.post_id AND
mt1.meta_key = 'events_date_till' )
LEFT JOIN wp_postmeta AS mt2 ON ( wp_posts.ID = mt2.post_id AND
mt2.meta_key = 'events_date_till' )
LEFT JOIN wp_postmeta AS mt3 ON ( wp_posts.ID = mt3.post_id AND
mt3.meta_key = 'events_time_frame_end' )
LEFT JOIN wp_postmeta AS mt4 ON (wp_posts.ID = mt4.post_id AND
mt4.meta_key = 'events_time_frame_end' )
WHERE 1=1 AND (
( wp_postmeta.meta_value >= '20210217' )
AND
(
( mt1.meta_value > '20210217' )
OR
(
( mt2.meta_value = '20210217' )
AND
( mt3.meta_value >= '14:59:19' )
)
OR
mt4.post_id IS NULL
)
) AND wp_posts.post_type = 'events' AND (wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'acf-disabled' OR wp_posts.post_status = 'dp-
rewrite-republish') GROUP BY wp_posts.ID ORDER BY wp_postmeta.meta_value
ASC LIMIT 0, 10;
}}}
Takes 400ms and returns identical results.
I believe the cause here is that because in the first query filtering is
only done after the join and each event has 50 postmeta fields MySQL is
first creating a virtual table with 50^4^ ( 4 because the last JOIN does
include the meta_key ) rows joined resulting in 60.000.000 total that are
then filtered back down.
Adding the meta_key condition to the `ON` clause means we're only joining
1 ( or a small number in the case of the meta_key existing in multiple
rows ) rows for each JOIN meaning the virtual table being created is
significantly smaller thus having much better query performance.
At the very least when there is any LEFT JOIN occurring then ALL
`meta_key` conditions should be moved in the `ON` clause. Although it may
be simpler to always have the `meta_key` condition in the `ON` clause.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/52559>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list