[wp-hackers] Meta tables: Take 5

Fri Aug 7 13:56:44 UTC 2009

On Aug 6, 2009, at 1:12 AM, Mike Schinkel wrote:

> Are you saying that meta is not needed because of the taxonomy  
> system?  If yes, that's ironic because some of what I'm doing lately  
> really has me lamenting that limits of the taxonomy systems.
>
> Meta is used for attributes and values. Name/value pairs in other  
> words.  Taxonomy is used for classification.  But wait; what is  
> classification? Is it not simply attributes with a value of "Yes",  
> "No", or "Unknown?"
>
> Of course it is.  Everything boils down to attributes as name/value  
> pairs where the classification is also an attribute with potential  
> values of Yes/No/Unknown.  I believe even Tim Bray now agrees[1].
>
> So IMO the taxonomy system, while a good start, has a long way to go  
> before it can fully usable to replace metadata.

I think you misread my previous email, and our understanding of the  
purpose and best uses of the taxonomy and post_meta tables is different.

The taxonomy system is well optimized as a mechanism to querying for  
objects that match various criteria. The meta tables are good places  
to store bulk data for objects. The point of my previous email was not  
to suggest the taxonomy tables could or should replace meta, but that  
it would be counterproductive to expect the meta tables to be  
optimized for the types of queries that are better suited to the  
taxonomy system.

For some applications, this may mean that we'll have to store data in  
both the taxonomy and meta systems. Some people will argue that this  
defies the ideals of normalization, but others will point out that we  
should never let ideals get in the way of performance. Structured data  
that you want to display within a post is a good use of postmeta  
(perhaps saved as a serialized array). But if you want to select posts  
based on that data, you'd do well to include it in the taxonomy tables.

> Educate me please. Explain to me how the database engine has already  
> eliminated all but a couple/few rows. Honestly, I hope to find out  
> you have a point but I'm very skeptical.

The following, from update_post_meta_cache() represents the most  
frequent query to postmeta:

SELECT post_id, meta_key, meta_value
FROM $wpdb->postmeta
WHERE post_id IN ($id_list)

If we modify that to work with the proposed general table, it might  
look like this:

SELECT object_id, object_type, meta_key, meta_value
FROM $wpdb->generalmeta
WHERE object_id IN ($id_list)
AND object_type IN ($type_list)

In addition to the primary key for meta_id, the current postmeta table  
has a indexes on both post_id and meta_key. For the general meta  
table, I would suggest using those indexes plus an index on object_type.

When MySQL ( > 4.something when it became possible to use multiple  
indexes for a single query) executes that query, the query parser is  
optimized to use indexes in such a way that it eliminates the most  
rows first. So for most queries it would likely execute the object_id  
portion first, leaving only a few rows remaining to be compared.

--Casey
http://maisonbisson.com/