[wp-hackers] Meta tables: Take 5

Mike Schinkel mikeschinkel at newclarity.net
Fri Aug 7 14:59:42 UTC 2009

On Aug 7, 2009, at 9:56 AM, Casey Bisson wrote:
> On Aug 6, 2009, at 1:12 AM, Mike Schinkel wrote:
>> Are you saying that meta is not needed because of the taxonomy  
>> system?  If yes, that's ironic because some of what I'm doing  
>> lately really has me lamenting that limits of the taxonomy systems.
>> Meta is used for attributes and values. Name/value pairs in other  
>> words.  Taxonomy is used for classification.  But wait; what is  
>> classification? Is it not simply attributes with a value of "Yes",  
>> "No", or "Unknown?"
>> Of course it is.  Everything boils down to attributes as name/value  
>> pairs where the classification is also an attribute with potential  
>> values of Yes/No/Unknown.  I believe even Tim Bray now agrees[1].
>> So IMO the taxonomy system, while a good start, has a long way to  
>> go before it can fully usable to replace metadata.
> I think you misread my previous email, and our understanding of the  
> purpose and best uses of the taxonomy and post_meta tables is  
> different.

Forgive me if I misunderstood your purpose, not my intent.

> The taxonomy system is well optimized as a mechanism to querying for  
> objects that match various criteria. The meta tables are good places  
> to store bulk data for objects. The point of my previous email was  
> not to suggest the taxonomy tables could or should replace meta, but  
> that it would be counterproductive to expect the meta tables to be  
> optimized for the types of queries that are better suited to the  
> taxonomy system.

OTOH I think you misunderstood my purpose which is to say that  
separation of taxonomy from attributes becomes arbitrary for any given  
use case and thus would be better if it were unified.  For example, I  
have a website for project management solutions and each solution is  
tagged with attributes like "open source", "has api" etc. However, it  
would be much better if those tags could carry details about why a  
solution is open-source such as what language it's programmed in or  
what type of API it has.

Basically, forcing a distinction between taxonomy and attributes is  
arbitrary and results in important lack of flexibility as a system  
evolves.  I've run into that problem time and again in other systems  
I've worked with including open-source CMS and custom developed CMS.  
What would I like to see in it's place?  Maybe to have taxonomy have  
option meta data.

> For some applications, this may mean that we'll have to store data  
> in both the taxonomy and meta systems. Some people will argue that  
> this defies the ideals of normalization, but others will point out  
> that we should never let ideals get in the way of performance.

I'll let others debate the purity vs. performance. My concern is  
functionality and maintainability.

> Structured data that you want to display within a post is a good use  
> of postmeta (perhaps saved as a serialized array). But if you want  
> to select posts based on that data, you'd do well to include it in  
> the taxonomy tables.

That's a perfect abstract example of where separating taxonomy from  
meta forces someone to choose between two incomplete options.

>> Educate me please. Explain to me how the database engine has  
>> already eliminated all but a couple/few rows. Honestly, I hope to  
>> find out you have a point but I'm very skeptical.
> The following, from update_post_meta_cache() represents the most  
> frequent query to postmeta:
> SELECT post_id, meta_key, meta_value
> FROM $wpdb->postmeta
> WHERE post_id IN ($id_list)
> If we modify that to work with the proposed general table, it might  
> look like this:
> SELECT object_id, object_type, meta_key, meta_value
> FROM $wpdb->generalmeta
> WHERE object_id IN ($id_list)
> AND object_type IN ($type_list)
> In addition to the primary key for meta_id, the current postmeta  
> table has a indexes on both post_id and meta_key. For the general  
> meta table, I would suggest using those indexes plus an index on  
> object_type.
> When MySQL ( > 4.something when it became possible to use multiple  
> indexes for a single query) executes that query, the query parser is  
> optimized to use indexes in such a way that it eliminates the most  
> rows first. So for most queries it would likely execute the  
> object_id portion first, leaving only a few rows remaining to be  
> compared.

Between you and Otto I know understand this, thanks.  BTW, wouldn't  
you want to drop the index on post_id/object_id+meta_key since it will  
be large and won't really be needed?

-Mike Schinkel
WordPress Custom Plugins

More information about the wp-hackers mailing list