[wp-hackers] questioning the efficiency of using custom post types
Marcus Pope
Marcus.Pope at springbox.com
Tue Nov 1 18:10:22 UTC 2011
Agreed Christopher, both on flame wars and use cases for joins. You can't make a blanket statement about joins being slower in all cases as that purely depends on how you slice the data. Row level scans can be horrifically slow. You can improve this by adding indexes on your key table columns but when those indexes become extremely large themselves the benefit is lost. Joins allow you, in many cases, to break those relationships down into more manageable chunks on several levels. So the result is a faster query. But the notion that all data should be relational is a misnomer too and has proven more troublesome (inflexible) for the data set sizes wordpress typically deals with. kvp architectures tend to support far more variations of data structures than problem domain specific relationships, so the result for a generalized community is a better platform. And given the performance of modern databases and price of modern technology it will continue to become even easier as a general rule.
In reality wordpress is still built on those performance realizing joins, just not more than one or two join for this particular case (meta data is joined with posts and possibly taxonomies in most of the queries, just typically not between the data itself that people put in the meta table)
Marcus Pope
On Oct 30, 2011, at 2:06 PM, "Christopher O'Connell" <jwriteclub at gmail.com> wrote:
> I don't want to start a flame war, but, storing the data as key value pairs
> defeats the whole purpose of using a relational database. If you want a K-V
> store, then user a K-V store.
>
> In general, Wordpress' approach works, because any given post meta table is
> relatively short and/or the querying is of only modest complexity. However,
> relational schema and normalization exists for a reason. Yes, they're
> extremely complex and often overkill for a project. If you have large,
> complex data, however, using a well normalized database will result in
> significant speed improvements assuming that you need to perform arbitrary
> queries over the data.
>
> How's this apply to Wordpress? The K-V version of post meta doesn't scale
> past a couple hundred million entries. Once this limit is reached, the fact
> that a non-K-V data-store is being used causes the DB to fall over. In this
> case, storing metadata in something more suited (such as a CouchDB or
> Amazon SimpleDB) works far, far better.
>
> ~ Christopher
>
> P.S. Before anyone takes me to task for this post, I *don't* think that the
> WordPress core should change. As already stated for small to "modestly
> large" amounts of data it works splendidly and does not require developers
> to worry about the integrity or quality of their data model.
>
> On Sun, Oct 30, 2011 at 11:40, Mike Bijon <mike at etchsoftware.com> wrote:
>
>> Haluk,
>> One of the really non-intuitive things about relational DBs like MySQL is
>> that looking up data spread across more tables is less-efficient than fewer
>> tables with lots of rows. It got me at first too, but doing JOINs just
>> isn't fast, and more tables means more For an example of how efficient
>> long tables like WPs are, take a look at Brett Taylor's documentation of
>> FriendFeed's speedup:
>> * http://bret.appspot.com/entry/how-friendfeed-uses-mysql
>>
>> The longer tables also give us a lot of flexibility and prevent errors.
>> It's odd until you get to know MySQL, and the coding to deal with
>> meta-table isn't simple. And, since these things are already solved in
>> WordPress, it's also why we use WP over-and-over and why a lot of good devs
>> contribute to the project.
>>
>>
>> For an example of flexibility:
>> If you had a table of restaurants and later on you wanted to store extra
>> info about ones that served coffee, then... In a multi-table system you
>> could add columns to the restaurants table, or add a new table just for
>> those cafes. In a meta-table system you would add a meta-row and then add
>> the rows to the data table.
>>
>> On a live site the multi-table changes could cause locking and performance
>> issues while they're happening, and they might leave you with either empty
>> columns for non-cafes or a whole other table to join in. On a live site
>> with the meta-tables the update would be the same as any other insert and
>> performance would be almost unchanged.
>>
>> For an example of preventing errors:
>> Let's say you make some changes on your dev server and need to apply them
>> to the live site. On a multi-table system this could be adding columns or
>> adding tables. On a meta-table system this would be adding rows.
>>
>> While the update queries run on the live site, the multi-table system could
>> return no data when queries see columns or tables "missing" because they
>> aren't added yet. The meta-table will return the "old" data, so there may
>> be missing fields, but no errors.
>>
>> In the event someone makes a mistake in the update process (which happens
>> way more than it should, unless you're doing test-migrations and have
>> staging, test, pre-deploy and/or other versions of the site)... The
>> multi-table system will return errors to live users up until the mistake is
>> corrected. The meta-table system just leaves a few fields blank and (as
>> long as your devs are checking whether or nat a value exists) live site
>> visitors should just see some empty fields on the live pages.
>> _______________________________________________
>> wp-hackers mailing list
>> wp-hackers at lists.automattic.com
>> http://lists.automattic.com/mailman/listinfo/wp-hackers
>>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
More information about the wp-hackers
mailing list