[wp-hackers] Taxonomy Schema Proposal

Jamie Talbot wphackers at jamietalbot.com
Wed Apr 18 12:45:27 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Steve Lewis wrote:
> On 4/16/07, Jamie Talbot <wphackers at jamietalbot.com> wrote:
>>
>>
> At this point the API and upgrade patterns suggest that adding columns to
> existing tables is a dangerous thing for plugins to attempt.  I wanted
> to at
> least air that here.  It could be addressed a number of ways, but making
> sure the topic is explored earlier then later seemed wise.

Sure.  I think this column might actually be a valuable addition to the core myself, but will leave
that decision to those with authority.

> term_group bigint(10) NOT NULL default 0
> 
> I would suggest alias_term_id as a column name, but a self-join was my
> first thought.  Using the word 'group' is very often a gateway to pain IME.  That
> word is just too easily overloaded and vague, though the idea of using this
> as a lever to achieve translation is clever I have to admit.  This solution
> is distinct from the category type of hierarchy in that you only need to
> manage a level of depth of 0 and 1.  That means when you add a new alias,
> you only alias to root terms, not to aliases.  That simplifies the
> interface and the data model tremendously.

I think perhaps my original example was a little unclear.  The alias_term_id is not intended to be a
reference to a term_id, it is its own value that only makes sense in that column.  I would normally
agree with you about 'group', In this context, I would say that alias group is conceptually clearer.
 For example

1 Cities	cities		0
2 Products	products 	1
3 Projects	projects	1

The '1' in the third column signifies that they are in the same 'alias group' - it doesn't relate to
the row with a term_id of '1'.  The general pseudocode for adding an alias is:

Is the term I want to make an alias of already in an 'alias group'?
Yes:
	Add my new term with that same 'alias group'.
No:
	Select the next available 'alias group' (MAX() + 1 maybe)
	Update the existing term with the new 'alias group'
	Add my new term with that same 'alias group'.

Too many inverted commas but you get the idea.  There's no depth level at all - you're in an alias
group, or you're not.  Maintaining a 1-level hierarchy implies that the first term is the 'master'.
 What happens if you delete it?  In this scheme, it doesn't matter.  That's my take on it anyway.
You could make the column NULL by default, instead of 0, which simplifies certain operations, but is
slower (NOT NULL is virtually always faster).

>> Lookup by slug is easy.
>
> Notice I slipped a trick into that one.  You don't know if 'products' is a
> tag or a post category or ... based on the URL.  It isn't clear that we
> should care.

That's true, if you ignore the fact that category URLs will be prepended by category/, and tag URLs
by tag/.  Given that those bases exist though, I think we should be safe.

>> Lookup by both is still easy, but a bit more inefficient as you'd have to
>> do an IN(term_ids) based on all the returned tags from the above query.
>  
> Couldn't you just self join, take the 'parent' term_id, and then lookup
> objects based on that term_id?  No need for an IN clause if the 'parent'
> term_id points at itself, as you suggested in your reply. Of course getting
> that inserted id into the alias column is it's own challenge.  :)

There isn't a parent term_id (see above).  You'd do your self join to get all ids of the aliases of
the specified term, then use that in an IN() clause to filter only the posts/links that you wanted
from the term_references table.

Not many people have commented on this thread, despite the large number of people who were vocally
opposed to the existing schema.  Does this mean people approve of this new, suggested direction, or
are there any better solutions out there?

Cheers,

Jamie.

- --
http://jamietalbot.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGJhLnrovxfShShFARAmqEAJ4g1g7s1cVytkHbpzqDJiavj7EQPACfV6lt
lYyrWsKErMOdFIZIzosoay0=
=W+GW
-----END PGP SIGNATURE-----


More information about the wp-hackers mailing list