[wp-trac] [WordPress Trac] #16282: PHP catchable error with get_term_link and WP3.1RC2
WordPress Trac
wp-trac at lists.automattic.com
Mon Jan 24 21:36:44 UTC 2011
#16282: PHP catchable error with get_term_link and WP3.1RC2
--------------------------+-----------------------
Reporter: illutic | Owner:
Type: defect (bug) | Status: reopened
Priority: normal | Milestone: 3.1
Component: Multisite | Version: 3.1
Severity: major | Resolution:
Keywords: has-patch |
--------------------------+-----------------------
Comment (by dd32):
Ok, So the root of the cause here, Is that in <= 3.0, some slugs have been
stored as URLEncoded, unlike 3.1 which is currently applying it correcty
(Due to a bug in sanitize_title, not recognising all accents - can we just
replace any non a-z characters already after sanitizing?)
This results in `sanitize_title_for_query()` actually working for those
old slugs in 3.1. What it does not allow for, is Terms added in 3.1 to be
queried correctly, Which is why scibu has added the extra
`get_term_by('name'..`
As a result, `get_term_by()` -cannot- use either `sanitize_title()` OR
`sanitize_title_for_query()`, It must use bot for 100% backwards
compatibility. attachment:both.16282.2.diff handles the fact that the
database may contain incorrect data at a high level (ie. In the calling
function if by slug didnt work, try it by name. It doesnt count for the
fact that the slug may have been sanitized wrong, and therefor take care
of it in get_term_by()).
We can now ignore `get_term_link()` as that's not a problem, mearly
showing a problem deeper in the Taxonomy API -
attachment:both.16282.2.diff will allow get_term_link to work, but it does
not help with the root cause which could show up in any function which
uses `get_term_by('slug',..)`.
The problem is the fact that the input Sanitization for `get_term_by()`
has changed, in a backwards incompatible way, we can use
`sanitize_title()` for terms added ''in 3.1'', where as, terms added
before 3.1 might not be found using that (If they include characters which
have accents which were not being removed) and
`sanitize_title_for_query()` would need to be used instead, but once
again, it's cat and mouse, as the latter cannot query the former, and the
former cannot query the latter.
So, Test case. Created a term 'șir' in 3.0 sanitized format, Created a
term 'uș' in 3.1 sanitized format, test code:
{{{
var_dump( get_term_by('slug', 'șir', 'category') );
var_dump( get_term_by('slug', 'uș', 'category') );
}}}
Under 3.1, that causes the first to fail(3.0 format), the 2nd to
succeed(3.1 format). Previous to the revert it caused the first to
succeed(3.0 format), and the 2nd to fail(3.1 format).
So this brings up the question, If a slug is being requested, we sanitize
the input to match what we call a "slug", in doing so we've allowed
internationalised slugs to be requested which aids non-english coders.
The solutions here are:
* Require the exact slug format to be passed, This means no accented
characters, This means that the input will differ depending on when the
term was created (ie. 'sir' in 3.1, or '%c8%99ir' in 3.0)
* Make terms created in previous version inaccessible
* Query for both Old and New taxonomy slugs
* attachment:16282.diff as an example of this, a POC which alows both
test cases above to pass
* This results in up to 2 queries per taxonomy request,
* ascii-only requests will result in 1 query
* international characters with a slug created in 3.1 will result in 1
query if found, else 2
* international characters with a slug created prior to 3.1 will result
in 2 queries
* store the slug somehow differently or update previous terms (Using the
alias functionality(term_group) could allow for that, but that's just
complicating things)
We need proper unit tests for Taxonomy queries, This would've shown up in
a proper test case scenario.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/16282#comment:47>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list