[wp-trac] [WordPress Trac] #16282: PHP catchable error with get_term_link and WP3.1RC2

WordPress Trac wp-trac at lists.automattic.com
Mon Jan 24 21:36:44 UTC 2011


#16282: PHP catchable error with get_term_link and WP3.1RC2
--------------------------+-----------------------
 Reporter:  illutic       |       Owner:
     Type:  defect (bug)  |      Status:  reopened
 Priority:  normal        |   Milestone:  3.1
Component:  Multisite     |     Version:  3.1
 Severity:  major         |  Resolution:
 Keywords:  has-patch     |
--------------------------+-----------------------

Comment (by dd32):

 Ok, So the root of the cause here, Is that in <= 3.0, some slugs have been
 stored as URLEncoded, unlike 3.1 which is currently applying it correcty
 (Due to a bug in sanitize_title, not recognising all accents - can we just
 replace any non a-z characters already after sanitizing?)

 This results in `sanitize_title_for_query()` actually working for those
 old slugs in 3.1. What it does not allow for, is Terms added in 3.1 to be
 queried correctly, Which is why scibu has added the extra
 `get_term_by('name'..`

 As a result, `get_term_by()` -cannot- use either `sanitize_title()` OR
 `sanitize_title_for_query()`, It must use bot for 100% backwards
 compatibility. attachment:both.16282.2.diff handles the fact that the
 database may contain incorrect data at a high level (ie. In the calling
 function if by slug didnt work, try it by name. It doesnt count for the
 fact that the slug may have been sanitized wrong, and therefor take care
 of it in get_term_by()).

 We can now ignore `get_term_link()` as that's not a problem, mearly
 showing a problem deeper in the Taxonomy API -
 attachment:both.16282.2.diff will allow get_term_link to work, but it does
 not help with the root cause which could show up in any function which
 uses `get_term_by('slug',..)`.

 The problem is the fact that the input Sanitization for `get_term_by()`
 has changed, in a backwards incompatible way, we can use
 `sanitize_title()` for terms added ''in 3.1'', where as, terms added
 before 3.1 might not be found using that (If they include characters which
 have accents which were not being removed) and
 `sanitize_title_for_query()` would need to be used instead, but once
 again, it's cat and mouse, as the latter cannot query the former, and the
 former cannot query the latter.

 So, Test case. Created a term 'șir' in 3.0 sanitized format, Created a
 term 'uș' in 3.1 sanitized format, test code:
 {{{
 var_dump( get_term_by('slug', 'șir', 'category') );
 var_dump( get_term_by('slug', 'uș', 'category') );
 }}}
 Under 3.1, that causes the first to fail(3.0 format), the 2nd to
 succeed(3.1 format). Previous to the revert it caused the first to
 succeed(3.0 format), and the 2nd to fail(3.1 format).

 So this brings up the question, If a slug is being requested, we sanitize
 the input to match what we call a "slug", in doing so we've allowed
 internationalised slugs to be requested which aids non-english coders.

 The solutions here are:
  * Require the exact slug format to be passed, This means no accented
 characters, This means that the input will differ depending on when the
 term was created (ie. 'sir' in 3.1, or '%c8%99ir' in 3.0)
  * Make terms created in previous version inaccessible
  * Query for both Old and New taxonomy slugs
   * attachment:16282.diff as an example of this, a POC which alows both
 test cases above to pass
   * This results in up to 2 queries per taxonomy request,
    * ascii-only requests will result in 1 query
    * international characters with a slug created in 3.1 will result in 1
 query if found, else 2
    * international characters with a slug created prior to 3.1 will result
 in 2 queries
  * store the slug somehow differently or update previous terms (Using the
 alias functionality(term_group) could allow for that, but that's just
 complicating things)

 We need proper unit tests for Taxonomy queries, This would've shown up in
 a proper test case scenario.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/16282#comment:47>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list