[wp-trac] [WordPress Trac] #28058: Taxonomies defined with UTF8 encoded names cause notices when adding a new term

WordPress Trac noreply at wordpress.org
Sun May 17 01:06:30 UTC 2015


#28058: Taxonomies defined with UTF8 encoded names cause notices when adding a new
term
--------------------------+-----------------------------
 Reporter:  mikejolley    |       Owner:
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  Future Release
Component:  Taxonomy      |     Version:  3.9
 Severity:  normal        |  Resolution:
 Keywords:                |     Focuses:
--------------------------+-----------------------------
Changes (by boonebgorges):

 * milestone:  Awaiting Review => Future Release


Comment:

 To clarify: The issue here is not with *terms*, it's with the taxonomy
 name itself, correct? Eg:

 {{{
 register_taxonomy( 'pa_資料庫版本', $args );
 }}}

 Looking through the component, it looks like we don't explicitly support
 UTF8 characters in taxonomy names, though we don't enforce it; in most
 places, the use of these characters for taxonomy names will work fine, but
 clearly there are some finer points where things break. (The same thing is
 almost certainly true of post types.)

 It would be great to clean this up and provide full support for
 taxonomies/post types with non-ASCII characters in their names. This will
 take a pretty thorough review, however. Some things to check:

 - The 'taxonomy' field in the 'wp_term_taxonomy' table is `VARCHAR(32)`,
 which imposes an absolute maximum length on taxonomy names. We throw a
 related `_doing_it_wrong()` notice in `register_taxonomy()` based on
 `strlen()`. This check would need to use `mb_strlen()` instead.
 - Non-ASCII characters will be stored differently (or sometimes not at
 all) in databases with different character encoding. This means that a
 taxonomy name that works properly on one WP installation may not work
 properly on another one, just due to the DB charset/collation. This might
 be an education issue for plugin authors; or it might suggest that core
 should be stricter about not allowing certain character types in certain
 fields that are used as keys in plugins/themes.
 - We should take special care testing rewrite issues, as non-ASCII
 characters will be encoded in various places in the context of URLs.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/28058#comment:6>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list