[wp-trac] [WordPress Trac] #13590: Inserting a 4-byte UTF-8 character truncates data

WordPress Trac wp-trac at lists.automattic.com
Wed Jan 19 20:21:53 UTC 2011


#13590: Inserting a 4-byte UTF-8 character truncates data
--------------------------+------------------------------
 Reporter:  sardisson     |       Owner:
     Type:  defect (bug)  |      Status:  reopened
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  Database      |     Version:  3.0.4
 Severity:  normal        |  Resolution:
 Keywords:  utf8          |
--------------------------+------------------------------

Comment (by aercolino):

 About the performance hit, I have tried a post with 160KB of 1-byte UTF-8
 (ASCII) text and just one 4-byte UTF-8 character (a G clef), near the end.
 It works fine, I think around a single second roundtrip (using localhost).
 I could time this better, but I don't feel the urge.

 I had to change a bit the internal workings of my class, because the
 pattern matching I chose in the unescape method was driving PHP crazy
 after 16KB of text. This is why I updated the patch. Thanks to westi for
 suggesting this test.

 ----
 About the license, I chose the same used in Zend Framework, and AFAIK the
 New BSD License is GPLv2 and GPLv3 compatible: see this
 [http://en.wikipedia.org/wiki/BSD_licenses Wikipedia article] and this
 [http://www.gnu.org/licenses/license-list.html#ModifiedBSD GNU page].

 ----
 About the WP coding standards, I'd leave my class as is, because it's
 intended as a drop in, it's not WP specific, but my style on purpose.
 Anyone can easily adapt the code in the full-utf8.php file to the WP
 coding standards. It's just 7 misplaced open braces away, I think.

 ----
 About the installs that don't even use UTF-8, I know it's possible in WP,
 but I cannot understand why, except for historical reasons. Anyway, I
 think it's very simple to adapt the patch so that it only escapes /
 unescapes if the install is UTF-8, something like
 {{{
 if (DB_CHARSET == 'utf8') ...
 }}}

 ----
 Incidentally, during my 160KB test, I discovered
 [http://core.trac.wordpress.org/ticket/16311 this bug].

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/13590#comment:8>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list