[wp-trac] [WordPress Trac] #32136: strip_invalid_text removes all russian utf8 chars

WordPress Trac noreply at wordpress.org
Sun Jun 7 22:54:19 UTC 2015

#32136: strip_invalid_text removes all russian utf8 chars
 Reporter:  Fahrain       |       Owner:
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  4.2.3
Component:  Database      |     Version:  4.2
 Severity:  normal        |  Resolution:
 Keywords:  needs-patch   |     Focuses:

Comment (by Fahrain):

 i've solved this problem.

 You can reproduce this bug with this steps:
 1. install wordpress & database in utf8
 2. create some file in windows-1251 encoding and try to call $wpdb->insert
 with text in this encoding inside this file. Function strip_invalid_text
 will cut any text in windows-1251 encoding because this function will
 think that there are utf-8 encoding. Patch 32136 doesn't fix this problem,
 because it uses  mb_internal_encoding() which - of course - returns utf-8

 i've used
   $wpdb->query("SET CHARACTER_SET_CLIENT='cp1251'");
   $wpdb->query("SET CHARACTER_SET_RESULTS='cp1251'");
 to correct input/output encoding before calling wpdb functions and this
 working (and writes to db data in correct encoding - utf8), but with this
 new function (strip_invalid_text) this can't work

 I converted all my included files to utf8 and fixed encoding of input data
 - this solved problem

 I think that this bug report can be closed

Ticket URL: <https://core.trac.wordpress.org/ticket/32136#comment:7>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform

More information about the wp-trac mailing list