[wp-trac] [WordPress Trac] #27733: wpautop(): \s in regex destroys some UTF-8 characters

WordPress Trac noreply at wordpress.org
Wed Apr 9 09:17:40 UTC 2014


#27733: wpautop(): \s in regex destroys some UTF-8 characters
--------------------------+-----------------------------
 Reporter:  tenpura       |      Owner:
     Type:  defect (bug)  |     Status:  new
 Priority:  normal        |  Milestone:  Awaiting Review
Component:  Formatting    |    Version:
 Severity:  normal        |   Keywords:
  Focuses:                |
--------------------------+-----------------------------
 \s in preg_replace() incorrectly targets some UTF-8 characters.

 '''Steps to reproduce:'''
 1. Create a post with
 {{{
 ム
 new line
 }}}
   as a content.

 2. It will be output as
 {{{
 <p>�<br>
 new line</p>
 }}}

 '''Quick Test:'''
 {{{
 $pee = "<p>ム\n";
 $pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee);
 echo $pee; // outputs <p>�<br />\n
 }}}

 '''Solution:'''
 Use [\r\n\t ] rather than \s.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/27733>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list