[wp-trac] Re: [WordPress Trac] #1955: Outgoing trackback ping could include 'charset' attribute for international trackbacks

WordPress Trac wp-trac at lists.automattic.com
Tue Sep 5 14:34:50 GMT 2006


#1955: Outgoing trackback ping could include 'charset' attribute for international
trackbacks
--------------------------------------------+-------------------------------
 Reporter:  matopc                          |        Owner:  anonymous
     Type:  defect                          |       Status:  new      
 Priority:  normal                          |    Milestone:  2.1      
Component:  Optimization                    |      Version:  2.0      
 Severity:  normal                          |   Resolution:           
 Keywords:  trackback charset bg|has-patch  |  
--------------------------------------------+-------------------------------
Comment (by Tae-young):

 majelbstoat// Unfortunately we can't make this pluggable. there's no
 do_action(...) in trackback function. I made korean trackback plugin, but
 it's almost hack! do_action('trackback_post') called after inserting
 trackback to the database. (in wp-trackback.php).

 drssay// here's utf-8 validation code
 {{{
 // Returns true if $string is valid UTF-8 and false otherwise.
 // utf-8 validation with regex expression
 function is_utf8($string) {
 return preg_match('%^(?:
 [\x09\x0A\x0D\x20-\x7E] # ASCII
 | [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
 | \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
 | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
 | \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
 | \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
 | [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
 | \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
 )*$%xs', $string);
 }
 }}}

 Here's another one. I referenced glib's utf-8 validating interface .
 {{{
 function utf8_validation( $str, &$i ){

     $i = 0;
     $len = strlen($str);

     while( $i < $len ){

         if( (ord($str[$i]) & 0xF0) == 0xE0 ){
             if( $i <= ($len-3) &&
                 (ord($str[$i+1])&0x80 == 0x80) &&
                 (ord($str[$i+2])&0x80 == 0x80) )
                     $i += 3;
             else
                 return;
         }
         // 2Byte
         else if( (ord($str[$i]) & 0xE0) == 0xC0 ){
             if( $i <= ($len-2) &&
                 (ord($str[$i+1])&0x80 == 0x80) )
                     $i += 2;
             else
                 return;
         }
         // 1Byte
         else {
             $i++;
         }

     }

 }
 }}}

 you can test the second code in the url below.
 http://mytears.org/resources/mysrc/php/unicode/

-- 
Ticket URL: <http://trac.wordpress.org/ticket/1955>
WordPress Trac <http://wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list