[wp-trac] [WordPress Trac] #2980: Improvements to wptexturize

WordPress Trac wp-trac at lists.automattic.com
Thu Jul 27 20:33:53 GMT 2006


#2980: Improvements to wptexturize
--------------------------+-------------------------------------------------
 Reporter:  ecb29         |       Owner:  anonymous           
     Type:  enhancement   |      Status:  new                 
 Priority:  normal        |   Milestone:  2.0.4               
Component:  Optimization  |     Version:  2.0.3               
 Severity:  normal        |    Keywords:  wptexturize optimize
--------------------------+-------------------------------------------------
 The wptexturize function in functions-formatting.php can be significantly
 improved by some simple refactoring.  My measurements show this reduces
 the time spent inside wptexturize from 24% to 16% of total wp(), and the
 time of the function itself from 600ms to 200ms. This also reduces the
 number of preg_replace calls dramatically, from 10,439 to 3,289 and total
 time from 74ms to 36ms. Also, we’ve gone from 54ms of 6,410 str_replace
 calls to 29ms of 1,405 calls.

 {{{
 function wptexturize($text) {
         $next = true;
         $output = '';
         $curl = '';
         $textarr = preg_split('/(<.*>)/Us', $text, -1,
 PREG_SPLIT_DELIM_CAPTURE);
         $stop = count($textarr);

         for($i = 0; $i < $stop; $i++){
                 $curl = $textarr[$i];

         if (isset($curl{0}) && '<' != $curl{0} && $next) { // If it's not
 a tag
                 // static strings
                 $static_characters = array('&#8212;', ' &#8212; ',
 '&#8211;', 'xn--', '&#8230;', '&#8220;', '\'tain\'t', '\'twere', '\'twas',
 '\'tis', '\'twill', '\'til', '\'bout', '\'nuff', '\'round', '\'cause',
 '\'s', '\'\'', ' (tm)');
                 $static_replacements = array('---', ' -- ', '--',
 'xn&#8211;', '...', '``', '&#8217;tain&#8217;t', '&#8217;twere',
 '&#8217;twas', '&#8217;tis', '&#8217;twill', '&#8217;til', '&#8217;bout',
 '&#8217;nuff', '&#8217;round', '&#8217;cause', '&#8217;s', '&#8221;', '
 &#8482;');
                 $curl = str_replace($static_characters,
 $static_replacements, $curl);

                 // regular expressions
                 $dynamic_characters = array('/\'(\d\d(?:&#8217;|\')?s)/',
 '/(\s|\A|")\'/', '/(\d+)"/', '/(\d+)\'/', '/(\S)\'([^\'\s])/',
 '/(\s|\A)"(?!\s)/', '/"(\s|\S|\Z)/', '/\'([\s.]|\Z)/', '/(\d+)x(\d+)/');
                 $dynamic_replacements = array('&#8217;$1','$1&#8216;',
 '$1&#8243;', '$1&#8242;', '$1&#8217;$2', '$1&#8220;$2', '&#8221;$1',
 '&#8217;$1', '$1&#215;$2');
                 $curl = preg_replace($dynamic_characters,
 $dynamic_replacements, $curl);
         } elseif (strstr($curl, '<code') || strstr($curl, '<pre') ||
 strstr($curl, '<kbd' || strstr($curl, '<style') || strstr($curl,
 '<script'))) {
                 // strstr is fast
                 $next = false;
         } else {
                 $next = true;
         }

         $curl = preg_replace('/&([^#])(?![a-zA-Z1-4]{1,8};)/', '&#038;$1',
 $curl);
         $output .= $curl;
         }

         return $output;
 }
 }}}

 I'll try and attach a patch...

-- 
Ticket URL: <http://trac.wordpress.org/ticket/2980>
WordPress Trac <http://wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list