[wp-trac] [WordPress Trac] #28575: Eliminate redundant preg_match() to improve wptexturize() performance

WordPress Trac noreply at wordpress.org
Tue Jun 24 14:38:34 UTC 2014


#28575: Eliminate redundant preg_match() to improve wptexturize() performance
------------------------------+-----------------------------
 Reporter:  dllh              |       Owner:  SergeyBiryukov
     Type:  defect (bug)      |      Status:  closed
 Priority:  normal            |   Milestone:  4.0
Component:  Formatting        |     Version:  3.9
 Severity:  normal            |  Resolution:  fixed
 Keywords:  has-patch commit  |     Focuses:  performance
------------------------------+-----------------------------

Comment (by miqrogroove):

 My performance results:

 Looking strictly at the regex performance only, patching 3.9.1 for the one
 variable name gives a speed increase of about 10% for a sample post.

 Substituting the $dynamic patterns used in trunk now, about 50% more time
 is needed.

 Substituting the new HTML/shortcode avoidance logic, about 4% more time is
 needed.

 Patching trunk for the one variable name, about 8% less time is needed.

 With the patch from #28483, another 4% less time is needed.

 Here is a rough breakdown of the time being spent on regex:

 17% - Dashes and spaces (4 patterns)
 8% - '99' and '99" (2 patterns)
 8% - "42" or '42.00' (2 patterns)
 8% - 9" and 9' (2 patterns)
 5% - HTML/shortcode avoidance
 5% - '99
 5% - Single quote at start
 5% - Apostrophe in a word.
 5% - Double quote at start
 4% - Any remaining double quotes.
 4% - Single quotes followed by spaces

 Conclusions:

 The difference between running with and without the affected if statement
 altogether is on the order of 1%.  So, the patch for this ticket was only
 slightly better than deleting the bugged code.

 Each of the $dynamic patterns is optimal, using almost exactly the same
 amount of time.  Reducing the number of patterns is paramount for
 improving performance.

 For other than the HTML/shortcode avoidance pattern, about 95% of the time
 consumed is devoted to looping through $textarr and re-executing the
 patterns rather versus running each pattern one time on the entire post.

 The calls to _wptexturize_pushpop_element() are not trivial, consuming 7%
 of the total time.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/28575#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list