[wp-trac] [WordPress Trac] #28575: Eliminate redundant preg_match() to improve wptexturize() performance
WordPress Trac
noreply at wordpress.org
Wed Jun 18 14:59:05 UTC 2014
#28575: Eliminate redundant preg_match() to improve wptexturize() performance
-------------------------+-----------------------------
Reporter: dllh | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
Focuses: performance |
-------------------------+-----------------------------
`wptexturize()` performs very poorly for large posts. In trying to
investigate ways of speeding it up, I noticed that when looping over
`$textarr` to actually do the replacements, we do the `preg_match()` to
check for strings like `9x9` for every iteration of the loop. It seems to
me that we should do this check only once, since for large chunks of
`$text`, we're doing a `preg_match()` against all of `$text` for each item
in `$textarr`, which will be a lot of items for large text.
I did some rough profiling of the core code against the attached patch.
Methodology:
* Put a chunk of text of approaching 1MB in a file. The text I used came
from an actual large post I encountered this problem with.
* Put the wptexturized text in a second file.
* For 10 iterations, call `wptexturize()` on the original file.
* Compare the value to the value in the second file to validate that
nothing in the way `wptexturize()` actually processes text has been
changed by my modifications.
* Measure the total time (I used the `time` command on my linux box).
With no modifications, core consistently took about 1m9s to run through my
little test. With the attached patch, it consistently took about 6s. For
just 1 iteration, the difference was more like 8s to 1.4s, so an
appreciable amount of time even for a single run of the function.
Arguably, users shouldn't be making posts of 1MB, but it happens. On pages
like search results that will generate an excerpt and thus fire
`wptexturize()` potentially many times, the performance increase here
stands to be fairly significant provided there are large posts among the
results.
I'm attaching both a proposed patch and the script I used for the rough
profiling. The post I tested isn't mine, so I can't share it, but that
should be easy enough to generate.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/28575>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list