[wp-trac] [WordPress Trac] #29557: PHP ≤ 5.4.8 Crashes on '[' Character in Posts
WordPress Trac
noreply at wordpress.org
Tue Oct 7 22:21:10 UTC 2014
#29557: PHP ≤ 5.4.8 Crashes on '[' Character in Posts
-----------------------------------+--------------------
Reporter: MrBobDobolina | Owner:
Type: defect (bug) | Status: new
Priority: highest omg bbq | Milestone: 4.0.1
Component: Formatting | Version: 4.0
Severity: blocker | Resolution:
Keywords: wptexturize has-patch | Focuses:
-----------------------------------+--------------------
Comment (by kitchin):
Replying to [comment:87 kovshenin]:
> * If we're going to match shortcodes, why aren't we using the existing
cached shortcode regex?
See [#comment62] through [#comment67]. Unregistered shortcodes are allowed
in order not to break Jetpack Contact.
> * "Adding preg_split() inside a loop does not seem like a good direction
for performance." - why not? How is it different from using preg_match()
inside a loop?
Don't know, maybe due to the larger arrays created. It's more like
`preg_match_all`. The heuristic is to get the next match.
> * I like @azaozz's idea of splitting the regex into smaller managable,
consistent and possibly reusable chunks
Would have to change the spec. The `[]` and `<>` parsing must occur at the
same time, as it is now, as far as I know.
I do have a simplified algorithm if you want it:
1. parse for html comments
2. parse for shortcodes
3. parse for html tags
It fails three tests now in trunk:
{{{
Test: '<br [gallery ...] ... />'
Expect: '<br [gallery ...] ... />'
Actual: '<br [gallery ...] … />'
Test: '<br [[gallery ...]] ... />'
Expect: '<br [[gallery ...]] ... />'
Actual: '<br [[gallery ...]] … />'
Test: '[ regex catches this <a href="[quote]">here</a> ]'
Expect: '[ regex catches this <a href="[quote]">here</a> ]'
Actual: '[ regex catches this <a href=”[quote]“>here</a> ]'
}}}
I don't consider those tests important, by the way. But `[]` are common in
PHP links, and my patch would break:
{{{
<a href="http://example.com/?action[string]=1">
}}}
A compromise would be to restrict what can go in that `string` to be
considered a possible shortcode.
There is an active ticket on allowing shortcodes in ''more'' places,
[ticket:24990], which my 3-way algorithm would also break I think.
Other re-arrangements of parsing comments, shortcodes and tags break more
important tests.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/29557#comment:88>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list