[wp-trac] [WordPress Trac] #39153: Bug in wp_html_split with unclosed PHP tag (or HTML tag <)
WordPress Trac
noreply at wordpress.org
Wed Dec 7 17:48:33 UTC 2016
#39153: Bug in wp_html_split with unclosed PHP tag (or HTML tag <)
--------------------------------------+-----------------------------
Reporter: crosp | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Formatting | Version: 4.6.1
Severity: normal | Keywords:
Focuses: administration, template |
--------------------------------------+-----------------------------
The problem is in the ''shortcodes.php'' file, but exact problem is
function ''wp_html_spli''t in ''formatting.php''
This bug is completely described in this question forum thread.
https://wordpress.org/support/topic/bug-in-wp_html_split-with-unclosed-
php-tag/
Consider following post code.
{{{
Some amount of useless text <!--more-->
[code-highlight line-numbers="table" linenostart="53" highlight-
lines="1,3,8" style="native" lang="html+php" pyg-id="1" ]
<?php
//This callback registers our plug-in
function wpse72394_register_tinymce_plugin($plugin_array) {
$plugin_array['wpse72394_button'] = 'path/to/shortcode.js';
return $plugin_array;
}
//This callback adds our button to the toolbar
function wpse72394_add_tinymce_button($buttons) {
//Add the button ID to the $button array
$buttons[] = "wpse72394_button";
return $buttons;
}
?
[/code-highlight]
Some amount of useless text <strong>checkstyle</strong>
[code-highlight style="native" lang="perl" pyg-id="2" ]
(?:s+)(?:(/*([^*]|[rn]|(*+([^*/]|[rn])))**+/)|(//(?!.*(CHECKSTYLE)).*))
[/code-highlight]
}}}
Here dump after this line
{{{
$textarr = wp_html_split( $content );
var_dump($textarr);
exit;
}}}
{{{
array(25) {
[0]=>
string(0) ""
[1]=>
string(3) "<p>"
[2]=>
string(28) "Some amount of useless text "
[3]=>
string(11) "<!--more-->"
[4]=>
string(0) ""
[5]=>
string(4) "</p>"
[6]=>
string(1) "
"
[7]=>
string(3) "<p>"
[8]=>
string(121) "[code-highlight line-numbers="table" linenostart="53"
highlight-lines="1,3,8" style="native" lang="html+php" pyg-id="1" ]"
[9]=>
string(6) "<br />"
[10]=>
string(1) "
"
[11]=>
string(464) "<?php
//This callback registers our plug-in
function wpse72394_register_tinymce_plugin($plugin_array) {
$plugin_array['wpse72394_button'] = 'path/to/shortcode.js';
return $plugin_array;
}
//This callback adds our button to the toolbar
function wpse72394_add_tinymce_button($buttons) {
//Add the button ID to the $button array
$buttons[] = "wpse72394_button";
return $buttons;
}
?
[/code-highlight]
Some amount of useless text <strong>"
[12]=>
string(10) "checkstyle"
[13]=>
string(9) "</strong>"
[14]=>
string(0) ""
[15]=>
string(4) "</p>"
[16]=>
string(56) "
[code-highlight style="native" lang="perl" pyg-id="2" ]"
[17]=>
string(6) "<br />"
[18]=>
string(72) "
(?:s+)(?:(/*([^*]|[rn]|(*+([^*/]|[rn])))**+/)|(//(?!.*(CHECKSTYLE)).*))"
[19]=>
string(6) "<br />"
[20]=>
string(19) "
[/code-highlight]
"
[21]=>
string(3) "<p>"
[22]=>
string(15) "Some Text Again"
[23]=>
string(4) "</p>"
[24]=>
string(1) "
"
}
}}}
As you can see one shortcode was not splitted, and here the problem. If
php closing tag is present (?>)
than everything works fine.
Problematic regex provider
{{{#!php
<?php
function get_html_split_regex() {
static $regex;
if ( ! isset( $regex ) ) {
$comments =
'!' // Start of comment, after the <.
. '(?:' // Unroll the loop: Consume
everything until --> is found.
. '-(?!->)' // Dash not followed by end of
comment.
. '[^\-]*+' // Consume non-dashes.
. ')*+' // Loop possessively.
. '(?:-->)?'; // End of comment. If not found,
match all input.
$cdata =
'!\[CDATA\[' // Start of comment, after the <.
. '[^\]]*+' // Consume non-].
. '(?:' // Unroll the loop: Consume
everything until ]]> is found.
. '](?!]>)' // One ] not followed by end of
comment.
. '[^\]]*+' // Consume non-].
. ')*+' // Loop possessively.
. '(?:]]>)?'; // End of comment. If not found,
match all input.
$escaped =
'(?=' // Is the element escaped?
. '!--'
. '|'
. '!\[CDATA\['
. ')'
. '(?(?=!-)' // If yes, which type?
. $comments
. '|'
. $cdata
. ')';
$regex =
'/(' // Capture the entire match.
. '<' // Find start of element.
. '(?' // Conditional expression
follows.
. $escaped // Find end of escaped
element.
. '|' // ... else ...
. '[^>]*>?' // Find end of normal element.
. ')'
. ')/';
}
return $regex;
}
}}}
Without any doubts this case should be included in regex.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/39153>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list