[wp-trac] [WordPress Trac] #10566: Possibly wrong regex to strip shortcodes

WordPress Trac wp-trac at lists.automattic.com
Fri Aug 7 20:29:54 UTC 2009


#10566: Possibly wrong regex to strip shortcodes
--------------------------+-------------------------------------------------
 Reporter:  jgbustos      |       Owner:                  
     Type:  defect (bug)  |      Status:  new             
 Priority:  normal        |   Milestone:  Unassigned      
Component:  Shortcodes    |     Version:  2.8.3           
 Severity:  normal        |    Keywords:  regex, shortcode
--------------------------+-------------------------------------------------
 In file wp-includes/shortcodes.php, the function get_shortcode_regex() is
 generating the following regex:

 {{{
 return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
 }}}

 The (.?) captured groups at the beginning and the end are meant to match
 possible double brackets [[ and ]], according to the comments above the
 function. If so, they probably should be replaced with (\[)? and (\])?
 Otherwise, stripping shortcodes in strings such as

 {{{
 [caption foo="bar"]data[/caption]Text
 }}}

 Causes the initial "T" in "Text" to be removed.

 Also, the (.+?) group in the middle is causing problems with blocks of
 text like this:

 {{{
 [shortcode foo="bar"][/shortcode]Text[shortcode foo="bar"][/shortcode]
 }}}

 Since the are no characters between the opener and closer tags, (.+?)
 fails to match and the complete string is removed. In this admittedly
 fringe case, with two consecutive shortcodes, the first of which closes
 just after opening, we remove too much text. The group should probably be
 (.+?) to allow the possibility of an empty string.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/10566>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list