[wp-trac] [WordPress Trac] #23307: shortcode_parse_atts may return empty string

WordPress Trac noreply at wordpress.org
Thu Feb 20 13:08:56 UTC 2014


#23307: shortcode_parse_atts may return empty string
------------------------------------------+------------------
 Reporter:  GaryJ                         |       Owner:
     Type:  defect (bug)                  |      Status:  new
 Priority:  normal                        |   Milestone:  3.9
Component:  Shortcodes                    |     Version:
 Severity:  minor                         |  Resolution:
 Keywords:  needs-unit-tests needs-patch  |     Focuses:
------------------------------------------+------------------

Comment (by GaryJ):

 Replying to [comment:12 TobiasBg]:

 > Without having actually tested your patch: What happens for `[foobar ]`
 or `[foobar /]`?
 > And do you happen to have an example where the regex would fail for a
 non-empty string? I couldn't find one yet.

 From the regex in `get_shortcode_regex()`, both `[foobar ]` and `[foobar
 /]` give `$m[3]` as `' '` (string containing a single space). It's `$m[3]`
 that is provided as the argument to `shortcode_parse_atts()`.

 Here's the (slightly modified - see below) regex in a single string (with
 "foobar" added in place of the tag name variable):

 `\[(\[?)(foobar)(?![\w-])([^\]\/]*(?:\/(?!\])[^\]\/]*)*?)(?:(\/)\]|\](?:([^\[]*(?:\[(?!\/\2\])[^\[]*)*)\[\/\2\])?)(\]?)`

 ----

 The `shortcode_parse_atts()` regex allows shortcodes attributes of the
 form:
  * x=y
  * x =y (optional space before =)
  * x= y (optional space after =)
  * x="y" (use of double-quotes, can have optional spaces around =)
  * x='y' (use of single quotes, can have optional spaces around =)
  * "y" (just attribute value in double quotes)
  * y (non-whitespace character without quotes)

 That means that a single or multiple consecutive spaces WILL fail the
 regex, and why there's a trim in the current `else`.

 So, `[foobar ]` should act the same as `[foobar]`.

 One case where the string wouldn't be empty (if I'm reading everything
 right) would be something like `[foobar "]` (single double quote) or
 `[foobar '']` (two single quotes) - they give a string of `'"'` and `''''`
 which are then run through `stripcslashes()`.

 ----

 As a side question, the Rad Software Regular Expression Designer that I
 was testing in reported that the `get_shortcode_regex()` regex had nested
 quantifiers and wouldn't run without fixing them. Looking at the original
 regex, there does seem to be three instances of `*+`, which seems odd. Can
 anyone explain that logic?

--
Ticket URL: <https://core.trac.wordpress.org/ticket/23307#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list