[wp-trac] [WordPress Trac] #23307: shortcode_parse_atts may return empty string
WordPress Trac
noreply at wordpress.org
Thu Feb 20 13:08:56 UTC 2014
#23307: shortcode_parse_atts may return empty string
------------------------------------------+------------------
Reporter: GaryJ | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: 3.9
Component: Shortcodes | Version:
Severity: minor | Resolution:
Keywords: needs-unit-tests needs-patch | Focuses:
------------------------------------------+------------------
Comment (by GaryJ):
Replying to [comment:12 TobiasBg]:
> Without having actually tested your patch: What happens for `[foobar ]`
or `[foobar /]`?
> And do you happen to have an example where the regex would fail for a
non-empty string? I couldn't find one yet.
From the regex in `get_shortcode_regex()`, both `[foobar ]` and `[foobar
/]` give `$m[3]` as `' '` (string containing a single space). It's `$m[3]`
that is provided as the argument to `shortcode_parse_atts()`.
Here's the (slightly modified - see below) regex in a single string (with
"foobar" added in place of the tag name variable):
`\[(\[?)(foobar)(?![\w-])([^\]\/]*(?:\/(?!\])[^\]\/]*)*?)(?:(\/)\]|\](?:([^\[]*(?:\[(?!\/\2\])[^\[]*)*)\[\/\2\])?)(\]?)`
----
The `shortcode_parse_atts()` regex allows shortcodes attributes of the
form:
* x=y
* x =y (optional space before =)
* x= y (optional space after =)
* x="y" (use of double-quotes, can have optional spaces around =)
* x='y' (use of single quotes, can have optional spaces around =)
* "y" (just attribute value in double quotes)
* y (non-whitespace character without quotes)
That means that a single or multiple consecutive spaces WILL fail the
regex, and why there's a trim in the current `else`.
So, `[foobar ]` should act the same as `[foobar]`.
One case where the string wouldn't be empty (if I'm reading everything
right) would be something like `[foobar "]` (single double quote) or
`[foobar '']` (two single quotes) - they give a string of `'"'` and `''''`
which are then run through `stripcslashes()`.
----
As a side question, the Rad Software Regular Expression Designer that I
was testing in reported that the `get_shortcode_regex()` regex had nested
quantifiers and wouldn't run without fixing them. Looking at the original
regex, there does seem to be three instances of `*+`, which seems odd. Can
anyone explain that logic?
--
Ticket URL: <https://core.trac.wordpress.org/ticket/23307#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list