[wp-trac] [WordPress Trac] #33121: wp_kses_attr_check fails to process html data-* attributes
WordPress Trac
noreply at wordpress.org
Thu Oct 11 22:55:20 UTC 2018
#33121: wp_kses_attr_check fails to process html data-* attributes
--------------------------------------+-----------------------
Reporter: isoftware | Owner: (none)
Type: defect (bug) | Status: reopened
Priority: normal | Milestone: 5.0
Component: Editor | Version: 4.2.3
Severity: major | Resolution:
Keywords: has-patch has-unit-tests | Focuses:
--------------------------------------+-----------------------
Comment (by peterwilsoncc):
tl;dr in [attachment:"33121.4.diff"]:
* Test added to ensure prefix is followed by a hyphen
* Test added to ensure attributes with multiple hyphens allowed
* Regex altered to require one or more instance of `(-[a-z0-9_]+)`, ie
`'/^' . preg_quote( $prefix ) . '(-[a-z0-9_]+)+$/'`
Replying to [comment:16 azaozz]:
> @peterwilsoncc thanks for adding the test :)
And thanks for the review.
> Looking at `data--invaild="gone"` and `data-also-invaild-="gone"`, it
seems having two hyphens or a hyphen as last char of the data-* attribute
name is valid per https://developer.mozilla.org/en-
US/docs/Web/HTML/Global_attributes/data-* and https://www.w3.org/TR/REC-
xml/#NT-Name. Also seems quite a few chars are valid there, but still
thinking we should only support a-z0-9_-.
This is true but it does some strange things to the `element.dataset`
property available in JavaScript so I decided to prevent it. I've created
a bin with an example
https://jsbin.com/muloxeq/edit?html,js,console,output
I'm happy to change this if needs be.
> The TL;DR: don't think allowing wildcard attributes in KSES is a good
thing. It brings us to a pretty dangerous place and at the same time
reduces some of the existing functionality: sanitizing attribute values.
I'm not sure this is the case if we require the hyphen following any
prefixes, so a developer won't be able to add `href-*` and bypass
checking. The regex change I mention below hardens against this.
I'm also prepared to be misunderstanding something, so are you able to let
me know if that's the case.
> That would mean --somebody-- can add `on-*` or even `o-*` and allow all
`onerror`, `onclick`, `onmouseover`, etc. attributes.
This isn't the case as the hyphen is required before any characters in the
regex group `(-[a-z0-9_]+)`.
As it's important to block, I've added a test in
[attachment:"33121.4.diff"] to ensure against it.
> Also `preg_match( '/^' . preg_quote( $prefix ) . '(-[a-z0-9_]+)*$/',
$name_low )` would mean we don't allow attribute names containing two
hyphens like `data-wp-id` (which is somewhat common).
This is also incorrect, I've added such an example as a test in
[attachment:"33121.4.diff"].
However, the zero or more regex (`*`) did allow users to add
`data="something"`, so I've changed that in the latest patch to be one or
more (`+`).
--
Ticket URL: <https://core.trac.wordpress.org/ticket/33121#comment:17>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list