[wp-trac] [WordPress Trac] #56504: `sanitize_html_class()` is both too restrictive, and too permissive so it may return an invalid class name
WordPress Trac
noreply at wordpress.org
Fri Sep 2 20:47:05 UTC 2022
#56504: `sanitize_html_class()` is both too restrictive, and too permissive so it
may return an invalid class name
--------------------------+-----------------------------
Reporter: anrghg | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
`sanitize_html_class()` returns invalid class when arguments start with a
digit 0-9, or a hyphen followed by a digit 0-9, per CSS spec
(https://www.w3.org/TR/CSS21/syndata.html#characters). Brave/Chrome does
not support these invalid classes, they do not work, so they are not
“sane”, the return value is not sanitized.
At the other end, `sanitize_html_class()` needlessly degrades class names
containing or made of non-ASCII Unicode, from no-break space and above,
accented letters and emoji are allowed in class names (and IDs) and work
perfectly, provided of course they are not URL-encoded, but they may be
backslash escaped, e.g. UTF-8. Best is to have them in plain Unicode, per
https://www.w3.org/International/questions/qa-escapes
A sanitizing function conforming to the spec and providing a better user
experience could be coded for example like so:
{{{
function anrghg_sanitize_html_id_class( $p_s_string, $p_s_prefix = '_',
$p_b_decode = true ) {
if ( preg_match( '/[0-9]/', $p_s_string[0] )
||
( preg_match( '/-/', $p_s_string[0] ) && preg_match(
'/[-0-9]/', $p_s_string[1] ) )
) {
$p_s_string = $p_s_prefix . $p_s_string;
}
if ( $p_b_decode ) {
$p_s_string = urldecode( $p_s_string );
} else {
$p_s_string = preg_replace( '/%[0-9A-Fa-f]{2}/', '',
$p_s_string );
}
$p_s_string = preg_replace( '/((?<!\\\\[0-9A-
Fa-f]{2})\s|(?<!\\\\)[%^{}~@`\'"&#$()+[\]|\/*<>=?;:!,.])/', '',
$p_s_string );
return $p_s_string;
}
}}}
Prepending an underscore probably provides a better UX than escaping the
first digit.
The `apply_filters()` is skipped for brevity. We can use this filter to
override default processing, but the issue is not so much about
customization, rather about conformance to the CSS specification.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/56504>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list