[wp-trac] [WordPress Trac] #53910: `sanitize_title_with_dashes` returns partial encoded values in permalink
WordPress Trac
noreply at wordpress.org
Tue Aug 10 23:33:28 UTC 2021
#53910: `sanitize_title_with_dashes` returns partial encoded values in permalink
-------------------------------------+-------------------------------------
Reporter: costdev | Owner: SergeyBiryukov
Type: defect (bug) | Status: reviewing
Priority: normal | Milestone: 5.9
Component: Permalinks | Version: 5.8
Severity: major | Resolution:
Keywords: has-patch has-unit- | Focuses: ui, rtl,
tests dev-feedback | administration
-------------------------------------+-------------------------------------
Comment (by costdev):
Replying to [comment:6 audrasjb]:
> Aside, I feel a bit hesitant to move it to the next minor milestone. I
don't think it could break anything, but I'm not sure it could not be
considered as a ''breaking change'' since it changes the behavior of the
URL sanitization…
Absolutely, it ''is'' a ''breaking change'', in certain circumstances.
I've also now realised that this bug doesn't just occur with encoded
substrings like `%e2%80%93`, but with all sanitized substrings including
`©` that exist on either side of the 200 character boundary.
----
Before applying the patch, create a post with this title:
> This very long title is to help demonstrate that partial encoded values
remain when you try to use sanitize title with dashes on encoded strings
trimmed to 200 chars instead of using max and strlen©
It will be sanitized to:
> this-very-long-title-is-to-help-demonstrate-that-partial-encoded-values-
remain-when-you-try-to-use-sanitize-title-with-dashes-on-encoded-strings-
trimmed-to-200-chars-instead-of-using-max-and-strlenco
In your theme:
{{{
<?php
if ( have_posts() ) : while( have_posts() ) : the_post();
if ( get_post_field( 'post_name', get_the_ID() ) ===
sanitize_title_with_dashes( get_the_title() ) {
echo 'Yes';
} else {
echo 'No';
}
endwhile; endif;
?>
}}}
Prior to the patch, it should return 'Yes'. After the patch, the
previously stored slug will still have, for example, `%e2` at the end,
resulting in a 'No' result.
----
While this is a very specific edge case whereby there is an old post with
a title that is over 200 characters long and just happens to have any
sanitized substring that crosses the 200 character boundary, it's still a
''breaking change'' after fixing `sanitize_title_with_dashes()`.
Throw into the mix that `sanitize_title_with_dashes()` can be, and is,
used to sanitize other strings that may be stored in the database and used
later in comparative operations, and this is a ''breaking change'' with
some potential reach.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/53910#comment:8>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list