[wp-trac] [WordPress Trac] #56531: Aiming to “kill” entities, `sanitize_title_with_dashes()` happens to eat content
WordPress Trac
noreply at wordpress.org
Thu Sep 8 23:17:53 UTC 2022
#56531: Aiming to “kill” entities, `sanitize_title_with_dashes()` happens to eat
content
--------------------------+------------------------------
Reporter: anrghg | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Formatting | Version:
Severity: major | Resolution:
Keywords: | Focuses:
--------------------------+------------------------------
Comment (by anrghg):
As I don’t have the resources to submit patches: The suggested rewrite
could result in the following that also includes converting apostrophe to
hyphen, and okina, letter apostrophe to underscore, as an important
enhancement that would require a separate ticket and a dev note:
{{{#!php
<?php
function sanitize_title_with_dashes( $title, $raw_title = '', $context =
'display' ) {
$title = strip_tags( $title );
// Maintains plus sign before calling `urldecode()`.
$title = str_replace( '%2B', '+', $title );
// URL-decodes to avoid screwing up percent sign removal.
$title = urldecode( $title );
// Removes percent signs.
$title = str_replace( '%', '', $title );
// Decodes HTML entities.
$title = html_entity_decode( $title );
// Reencodes <, >, &.
$title = htmlspecialchars( $title, ENT_NOQUOTES );
// Converts to lowercase.
if ( seems_utf8( $title ) && function_exists( 'mb_strtolower' ) )
{
$title = mb_strtolower( $title, 'UTF-8' );
}
$title = strtolower( $title );
if ( 'save' === $context ) {
// Converts okina, letter apostrophe to underscore.
$title = str_replace( array( 'ʻ', 'ʼ' ), '_', $title );
// Converts punctuation apostrophe to hyphen-minus.
$title = str_replace( array( '’', '\'' ), '-', $title );
// Converts spaces and dashes to hyphen-minus.
$title = preg_replace(
'/[\p{Zs}\p{Zl}\p{Zp}\x{2010}-\x{2015}\x{2212}]/u', '-', $title );
// Converts &, @, /, * and dots to hyphen-minus.
$title = str_replace( array( '&', '@', '/', '*', '·',
'‧' ), '-', $title );
// Converts times to 'x'.
$title = str_replace( '×', 'x', $title );
// Removes entirely format controls, punctuation, symbols,
modifier letters.
$p_s_text = preg_replace(
'/[\p{Cf}\p{Ps}\p{Pe}\p{Pi}\p{Pf}\p{Po}\p{Sk}\p{So}\p{Lm}]/u', '',
$p_s_text );
}
// Converts period to hyphen-minus.
$title = str_replace( '.', '-', $title );
// Collapses and trims hyphen-minus.
$title = preg_replace( '/-+/', '-', $title );
$title = trim( $title, '-' );
// Percent-encodes non-ASCII.
if ( seems_utf8( $title ) ) {
$title = utf8_uri_encode( $title, 200 );
}
// Deletes unsafe ASCII. (No more space.)
$title = preg_replace( '/[^%a-z0-9_-]/', '', $title );
return $title;
}
}}}
--
Ticket URL: <https://core.trac.wordpress.org/ticket/56531#comment:5>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list