[wp-trac] [WordPress Trac] #48106: Revisit post GUID sanitization on `&`
WordPress Trac
noreply at wordpress.org
Mon Sep 23 08:26:37 UTC 2019
#48106: Revisit post GUID sanitization on `&`
--------------------------+-----------------------------
Reporter: zzxiang | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Post Formats | Version: 5.2.3
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
=== The source code of core which needs to be revisit
When a new post/attachment is inserted into the database, the post GUID is
sanitized and the `&` character in GUID is converted to `&`.
More specifically, in `wp-includes/default-filters.php`, function
`wp_filter_kses` is added as a default `pre_post_guid` filter.
{{{#!php
// Save URL
foreach ( array( 'pre_comment_author_url', 'pre_user_url', 'pre_link_url',
'pre_link_image',
'pre_link_rss', 'pre_post_guid' ) as $filter ) {
add_filter( $filter, 'wp_strip_all_tags' );
add_filter( $filter, 'esc_url_raw' );
add_filter( $filter, 'wp_filter_kses' );
}
}}}
Before a post GUID is saved, function `wp_filter_kses` in `wp-
includes/kses.php` is called, and eventually function
`wp_kses_normalize_entities` does the real conversion, so that `&` is
converted to `&`.
=== The problem it causes
The plugin External Media without Import (https://wordpress.org/plugins
/external-media-without-import/) inserts external image URLs as post GUIDs
into database so that users can add external images into their media
libraries without actually uploading the image files to their WordPress
servers. If the image URL contains `&`, such as
https://pbs.twimg.com/media/D_NKa3yWkAYwZwn?name=900x900&format=png
it is converted to
https://pbs.twimg.com/media/D_NKa3yWkAYwZwn?name=900x900&format=png
The result is that the image is not correctly displayed in some places,
such as the media library page of the admin dashboard.
There're also other plugins, such as Imposer
(https://github.com/dirtsimple/imposer/) and Postmark
(https://github.com/dirtsimple/postmark/), encountering the same issue.
Imposer fixes the issue by forcing to save post GUIDs again with the
unsanitized version. I think it is equivalent to removing `wp_filter_kses`
from the default `pre_post_guid` filters.
=== The reason of post GUID sanitization
Post GUID sanitization was added with a commit in 2011:
https://github.com/WordPress/WordPress/commit/81a5f821fbfb63be6c5517d033b8e7a0a4172f07.
The commit log message does not state why post GUIDs need to be sanitized
on save and display. Also, the commit is so long time ago that seems that
even the members of the core channel of WordPress Slack group can't tell
the reason.
At first it was thought that it is because when exporting RSS feeds, `&`
needs to be converted due to XML specification. But I did some experiments
and inspected the core source code, and found that in fact WordPress core
does convert `&` to `&` while exporting RSS2 feed, even if I changed
the `&` back to `&` in the database via MySQL client. The convertion
is done by function `wptexturize` in `wp-includes/formatting.php`. The
function is added as a default `the_content` filter.
So I really don't understand why post GUIDs should be sanitized,
especially for the `&` issue. This might be a core issue rather than a
plugin issue. It might be fine to not add `wp_filter_kses` as a default
`pre_post_guid`, i.e. not do the post GUID sanitization.
This issue has also been discussed here: https://github.com/zzxiang
/external-media-without-import/issues/17
--
Ticket URL: <https://core.trac.wordpress.org/ticket/48106>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list