[wp-trac] [WordPress Trac] #19998: Feeds can contain characters that are not valid XML
WordPress Trac
wp-trac at lists.automattic.com
Fri Feb 10 06:57:45 UTC 2012
#19998: Feeds can contain characters that are not valid XML
--------------------------+------------------------------
Reporter: westi | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Feeds | Version: 3.3.1
Severity: normal | Resolution:
Keywords: needs-patch |
--------------------------+------------------------------
Comment (by solarissmoke):
One approach could be to filter usgin the set of valid characters from the
[http://www.w3.org/TR/REC-xml/#charsets spec]:
{{{
function strip_for_xml( $utf8 ) {
return preg_replace(
'/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ',
$utf8 );
}
}}}
This assumes that the feed is served as UTF-8. I've no idea what it would
do to XML in other charsets.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/19998#comment:1>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list