[wp-trac] Re: [WordPress Trac] #4794: xml-rpc should identify encoding

WordPress Trac wp-trac at lists.automattic.com
Sun Jun 14 20:46:58 GMT 2009


#4794: xml-rpc should identify encoding
--------------------------+-------------------------------------------------
 Reporter:  redsweater    |       Owner:  josephscott
     Type:  defect (bug)  |      Status:  new        
 Priority:  normal        |   Milestone:  2.9        
Component:  XML-RPC       |     Version:  2.2.2      
 Severity:  normal        |    Keywords:             
--------------------------+-------------------------------------------------

Old description:

> WordPress provides users with a preference to identify the text encoding
> of the blog's content. But this encoding format is not used to identify
> the content expectations for (most) XML documents generated by
> xmlrpc.php.
>
> Notice that when RSD support was added, the developer who wrote that code
> *did* include the blog's encoding in the document header. But for all
> other XML documents generated (i.e. replies to XML-RPC queries, the
> encoding is omitted.
>
> When the encoding is omitted, as I understand it, the presumed encoding
> is UTF8. In my limited experience with customers running non-UTF8 blogs,
> they tend to use ISO-8859-1 encoding. When they use this encoding and
> also take advantage of some of the accented characters in that set, such
> as 0xE9 or 0xc9, the resulting document is illegal XML because it
> contains characters that are not part of the presumed UTF8 set.
>
> This failure to identify properly the encoding of XML documents can lead
> blog clients to fail to parse the XML, and therefore cause the XML-RPC to
> more or less completely fail for a certain class of users.
>
> I propose that xmlrpc.php be modified such that every XML document it
> generates for the purposes of exposing blog content, be identified as
> being of the encoding specified by the user in Options -> Reading.

New description:

 WordPress provides users with a preference to identify the text encoding
 of the blog's content. But this encoding format is not used to identify
 the content expectations for (most) XML documents generated by xmlrpc.php.

 Notice that when RSD support was added, the developer who wrote that code
 *did* include the blog's encoding in the document header. But for all
 other XML documents generated (i.e. replies to XML-RPC queries, the
 encoding is omitted.

 When the encoding is omitted, as I understand it, the presumed encoding is
 UTF8. In my limited experience with customers running non-UTF8 blogs, they
 tend to use ISO-8859-1 encoding. When they use this encoding and also take
 advantage of some of the accented characters in that set, such as 0xE9 or
 0xc9, the resulting document is illegal XML because it contains characters
 that are not part of the presumed UTF8 set.

 This failure to identify properly the encoding of XML documents can lead
 blog clients to fail to parse the XML, and therefore cause the XML-RPC to
 more or less completely fail for a certain class of users.

 I propose that xmlrpc.php be modified such that every XML document it
 generates for the purposes of exposing blog content, be identified as
 being of the encoding specified by the user in Options -> Reading.

--

Comment(by Denis-de-Bernardy):

 seems still valid.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/4794#comment:8>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list