[wp-trac] Re: [WordPress Trac] #7563: html_entity_decode at RSS Feed import doesn't respect charset of Blog

WordPress Trac wp-trac at lists.automattic.com
Sun Sep 14 20:09:33 GMT 2008


#7563: html_entity_decode at RSS Feed import doesn't respect charset of Blog
------------------------------------------+---------------------------------
 Reporter:  codestyling                   |        Owner:  anonymous
     Type:  defect                        |       Status:  new      
 Priority:  high                          |    Milestone:  2.7      
Component:  General                       |      Version:  2.5.1    
 Severity:  critical                      |   Resolution:           
 Keywords:  rss bug feed encoding damage  |  
------------------------------------------+---------------------------------
Changes (by codestyling):

  * keywords:  => rss bug feed encoding damage
  * version:  => 2.5.1

Comment:

 I have created a patch for MagpieRSS class to be able to handle the
 imported Feeds correctly.
 The patch is made for PHP4 versions, which doesn't detect the feeds
 encoding (UTF-8 feeds will be handled as ISO feeds  and also for PHP5
 versions (with detection) to  ensure qualified ISO based html entities
 gets converted into UTF-8 target.
 Here are 2 feeds gets damaged, if added to dashboard:

 -> ISO-8859-1 feed
 {{{
 http://www.maerkischeallgemeine.de/cms/list/6947650?style_only=J&cms_encoding=iso
 }}}

 -> UTF-8 Feed with ISO entities (like ä)
 {{{
 http://blog.wordpress-deutschland.org/feed
 }}}

 The patch has been tested at PHP4 and PHP5 with both example feeds and
 show them now correctly. Also the database doesn't store anymore damaged
 option values (broken serialize using original rss.php, sometimes
 dependend on feed content)
 Input encoding will be detected using regular expression at raw data and
 output enconding will be set using charset of blog by given option value.

-- 
Ticket URL: <http://trac.wordpress.org/ticket/7563#comment:2>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list