[wp-hackers] Need internationalization-issue help, UTF-8 RSS feeds...

Ryan Boren ryan at boren.nu
Mon Aug 9 18:39:04 UTC 2004


> So... How programmatically do I keep from stomping the UTF-8 chars?  Even
> when debugging through the feed processing, it looks like it is too late and
> the UTF-8 to ascii (or something) 'stomp' has already occurred.  I wouldn't
> be surprised to find that it is in fact certain PHP XML library calls that I
> am making to convert the XML into structured arrays that is part of the
> problem (and would, painfully, write my own XML converter if need be).  I do
> understand that some of the other string functions I am using will
> completely bork UTF-8 strings at this point (trim/strip/substring functions,
> for example).

The expat parser allows setting the source and target encodings to UTF-
8.  I believe it uses UTF-8 for its internal representation.  Do you use
expat?  Does xml_parser_create("UTF-8") not help?

Ryan





More information about the hackers mailing list