[wp-hackers] Content-Type conflict (Mosquito #857)

K Suominen ksuominen at gmail.com
Fri Feb 25 05:07:09 GMT 2005


On Thu, 24 Feb 2005 17:29:10 -0800, Matthew Mullenweg <m at mullenweg.com> wrote:
> K Suominen wrote:
> > - readme.html (yuck, impossible to fix, please make it readme.php instead)
> 
> This will always be a straight HTML file, sorry.

Which means that the only safe way to produce translated versions of
that file is to use entity codes.  Sorry, Ryan.

> > For the files in wp-admin, I guess an acceptable solution would be to
> > hard-code utf-8 as the charset (as is already done with the
> > (ignored-by-browser) <meta> tag), although it means the user could
> > input a blog title that becomes incorrect when the character set is
> > changed.
> 
> What browsers are this?

Firefox 1.0 for sure.

Just to make sure: when I say "ignored-by-browser" I'm referring to
the statement in the Mosquito isseu #857 that information from the
response headers is preferred over the meta tag by browsers.  I can
confirm exactly that behaviour on Firefox 1.0.

You should be able to reproduce the probelm on your own server and browser:
- tell Apache:  AddDefaultCharset iso-8859-1  (applies to html and txt files)
- tell PHP: default_charset="iso-8859-1"  (applies to php files)

Then browse a file with UTF-8 characters and see how the <meta> tag
has no effect, and you see garbage instead of the expected multibyte
characters...

To view a sample from my server, try this one:

http://kimmo.suominen.com/stuff/readme.html

You should be seeing characters like "ä" and "ö", but instead you will
see sequences like "ä" and "ö" instead.  This because your browser
is showing each byte of the multibyte characters, assuming ISO-8859-1
encoding as indicated by the response headers, and effectively
ignoring the <meta http-equiv="Content-Type" content="text/html;
charset=utf-8"> tag inside the document.


Regards,
+ Kim


More information about the hackers mailing list