[wp-hackers] 8bit ASCII characters in mail-headers

Sebastian Herp newsletter at scytheman.net
Wed Sep 8 14:18:14 UTC 2004


drDave wrote:

>
> On Sep 8, 2004, at 9:25 PM, Sebastian Herp wrote:
>
>> Hello again,
>>
>> i played around with the comment notification function and as drDave 
>> suspected, wordpress doesn't respect RFCs :-)
>
>
> damn. I didn't do anything... not me sir...

I am sorry, i am not a native english speaker :-) "suspected" was the 
wrong word ... you "asked" if wordpress respects the RFCs :-)

>> So I wrote a little function that converts Subject and From headers 
>> correctly into the "Q" encoding described in RFC1522 
>> (http://rfc.net/rfc1522.html). I have tested it with Wordpress 1.2 
>> and it works perfectly :-) Every ÖÄÜß is encoded ... hurray!!!! Is 
>> there a large enough interest from the developer-side to implement 
>> this in wordpress
>
>
> I say there definitely should be one...
> however, one important question: you are encoding these headers using 
> UTF-8, right?
> If we want WP to be truly usable by non-English speaker, supporting 
> non ASCII titles is vital, even more so for non Latin character sets 
> (kanji, arab, hebrew etc) where the whole sequence would be absolutely 
> unreadable. So such a function must make sure it supports all the 
> encoding it could get fed, convert to UTF-8 (if mb_string is 
> available) and then encode according to RFC1522.
>
Encoding is a strong word ... i am not really encoding anything and if 
you look at the "Q" encoding thing, we don't have to :-) I am only 
converting all chars which have the 8th bit set into hexcode (e.g. 
"=F6") and say that it is whatever encoding the wordpress admin has set 
in his options. That should work for everyone ...

Example:
"Ich liebe Umlaute: äöüß~whatever trala!πक" becomes
=?UTF-8?Q?Ich liebe Umlaute: =C3=A4=C3=B6=C3=BC=C3=9F~whatever trala!=CF?=
=?UTF-8?Q?=80=E0=A4=95?=

>> (had a bad expierience with the wp-calendar which _still_ does not 
>> know that there are nations not using sunday as the first day of the 
>> week)? If yes, i'll get it to work with the CVS version and post the 
>> diffs ...
>
>
> that sounds to me like something which should be implemented. there 
> are definitely a lot of countries (actually, pretty much anywhere else 
> beside the US) that start the week on Monday and I can imagine the 
> US-style of wp-calendar makes it mostly useless for them.
>
> Hope this gets implemented in the core one way or another (support for 
> UTF-8 needs to be more consistent, imho)...

Calendar: yes
UTF-8: this header thing has nothing to do with UTF-8 ... i doesn't 
matter what charset is used only the fact that it IS "encoded" ...



More information about the hackers mailing list