[wp-hackers] Internationalized URIs

Ryan Boren ryan at boren.nu
Sat Nov 20 06:39:07 UTC 2004


On Fri, 2004-11-19 at 17:32 -0800, Matt Mullenweg wrote:
> Are these working for anyone else?
> 
> It's not for me:
> 
> http://photomatt.net/2004/09/07/
> 
> But I wanted to confirm that this wasn't just a bug on my site.

I have posts on my testbeds that mix various languages together, and
everything works fine.  Check out my usual testbed to see
Iñtërnâtiônàlizætiøn permalinked.

I think you're having a problem because that URI was created with the
old plugin, which didn't perform decomposition of Latin-1 Supplement and
Latin Extended-A characters into unaccented characters.  The new, built-
in I18N URI sanitizer does perform decomposition.  This difference
presents a problem because the server decodes encoded URIs before
WordPress gets them.  Thus, WP must run the URI through the sanitizer
before comparing to the DB.  Since the title in your DB was sanitized
without decomposition and the incoming URI is sanitized with
decomposition, they don't match.  Yes, this sucks since permalinks can
break if the sanitizer algorithm changes.  If the server simply passed
the encoded URI through, we would be okay.

Ryan




More information about the hackers mailing list