[wp-trac] Re: [WordPress Trac] #5964: Multi-word tags encoded incorrectly

WordPress Trac wp-trac at lists.automattic.com
Fri Feb 29 19:35:02 GMT 2008


#5964: Multi-word tags encoded incorrectly
---------------------------+------------------------------------------------
 Reporter:  DavidMeade     |        Owner:  anonymous
     Type:  defect         |       Status:  new      
 Priority:  normal         |    Milestone:  2.6      
Component:  General        |      Version:           
 Severity:  normal         |   Resolution:           
 Keywords:  tags, tagging  |  
---------------------------+------------------------------------------------
Comment (by andreashaugstrup):

 +1 for fixing

 As has been been pointed out this is a technical issue and not a
 aesthetical one. As has also been pointed out URL encodings have been
 standardized long before anyone ever though about Wordpress. RFC 1738 from
 1994 (!) designates the space as an unsafe character that must always be
 encoded within a URL as %20. Later the HTML specification allows for the
 use of a plus sign.

 No matter what you think dashes are not standard just because a popular
 blogging engine use them. Wordpress can of course use any format it wishes
 for internal use, making any kind of substitution. However, when wishing
 to interact with other parties Wordpress should always follow the
 established standards rather than making up they own. Wordpress cannot
 choose to adopt half the rel-tag specification, but not the other half.
 Either this bug should be fixed or rel-tag support in Wordpress should be
 dropped.

 While dashes are used in Wordpress as a substitute for spaces in tags, it
 is not a good solution and that's why RFC 1738 describes a different route
 (%20). When you have hyphenated compound words it becomes impossible to
 tell a hyphenated compound apart from other options.

 Take for example the tags "my wet suit" and "my wet-suit". These are
 distinct tags that carry different meanings (feel free to make up other
 examples, my native language is not English). But in Wordpress they would
 both result in the same tag URL: "my-wet-suit" even though they are
 separate tags. Using the only correct way to encode URLs (RFC 1738) they
 would remain distinct "my%20wet%20suit" and "my%20wet-suit" respectively -
 or as accepted in HTML: "my+wet+suit" and "my+wet-suit".

 This should be fixed in Wordpress. Other software systems like Drupal gets
 this right.

-- 
Ticket URL: <http://trac.wordpress.org/ticket/5964#comment:11>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list