[wp-trac] Re: [WordPress Trac] #5964: Multi-word tags encoded
incorrectly
WordPress Trac
wp-trac at lists.automattic.com
Fri Feb 29 19:35:02 GMT 2008
#5964: Multi-word tags encoded incorrectly
---------------------------+------------------------------------------------
Reporter: DavidMeade | Owner: anonymous
Type: defect | Status: new
Priority: normal | Milestone: 2.6
Component: General | Version:
Severity: normal | Resolution:
Keywords: tags, tagging |
---------------------------+------------------------------------------------
Comment (by andreashaugstrup):
+1 for fixing
As has been been pointed out this is a technical issue and not a
aesthetical one. As has also been pointed out URL encodings have been
standardized long before anyone ever though about Wordpress. RFC 1738 from
1994 (!) designates the space as an unsafe character that must always be
encoded within a URL as %20. Later the HTML specification allows for the
use of a plus sign.
No matter what you think dashes are not standard just because a popular
blogging engine use them. Wordpress can of course use any format it wishes
for internal use, making any kind of substitution. However, when wishing
to interact with other parties Wordpress should always follow the
established standards rather than making up they own. Wordpress cannot
choose to adopt half the rel-tag specification, but not the other half.
Either this bug should be fixed or rel-tag support in Wordpress should be
dropped.
While dashes are used in Wordpress as a substitute for spaces in tags, it
is not a good solution and that's why RFC 1738 describes a different route
(%20). When you have hyphenated compound words it becomes impossible to
tell a hyphenated compound apart from other options.
Take for example the tags "my wet suit" and "my wet-suit". These are
distinct tags that carry different meanings (feel free to make up other
examples, my native language is not English). But in Wordpress they would
both result in the same tag URL: "my-wet-suit" even though they are
separate tags. Using the only correct way to encode URLs (RFC 1738) they
would remain distinct "my%20wet%20suit" and "my%20wet-suit" respectively -
or as accepted in HTML: "my+wet+suit" and "my+wet-suit".
This should be fixed in Wordpress. Other software systems like Drupal gets
this right.
--
Ticket URL: <http://trac.wordpress.org/ticket/5964#comment:11>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list