[wp-trac] [WordPress Trac] #7099: Comments in POT file should ideally explain meaning of character entities

WordPress Trac wp-trac at lists.automattic.com
Thu Jun 5 12:04:03 GMT 2008


#7099: Comments in POT file should ideally explain meaning of character entities
---------------------+------------------------------------------------------
 Reporter:  leuce    |       Owner:  anonymous                                                        
     Type:  defect   |      Status:  new                                                              
 Priority:  normal   |   Milestone:  2.7                                                              
Component:  General  |     Version:                                                                   
 Severity:  normal   |    Keywords:  i18n comment, translation, l10n, POT, PO, entities, pipe, msgctxt
---------------------+------------------------------------------------------
 The wordpress.pot file contains HTML character entities where one might
 have expected Unicode characters.  I suppose this is normal since the
 input format is PHP which is, after all, a browser viewed format.
 However, it is not always clear from the context to translators what these
 entity codes mean.

 A translator who had previously translated HTML may be aware of  ,
 amp;, &gt and <, and it is fairly easy to guess what " and
 &emdash; stands for, but there are also many numbered entities.  I suggest
 that these are explained to translators in a comment or in msgctxt
 whenever they occur.

 Well, in bug 7090 nbachiyski said that msgctxt is for context only, not
 for comments, so I'm not sure what the ideal solution might be.

 Here is a list of the entity codes used in the POT file:


 {{{
 ' = single quote
 ’ = right single quotation mark
 “ = left double quotation mark
 ” = right double quotation mark

 & = ampersand
 © = copyright sign
 > = greater than, closing angle bracket
 « = left angle quote
 < = lesser than, opening angle bracket
   = non-breaking space
 " = straight double quotation mark

 » and » = right angle quote
 — and — = em dash
 … and … = ellips (three dots)
 › or › = double right angle quote
 — (see — above)
 » (see » above)
 … (see … above)
 › (see › above)
 }}}

 Here's an example of three such cases, and how it might be more useful to
 translators (if msgctxt is used (but see bug 7090 also)):


 {{{
 #: wp-includes/script-loader.php:98
 msgid "Crunching…"
 msgstr ""

 #: wp-includes/script-loader.php:164
 msgid "« Back"
 msgstr ""

 #: wp-includes/script-loader.php:176
 msgid "Send to editor »"
 msgstr ""
 }}}

 to:

 {{{
 #: wp-includes/script-loader.php:98
 msgctxt "The entity … is an ellips, or three dots"
 msgid "Crunching…"
 msgstr ""

 #: wp-includes/script-loader.php:164
 msgctxt ""
 "The entity « is a left angle quote, similar to "
 "<, which is a lesser-than sign, or a stemless "
 "arrow pointing right."
 msgid "« Back"
 msgstr ""

 #: wp-includes/script-loader.php:176
 msgctxt ""
 "The entity » is a right angle quote, similar to "
 ">, which is a greater-than sign, or a stemless "
 "arrow pointing right."
 msgid "Send to editor »"
 msgstr ""
 }}}

 Of course, the ideal situation is to simply display the Unicode character
 in the comments, but I'm not sure if that would be possible, otherwise why
 not just use the Unicode characters directly in the PHP files anyway,
 right?

-- 
Ticket URL: <http://trac.wordpress.org/ticket/7099>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list