[wp-trac] [WordPress Trac] #34677: Inline comments for remove_accents()

WordPress Trac noreply at wordpress.org
Mon Nov 16 18:39:11 UTC 2015


#34677: Inline comments for remove_accents()
--------------------------+------------------------------
 Reporter:  John_Schlick  |       Owner:
     Type:  enhancement   |      Status:  new
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  Formatting    |     Version:  trunk
 Severity:  normal        |  Resolution:
 Keywords:                |     Focuses:  docs
--------------------------+------------------------------

Comment (by John_Schlick):

 Here is the completed version with ALL the comments, and the characters in
 unicode order with full unicode names:
 {{{#!php
 <?php
                         $chars = array(
                                 // U+00A3 | £ | POUND SIGN <- why is this
 eliminated?
                                 chr(194).chr(163) => '',
                         // Decompositions for Latin-1 Supplement
                                 // U+00AA | ª | FEMININE ORDINAL INDICATOR
                                 chr(194).chr(170) => 'a',
                                 // U+00BA | º | MASCULINE ORDINAL
 INDICATOR
                                 chr(194).chr(186) => 'o',
                                 // U+00C0 | À | LATIN CAPITAL LETTER A
 WITH GRAVE
                                 chr(195).chr(128) => 'A',
                                 // U+00C1 | Á | LATIN CAPITAL LETTER A
 WITH ACUTE
                                 chr(195).chr(129) => 'A',
                                 // U+00C2 | Â | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX
                                 chr(195).chr(130) => 'A',
                                 // U+00C3 | Ã | LATIN CAPITAL LETTER A
 WITH TILDE
                                 chr(195).chr(131) => 'A',
                                 // U+00C4 | Ä | LATIN CAPITAL LETTER A
 WITH DIAERESIS
                                 chr(195).chr(132) => 'A',
                                 // U+00C5 | Å | LATIN CAPITAL LETTER A
 WITH RING ABOVE
                                 chr(195).chr(133) => 'A',
                                 // U+00C6 | Æ | LATIN CAPITAL LETTER AE
                                 chr(195).chr(134) => 'AE',
                                 // U+00C7 | Ç | LATIN CAPITAL LETTER C
 WITH CEDILLA
                                 chr(195).chr(135) => 'C',
                                 // U+00C8 | È | LATIN CAPITAL LETTER E
 WITH GRAVE
                                 chr(195).chr(136) => 'E',
                                 // U+00C9 | É | LATIN CAPITAL LETTER E
 WITH ACUTE
                                 chr(195).chr(137) => 'E',
                                 // U+00CA | Ê | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX
                                 chr(195).chr(138) => 'E',
                                 // U+00CB | Ë | LATIN CAPITAL LETTER E
 WITH DIAERESIS
                                 chr(195).chr(139) => 'E',
                                 // U+00CC | Ì | LATIN CAPITAL LETTER I
 WITH GRAVE
                                 chr(195).chr(140) => 'I',
                                 // U+00CD | Í | LATIN CAPITAL LETTER I
 WITH ACUTE
                                 chr(195).chr(141) => 'I',
                                 // U+00CE | Î | LATIN CAPITAL LETTER I
 WITH CIRCUMFLEX
                                 chr(195).chr(142) => 'I',
                                 // U+00CF | Ï | LATIN CAPITAL LETTER I
 WITH DIAERESIS
                                 chr(195).chr(143) => 'I',
                                 // U+00D0 | Ð | LATIN CAPITAL LETTER ETH
                                 chr(195).chr(144) => 'D',
                                 // U+00D1 | Ñ | LATIN CAPITAL LETTER N
 WITH TILDE
                                 chr(195).chr(145) => 'N',
                                 // U+00D2 | Ò | LATIN CAPITAL LETTER O
 WITH GRAVE
                                 chr(195).chr(146) => 'O',
                                 // U+00D3 | Ó | LATIN CAPITAL LETTER O
 WITH ACUTE
                                 chr(195).chr(147) => 'O',
                                 // U+00D4 | Ô | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX
                                 chr(195).chr(148) => 'O',
                                 // U+00D5 | Õ | LATIN CAPITAL LETTER O
 WITH TILDE
                                 chr(195).chr(149) => 'O',
                                 // U+00D6 | Ö | LATIN CAPITAL LETTER O
 WITH DIAERESIS
                                 chr(195).chr(150) => 'O',
                                 // U+00D8 | Ø | LATIN CAPITAL LETTER O
 WITH STROKE
                                 chr(195).chr(152) => 'O',
                                 // U+00D9 | Ù | LATIN CAPITAL LETTER U
 WITH GRAVE
                                 chr(195).chr(153) => 'U',
                                 // U+00DA | Ú | LATIN CAPITAL LETTER U
 WITH ACUTE
                                 chr(195).chr(154) => 'U',
                                 // U+00DB | Û | LATIN CAPITAL LETTER U
 WITH CIRCUMFLEX
                                 chr(195).chr(155) => 'U',
                                 // U+00DC | Ü | LATIN CAPITAL LETTER U
 WITH DIAERESIS
                                 chr(195).chr(156) => 'U',
                                 // U+00DD | Ý | LATIN CAPITAL LETTER Y
 WITH ACUTE
                                 chr(195).chr(157) => 'Y',
                                 // U+00DE | Þ | LATIN CAPITAL LETTER THORN
                                 chr(195).chr(158) => 'TH',
                                 // U+00DF | ß | LATIN SMALL LETTER SHARP S
                                 chr(195).chr(159) => 's',
                                 // U+00E0 | à | LATIN SMALL LETTER A WITH
 GRAVE
                                 chr(195).chr(160) => 'a',
                                 // U+00E1 | á | LATIN SMALL LETTER A WITH
 ACUTE
                                 chr(195).chr(161) => 'a',
                                 // U+00E2 | â | LATIN SMALL LETTER A WITH
 CIRCUMFLEX
                                 chr(195).chr(162) => 'a',
                                 // U+00E3 | ã | LATIN SMALL LETTER A WITH
 TILDE
                                 chr(195).chr(163) => 'a',
                                 // U+00E4 | ä | LATIN SMALL LETTER A WITH
 DIAERESIS
                                 chr(195).chr(164) => 'a',
                                 // U+00E5 | å | LATIN SMALL LETTER A WITH
 RING ABOVE
                                 chr(195).chr(165) => 'a',
                                 // U+00E6 | æ | LATIN SMALL LETTER AE
                                 chr(195).chr(166) => 'ae',
                                 // U+00E7 | ç | LATIN SMALL LETTER C WITH
 CEDILLA
                                 chr(195).chr(167) => 'c',
                                 // U+00E8 | è | LATIN SMALL LETTER E WITH
 GRAVE
                                 chr(195).chr(168) => 'e',
                                 // U+00E9 | é | LATIN SMALL LETTER E WITH
 ACUTE
                                 chr(195).chr(169) => 'e',
                                 // U+00EA | ê | LATIN SMALL LETTER E WITH
 CIRCUMFLEX
                                 chr(195).chr(170) => 'e',
                                 // U+00EB | ë | LATIN SMALL LETTER E WITH
 DIAERESIS
                                 chr(195).chr(171) => 'e',
                                 // U+00EC | ì | LATIN SMALL LETTER I WITH
 GRAVE
                                 chr(195).chr(172) => 'i',
                                 // U+00ED | í | LATIN SMALL LETTER I WITH
 ACUTE
                                 chr(195).chr(173) => 'i',
                                 // U+00EE | î | LATIN SMALL LETTER I WITH
 CIRCUMFLEX
                                 chr(195).chr(174) => 'i',
                                 // U+00EF | ï | LATIN SMALL LETTER I WITH
 DIAERESIS
                                 chr(195).chr(175) => 'i',
                                 // U+00F0 | ð | LATIN SMALL LETTER ETH
                                 chr(195).chr(176) => 'd',
                                 // U+00F1 | ñ | LATIN SMALL LETTER N WITH
 TILDE
                                 chr(195).chr(177) => 'n',
                                 // U+00F2 | ò | LATIN SMALL LETTER O WITH
 GRAVE
                                 chr(195).chr(178) => 'o',
                                 // U+00F3 | ó | LATIN SMALL LETTER O WITH
 ACUTE
                                 chr(195).chr(179) => 'o',
                                 // U+00F4 | ô | LATIN SMALL LETTER O WITH
 CIRCUMFLEX
                                 chr(195).chr(180) => 'o',
                                 // U+00F5 | õ | LATIN SMALL LETTER O WITH
 TILDE
                                 chr(195).chr(181) => 'o',
                                 // U+00F6 | ö | LATIN SMALL LETTER O WITH
 DIAERESIS
                                 chr(195).chr(182) => 'o',
                                 // U+00F8 | ø | LATIN SMALL LETTER O WITH
 STROKE
                                 chr(195).chr(184) => 'o',
                                 // U+00F9 | ù | LATIN SMALL LETTER U WITH
 GRAVE
                                 chr(195).chr(185) => 'u',
                                 // U+00FA | ú | LATIN SMALL LETTER U WITH
 ACUTE
                                 chr(195).chr(186) => 'u',
                                 // U+00FB | û | LATIN SMALL LETTER U WITH
 CIRCUMFLEX
                                 chr(195).chr(187) => 'u',
                                 // U+00FC | ü | LATIN SMALL LETTER U WITH
 DIAERESIS
                                 chr(195).chr(188) => 'u',
                                 // U+00FD | ý | LATIN SMALL LETTER Y WITH
 ACUTE
                                 chr(195).chr(189) => 'y',
                                 // U+00FE | þ | LATIN SMALL LETTER THORN
                                 chr(195).chr(190) => 'th',
                                 // U+00FF | ÿ | LATIN SMALL LETTER Y WITH
 DIAERESIS
                                 chr(195).chr(191) => 'y',
                         // Decompositions for Latin Extended-A
                                 // U+0100 | Ā | LATIN CAPITAL LETTER A
 WITH MACRON
                                 chr(196).chr(128) => 'A',
                                 // U+0101 | ā | LATIN SMALL LETTER A WITH
 MACRON
                                 chr(196).chr(129) => 'a',
                                 // U+0102 | Ă | LATIN CAPITAL LETTER A
 WITH BREVE
                                 chr(196).chr(130) => 'A',
                                 // U+0103 | ă | LATIN SMALL LETTER A WITH
 BREVE
                                 chr(196).chr(131) => 'a',
                                 // U+0104 | Ą | LATIN CAPITAL LETTER A
 WITH OGONEK
                                 chr(196).chr(132) => 'A',
                                 // U+0105 | ą | LATIN SMALL LETTER A WITH
 OGONEK
                                 chr(196).chr(133) => 'a',
                                 // U+01006 | Ć | LATIN CAPITAL LETTER C
 WITH ACUTE
                                 chr(196).chr(134) => 'C',
                                 // U+0107 | ć | LATIN SMALL LETTER C WITH
 ACUTE
                                 chr(196).chr(135) => 'c',
                                 // U+0108 | Ĉ | LATIN CAPITAL LETTER C
 WITH CIRCUMFLEX
                                 chr(196).chr(136) => 'C',
                                 // U+0109 | ĉ | LATIN SMALL LETTER C WITH
 CIRCUMFLEX
                                 chr(196).chr(137) => 'c',
                                 // U+010A | Ċ | LATIN CAPITAL LETTER C
 WITH DOT ABOVE
                                 chr(196).chr(138) => 'C',
                                 // U+010B | ċ | LATIN SMALL LETTER C WITH
 DOT ABOVE
                                 chr(196).chr(139) => 'c',
                                 // U+010C | Č | LATIN CAPITAL LETTER C
 WITH CARON
                                 chr(196).chr(140) => 'C',
                                 // U+010D | č | LATIN SMALL LETTER C WITH
 CARON
                                 chr(196).chr(141) => 'c',
                                 // U+010E | Ď | LATIN CAPITAL LETTER D
 WITH CARON
                                 chr(196).chr(142) => 'D',
                                 // U+010F | ď | LATIN SMALL LETTER D WITH
 CARON
                                 chr(196).chr(143) => 'd',
                                 // U+0110 | Đ | LATIN CAPITAL LETTER D
 WITH STROKE
                                 chr(196).chr(144) => 'D',
                                 // U+0111 | đ | LATIN SMALL LETTER D WITH
 STROKE
                                 chr(196).chr(145) => 'd',
                                 // U+0112 | Ē | LATIN CAPITAL LETTER E
 WITH MACRON
                                 chr(196).chr(146) => 'E',
                                 // U+0113 | ē | LATIN SMALL LETTER E WITH
 MACRON
                                 chr(196).chr(147) => 'e',
                                 // U+0114 | Ĕ | LATIN CAPITAL LETTER E
 WITH BREVE
                                 chr(196).chr(148) => 'E',
                                 // U+0115 | ĕ | LATIN SMALL LETTER E WITH
 BREVE
                                 chr(196).chr(149) => 'e',
                                 // U+0116 | Ė | LATIN CAPITAL LETTER E
 WITH DOT ABOVE
                                 chr(196).chr(150) => 'E',
                                 // U+0117 | ė | LATIN SMALL LETTER E WITH
 DOT ABOVE
                                 chr(196).chr(151) => 'e',
                                 // U+0118 | Ę | LATIN CAPITAL LETTER E
 WITH OGONEK
                                 chr(196).chr(152) => 'E',
                                 // U+0119 | ę | LATIN SMALL LETTER E WITH
 OGONEK
                                 chr(196).chr(153) => 'e',
                                 // U+011A | Ě | LATIN CAPITAL LETTER E
 WITH CARON
                                 chr(196).chr(154) => 'E',
                                 // U+011B | ě | LATIN SMALL LETTER E WITH
 CARON
                                 chr(196).chr(155) => 'e',
                                 // U+011C | Ĝ | LATIN CAPITAL LETTER G
 WITH CIRCUMFLEX
                                 chr(196).chr(156) => 'G',
                                 // U+011D | ĝ | LATIN SMALL LETTER G WITH
 CIRCUMFLEX
                                 chr(196).chr(157) => 'g',
                                 // U+011E | Ğ | LATIN CAPITAL LETTER G
 WITH BREVE
                                 chr(196).chr(158) => 'G',
                                 // U+011F | ğ | LATIN SMALL LETTER G WITH
 BREVE
                                 chr(196).chr(159) => 'g',
                                 // U+0120 | Ġ | LATIN CAPITAL LETTER G
 WITH DOT ABOVE
                                 chr(196).chr(160) => 'G',
                                 // U+0121 | ġ | LATIN SMALL LETTER G WITH
 DOT ABOVE
                                 chr(196).chr(161) => 'g',
                                 // U+0122 | Ģ | LATIN CAPITAL LETTER G
 WITH CEDILLA
                                 chr(196).chr(162) => 'G',
                                 // U+0123 | ģ | LATIN SMALL LETTER G WITH
 CEDILLA
                                 chr(196).chr(163) => 'g',
                                 // U+0124 | Ĥ | LATIN CAPITAL LETTER H
 WITH CIRCUMFLEX
                                 chr(196).chr(164) => 'H',
                                 // U+0125 | ĥ | LATIN SMALL LETTER H WITH
 CIRCUMFLEX
                                 chr(196).chr(165) => 'h',
                                 // U+0126 | Ħ | LATIN CAPITAL LETTER H
 WITH STROKE
                                 chr(196).chr(166) => 'H',
                                 // U+0127 | ħ | LATIN SMALL LETTER H WITH
 STROKE
                                 chr(196).chr(167) => 'h',
                                 // U+0128 | Ĩ | LATIN CAPITAL LETTER I
 WITH TILDE
                                 chr(196).chr(168) => 'I',
                                 // U+0129 | ĩ | LATIN SMALL LETTER I WITH
 TILDE
                                 chr(196).chr(169) => 'i',
                                 // U+012A | Ī | LATIN CAPITAL LETTER I
 WITH MACRON
                                 chr(196).chr(170) => 'I',
                                 // U+012B | ī | LATIN SMALL LETTER I WITH
 MACRON
                                 chr(196).chr(171) => 'i',
                                 // U+012C | Ĭ | LATIN CAPITAL LETTER I
 WITH BREVE
                                 chr(196).chr(172) => 'I',
                                 // U+012D | ĭ | LATIN SMALL LETTER I WITH
 BREVE
                                 chr(196).chr(173) => 'i',
                                 // U+012E | Į | LATIN CAPITAL LETTER I
 WITH OGONEK
                                 chr(196).chr(174) => 'I',
                                 // U+012F | į | LATIN SMALL LETTER I WITH
 OGONEK
                                 chr(196).chr(175) => 'i',
                                 // U+0130 | İ | LATIN CAPITAL LETTER I
 WITH DOT ABOVE
                                 chr(196).chr(176) => 'I',
                                 // U+0131 | ı | LATIN SMALL LETTER DOTLESS
 I
                                 chr(196).chr(177) => 'i',
                                 // U+0132 | IJ | LATIN CAPITAL LIGATURE IJ
                                 chr(196).chr(178) => 'IJ',
                                 // U+0133 | ij | LATIN SMALL LIGATURE IJ
                                 chr(196).chr(179) => 'ij',
                                 // U+0134 | Ĵ | LATIN CAPITAL LETTER J
 WITH CIRCUMFLEX
                                 chr(196).chr(180) => 'J',
                                 // U+0135 | ĵ | LATIN SMALL LETTER J WITH
 CIRCUMFLEX
                                 chr(196).chr(181) => 'j',
                                 // U+0136 | Ķ | LATIN CAPITAL LETTER K
 WITH CEDILLA
                                 chr(196).chr(182) => 'K',
                                 // U+0137 | ķ | LATIN SMALL LETTER K WITH
 CEDILLA
                                 chr(196).chr(183) => 'k',
                                 // U+0138 | ĸ | LATIN SMALL LETTER KRA
                                 chr(196).chr(184) => 'k',
                                 // U+0139 | Ĺ | LATIN CAPITAL LETTER L
 WITH ACUTE
                                 chr(196).chr(185) => 'L',
                                 // U+013A | ĺ | LATIN SMALL LETTER L WITH
 ACUTE
                                 chr(196).chr(186) => 'l',
                                 // U+013B | Ļ | LATIN CAPITAL LETTER L
 WITH CEDILLA
                                 chr(196).chr(187) => 'L',
                                 // U+013C | ļ | LATIN SMALL LETTER L WITH
 CEDILLA
                                 chr(196).chr(188) => 'l',
                                 // U+013D | Ľ | LATIN CAPITAL LETTER L
 WITH CARON
                                 chr(196).chr(189) => 'L',
                                 // U+013E | ľ | LATIN SMALL LETTER L WITH
 CARON
                                 chr(196).chr(190) => 'l',
                                 // U+013F | Ŀ | LATIN CAPITAL LETTER L
 WITH MIDDLE DOT
                                 chr(196).chr(191) => 'L',
                                 // U+0140 | ŀ | LATIN SMALL LETTER L WITH
 MIDDLE DOT
                                 chr(197).chr(128) => 'l',
                                 // U+0141 | Ł | LATIN CAPITAL LETTER L
 WITH STROKE
                                 chr(197).chr(129) => 'L',
                                 // U+0142 | ł | LATIN SMALL LETTER L WITH
 STROKE
                                 chr(197).chr(130) => 'l',
                                 // U+0143 | Ń | LATIN CAPITAL LETTER N
 WITH ACUTE
                                 chr(197).chr(131) => 'N',
                                 // U+0144 | ń | LATIN SMALL LETTER N WITH
 ACUTE
                                 chr(197).chr(132) => 'n',
                                 // U+0145 | Ņ | LATIN CAPITAL LETTER N
 WITH CEDILLA
                                 chr(197).chr(133) => 'N',
                                 // U+0146 | ņ | LATIN SMALL LETTER N WITH
 CEDILLA
                                 chr(197).chr(134) => 'n',
                                 // U+0147 | Ň | LATIN CAPITAL LETTER N
 WITH CARON
                                 chr(197).chr(135) => 'N',
                                 // U+0148 | ň | LATIN SMALL LETTER N WITH
 CARON
                                 chr(197).chr(136) => 'n',
                                 // U+0149 | ʼn | LATIN SMALL LETTER N
 PRECEDED BY APOSTROPHE
                                 chr(197).chr(137) => 'N',
                                 // U+014A | Ŋ | LATIN CAPITAL LETTER ENG
                                 chr(197).chr(138) => 'n',
                                 // U+014B | ŋ | LATIN SMALL LETTER ENG
                                 chr(197).chr(139) => 'N',
                                 // U+014C | Ō | LATIN CAPITAL LETTER O
 WITH MACRON
                                 chr(197).chr(140) => 'O',
                                 // U+014D | ō | LATIN SMALL LETTER O WITH
 MACRON
                                 chr(197).chr(141) => 'o',
                                 // U+014E | Ŏ | LATIN CAPITAL LETTER O
 WITH BREVE
                                 chr(197).chr(142) => 'O',
                                 // U+014F | ŏ | LATIN SMALL LETTER O WITH
 BREVE
                                 chr(197).chr(143) => 'o',
                                 // U+0150 | Ő | LATIN CAPITAL LETTER O
 WITH DOUBLE ACUTE
                                 chr(197).chr(144) => 'O',
                                 // U+0151 | ő | LATIN SMALL LETTER O WITH
 DOUBLE ACUTE
                                 chr(197).chr(145) => 'o',
                                 // U+0152 | Π| LATIN CAPITAL LIGATURE OE
                                 chr(197).chr(146) => 'OE',
                                 // U+0153 | œ | LATIN SMALL LIGATURE OE
                                 chr(197).chr(147) => 'oe',
                                 // U+0154 | Ŕ | LATIN CAPITAL LETTER R
 WITH ACUTE
                                 chr(197).chr(148) => 'R',
                                 // U+0155 | ŕ | LATIN SMALL LETTER R WITH
 ACUTE
                                 chr(197).chr(149) => 'r',
                                 // U+0156 | Ŗ | LATIN CAPITAL LETTER R
 WITH CEDILLA
                                 chr(197).chr(150) => 'R',
                                 // U+0157 | ŗ | LATIN SMALL LETTER R WITH
 CEDILLA
                                 chr(197).chr(151) => 'r',
                                 // U+0158 | Ř | LATIN CAPITAL LETTER R
 WITH CARON
                                 chr(197).chr(152) => 'R',
                                 // U+0159 | ř | LATIN SMALL LETTER R WITH
 CARON
                                 chr(197).chr(153) => 'r',
                                 // U+015A | Ś | LATIN CAPITAL LETTER S
 WITH ACUTE
                                 chr(197).chr(154) => 'S',
                                 // U+015B | ś | LATIN SMALL LETTER S WITH
 ACUTE
                                 chr(197).chr(155) => 's',
                                 // U+015C | Ŝ | LATIN CAPITAL LETTER S
 WITH CIRCUMFLEX
                                 chr(197).chr(156) => 'S',
                                 // U+015D | ŝ | LATIN SMALL LETTER S WITH
 CIRCUMFLEX
                                 chr(197).chr(157) => 's',
                                 // U+015E | Ş | LATIN CAPITAL LETTER S
 WITH CEDILLA
                                 chr(197).chr(158) => 'S',
                                 // U+015F | ş | LATIN SMALL LETTER S WITH
 CEDILLA
                                 chr(197).chr(159) => 's',
                                 // U+0160 | Š | LATIN CAPITAL LETTER S
 WITH CARON
                                 chr(197).chr(160) => 'S',
                                 // U+0161 | š | LATIN SMALL LETTER S WITH
 CARON
                                 chr(197).chr(161) => 's',
                                 // U+0162 | Ţ | LATIN CAPITAL LETTER T
 WITH CEDILLA
                                 chr(197).chr(162) => 'T',
                                 // U+0163 | ţ | LATIN SMALL LETTER T WITH
 CEDILLA
                                 chr(197).chr(163) => 't',
                                 // U+0164 | Ť | LATIN CAPITAL LETTER T
 WITH CARON
                                 chr(197).chr(164) => 'T',
                                 // U+0165 | ť | LATIN SMALL LETTER T WITH
 CARON
                                 chr(197).chr(165) => 't',
                                 // U+0166 | Ŧ | LATIN CAPITAL LETTER T
 WITH STROKE
                                 chr(197).chr(166) => 'T',
                                 // U+0167 | ŧ | LATIN SMALL LETTER T WITH
 STROKE
                                 chr(197).chr(167) => 't',
                                 // U+0168 | Ũ | LATIN CAPITAL LETTER U
 WITH TILDE
                                 chr(197).chr(168) => 'U',
                                 // U+0169 | ũ | LATIN SMALL LETTER U WITH
 TILDE
                                 chr(197).chr(169) => 'u',
                                 // U+016A | Ū | LATIN CAPITAL LETTER U
 WITH MACRON
                                 chr(197).chr(170) => 'U',
                                 // U+016B | ū | LATIN SMALL LETTER U WITH
 MACRON
                                 chr(197).chr(171) => 'u',
                                 // U+016C | Ŭ | LATIN CAPITAL LETTER U
 WITH BREVE
                                 chr(197).chr(172) => 'U',
                                 // U+016D | ŭ | LATIN SMALL LETTER U WITH
 BREVE
                                 chr(197).chr(173) => 'u',
                                 // U+016E | Ů | LATIN CAPITAL LETTER U
 WITH RING ABOVE
                                 chr(197).chr(174) => 'U',
                                 // U+016F | ů | LATIN SMALL LETTER U WITH
 RING ABOVE
                                 chr(197).chr(175) => 'u',
                                 // U+0170 | Ű | LATIN CAPITAL LETTER U
 WITH DOUBLE ACUTE
                                 chr(197).chr(176) => 'U',
                                 // U+0171 | ű | LATIN SMALL LETTER U WITH
 DOUBLE ACUTE
                                 chr(197).chr(177) => 'u',
                                 // U+0172 | Ų | LATIN CAPITAL LETTER U
 WITH OGONEK
                                 chr(197).chr(178) => 'U',
                                 // U+0173 | ų | LATIN SMALL LETTER U WITH
 OGONEK
                                 chr(197).chr(179) => 'u',
                                 // U+0174 | Ŵ | LATIN CAPITAL LETTER W
 WITH CIRCUMFLEX
                                 chr(197).chr(180) => 'W',
                                 // U+0175 | ŵ | LATIN SMALL LETTER W WITH
 CIRCUMFLEX
                                 chr(197).chr(181) => 'w',
                                 // U+0176 | Ŷ | LATIN CAPITAL LETTER Y
 WITH CIRCUMFLEX
                                 chr(197).chr(182) => 'Y',
                                 // U+0177 | ŷ | LATIN SMALL LETTER Y WITH
 CIRCUMFLEX
                                 chr(197).chr(183) => 'y',
                                 // U+0178 | Ÿ | LATIN CAPITAL LETTER Y
 WITH DIAERESIS
                                 chr(197).chr(184) => 'Y',
                                 // U+0179 | Ź | LATIN CAPITAL LETTER Z
 WITH ACUTE
                                 chr(197).chr(185) => 'Z',
                                 // U+017A | ź | LATIN SMALL LETTER Z WITH
 ACUTE
                                 chr(197).chr(186) => 'z',
                                 // U+017B | Ż | LATIN CAPITAL LETTER Z
 WITH DOT ABOVE
                                 chr(197).chr(187) => 'Z',
                                 // U+017C | ż | LATIN SMALL LETTER Z WITH
 DOT ABOVE
                                 chr(197).chr(188) => 'z',
                                 // U+017D | Ž | LATIN CAPITAL LETTER Z
 WITH CARON
                                 chr(197).chr(189) => 'Z',
                                 // U+017E | ž | LATIN SMALL LETTER Z WITH
 CARON
                                 chr(197).chr(190) => 'z',
                                 // U+017F | ſ | LATIN SMALL LETTER LONG S
                                 chr(197).chr(191) => 's',
 // XXX Add remainder of 198-128 (U+0181) thru 199-191 (U+01FF)
                                 // U+01A0 | Ơ | LATIN CAPITAL LETTER O
 WITH HORN
                                 chr(198).chr(160) => 'O',
                                 // U+01A1 | ơ | LATIN SMALL LETTER O WITH
 HORN
                                 chr(198).chr(161) => 'o',
                                 // U+01AF | Ư | LATIN CAPITAL LETTER U
 WITH HORN
                                 chr(198).chr(175) => 'U',
                                 // U+01B0 | ư | LATIN SMALL LETTER U WITH
 HORN
                                 chr(198).chr(176) => 'u',
                                 // U+01CD | Ǎ | LATIN CAPITAL LETTER A
 WITH CARON
                                 chr(199).chr(141) => 'A',
                                 // U+01CE | ǎ | LATIN SMALL LETTER A WITH
 CARON
                                 chr(199).chr(142) => 'a',
                                 // U+01CF | Ǐ | LATIN CAPITAL LETTER I
 WITH CARON
                                 chr(199).chr(143) => 'I',
                                 // U+01D0 | ǐ | LATIN SMALL LETTER I WITH
 CARON
                                 chr(199).chr(144) => 'i',
                                 // U+01D1 | Ǒ | LATIN CAPITAL LETTER O
 WITH CARON
                                 chr(199).chr(145) => 'O',
                                 // U+01D2 | ǒ | LATIN SMALL LETTER O WITH
 CARON
                                 chr(199).chr(146) => 'o',
                                 // U+01D3 | Ǔ | LATIN CAPITAL LETTER U
 WITH CARON
                                 chr(199).chr(147) => 'U',
                                 // U+01D4 | ǔ | LATIN SMALL LETTER U WITH
 CARON
                                 chr(199).chr(148) => 'u',
                                 // U+01D5 | Ǖ | LATIN CAPITAL LETTER U
 WITH DIAERESIS AND MACRON
                                 chr(199).chr(149) => 'U',
                                 // U+01D6 | ǖ | LATIN SMALL LETTER U WITH
 DIAERESIS AND MACRON
                                 chr(199).chr(150) => 'u',
                                 // U+01D7 | Ǘ | LATIN CAPITAL LETTER U
 WITH DIAERESIS AND ACUTE
                                 chr(199).chr(151) => 'U',
                                 // U+01D8 | ǘ | LATIN SMALL LETTER U WITH
 DIAERESIS AND ACUTE
                                 chr(199).chr(152) => 'u',
                                 // U+01D9 | Ǚ | LATIN CAPITAL LETTER U
 WITH DIAERESIS AND CARON
                                 chr(199).chr(153) => 'U',
                                 // U+01DA | ǚ | LATIN SMALL LETTER U WITH
 DIAERESIS AND CARON
                                 chr(199).chr(154) => 'u',
                                 // U+01DB | Ǜ | LATIN CAPITAL LETTER U
 WITH DIAERESIS AND GRAVE
                                 chr(199).chr(155) => 'U',
                                 // U+01DC | ǜ | LATIN SMALL LETTER U WITH
 DIAERESIS AND GRAVE
                                 chr(199).chr(156) => 'u',
                                 // U+0218 | Ș | LATIN CAPITAL LETTER S
 WITH COMMA BELOW
 // XXX Review of unimplemented codes below this point is necessary.
                                 chr(200).chr(152) => 'S',
                                 // U+0219 | ș | LATIN SMALL LETTER S WITH
 COMMA BELOW
                                 chr(200).chr(153) => 's',
                                 // U+021A | Ț | LATIN CAPITAL LETTER T
 WITH COMMA BELOW
                                 chr(200).chr(154) => 'T',
                                 // U+021B | ț | LATIN SMALL LETTER T WITH
 COMMA BELOW
                                 chr(200).chr(155) => 't',
                         // Vowels with diacritic (Chinese, Hanyu Pinyin)
                                 // U+0251 | ɑ | LATIN SMALL LETTER ALPHA
                                 chr(201).chr(145) => 'a',
                                 // U+1EA0 | Ạ | LATIN CAPITAL LETTER A
 WITH DOT BELOW
                                 chr(225).chr(186).chr(160) => 'A',
                                 // U+1EA1 | ạ | LATIN SMALL LETTER A WITH
 DOT BELOW
                                 chr(225).chr(186).chr(161) => 'a',
                                 // U+1EA2 | Ả | LATIN CAPITAL LETTER A
 WITH HOOK ABOVE
                                 chr(225).chr(186).chr(162) => 'A',
                                 // U+1EA3 | ả | LATIN SMALL LETTER A WITH
 HOOK ABOVE
                                 chr(225).chr(186).chr(163) => 'a',
                                 // U+1EA4 | Ấ | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX AND ACUTE
                                 chr(225).chr(186).chr(164) => 'A',
                                 // U+1EA5 | ấ | LATIN SMALL LETTER A WITH
 CIRCUMFLEX AND ACUTE
                                 chr(225).chr(186).chr(165) => 'a',
                                 // U+1EA6 | Ầ | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX AND GRAVE
                                 chr(225).chr(186).chr(166) => 'A',
                                 // U+1EA7 | ầ | LATIN SMALL LETTER A WITH
 CIRCUMFLEX AND GRAVE
                                 chr(225).chr(186).chr(167) => 'a',
                                 // U+1EA8 | Ẩ | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(186).chr(168) => 'A',
                                 // U+1EA9 | ẩ | LATIN SMALL LETTER A WITH
 CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(186).chr(169) => 'a',
                                 // U+1EAA | Ẫ | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX AND TILDE
                                 chr(225).chr(186).chr(170) => 'A',
                                 // U+1EAB | ẫ | LATIN SMALL LETTER A WITH
 CIRCUMFLEX AND TILDE
                                 chr(225).chr(186).chr(171) => 'a',
                                 // U+1EA6 | Ậ | LATIN CAPITAL LETTER A
 WITH CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(186).chr(172) => 'A',
                                 // U+1EAD | ậ | LATIN SMALL LETTER A WITH
 CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(186).chr(173) => 'a',
                                 // U+1EAE | Ắ | LATIN CAPITAL LETTER A
 WITH BREVE AND ACUTE
                                 chr(225).chr(186).chr(174) => 'A',
                                 // U+1EAF | ắ | LATIN SMALL LETTER A WITH
 BREVE AND ACUTE
                                 chr(225).chr(186).chr(175) => 'a',
                                 // U+1EB0 | Ằ | LATIN CAPITAL LETTER A
 WITH BREVE AND GRAVE
                                 chr(225).chr(186).chr(176) => 'A',
                                 // U+1EB1 | ằ | LATIN SMALL LETTER A WITH
 BREVE AND GRAVE
                                 chr(225).chr(186).chr(177) => 'a',
                                 // U+1EB2 | Ẳ | LATIN CAPITAL LETTER A
 WITH BREVE AND HOOK ABOVE
                                 chr(225).chr(186).chr(178) => 'A',
                                 // U+1EB3 | ẳ | LATIN SMALL LETTER A WITH
 BREVE AND HOOK ABOVE
                                 chr(225).chr(186).chr(179) => 'a',
                                 // U+1EB4 | Ẵ | LATIN CAPITAL LETTER A
 WITH BREVE AND TILDE
                                 chr(225).chr(186).chr(180) => 'A',
                                 // U+1EB5 | ẵ | LATIN SMALL LETTER A WITH
 BREVE AND TILDE
                                 chr(225).chr(186).chr(181) => 'a',
                                 // U+1EB6 | Ặ | LATIN CAPITAL LETTER A
 WITH BREVE AND DOT BELOW
                                 chr(225).chr(186).chr(182) => 'A',
                                 // U+1EB7 | ặ | LATIN SMALL LETTER A WITH
 BREVE AND DOT BELOW
                                 chr(225).chr(186).chr(183) => 'a',
                                 // U+1EB8 | Ẹ | LATIN CAPITAL LETTER E
 WITH DOT BELOW
                                 chr(225).chr(186).chr(184) => 'E',
                                 // U+1EB9 | ẹ | LATIN SMALL LETTER E WITH
 DOT BELOW
                                 chr(225).chr(186).chr(185) => 'e',
                                 // U+1EBA | Ẻ | LATIN CAPITAL LETTER E
 WITH HOOK ABOVE
                                 chr(225).chr(186).chr(186) => 'E',
                                 // U+1EBB | ẻ | LATIN SMALL LETTER E WITH
 HOOK ABOVE
                                 chr(225).chr(186).chr(187) => 'e',
                                 // U+1EBC | Ẽ | LATIN CAPITAL LETTER E
 WITH TILDE
                                 chr(225).chr(186).chr(188) => 'E',
                                 // U+1EBD | ẽ | LATIN SMALL LETTER E WITH
 TILDE
                                 chr(225).chr(186).chr(189) => 'e',
                                 // U+1EBE | Ế | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX AND ACUTE
                                 chr(225).chr(186).chr(190) => 'E',
                                 // U+1EBF | ế | LATIN SMALL LETTER E WITH
 CIRCUMFLEX AND ACUTE
                                 chr(225).chr(186).chr(191) => 'e',
                                 // U+1EC0 | Ề | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX AND GRAVE
                                 chr(225).chr(187).chr(128) => 'E',
                                 // U+1EC1 | ề | LATIN SMALL LETTER E WITH
 CIRCUMFLEX AND GRAVE
                                 chr(225).chr(187).chr(129) => 'e',
                                 // U+1EC2 | Ể | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(187).chr(130) => 'E',
                                 // U+1EC3 | ể | LATIN SMALL LETTER E WITH
 CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(187).chr(131) => 'e',
                                 // U+1EC4 | Ễ | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX AND TILDE
                                 chr(225).chr(187).chr(132) => 'E',
                                 // U+1EC5 | ễ | LATIN SMALL LETTER E WITH
 CIRCUMFLEX AND TILDE
                                 chr(225).chr(187).chr(133) => 'e',
                                 // U+1EC6 | Ệ | LATIN CAPITAL LETTER E
 WITH CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(187).chr(134) => 'E',
                                 // U+1EC7 | ệ | LATIN SMALL LETTER E WITH
 CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(187).chr(135) => 'e',
                                 // U+1EC8 | Ỉ | LATIN CAPITAL LETTER I
 WITH HOOK ABOVE
                                 chr(225).chr(187).chr(136) => 'I',
                                 // U+1EC9 | ỉ | LATIN SMALL LETTER I WITH
 HOOK ABOVE
                                 chr(225).chr(187).chr(137) => 'i',
                                 // U+1ECA | Ị | LATIN CAPITAL LETTER I
 WITH DOT BELOW
                                 chr(225).chr(187).chr(138) => 'I',
                                 // U+1ECB | ị | LATIN SMALL LETTER I WITH
 DOT BELOW
                                 chr(225).chr(187).chr(139) => 'i',
                                 // U+1ECC | Ọ | LATIN CAPITAL LETTER O
 WITH DOT BELOW
                                 chr(225).chr(187).chr(140) => 'O',
                                 // U+1ECD | ọ | LATIN SMALL LETTER O WITH
 DOT BELOW
                                 chr(225).chr(187).chr(141) => 'o',
                                 // U+1ECE | Ỏ | LATIN CAPITAL LETTER O
 WITH HOOK ABOVE
                                 chr(225).chr(187).chr(142) => 'O',
                                 // U+1ECF | ỏ | LATIN SMALL LETTER O WITH
 HOOK ABOVE
                                 chr(225).chr(187).chr(143) => 'o',
                                 // U+1ED0 | Ố | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX AND ACUTE
                                 chr(225).chr(187).chr(144) => 'O',
                                 // U+1ED1 | ố | LATIN SMALL LETTER O WITH
 CIRCUMFLEX AND ACUTE
                                 chr(225).chr(187).chr(145) => 'o',
                                 // U+1ED2 | Ồ | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX AND GRAVE
                                 chr(225).chr(187).chr(146) => 'O',
                                 // U+1ED3 | ồ | LATIN SMALL LETTER O WITH
 CIRCUMFLEX AND GRAVE
                                 chr(225).chr(187).chr(147) => 'o',
                                 // U+1ED4 | Ổ | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(187).chr(148) => 'O',
                                 // U+1ED5 | ổ | LATIN SMALL LETTER O WITH
 CIRCUMFLEX AND HOOK ABOVE
                                 chr(225).chr(187).chr(149) => 'o',
                                 // U+1ED6 | Ỗ | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX AND TILDE
                                 chr(225).chr(187).chr(150) => 'O',
                                 // U+1ED7 | ỗ | LATIN SMALL LETTER O WITH
 CIRCUMFLEX AND TILDE
                                 chr(225).chr(187).chr(151) => 'o',
                                 // U+1ED8 | Ộ | LATIN CAPITAL LETTER O
 WITH CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(187).chr(152) => 'O',
                                 // U+1ED9 | ộ | LATIN SMALL LETTER O WITH
 CIRCUMFLEX AND DOT BELOW
                                 chr(225).chr(187).chr(153) => 'o',
                                 // U+1EDA | Ớ | LATIN CAPITAL LETTER O
 WITH HORN AND ACUTE
                                 chr(225).chr(187).chr(154) => 'O',
                                 // U+1EDB | ớ | LATIN SMALL LETTER O WITH
 HORN AND ACUTE
                                 chr(225).chr(187).chr(155) => 'o',
                                 // U+1EDC | Ờ | LATIN CAPITAL LETTER O
 WITH HORN AND GRAVE
                                 chr(225).chr(187).chr(156) => 'O',
                                 // U+1EDD | ờ | LATIN SMALL LETTER O WITH
 HORN AND GRAVE
                                 chr(225).chr(187).chr(157) => 'o',
                                 // U+1EDE | Ở | LATIN CAPITAL LETTER O
 WITH HORN AND HOOK ABOVE
                                 chr(225).chr(187).chr(158) => 'O',
                                 // U+1EDF | ở | LATIN SMALL LETTER O WITH
 HORN AND HOOK ABOVE
                                 chr(225).chr(187).chr(159) => 'o',
                                 // U+1EE0 | Ỡ | LATIN CAPITAL LETTER O
 WITH HORN AND TILDE
                                 chr(225).chr(187).chr(160) => 'O',
                                 // U+1EE1 | ỡ | LATIN SMALL LETTER O WITH
 HORN AND TILDE
                                 chr(225).chr(187).chr(161) => 'o',
                                 // U+1EE2 | Ợ | LATIN CAPITAL LETTER O
 WITH HORN AND DOT BELOW
                                 chr(225).chr(187).chr(162) => 'O',
                                 // U+1EE3 | ợ | LATIN SMALL LETTER O WITH
 HORN AND DOT BELOW
                                 chr(225).chr(187).chr(163) => 'o',
                                 // U+1EE4 | Ụ | LATIN CAPITAL LETTER U
 WITH DOT BELOW
                                 chr(225).chr(187).chr(164) => 'U',
                                 // U+1EE5 | ụ | LATIN SMALL LETTER U WITH
 DOT BELOW
                                 chr(225).chr(187).chr(165) => 'u',
                                 // U+1EE6 | Ủ | LATIN CAPITAL LETTER U
 WITH HOOK ABOVE
                                 chr(225).chr(187).chr(166) => 'U',
                                 // U+1EE7 | ủ | LATIN SMALL LETTER U WITH
 HOOK ABOVE
                                 chr(225).chr(187).chr(167) => 'u',
                                 // U+1EE8 | Ứ | LATIN CAPITAL LETTER U
 WITH HORN AND ACUTE
                                 chr(225).chr(187).chr(168) => 'U',
                                 // U+1EE9 | ứ | LATIN SMALL LETTER U WITH
 HORN AND ACUTE
                                 chr(225).chr(187).chr(169) => 'u',
                                 // U+1EEA | Ừ | LATIN CAPITAL LETTER U
 WITH HORN AND GRAVE
                                 chr(225).chr(187).chr(170) => 'U',
                                 // U+1EEB | ừ | LATIN SMALL LETTER U WITH
 HORN AND GRAVE
                                 chr(225).chr(187).chr(171) => 'u',
                                 // U+1EEC | Ử | LATIN CAPITAL LETTER U
 WITH HORN AND HOOK ABOVE
                                 chr(225).chr(187).chr(172) => 'U',
                                 // U+1EED | ử | LATIN SMALL LETTER U WITH
 HORN AND HOOK ABOVE
                                 chr(225).chr(187).chr(173) => 'u',
                                 // U+1EEE | Ữ | LATIN CAPITAL LETTER U
 WITH HORN AND TILDE
                                 chr(225).chr(187).chr(174) => 'U',
                                 // U+1EEF | ữ | LATIN SMALL LETTER U WITH
 HORN AND TILDE
                                 chr(225).chr(187).chr(175) => 'u',
                                 // U+1EF0 | Ự | LATIN CAPITAL LETTER U
 WITH HORN AND DOT BELOW
                                 chr(225).chr(187).chr(176) => 'U',
                                 // U+1EF1 | ự | LATIN SMALL LETTER U WITH
 HORN AND DOT BELOW
                                 chr(225).chr(187).chr(177) => 'u',
                                 // U+1EF2 | Ỳ | LATIN CAPITAL LETTER Y
 WITH GRAVE
                                 chr(225).chr(187).chr(178) => 'Y',
                                 // U+1EF3 | ỳ | LATIN SMALL LETTER Y WITH
 GRAVE
                                 chr(225).chr(187).chr(179) => 'y',
                                 // U+1EF4 | Ỵ | LATIN CAPITAL LETTER Y
 WITH DOT BELOW
                                 chr(225).chr(187).chr(180) => 'Y',
                                 // U+1EF5 | ỵ | LATIN SMALL LETTER Y WITH
 DOT BELOW
                                 chr(225).chr(187).chr(181) => 'y',
                                 // U+1EF6 | Ỷ | LATIN CAPITAL LETTER Y
 WITH HOOK ABOVE
                                 chr(225).chr(187).chr(182) => 'Y',
                                 // U+1EF7 | ỷ | LATIN SMALL LETTER Y WITH
 HOOK ABOVE
                                 chr(225).chr(187).chr(183) => 'y',
                                 // U+1EF8 | Ỹ | LATIN CAPITAL LETTER Y
 WITH TILDE
                                 chr(225).chr(187).chr(184) => 'Y',
                                 // U+1EF9 | ỹ | LATIN SMALL LETTER Y WITH
 TILDE
                                 chr(225).chr(187).chr(185) => 'y',
                                 // U+20AC | € | EURO SIGN
                                 chr(226).chr(130).chr(172) => 'E',
                         );

 }}}

--
Ticket URL: <https://core.trac.wordpress.org/ticket/34677#comment:2>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list