[wp-trac] [WordPress Trac] #35293: Emoji Regex in wp_encode_emoji() is wildly inaccurate
WordPress Trac
noreply at wordpress.org
Tue Jul 19 14:07:35 UTC 2016
#35293: Emoji Regex in wp_encode_emoji() is wildly inaccurate
--------------------------+-----------------------------
Reporter: pento | Owner: pento
Type: defect (bug) | Status: assigned
Priority: normal | Milestone: Future Release
Component: Emoji | Version: 4.2
Severity: normal | Resolution:
Keywords: | Focuses:
--------------------------+-----------------------------
Changes (by pento):
* keywords: emoji =>
Comment:
[attachment:35293.2.diff] is the framework for generating the PHP regex
from the `twemoji.js` regex.
Proceeding from here is... tricky. The Twemoji regex uses UTF-16 code
points, which PHP didn't support until PCRE 8.3.2 (PHP 5.4.14). There's no
way to nicely convert the code point ranges to a PHP-compatible regex.
The main problem with using the method from `twemoji-generator.js` is that
it requires a local copy of the Twemoji images, to check which images
Twemoji supports. It also takes us a further step away from the actual
regex we need to build, creating potential inconsistencies.
I would not be adverse to providing an accurate regex for PHP versions
that support it, and a more approximate fallback for those that don't.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/35293#comment:9>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list