[wp-trac] [WordPress Trac] #55603: PHP 8.2: address deprecation of the utf8_encode() and utf8_decode() functions

WordPress Trac noreply at wordpress.org
Thu Apr 4 10:10:30 UTC 2024


#55603: PHP 8.2: address deprecation of the utf8_encode() and utf8_decode()
functions
-------------------------------------------------+-------------------------
 Reporter:  jrf                                  |       Owner:  (none)
     Type:  task (blessed)                       |      Status:  assigned
 Priority:  normal                               |   Milestone:  6.6
Component:  General                              |     Version:  6.0
 Severity:  normal                               |  Resolution:
 Keywords:  2nd-opinion php82 dev-feedback has-  |     Focuses:  coding-
  patch has-unit-tests                           |  standards
-------------------------------------------------+-------------------------

Old description:

> == Context
>
> The [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode PHP RFC
> to remove the `utf8_encode()` and `utf8_decode()` functions] from PHP in
> PHP 9.0 has recently been accepted.
>
> This means in effect that as of PHP 8.2, those functions will be
> deprecated and a deprecation notice will be thrown whenever they are
> called.
>
> The reasoning behind the deprecation and removal is that these functions
> are confusing and rarely used correctly.
>
> See the
> [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode#usage Usage
> section] of the RFC for an analysis of the various (mostly incorrect)
> uses of the functions.
>
> == The Problem
>
> The
> [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode#alternatives_to_removed_functionality
> typical replacements for these functions] are using the
> [https://www.php.net/manual/en/book.mbstring.php MBString extension]
> and/or the [https://www.php.net/manual/en/book.iconv.php Iconv
> extension].
>
> As these extensions are both ''optional'' extensions in PHP, they cannot
> be relied on to be available in an open source context.
>
> WordPress uses the `utf8_encode()` function a few times in the codebase:
> * 1 x `utf8_encode()` in `src/wp-admin/includes/export.php`
> * 2 x `utf8_encode()` in `src/wp-admin/includes/image.php`
> * 1 x `utf8_encode()` in `tests/phpunit/tests/kses/php`
>
> Aside from that the external dependency
> [https://github.com/JamesHeinrich/getID3 GetID3] also uses both these
> functions a number of times.
>
> A search of the plugin and theme directory shows more worrying results
> with a plenitude of matches:
> * [https://wpdirectory.net/search/01G16P0SWHB37G2965MP8R4ZYK 11247
> matches in 3315 plugins], including 15 plugins with over a million
> installs.
> * [https://wpdirectory.net/search/01G16P2K39TQ538M9KRTVXT4CA 40 matches
> in 22 themes].
>

> == Options
>
> So, what are the options we have ?
>
> In my opinion, especially seeing how these functions are used so often in
> plugins, there are only two realistic options:
>
> === 1. We could polyfill these functions.
>
> While some functions which may not be available are polyfilled by WP,
> this is generally only done to have access to ''new'' PHP functionality
> or to allow for using functions which require certain optional extensions
> to be enabled.
>
> As far as I know, no PHP native function has ever been polyfilled due to
> it being removed from PHP.
>
> **Pro**:
> Relatively simple solution and everything keeps working (deprecation
> notices will still show when running on PHP 8.x, though these could
> silenced).
>
> **Con**:
> As most uses of these functions are likely to be incorrect usage
> (especially in plugins), these "bugs" will remain and not be reviewed or
> addressed, undercutting the improvement PHP is trying to make.
>
> === 2. We could make the MbString (or the Iconv) extension a requirement
>
> At this moment, [https://core.trac.wordpress.org/browser/trunk/src/wp-
> admin/includes/class-wp-site-health.php#L876 both the MbString as well as
> the Iconv extension are recommended, but not required by WP].
>
> A couple of MbString functions are also polyfilled in WP, so now might be
> a good time to make the MbString extension a requirement for WP.
>
> **Pro**:
> MbString being available will allow for fixing the deprecations in a
> forward-compatible manner. It will also allow for other code improvements
> to be made to improve WPs support for languages using non-latin based
> scripts.
>
> **Con**:
> A new requirement would be added to WP which should not be taken lightly.
> At the same time, it should be noted that MbString is generally enabled
> already anyway, so this will impact only a small percentage of users.
>
> ==== Why MbString instead of Iconv ?
>
> While both are included (though not enabled) by default with PHP, Iconv
> [https://www.php.net/manual/en/iconv.requirements.php requires the
> `libiconv` library], which may not be available, while MbString has
> [https://www.php.net/manual/en/mbstring.requirements.php no external
> dependencies].
>
> MbString is [https://www.php.net/manual/en/mbstring.installation.php not
> enabled by default in PHP], but generally ''is'' enabled in practice.
> [https://www.php.net/manual/en/mbstring.installation.php Iconv is enabled
> by default] in PHP, but can be disabled.
>
> Having said that, MbString offers much more functionality than the
> limited functionality offered by Iconv and - as indicated by a number of
> functions being polyfilled - is already in use in WP.
>
> Still, it would be helpful if someone with access to the underlying
> statistics data collected by WP could add figures to this issue showing
> how often either extension is enabled on systems running WP.
>

> == Recommendation
>
> I'd strongly recommend option 2, but would like to hear the opinions of
> additional Core devs.
>

> == Action lists
>
> === General
>
> - [ ] Report the issue to GetID3
>

> === Action list for option 1
>
> - [ ] Polyfill the functions.
> - [ ] Review the uses of the functions in WP Core anyhow to see if those
> could/should be removed/the code using them should be refactored.
> - [ ] Add a note about the polyfills in a dev-note with a recommendation
> for plugin/theme authors to review their use of these functions anyhow.
>
> === Action list for option 2
>
> - [ ] Make the MbString a requirement for installing WP/in the WP
> bootstrapping.
> - [ ] Change the MbString extension from optional to required in the Site
> Health component.
> - [ ] Remove the current MbString related polyfills from the `compat.php`
> file.
> - [ ] Review the uses of the functions in WP Core and replace with more
> appropriate alternatives.
> - [ ] Add a note about the deprecation in the PHP 8.2 dev-note with a
> recommendation for plugin/theme authors to review their use of these
> functions and noting that the MbString extension can be relied upon to be
> available (as of WP 6.1).

New description:

 == Context

 The [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode PHP RFC
 to remove the `utf8_encode()` and `utf8_decode()` functions] from PHP in
 PHP 9.0 has recently been accepted.

 This means in effect that as of PHP 8.2, those functions will be
 deprecated and a deprecation notice will be thrown whenever they are
 called.

 The reasoning behind the deprecation and removal is that these functions
 are confusing and rarely used correctly.

 See the [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode#usage
 Usage section] of the RFC for an analysis of the various (mostly
 incorrect) uses of the functions.

 == The Problem

 The
 [https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode#alternatives_to_removed_functionality
 typical replacements for these functions] are using the
 [https://www.php.net/manual/en/book.mbstring.php MBString extension]
 and/or the [https://www.php.net/manual/en/book.iconv.php Iconv extension].

 As these extensions are both ''optional'' extensions in PHP, they cannot
 be relied on to be available in an open source context.

 WordPress uses the `utf8_encode()` function a few times in the codebase:
 * 1 x `utf8_encode()` in `src/wp-admin/includes/export.php`
 * 2 x `utf8_encode()` in `src/wp-admin/includes/image.php`
 * 1 x `utf8_encode()` in `tests/phpunit/tests/kses/php`

 Aside from that the external dependency
 [https://github.com/JamesHeinrich/getID3 GetID3] also uses both these
 functions a number of times.

 A search of the plugin and theme directory shows more worrying results
 with a plenitude of matches:
 * [https://wpdirectory.net/search/01G16P0SWHB37G2965MP8R4ZYK 11247 matches
 in 3315 plugins], including 15 plugins with over a million installs.
 * [https://wpdirectory.net/search/01G16P2K39TQ538M9KRTVXT4CA 40 matches in
 22 themes].


 == Options

 So, what are the options we have ?

 In my opinion, especially seeing how these functions are used so often in
 plugins, there are only two realistic options:

 === 1. We could polyfill these functions.

 While some functions which may not be available are polyfilled by WP, this
 is generally only done to have access to ''new'' PHP functionality or to
 allow for using functions which require certain optional extensions to be
 enabled.

 As far as I know, no PHP native function has ever been polyfilled due to
 it being removed from PHP.

 **Pro**:
 Relatively simple solution and everything keeps working (deprecation
 notices will still show when running on PHP 8.x, though these could
 silenced).

 **Con**:
 As most uses of these functions are likely to be incorrect usage
 (especially in plugins), these "bugs" will remain and not be reviewed or
 addressed, undercutting the improvement PHP is trying to make.

 === 2. We could make the MbString (or the Iconv) extension a requirement

 At this moment, [https://core.trac.wordpress.org/browser/trunk/src/wp-
 admin/includes/class-wp-site-health.php#L876 both the MbString as well as
 the Iconv extension are recommended, but not required by WP].

 A couple of MbString functions are also polyfilled in WP, so now might be
 a good time to make the MbString extension a requirement for WP.

 **Pro**:
 MbString being available will allow for fixing the deprecations in a
 forward-compatible manner. It will also allow for other code improvements
 to be made to improve WPs support for languages using non-latin based
 scripts.

 **Con**:
 A new requirement would be added to WP which should not be taken lightly.
 At the same time, it should be noted that MbString is generally enabled
 already anyway, so this will impact only a small percentage of users.

 ==== Why MbString instead of Iconv ?

 While both are included (though not enabled) by default with PHP, Iconv
 [https://www.php.net/manual/en/iconv.requirements.php requires the
 `libiconv` library], which may not be available, while MbString has
 [https://www.php.net/manual/en/mbstring.requirements.php no external
 dependencies].

 MbString is [https://www.php.net/manual/en/mbstring.installation.php not
 enabled by default in PHP], but generally ''is'' enabled in practice.
 [https://www.php.net/manual/en/mbstring.installation.php Iconv is enabled
 by default] in PHP, but can be disabled.

 Having said that, MbString offers much more functionality than the limited
 functionality offered by Iconv and - as indicated by a number of functions
 being polyfilled - is already in use in WP.

 Still, it would be helpful if someone with access to the underlying
 statistics data collected by WP could add figures to this issue showing
 how often either extension is enabled on systems running WP.


 == Recommendation

 I'd strongly recommend option 2, but would like to hear the opinions of
 additional Core devs.


 == Action lists

 === General

 - [x] Report the issue to GetID3


 === Action list for option 1

 - [ ] Polyfill the functions.
 - [ ] Review the uses of the functions in WP Core anyhow to see if those
 could/should be removed/the code using them should be refactored.
 - [ ] Add a note about the polyfills in a dev-note with a recommendation
 for plugin/theme authors to review their use of these functions anyhow.

 === Action list for option 2

 - [ ] Make the MbString a requirement for installing WP/in the WP
 bootstrapping.
 - [ ] Change the MbString extension from optional to required in the Site
 Health component.
 - [ ] Remove the current MbString related polyfills from the `compat.php`
 file.
 - [ ] Review the uses of the functions in WP Core and replace with more
 appropriate alternatives.
 - [ ] Add a note about the deprecation in the PHP 8.2 dev-note with a
 recommendation for plugin/theme authors to review their use of these
 functions and noting that the MbString extension can be relied upon to be
 available (as of WP 6.1).

--

Comment (by afercia):

 > Report the issue to GetID3

 Looks like this action item is no longer necessary. GetID3 removed these
 functions starting from version 1.9.23 released on 2023-10-19. See
 https://github.com/JamesHeinrich/getID3/pull/402

 The GetID3 new version is already included in WordPress 6.5, see
 https://core.trac.wordpress.org/changeset/56975

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/55603#comment:74>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list