[wp-trac] [WordPress Trac] #51525: Add new functions apply_filters_single_type() and apply_filters_ref_array_single_type()

WordPress Trac noreply at wordpress.org
Tue Oct 20 11:01:40 UTC 2020


#51525: Add new functions apply_filters_single_type() and
apply_filters_ref_array_single_type()
------------------------------------------+---------------------
 Reporter:  jrf                           |       Owner:  (none)
     Type:  feature request               |      Status:  new
 Priority:  normal                        |   Milestone:  5.6
Component:  General                       |     Version:  trunk
 Severity:  normal                        |  Resolution:
 Keywords:  2nd-opinion needs-patch php8  |     Focuses:
------------------------------------------+---------------------

Comment (by giuseppe.mazzapica):

 Thanks @jrf, instead of answering your points one by one, let me try to
 put in words what I tried to put in code because I think there was some
 misunderstanding.

 To be honest, I'm afraid the direction this ticket is taking lands to now
 where, and that is for the "single type" principle that is being self-
 imposed.

 First: this does not play well with PHP: a variable can be more than one
 type. An array or a string both can also be callable. And the concept of
 "single type" is completely applicable to objects: `object` is pretty much
 useless as it guarantees exactly zero safety.

 Second: it does not play well with WordPress. If I understand correctly
 the idea is to use of these functions you propose in the core. I'm fairly
 certain that there's no way you can do that if a "single type" is
 accepted. There are a huge amount of filter hooks in WordPress that
 depending on several factors can pass `null` or a string or an integer.
 The first example into my mind: a post ID that comes from the DB will be a
 string, if it comes from cache might very well be an integer. Sometimes a
 `$post` variable passed by a core filter is a `WP_Post`, sometimes is an
 integer and sometimes a string. To make it short: if you want to use these
 functions in WP core, that will just not possible without giving up on the
 "single type" thing.

 Second: it makes the whole thing much less useful and powerful. As a
 developer who advocates type-safe code for years, I know very well as the
 "single type" approach that PHP itself had for a long time resulted in
 untyped code. There's no possibility to develop easily with a type system
 that only allows a single and invariant type and so many times the
 alternative to that is use no type, something that PHP allows. What I
 foresee if the "single type" approach stick is that not only will not be
 possible to use it in the core, but it will be much less useful for plugin
 developers.

 I'll try to expand on these topics below, and also answer some of the
 things you posted.


 == Typed hooks

 When I first saw this ticket I thought "too good to be true", because I
 thought about the possibility to finally have **typed hooks** in
 WordPress.

 At Inpsyde, we are very strict about types. We have a couple of PHPCS
 rules that force us to add a type declaration (argument and return) to
 ''every function'' and method. We automatically exclude from those rules
 functions attached to hooks for the simple reason that hooks are not type-
 safe.
 If we do something like this:

 {{{#!php
 <?php
 add_filter('hook', function (string $x): string { /*...*/ });
 }}}

 We can be sure that this will explode because the core itself of some
 other plugin will pass something that is not a string.

 If we analyze the code above, the piece of code that calls `apply_filters`
 has (in theory for now) the possibility to define the **type of the
 hook**.

 In fact, there's a single place where the result of `apply_filters` is
 consumed. That means **the code that calls `apply_filters` is the code
 that decides the type of the hook**, the various places where `add_filter`
 is called should stick with that type.

 Unfortunately, this does not happen right now: `apply_filters` declares a
 type, but code that uses `add_action` return whatever, and that is happily
 accepted by WordPress.

 I thought that this could be a chance to stop this.

 The way I imagined this was the same mechanism of a function having the
 so-called "type hints" for parameters. When that happens, whoever calls
 the function has to pass a type that is ''compatible'' with the type the
 function declared to accept.

 Similarly to that, whoever calls `apply_filters_typesafe` (let me stick
 with this naming for now) would be able to declare the **type of the
 hook**, and then callbacks added to it would be required to return
 compatible types.

 {{{#!php
 <?php
 /** @var string $my_string */
 $my_string = apply_filters_typesafe( 'hook', /* ... */ );
 }}}

 Doing so, I would finally be able to safely things like this:


 {{{#!php
 <?php
 add_filter('hook', function (string $x): string { /*...*/ });
 }}}

 this would be safe because core would refuse to pass to the next hooking
 callback something that does not respect the declared hook type. This
 would be **a huge** improvement to WordPress, besides the support of PHP
 8.


 == Determining the type

 Now that is (I hope) clear what for me is the purpose of an
 `apply_filters_typesafe`, that is having ''typed hooks'', we can surely
 proceed in saying that we could enforce the rule that a typed hook must
 have a **single type**.

 I'll come back to this later, but for now, let's assume we decide to stick
 with that.

 There would still be two problems:

 - it will not always be possible to infer the type from the given filter
 "subject"
 - it will not work in a multitude of existing WordPress hooks


 == The `callable` issue

 You said:

 > I can't think of a single reason for a callable being filterable.
 > To be honest, that just sounds like a security issue

 The security issue argument is not existent. A callable to be executed
 must be loaded, that is the file that declares it must be loaded by
 WordPress. And if a file is loaded, it could do anything, including wiping
 the entire database, insert new super-admin users, and so on. The fact
 that it could also filter a callback is not the problem there. If
 unchecked PHP files can be included in the application that is a huge
 security issue, but that's a different story and has nothing to do with
 the fact that a callable is filterable.
 WordPress even allows in several places to re-write declared functions
 (drop-ins, child-themes), there's no security issue there.

 Regarding "a single reason for a callable being filterable", I don't have
 to think. I can show you some examples in the WP core:
 https://developer.wordpress.org/?s=_handler&post_type%5B%5D=wp-parser-hook

 Any of those "handler" hooks accepts a `callable`. The link above was the
 one I find more effective to show a list of functions that expect
 callbacks, but I can tell you that "filterable callback" is a pattern that
 I myself use very often, and because I often review my colleague's code, I
 can assure you I'm not the only one. Moreover, I'm very sure that
 extremely popular plugins do it here and there.

 Now, considering that WordPress passes a string to those functions if WP
 would auto-determine the type that would be `string`. Which of course is
 not valid because an anonymous function or an object method that is
 correct would be refused and a string that is not callable would be
 accepted.


 == Objects

 How do we determine the "single type" for objects? You said to stick with
 the `object` type.

 How this is useful?

 When using object-oriented programming the piece of code that declares the
 type of the hook (that is the piece of code that calls
 `apply_filters_single_type`) can decide that anything implementing an
 interface will be fine. For example:

 {{{#!php
 <?php
 /** @var Iterator $my_iterator */
 $my_iterator = apply_filters_single_type( /* ... */ );

 while( $my_iterator->valid() ) {
   $my_iterator->next();
 }
 }}}

 If I pass an `ArrayIterator` the code that tries to determine the "single
 accepted type" will have a few choices:

 - accept with `object`: this will obviously break the snippet above
 because the hook type is not respected. The problem this issue tries to
 solve is not solved. And in any case, the "typed hook" is not typed
 anymore: if I add a filter using a function that declares `Iterator` as
 param type a fatal error will happen, and my plugin will be blamed even if
 it was another plugin to return an object that is not an iterator as
 desired.
 - accept only instances of `ArrayIterator`. Could work, but it is
 extremely limiting. As a dev who practices OOP in the everyday job for a
 few years, the most common reason why you pass an object to a filter is to
 allow consumers to ''replace'' that object with a different implementation
 of the same interface. In fact, if you want consumers to only interact
 with the object you use an action, not a filter. (What, for example,
 WordPress does with `pre_get_posts` passing a `WP_Query` instance).
 - accept ''any'' of the interfaces/parent classes of `ArrayIterator`.
 Again not useful at all. A class having only the `Serializable` interface
 would be accepted, but the "hook single type" here is `Iterator` not
 `Serializable`, and the code snippet above would end up in a fatal error.


 == Other types

 For other types, e.g. the auto-casting from `int` to `float` and vice-
 versa, that is something PHP happily does even in PHP 8
 (https://3v4l.org/alj77) and I really don't see any reason to be stricter
 than PHP, but if that lands in core... ''I'' would not complain.

 The only reason why in my proof-of-concept I accepted any `numeric` value
 when an `int` or a `float` was given to the filter was that this is how
 WordPress currently handles those values.

 When a post ID is passed around in hooks it might be very well a numeric
 string or an integer. So I thought that supporting `is_numeric` instead of
 `is_int` would have made the integration in existing WordPress hooks much
 easier.

 After all, if I know that the type is ''numeric'', I can safely do a
 casting downstream:

 {{{#!php
 <?php
 /** @var int $post_id */
 $post_id = (int)apply_filters_single_type( 'hook', /* ... */ );
 }}}

 Cast to int will never fail if I have the guarantee that the filter
 returns a value that passed the `is_numeric` check.

 Regarding functions hooking there, in PHP 8 they could even have type-
 declarations:


 {{{#!php
 <?php
 /** @var int $post_id */
 add_filter( 'hook', fn(int|string|float $id) => related_id( $id ) );
 }}}

 That said, once again, if the desired direction is to be very strict about
 this ''I'' will not complain but suspect that this will reduce the
 possibility of this ticket to be merged.


 == About `null`

 I'll directly start with an example from core:
 https://developer.wordpress.org/reference/hooks/posts_pre_query/

 That filter passes `null` because it allows overriding the results of the
 query before the query runs and `null` which is passed as the default
 value is a signal that it has not being overridden. The reason is simple:
 if a filter returns an empty array what it means is that the query is
 forced to have no results at all. So `null` allow distinguishing "not
 filtered" from "no results".

 This kind of "pre" filtering is very common in WordPress:
 https://developer.wordpress.org/?s=pre_&post_type%5B%5D=wp-parser-hook

 Sometimes `false` is used as the discriminant in place of `null` but the
 idea is the same. How are we going to handle this very common case
 sticking with "the single type principle"?

 If we think in terms of "typed hook" it makes total sense that WordPress
 declares a hook to have the type `array|null`. This is something that for
 PHP supports since PHP 5.1 (https://3v4l.org/fVVnX).

 If we stick with the "single type" approach, there will be no possibility
 to leverage the new `apply_filters_single_type` for such cases, which
 means that WP would stick with the `apply_filters`: as I said above when
 the type system is not flexible the only possibility is having not types.
 That means that all the filters like `posts_pre_query` would be
 "condemned" to be untyped and receive something that is nor null nor array
 and break WP with a fatal error.


 == Configurable type

 All the examples above try to prove that having sensitive defaults to
 automatically infer the hook type is very fine, but there should be a way
 to support the ''explicit declaration'' of the hook type.

 Let's call it `$type_spec` as @jrf proposed so we don't have yet to define
 the form it takes.

 How to do that? Because defaulting to the type of the filter ''subject''
 is a sensitive choice it would make totally sense to have this as the last
 parameter of the functions, making it possible to have a default value.

 Unfortunately, `apply_filters` is a variadic function so if we want to
 have a signature that resembles that function it is not possible to
 ''append'' a new parameter. If we agree that having a different approach
 for the two functions discussed in this ticket is not desirable, I see two
 possibilities:

 - Prepend the argument, having something like
 `apply_filters_single_type($tag, $type_spec, $subject, ...$args)` and
 `apply_filters_ref_array_single_type($tag, $type_spec, $args)`
 - Only have the `ref_array` variant:
 `apply_filters_ref_array_single_type($tag, $args, $type_spec)`. Which at
 that point could be renamed `apply_filters_typed($tag, $args,
 $type_spec)`. That means that if one wants a typed filter they has to use
 this function, which is not anymore a ''variant'' of the existing
 `apply_filters` / `apply_filters_ref_array` but it is a new function whose
 happen to accept in its simplest form the same two parameters of
 `apply_filters_ref_array`: the hook name and an array of args (using the
 default for `$type_spec`). Considering that **any** occurrence of
 `apply_filters` could be replaced with this function, and considering that
 any code doing `add_filter` could be left unchanged, I don't see this as
 an issue, actually, it would reduce the additional API introduced which
 can only be good.

 The examples about `null` above IMO proved there's the necessity to, at
 the very least, allow for nullable types to make this ''typed hook''
 function anywhere useful.

 If we assume there's a single type, and so `$type_spec` can be a string,
 we could still support nullable types. We don't have to think up a syntax,
 PHP did that for us:

 {{{#!php
 <?php
 $this->posts = apply_filters_typed( 'posts_pre_query', array( null, $this
 ), 'array|null' );
 }}}

 or an example with a filter currently using `apply_filters` instead of
 `apply_filters_ref_array`:

 {{{#!php
 <?php
 // was: $check = apply_filters( 'pre_delete_post', null, $post,
 $force_delete );
 // see: https://github.com/WordPress/WordPress/blob/master/wp-
 includes/post.php#L3013

 $check = apply_filters_typed( 'pre_delete_post', array( null, $post,
 $force_delete ), 'bool|null' );
 }}}

 This would be IMO a very effective solution:

 - Reduced new API: one functions instead of two
 - Sensitive defaults: the third parameter could default to `null`, and
 when that's the case, the accepted type would be determined by WordPress.

 Regarding ''how'' to calculate the default, e.g. if passing an `int`
 strictly accept `int` only, or `int|float` or `numeric` that is an
 implementation detail and, once again, I would be very fine with any of
 the choices but I ''suspect'' that the more permissive choice is what will
 make this more easily integrated into the core codebase.

 == Single type?

 The question now is: if we allow nullable types which are a form of union
 types, why don't accept the ''full'' union types specification,
 considering that WordPress has **many** places where the filtered types
 can assume several types and also considering that this ticket happens in
 the context of support for PHP 8, a version that introduces union types?

 In other words, if we have an implementation that allows for:

 {{{#!php
 <?php
 apply_filters_typed( 'hook', array( null, $value ), 'bool|null' );
 }}}

 there would be close to zero complications to also making it allow for:


 {{{#!php
 <?php
 apply_filters_typed( 'hook', array( null, $value ), 'bool|int' );
 }}}

 but that would provide **huge** benefits to plugin developers and at the
 same time will, once again, ease the process of using this kind approach
 in the core, when there are filters whose type is
 `int|string|\WP_Post|null` and moving away from that would require an
 immense refactoring that will never happen (at least not in the next
 decade).

 Regarding using the string form like `bool|int` or the array form
 `array('bool', 'int')` is an implementation detail that I think is not
 really relevant now.

 Permit me to say that the "single type principle" is a self-imposed
 principle that not only ignores the current state of WP code, but also the
 evolution of PHP that in recent versions has put a lot of effort in moving
 away from the "single type" approach, introducing pseudo-types like
 `callable` (PHP 5.4) and `iterable` (PHP 7.1), covariant and contravariant
 type declarations (PHP 7.4), and union types (PHP 8.0).
 All these things were done for the only reason that "single type" is just
 not flexible enough and that developers to overcome this lack of
 flexibility were forced to have no-type. If that teaches us anything, that
 is that the only consequence of having a not flexible type system is
 having a no-type system.

 == About pseudo-types

 I personally see a lot of value in having a "type spec" that allows for
 things like `numeric`, that is types that are not valid PHP type
 declarations but have a core way to be checked (`is_numeric`), but if the
 decision is made to don't support them, no strong feelings from ''me''.

 Only consider that allowing for `numeric` will provide the possibility to
 be much more tolerant with existing code. I'm very sure out there, besides
 core, exist thousands of plugins that return a numeric string where a post
 ID is expected. Making "typed hook function(s)" very strict about the type
 will be such a huge breaking change that I hardly see it even
 ''thinkable'' in WordPress, so either "typed hook function(s)" would be
 used much less in the core (and by plugins), or they would not see the
 light at all.

 That said, considering that to make "typed hooks" anywhere useful the
 support for pseudo-type like `callable` is kind-of mandatory, I don't
 really see the reason why a pseudo-type like `iterable`, that is also a
 valid type declaration in PHP, should not be allowed.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/51525#comment:8>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list