[wp-trac] [WordPress Trac] #40825: Re-addressing validation/sanitization of IDs to allow filtering before WP_Post (and others) database query

WordPress Trac noreply at wordpress.org
Sat May 20 00:00:06 UTC 2017


#40825: Re-addressing validation/sanitization of IDs to allow filtering before
WP_Post (and others) database query
-----------------------------------+-----------------------------
 Reporter:  LindsayBSC             |      Owner:
     Type:  enhancement            |     Status:  new
 Priority:  normal                 |  Milestone:  Awaiting Review
Component:  Posts, Post Types      |    Version:  4.7.5
 Severity:  normal                 |   Keywords:
  Focuses:  template, performance  |
-----------------------------------+-----------------------------
 The following ticket spawned from a desire to import content from outside
 of the WordPress database but have it treated as a native content type. A
 concept that has a clear audience that desires this functionality (see
 ticket: #12955) and has been addressed in a number of different ways. I
 believe the least impactful way to address this ''without'' removing the
 'final' keyword from the WP_Post class that  also improves standards for
 validation and sanitization of the ID value typically passed to
 get_posts() is to use the ID as a sort-of 'decorator' unto itself.

 When merging the content from multiple sources to be displayed in a theme
 the biggest conflict to arise is duplication of IDs. Since the remote
 source is ignorant to the ID numbers already in use in the wp_posts table,
 a requirement for a "decorated" ID determined.

 WordPress currently does not have a standard method for validating the
 format of the variable that will ultimately be passed to get_post() to
 create a new WP_Post object. The following methods are implemented in core
 files to attempt to sanitize '''''or''''' validate the value passed as an
 ID ^(*see links for pro/cons of usage in WP)^:
      *
 [https://gist.github.com/LinzardMac/b27e738aee52cb3e45c1909fb555cec5 (int)
 Typecast / intval()]
      *
 [https://gist.github.com/LinzardMac/0d1915dfe78fc68f0b3c64d50cf2cb41
 is_numeric()]
      *
 [https://gist.github.com/LinzardMac/31ed99f8faa34ffad6e666f213e99870
 absint()]

 == '''My Suggestion''' ==

 I recommend creating a new function that will standardize the validation
 and sanitization of ID numbers that are being passed to a database query.
 All instances of is_numeric, (int), intval(), and absint() that are used
 as a way to validate or sanitize (or both validate and sanitize) an ID
 number that is passed to a query from an external function should be
 replaces with a new function that will serve both purposes. The new
 function will return a falsey response OR throw an exception when
 validation fails or if validation passes, will sanitize the value to a
 format compatible with the typical MYSQL type for the ID column (bigint).

 Inside of this new function we can include a filter that will allow
 developers to override certain restrictions, specifically for allowing
 external content to be treated as a WP_Post object or some other native
 content that commonly would exist in WordPress’ database.

 Since WP_Post will always look for a cached version of the object before
 querying the database, we make sure to store all necessary values in the
 cache before the template is loaded after we run our remote_get. We
 utilize the concatenated ID which is formatted like 12345-REMOTE as the ID
 in the cache so as to avoid conflicts w/ existing post IDs that are also
 stored in the memory cache. The only hurdle to this was the fact that core
 files were forcibly casting IDs as integers long before a query of any
 sort were to be made.

 The argument for sanitizing early was to catch malformations early, but
 all it seemed to do was force the type early and never truly "caught" a
 bad value passed as an ID. A true “early catch” would either sanitize
 early w/ a falsey response or Exception and/or find the cached version as
 early as the sanitization so as to avoid the rest of the process of
 getting the WP_Post instance anyways.

 Available in the following gist is my suggestion for a better validation
 function I called ```is_valid_id()```, an example of how filters can be
 used on this sanitization function to allow external content to be treated
 as if it was a WP_Post object, and it's usage within a core file that
 previously used one of the subpar validation functions ( in this case
 meta.php using is_numeric() )

 *comments in the file are just opinions and alternative thoughts I had
 while crafting this

 https://gist.github.com/LinzardMac/38bbe22feb0b0a3fbabfcf64d797cd80

 !** It could be worthy of note that I have been using some version of this
 code in a live production site for the last 4 months without any changes
 needed to plugins or template files to account for this "non native"
 content.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/40825>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list