[wp-trac] [WordPress Trac] #22325: Abstract GPCS away from the superglobals

WordPress Trac noreply at wordpress.org
Sun Sep 29 06:57:15 UTC 2024


#22325: Abstract GPCS away from the superglobals
----------------------------------------------------+---------------------
 Reporter:  rmccue                                  |       Owner:  (none)
     Type:  enhancement                             |      Status:  new
 Priority:  normal                                  |   Milestone:
Component:  Bootstrap/Load                          |     Version:
 Severity:  normal                                  |  Resolution:
 Keywords:  has-patch 2nd-opinion needs-unit-tests  |     Focuses:
----------------------------------------------------+---------------------

Comment (by dmsnell):

 [https://www.php.net/releases/5_4_0.php Magic quotes were removed] in
 March 2012 with PHP 5.4. Thought this might be relevant here; no supported
 version of WordPress supports running on PHP with magic quotes.

 Perhaps some of the original motivation for this discussion is gone, but I
 think there's still value in handling `$_GET` and `$_POST` more
 explicitly.  I've grown skeptical of what happens automatically and what
 //doesn't// happen automatically.

 I'm in favor of encapsulating access to these variables, but in a way that
 runs in parallel to existing code (so as not to break it), but which
 presents a more explicit and defined interface. Some things I find
 surprising:

  - dots and spaces in the query args are transformed into underscores
  - duplicate params not using PHP's array syntax overwrite previous copies
 of the same name
  - array-named args explicitly create array structure, but…
  - duplicates of some subset of array names overwrites previous nested
 parameters
  - names and values are accepted as byte streams, not as encoded text
  - GET values are submitted as percent-encoded bytes (which are not
 guaranteed to be UTF-8), but…
  - POST values may be transformed to HTML character references, e.g. when
 a browser is set to `latin1`

 these are also the kind of "gotchas" that I tend to see people struggle
 with, because the basic mental model they form while getting started
 paints an inaccurate picture - the reality is much more complicated. and
 this is only coming from an examination of legitimate uses, ignoring
 malicious attacks.

 Although a bit less magical, I find a number of other standard approaches
 to query args and `POST` values simpler to reason about and teach about.

 {{{#!php
 <?php
 // ?q=one&q=two
 'one'                 === wp_get( 'q' );
 array( 'one', 'two' ) === wp_get_all( 'q' );
 null                  === wp_get( 'r' );

 // ?q=😄&r=&#x1f604&s=%F0%9F%98%84;
 '😄' === wp_get( 'q' );
 '😄' === wp_get( 'r' );
 '😄' === wp_get( 's' );
 }}}

  - Providing a default value in the absence of the query arg seems very
 reasonable. Of course we can use `??` now so it's less of a big deal, and
 there's no way a query param can bet set to `null` - only `"null"`, which
 is different.
  - I'm having trouble understanding the stop values, or the values to
 compare against for detecting the presence of a query arg. the `get()`
 function can serve the purpose of the `has()` because it can return `null`
 when the arg is missing.
  - might want to consider rejecting values that are invalid UTF-8. it's
 quite possible that non-UTF-8 data comes in anyway, but many non-UTF-8
 encodings also produce valid UTF-8 byte streams. so we can't ensure the
 right decoding, but we can reject //some// invalid ones.

 It's late and I'm tired so I'm stopping for now, but I'd like to add some
 more illustrating examples. for one, names can be weird and wild. for two,
 PHP's native system for these values is wild. it makes so many params
 unavailable, and makes it really hard on developers to get the right
 values when they want to.

 as a reminder, **always send `accept-charset=utf8`** on your `<form>`
 elements! even if a page has `<meta charset=utf8>`, the browser will still
 send other encodings in the POST body, including HTML character
 references, if the browser is set to an encoding override.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/22325#comment:52>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list