[wp-trac] [WordPress Trac] #46199: REST: Escape already decoded values when adding query args

WordPress Trac noreply at wordpress.org
Thu Feb 7 03:03:09 UTC 2019


#46199: REST: Escape already decoded values when adding query args
--------------------------+-------------------------------------
 Reporter:  dmsnell       |      Owner:  (none)
     Type:  defect (bug)  |     Status:  new
 Priority:  normal        |  Milestone:  Awaiting Review
Component:  REST API      |    Version:  trunk
 Severity:  normal        |   Keywords:  has-patch needs-testing
  Focuses:                |
--------------------------+-------------------------------------
 `add_query_args()` assumes that when given an array of args as the first
 parameter that they are already `urlencode`ed but in various places we are
 passing `$_GET` (indirectly) as that first parameter; unfortunately
 ([http://php.net/manual/en/reserved.variables.get.php per the PHP docs])
 "//The GET variables are passed through urldecode().//"

 On a site with over two hundred tags a search was made in the Gutenberg
 editor for tags matching `#` - the site used tags like `#cool-thing` and
 `#important` - as part of its autocompleting of tag names. This produced a
 URL with `&search=%23` in the query.

 Inside of `WP_REST_Terms_Controller::get_items()` we call `add_query_args(
 $request->get_query_params(), … )` which by that time passes on `$_GET`
 which contains `$_GET['search'] = '#'` because PHP decoded it.

 Although `add_query_args()` runs //existing// args from its URI through
 `urlencode_deep()` it doesn't modify the args passed into the function
 itself. This leaves us calling `build_query()` which calls
 `_http_build_query()` with the `$urlencode` parameter set to `false`, thus
 the //decoded// value gets directly `implode()`ed into the final query
 string and we're left with a raw `#` //before// any actual fragment part
 of the URL.

 This resulted in the response for the site's tag getting a next link whose
 query args are like this…
 `?per_page=100&orderby=count&order=desc&_fields=id%2Cname&search&page=2#&_envelope=1`
 …when they should be like this…
 `?per_page=100&orderby=count&order=desc&_fields=id%2Cname&search=%23&_envelope=1`

 We can note that it //looks// like we've done something funky moving
 around the value for the `search` arg but actually we truncated at
 `search=#` and then treated everything after it as if it were the
 fragment, finally adding it back in at the end of `add_query_args()`.

 In Gutenberg this resulted in a crash after typing `#` into the tag box
 because the API call came back with invalid data as the `Link` for the
 next page of tag results (and that link was automatically followed by
 Gutenberg's attempt to get all of the results in a paged set). Comically
 the valid data was there but because WordPress stuck `_envelope=1`
 //after// the `#` it was ignored by PHP as not being a `$_GET` arg.

 ----

 To fix this I propose wrapping `$request->get_query_args()` in
 `urldecode_deep()` because we should be able to reliably know that they
 are decoded and thus need re-encoding. When I scanned around I found that
 a few other REST controllers were affected by this same bug and so am
 including corrections there in this patch as well.

 We could approach this with other means and do more to detect if incoming
 args to `add_query_args()` were already encoded or decoded but I don't
 think the additional complexity is worth it.

 We could have always run the array args to `add_query_args()` through
 `urlencode_deep()` but I don't think we can safely assume that it's not
 being called with already-encoded args; I was able to find places where we
 //do already encode them// when we're passing variables that //aren't//
 `$_GET` so I think it's inconceivable to paint broadly here and encode
 everything.

 There may be more places affected by this.

 I searched the codex to see if I could find any ban on including `#` in
 tag names but couldn't find any. Further, I hypothesize that this will
 affect many different API calls which allow user-input search terms.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/46199>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list