[wp-trac] [WordPress Trac] #30099: fetch_feed does not properly parse URLs with query strings

WordPress Trac noreply at wordpress.org
Fri Oct 24 21:54:47 UTC 2014


#30099: fetch_feed does not properly parse URLs with query strings
---------------------------+-----------------------------
 Reporter:  leereamsnyder  |      Owner:
     Type:  defect (bug)   |     Status:  new
 Priority:  normal         |  Milestone:  Awaiting Review
Component:  General        |    Version:  4.0
 Severity:  normal         |   Keywords:
  Focuses:                 |
---------------------------+-----------------------------
 Using fetch_feed with this url:


 {{{
 http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?&contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
 }}}

 the function Simplepie::IRI::parse_iri decodes it to this (using a
 print_r($match) ):


 {{{
 Array
 (
     [0] =>
 http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
     [1] => http:
     [scheme] => http
     [2] => http
     [3] => //www.ibm.com
     [authority] => www.ibm.com
     [4] => www.ibm.com
     [path] => /developerworks/views/global/rss/libraryview.jsp
     [5] => /developerworks/views/global/rss/libraryview.jsp
     [6] => ?contentarea_by=global&
     [query] => contentarea_by=global&
     [7] => contentarea_by=global&
     [8] =>
 #038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
     [fragment] =>
 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
     [9] =>
 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
 )
 }}}

 The $match[query] appears to be stopping at the first parameter, and the
 rest is thrown in to $match[fragment] incorrectly.

 Weirder: the first ampersand (but only the first) appears to be html
 encoded to '#&038;', which has a '#', which is why the fragment starts
 with '038;'. I can't for the life of me explain where in parse_iri it's
 being escaped.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/30099>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list