[wp-trac] [WordPress Trac] #16330: media_sideload_image() broken with URLs containing spaces

WordPress Trac wp-trac at lists.automattic.com
Fri Jan 21 13:56:19 UTC 2011


#16330: media_sideload_image() broken with URLs containing spaces
--------------------------+-----------------------------
 Reporter:  Coolkevman    |      Owner:
     Type:  defect (bug)  |     Status:  new
 Priority:  normal        |  Milestone:  Awaiting Review
Component:  HTTP          |    Version:  3.1
 Severity:  normal        |   Keywords:
--------------------------+-----------------------------
 I'm using the {{{media_sideload_image()}}} method in the upcoming version
 of my [http://wordpress.org/extend/plugins/e107-importer/ e107 Importer
 script] (see:
 http://github.com/kdeldycke/e107-importer/blob/b7925fdac6aa43db4be5b7925265a83d95fc62ad/e107-importer.php#L277
 ) to upload remote images into WordPress.

 This method work as expected with lots of images from a lot of different
 sources, but fail on URLs containing spaces.

 Let me illustrate this bug with an example.

 When trying to upload the image located at
 {{{
 http://home.nordnet.fr/francois.jankowski/pochette avant thumb.jpg
 }}}
 the result looks like this on the file system: http://twitpic.com/3s0dk7 .
 As you can see, image file names are clean. But in the Media Manager, here
 is what you have: http://twitpic.com/3s0e5d . No thumbnails seems to have
 been created.

 Now, trying to fix this, I modified the original URL before calling
 {{{media_sideload_image()}}} with the following code:
 {{{
   $img_url = str_replace(' ', '%20', html_entity_decode($img_url));
   $new_tag = media_sideload_image($img_url, $post_id);
 }}}
 With this patch, here is the result on the filesystem:
 http://twitpic.com/3s0ets . I was surprised by WordPress not sanitizing
 URLs. Is that normal ?

 But the most surprising stuff is in the Media Manager:
 http://twitpic.com/3s0hup . It looks like thanks to this hack, WordPress
 somehow succeeded downloading the remote file but messed with filesystem
 naming. What let me think this ? The Media Manager, get the right image
 thumbnail dimensions but not the binary payload of the thumbnail (contrary
 to the case above were no binary nor dimensions are available about the
 thumbnail).

 All of this was tested in WordPress 3.1-RC2.

 As for the idea of the patch above, it come from a very old version of my
 plugin (v0.9) that was based on WordPress 2.3.2. There, I somehow found
 the root cause of the problem,
 [http://github.com/kdeldycke/e107-importer/blob/e107-importer-0.9/e107.php#L410
 according the comment I wrote 3 years ago]:
 {{{
  // The fopen() function in wp_remote_fopen() don't like URLs with space
 chars not translated to html entities
 }}}

 I should have posted this bug report sooner, as now I've forgotten
 everything about this issue... :(

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/16330>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list