[wp-trac] [WordPress Trac] #21767: Remove stripslashes from API functions
WordPress Trac
wp-trac at lists.automattic.com
Sun Sep 2 03:14:57 UTC 2012
#21767: Remove stripslashes from API functions
--------------------------+-------------------------------------
Reporter: alexkingorg | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version: trunk
Severity: normal | Keywords: has-patch needs-testing
--------------------------+-------------------------------------
This ticket attempts to bring consistency to the way slashes are handled
throughout the WordPress API functions that persist data.
Currently these API functions expect slashed data and strip slashes within
their code. Most of the code that invokes these functions understands this
and generally passes slashed data to them (sometimes calling
`add_magic_quotes()` when necessary). Most of the code understands this,
but not all. While auditing the code for the purposes of creating this
patch I found a variety of places that do not slash data before passing it
to the model functions as it currently should be done. Further, most
developers do not think to manually add slashes to the data they
programatically create and pass to these functions. Finally, it is not
documented in the Codex that all data passed to these functions should be
slashed before being passed in.
As such, we can generally consider such behavior a bug. There are a few
potential caveats, which I will cover shortly.
Some of the functions that are being affected here:
* wp_insert_post()
* wp_update_post()
* add_post_meta() [and other meta functions that call add_metadata()]
* update_post_meta() [and other meta functions that call
update_metadata()]
* wp_insert_attachment()
* wp_insert_term()
* wp_update_term()
* up_insert_user()
* wp_update_user()
* wp_new_comment()
* etc...
Point being, while it's cool that this stuff can work:
{{{
update_post_meta($post->ID, 'key', $_POST['key']);
}}}
It's not worth the cost of losing data when someone quite reasonably does
something like this:
{{{
$post_data = array(
'post_status' => 'publish',
'post_type' => 'post',
'post_title' => 'This is my title',
'post_content' => 'Crazy with the backslashes \\\\\\',
);
wp_insert_post($post_data);
}}}
With the current code they would instead need to do this (to not lose
data):
{{{
wp_insert_post( add_magic_quotes($post_data) );
}}}
Developers just don't think to do this, and they shouldn't have to.
Here's the real life example that got me diving into this:
{{{
$my_api_data = get_twitter_data();
update_post_meta($post->ID, '_my_meta_key', $my_api_data);
}}}
The data I was saving was in JSON format. I was putting it in like this:
{{{
{"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/17084282\/alex_king_normal.jpg"}
}}}
and getting it back liks this:
{{{
{"profile_image_url":"http://a0.twimg.com/profile_images/17084282/alex_king_normal.jpg"}
}}}
As you can imagine, `json_decode()` fails on this data.
The functions that do have examples around of passing $_POST data directly
to it are the post meta functions. I think changing them might break a
fair bit of custom code that is passing in slashed data. I've created
`wp_update_post_meta()` and `wp_add_post_meta()` accordingly that do not
expect slashed data and replaced all instances of `update_post_meta()` and
`add_post_meta()` with these calls accordingly.
It's possible that we may need to do a similar treatment of the other meta
functions (user, comment, etc.). I haven't put anything in place for this
in the current patch.
One additional change that was needed was to the default kses filters on a
few properties. wp-includes/default-filters.php
{{{
// Strip, trim, kses, special chars for string saves
foreach ( array( 'pre_term_name', 'pre_comment_author_name',
'pre_link_name', 'pre_link_target', 'pre_link_rel',
'pre_user_display_name', 'pre_user_first_name', 'pre_user_last_name',
'pre_user_nickname' ) as $filter ) {
add_filter( $filter, 'sanitize_text_field' );
add_filter( $filter, 'wp_filter_kses' );
add_filter( $filter, '_wp_specialchars', 30 );
}
}}}
The `wp_filter_kses` filter has been replaced with `wp_kses_data` - thanks
to @markjaquith for this tip here. I also applied this change to the
filters initialized in wp-includes/kses.php:
{{{
function kses_init_filters() {
// Normal filtering
add_filter('title_save_pre', 'wp_kses_data');
// Comment filtering
if ( current_user_can( 'unfiltered_html' ) )
add_filter( 'pre_comment_content', 'wp_kses_data' );
else
add_filter( 'pre_comment_content', 'wp_kses_data' );
// Post filtering
add_filter('content_save_pre', 'wp_kses_data');
add_filter('excerpt_save_pre', 'wp_kses_data');
add_filter('content_filtered_save_pre', 'wp_kses_data');
}
}}}
As this has security implications, it could use extra review and testing.
I changed `wp_rel_nofollow()` to no longer strip slashes then add them
again (and it was adding them wrong, via a call to `esc_sql()`). Now you
can pass unslashed strings to `wp_rel_nofollow()` without losing data as
well.
I have not thoroughly tested XMLRPC calls yet, that is definitely a needed
next step. The `escape()` method within the XMLRPC code looks hinky to me,
likely some changes (as well as thorough testing )are needed there.
Obviously, this is a big change. I've done my best to test appropriately.
I've tested via the admin UI and with simple code examples (attached) and
so far things look good. Unit tests that exercise various slash conditions
across all objects would be another smart addition to the mix.
Related: #18322, #20569
Whew! I think that's it for now - when you find something that breaks,
please comment accordingly so it can be fixed/discussed. I expect there
are likely a few edge cases I missed, but it's definitely time to get more
eyeballs on this (as well as help testing).
--
Ticket URL: <http://core.trac.wordpress.org/ticket/21767>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list