[wp-trac] [WordPress Trac] #51372: Race condition causes an autoload option to leak outside of alloptions
WordPress Trac
noreply at wordpress.org
Tue Sep 22 07:55:45 UTC 2020
#51372: Race condition causes an autoload option to leak outside of alloptions
--------------------------+-----------------------------
Reporter: kovshenin | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Cache API | Version: 2.2
Severity: normal | Keywords: dev-feedback
Focuses: |
--------------------------+-----------------------------
There is a problem with `add_option()` which causes an autoloaded item to
leak outside of the `alloptions` cache key and into its own item under the
`options` group.
This becomes pretty broken with persistent object caching, because it
leads to a state, where the option is stuck under its own cache key in the
`options` group, and is unable to be deleted or updated, while the
underlying database value is completely gone.
I've been able to reproduce this in multiple ways, with persistent object
caching turned on or off. The backend I used in my testing is
[https://github.com/Automattic/wp-memcached wp-memcached], but it should
be reproducable with any other backend.
The first and easiest way is to just add an `error_log()` to wp-
includes/options.php in `get_option()` right before `wp_cache_add()` that
writes to the `options` group:
{{{
function get_option() {
...
if ( is_object( $row ) ) {
...
if ( $option == 'foo' ) { error_log( 'leaked' ); }
wp_cache_add( $option, $value, 'options' );
}}}
Now in a simple plugin we can start writing the `foo` option, like this:
{{{
add_action( 'init', function() {
delete_option( 'foo' );
add_option( 'foo', 'bar' );
die();
} );
}}}
The `autoload` flag defaults to `true`, so in this scenario the `foo` item
should always end up in the `alloptions` cache key. However, if you run
this in multiple parallel threads (with `ab` for example), you'll see the
`leaked` message in your error log, which means `foo` has been written to
the `foo` key under the `options` group.
{{{
ab -c 100 -n 1000 http://localhost/
}}}
Another way to reproduce this is to turn on a persistent object caching
plugin, visit the site once to trigger a single `add_option()`, then check
some of site options and cache keys:
{{{
wp option get foo # says bar
wp cache get foo options # error, because foo is autoloaded and stored in
alloptions
wp cache get alloptions options # big array with foo => bar at the end
}}}
You can also confirm the database value is there:
In MySQL:
{{{
SELECT * FROM wp_options WHERE option_name = 'foo';
+-----------+-------------+--------------+----------+
| option_id | option_name | option_value | autoload |
+-----------+-------------+--------------+----------+
| 2403 | foo | bar | yes |
+-----------+-------------+--------------+----------+
}}}
Looks good so far.
Now run the same `ab` test with concurrent requests, and check again.
{{{
wp option get foo # says bar
wp cache get foo options # says bar, because the value leaked from
alloptions into its own item
wp cache get alloptions options # big array, but NO foo => bar
}}}
And finally, in MySQL shell:
{{{
SELECT * FROM wp_options WHERE option_name = 'foo';
Empty set (0.00 sec)
}}}
At this point the value is gone, and only remains as a stale item in
Memcached under the wrong key. Deleting, adding, or updating the item will
not work:
{{{
$ wp option delete foo
Warning: Could not delete 'foo' option. Does it exist?
$ wp option add foo bar
Error: Could not add option 'foo'. Does it already exist?
$ wp option update foo baz
Error: Could not update option 'foo'.
}}}
Flushing Memcached (or having the key evicted) will "fix" the stalemate,
but will cause the data to be lost forever.
Here's a brief overview of what could happen inside `add_option()` in two
concurrent threads:
{{{
Thread 1: add_option( 'foo', 'bar' );
Thread 2: add_option( 'foo', 'bar' );
1: .. get_option( 'foo' ) // false
1: .. INSERT INTO // true
2: .. get_option( 'foo' )
2: .. .. isset( $alloptions[ 'foo' ] ) // false
2: .. .. ->get_row()
2: .. .. wp_cache_add( 'foo', 'bar', 'options' ); // LEAKED
1: $alloptions[] = ...
1: wp_cache_set( 'alloptions', $alloptions, 'options' );
}}}
This is then followed by the next `delete_option()` call, which
successfully deletes the data from the database and the `alloptions` key,
so then future calls to `add_option()` will fail, because `get_option()`
will always return the data from cache.
I'm not sure about the best way to fix it. Here are a few thoughts:
* Maybe `add_option()` should not rely so heavily on `get_option()`, and
do its own checks with cache functions, depending on the `autoload`
function argument
* `delete_option()` could clear both the `alloptions` array item as well
as the `$option` key in the `options` group
* In `get_option()` if we were unable to retrieve the data from
`$alloptions` and option cache, maybe query the `autoload` column together
with `option_value`, before just assuming it's a no:
{{{
$row = $wpdb->get_row( $wpdb->prepare( "SELECT option_value, autoload ...
}}}
Then handle the cache addition differently, based on that autoload flag.
This sounds the most reasonable to me, because this is exactly the place
where the item is being put into the wrong key, which causes the other
problems.
I haven't actually tested this with 2.2, but it seems like that's where
the `alloptions` vs `options` cache keys appeared inside `get_option()`
right around r4855.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/51372>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list