[wp-trac] [WordPress Trac] #26050: Continual Admin Page POST (HeartBeats?) Can Cause SQL Connection Issues

Sun Feb 16 07:46:25 UTC 2014

#26050: Continual Admin Page POST (HeartBeats?) Can Cause SQL Connection Issues
-------------------------------------+------------------------------
 Reporter:  optimized-marketing.com  |       Owner:
     Type:  defect (bug)             |      Status:  new
 Priority:  normal                   |   Milestone:  Awaiting Review
Component:  Autosave                 |     Version:  3.7.1
 Severity:  normal                   |  Resolution:
 Keywords:                           |     Focuses:
-------------------------------------+------------------------------

Comment (by nacin):

 > Replying to [comment:11 lisota]:

 > We'll will recreate the load in a semi-controlled manner by starting
 with no post editing and then suddenly editing lots of post simultaneously
 so that we can demonstrate the effect that the post locking heartbeat is
 having on our CPU.

 Please don't waste too much time simulating this. I believe you that
 you're having problems, but a CPU graph alone won't help us. We need to
 know what specifically is causing the problem.

 Replying to [comment:10 heydonovan]:
 > Replying to [comment:8 azaozz]:
 >
 > That's exactly what we were thinking. Instead of writing to the
 database, check for persistent caching, and write the `_edit_lock`
 modifications straight to memcached.
 >
 > Replying to [comment:9 nacin]:
 >
 > We do know that when a ton of authors are editing a bunch of different
 posts, the load average on the server almost quadruples. I've enabled POST
 logging, and about 95% of the requests within that time period are from
 "wp-refresh-post-lock". Looking at some of our logs, I can see it was
 requested more than 5000+ times, so that would account for a lot of posts
 being edited at once. I'll see if I can dig up some more data.

 Each call has up to three SELECT queries and one UPDATE query for the
 request, as a result of get_post() and update_postmeta(). The SELECT
 queries (and the WHERE clause of the UPDATE) all hit indexes that are both
 well-established and small. The UPDATE query does not write to an indexed
 field and thus won't force an update of the index. These counts do not
 include the loading of WordPress, which depending on the environment,
 setup, plugins, etc., could range anywhere from dirt cheap to absurdly
 expensive.

 At the moment, I am more inclined to think this problem is peculiar to the
 site or environment, and not to the load. I have no qualms with being
 proven wrong, but I haven't heard a complaint from other hosting companies
 and this is the first inkling I've heard from WPEngine. 5000 times a day
 is still only one request every 17 seconds and just 5000 UPDATE queries,
 which even a crappy shared hosting box would absorb without blinking. I
 ran these queries on a site with a few tens of thousands of postmeta
 records and the indexes laughed at me. How fast are these queries running
 for you? How many rows are in the wp_postmeta table? I don't have access
 to my massive test database until I'm home later this week, but I don't
 expect it to be a problem; these are some of the most basic queries
 WordPress runs. Does the table CHECK out okay? Is the table optimized?

 The issue I see here with going to memcached (which, from an API
 perspective, is totally doable) is that would imply the DB write queries
 are the problem. The limited evidence thus far only points to generic load
 issues. We need to know what's actually causing that load. Is it the FPM
 (?) process that's taking a long time, or is MySQL the bottleneck? Is it
 network lag in hitting the DB server? (Assuming the DB is on a separate
 server.) Is it somehow fetching and invalidating data from memcached
 that's actually somehow slow?

 If you remove the post locking action, are other heartbeat requests
 affected? Are other admin-ajax requests affected? Are other POST requests
 affected? I've heard stories of people having post saves take forever, and
 then a developer goes in and realizes that about fifteen seconds worth of
 queries and calculations are being performed on every save by a plugin,
 entirely accidentally. Is it even *these* requests that are the slow ones,
 or is the issue cache invalidation and then some kind of race condition
 for frontend requests (which is one way traffic could play a role in
 this)?

 A periodic POST request that results in a few select queries and a single,
 simple write query should not in and of itself cause problems. Many
 different avenues to pursue here.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/26050#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform