[wp-trac] [WordPress Trac] #7068: cron improvement

WordPress Trac wp-trac at lists.automattic.com
Fri May 30 17:02:57 GMT 2008


#7068: cron improvement
---------------------+------------------------------------------------------
 Reporter:  hailin   |       Owner:  anonymous
     Type:  defect   |      Status:  new      
 Priority:  high     |   Milestone:  2.7      
Component:  General  |     Version:           
 Severity:  normal   |    Keywords:           
---------------------+------------------------------------------------------
 There are several key issues associated with current cron implementation.

 1. cron is not atomic.

   Every page load will call wp_cron(), check the first timestamp in cron
 array, if it  has expired, it calls spawn_cron which calls wp-cron.php to
 do fire up the jobs.

 This runs into massive concurrency issue on a large system with hundreds
 of servers, where millions of pages views are generated every day.

 The current method to address this issue is in wp-cron.php:

 if ( get_option('doing_cron') > $local_time )
         exit;

 update_option('doing_cron', $local_time + 30);

 However, the check does not solve the issues resulted from concurrency.

 Example:

 On a busy site, in the particular second when first cron timestamp is
 expiring, there are 10 blog page loads on 10 different servers.

 Suppose process#1 on server #1 goes first, yet before it has reached
 update_option('doing_cron', $local_time + 30), process #2 on server#2
 begins the sequence too.

  Since ‘doing_cron” is still being updated by the process#1, or the
 updated value has not taken effect yet (due to db or cache delays, several
 milliseconds or longer usually) , process#2 will pass if (
 get_option('doing_cron') > $local_time )
  Check and also update_option('doing_cron', $local_time + 30). So both
 processes will proceed to fire up the cron job.

 I’ve observed that on a popular blog on a busy production site, ANY cron
 job was executed 5-7 times!  That may be ok for publish_future_post
 operation, but may not be good for other cron tasks.

 An ideal solution is to guarantee every cron is executed once and once
 only.
 I can envision storing all cron jobs in a central table, then a daemon
 processes it on a PARTICULAR server. Yet this approach may not be as
 flexible as it may not handle blog-specific jobs well.

 A practical solution is to make the cron operation as atomic as possible,
 knowing that we can never make it truly atomic as there will be database
 and cross-data center communication delays.

 2.  Server timers are not always correct

 Because cron job condition is tested on every blog page load on every
 server.  Any server with a bad clock can ruin the cron jobs, causing
 future posts being published earlier or never being published.

 We can build in some protection mechanism to guard against this.

 3. Minor issue

 Calling time() in multiple places in cron operation chain can be tricky on
 a busy server, as each call can give different values if the server is
 overloaded.  Passing the first timestamp at cron entry point is logically
 sound.

 4. Lack of a central standard time source

 Server timer drifting issue caused by power outage, etc poses a
 fundamental challenge. Software can not prevent hardware failure, and can
 only do so much to adapt to those failure cases.

-- 
Ticket URL: <http://trac.wordpress.org/ticket/7068>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list