[wp-trac] [WordPress Trac] #12935: Evolve the URL routing system

WordPress Trac wp-trac at lists.automattic.com
Thu May 27 19:34:27 UTC 2010


#12935: Evolve the URL routing system
--------------------------+-------------------------------------------------
 Reporter:  mikeschinkel  |       Owner:  ryan
     Type:  enhancement   |      Status:  new 
 Priority:  normal        |   Milestone:  3.1 
Component:  Permalinks    |     Version:  3.0 
 Severity:  normal        |    Keywords:      
--------------------------+-------------------------------------------------

Comment(by mikeschinkel):

 So I spent all day Wed the 26th working on this.  My goal for this first
 step was to come up with code that would allow me to specify routes in a
 new way but to allow me to generate a list of regular expressions that
 would match with 100% fidelity the exiting list of regular expressions
 from those created by a vanilla install of WP3.0.  At this stage I'm not
 worried about code compatibility yet; getting the output to be compatible
 is enough of a first hurdle.

 After a day of coding I created a `WP_Routes` class was able to get to
 100% fidelity close but the code is still very messy (it's recursive and
 as such hard to get just right as anyone who has done complex recursion
 knows.)  I think I may have been able to get the code to match URLs with
 almost 100% fidelity but I'm struggling to get it to match the output
 regular expressions with 100% fidelity. The next step will likely be to
 set up code that can compare the rewrite rules' regular expressions with
 the regular expressions output by `WP_Routes` (instead of doing it ''by-
 hand'') and see if I can work through the remaining incompatibilities.

 More important I want to set up an exhaustive set of unit test cases that
 would test `$_SERVER['REQUEST_URI']` using both the rewrite system and my
 new route system and compare the queries generated by both as those are
 more important than matching the actual regular expressions. After all,
 one of the problems with regular expressions is that they are almost
 impossible to robustly inspect with code to decipher and to recombine.  By
 focusing on path segments we'll get a lot more flexibility and a lot
 greater ability to control routing via hooks than is currently possible
 with the rewrite system. For example, here is an example of the fragile
 code I had to write to do something that to the client appeared to be a
 relatively simple change to the URL. I fear it will break when they add a
 plugin as they are likely to do:

 {{{
 add_action('rewrite_rules_array', 'tyc_rewrite_rules_array');
 function tyc_rewrite_rules_array($rules) {
         $keys = array();
         foreach($rules as $key => $rule) {
                 if (preg_match('#^index.php\?restaurant-
 dummy=\$matches\[1\](\&(paged|feed)=\$matches\[2\])?$#',$rule)) {
                         $keys[$key] = $rule;
                 }
         }
         foreach($keys as $key => $rule)
                 unset($rules[$key]);
         return $rules;
 }
 }}}

 BTW, the goal of my efforts are  not to generate the same regular
 expressions as the existing system but instead to load the same pages
 based on the same URLs. The plan is to inspect URL path segment vs. match
 the entire URL path, but if I can get my code to generate a 100% fidelity
 match for the existing rewrite rules then I'll know I'm on the right track
 related to what is required to specify the routes.

 '''If anyone wants to help me generate that list of test cases that maps
 `$_SERVER['REQUEST_URI']` to resultant WordPress URL-encoded queries it
 will be greatly appreciated.'''

 FYI, I'm hosting this conference[1] so I may simply not have enough time
 to do any more work on this until July but if I can I will.

 [1] http://www.thebusinessof.net/wordpress/

 Anyway, here is the current code I have for specifying (almost) compatible
 vanilla WordPress 3.0 URL routing. It will need to change somewhat but
 currently this is what I have. I envision it will be one of several ways
 to specify routes, and it will also be able to be used behind-the-scenes
 for a transitional compatibility layer; i.e. functions like
 `add_rewrite_tag()` and `add_permastruct()` could end up calling these in
 a transitional compatibility mode.  Note that I would envision (prefer?)
 to see a legacy compatibility mode (which code similar to this could
 support) and then an optional set of routes that could be used to clean up
 the complexity of the routing that was required because of how the
 rewrites had to be implemented.

 {{{
 register_query_var('%*%',array('pattern'=>'.*'));
 register_query_var('%?%',array('pattern'=>'.+?'));
 register_query_var('%robots%',array('literal'=>true));
 register_query_var('%attachment%');
 register_query_var('%tb%',array('literal'=>true));
 register_query_var('%withcomments%',array('literal'=>true,'pattern
 '=>'.*wp-commentsrss2.php'));
 register_query_var('%feed%',
 array('pattern'=>'(feed|rdf|rss|rss2|atom)'));
 register_query_var('%cpage%', array('pattern'=>'([0-9]{1,})'));
 register_query_var('%s%', array('pattern'=>'(.+)','expand'=>true));
 register_query_var('%author_name%');
 register_query_var('%tag%', array('expand'=>true));
 register_query_var('%ssort%', array('expand'=>true));
 register_query_var('%pagename%',
 array('post_type'=>'page','expand'=>true));
 register_query_var('%category_name%', array('expand'=>true));
 register_query_var('%year%', array('pattern'=>'([0-9]{4})'));
 register_query_var('%monthnum%', array('pattern'=>'([0-9]{1,2})'));
 register_query_var('%day%', array('pattern'=>'([0-9]{1,2})'));
 register_query_var('%paged%', array('pattern'=>'([0-9]{1,})'));
 register_query_var('%page%', array('pattern'=>'([0-9]+)?'));
 register_query_var('%dummy%', array('pattern'=>'[^/]+'));
 register_query_var('%name%', array('post_type'=>'post'));

 register_query_literal('robots.txt', array('append'=>'robots=1'));
 register_query_literal('trackback', array('append'=>'tb=1'));
 register_query_literal('comments', array('append'=>'withcomments=1'));
 register_query_literal('wp-atom.php',array('append'=>'feed=atom'));
 register_query_literal('wp-
 commentsrss2.php',array('append'=>'feed=rss2&withcomments=1'));
 register_query_literal('wp-feed.php',array('append'=>'feed=feed'));
 register_query_literal('wp-rdf.php',array('append'=>'feed=rdf'));
 register_query_literal('wp-rss.php',array('append'=>'feed=rss'));
 register_query_literal('wp-rss2.php',array('append'=>'feed=rss2'));

 register_route_group('%%feed%%',array(
         'feed/%feed%',
         '%feed%'));

 register_route_group('%%page%%',array(
   '%page%'));

 register_route_group('%%paged%%',array(
   'page/%paged%'));

 register_route_group('%%feed_paged%%',array(
   '%%feed%%',
   '%%paged%%'),true);

 register_route_group('%%sort_feed_paged%%',array(
   'sort/%ssort%/%%feed_paged%%'),true);

 register_route_group('%%comment_page%%',array(
   'comment-page-%cpage%'));

 register_route_group('%%trackback_feed_comment_page%%',array(
         'trackback',
         '%%feed%%',
         '%%comment_page%%'),true);

 register_route_group('%%trackback_feed_paged_comment_page%%',array(
         '%%trackback_feed_comment_page%%',
         '%%paged%%'),true);

 register_route_group('%%trackback_feed_page_paged_comment_page%%',array(
         '%%trackback_feed_comment_page%%',
         '%%paged%%',
         '%%page%%'),true);

 register_route_group('%%attachment%%',array(
         'attachment/%attachment%'));

 register_route_group('%%attachment_trackback_feed_comment_page%%',array(
         '%%attachment%%/%%trackback_feed_comment_page%%'));

 register_route_path('%year%/%monthnum%/%day%/%%feed_paged%%');
 register_route_path('%year%/%monthnum%/%day%/%%sort_feed_paged%%');
 register_route_path('%year%/%monthnum%/%%sort_feed_paged%%');
 register_route_path('%year%/%%sort_feed_paged%%');
 register_route_path('author/%author_name%/%%sort_feed_paged%%');
 register_route_path('author/%author_name%/%%feed_paged%%');
 register_route_path('tag/%tag%/%%sort_feed_paged%%');
 register_route_path('category/%category_name%/%%sort_feed_paged%%');
 register_route_path('%%sort_feed_paged%%');
 register_route_path('robots.txt');
 register_route_path('%*%wp-atom.php');
 register_route_path('%*%wp-commentsrss2.php');
 register_route_path('%*%wp-feed.php');
 register_route_path('%*%wp-rdf.php');
 register_route_path('%*%wp-rss.php');
 register_route_path('%*%wp-rss2.php');
 register_route_path('%%feed_paged%%');
 register_route_path('comments/%%feed_paged%%');
 register_route_path('search/%s%/%%feed_paged%%');
 register_route_path('%year%/%monthnum%/%dummy%/%%attachment_trackback_feed_comment_page%%');
 register_route_path('%year%/%monthnum%/%dummy%/%name%/%%trackback_feed_paged_comment_page%%');
 register_route_path('%year%/%monthnum%/%dummy%/%%trackback_feed_paged_comment_page%%');
 register_route_path('%year%/%monthnum%/%%comment_page%%');
 register_route_path('%year%/%%comment_page%%');
 register_route_path('%?%/%%attachment_trackback_feed_comment_page%%');
 register_route_path('%pagename%/%%trackback_feed_page_paged_comment_page%%');

 }}}

 One thing I have yet to figure out is how best to specify optional path
 segments. I'm loath to introduce new special characters to the URL
 template but '''I am considering using square brackets like so and would
 like to get other's input on this?'''

 {{{
 register_route_path('%pagename%/[%%trackback_feed_page_paged_comment_page%%]');
 }}}

 BTW, if it is not obvious anything surrounded by a pair of percent signs
 (i.e. `%%foo%%`) is a macro that expands to one or more URL path suffixes
 (which I currently am naming a "route group" but am open to a better
 name.)

 For example, these:

 {{{
 register_route_group('%%page_suffix%%',array(
         'feed/%feed%',
         '%feed%',
         'page/%paged%',
         'page%paged%',
         ));

 register_route_path('pages/%pagename%/%%page_suffix%%');
 }}}

 would expand to:

 {{{
 register_route_path('pages/%pagename%/feed/%feed%');
 register_route_path('pages/%pagename%/%feed%');
 register_route_path('pages/%pagename%/page/%paged%');
 register_route_path('pages/%pagename%/page%paged%');
 }}}

 Hopefully you can see why it would be helpful to have an optional
 specifier, i.e.

 {{{
 register_route_path('pages/%pagename%/[%%page_suffix%%]');
 }}}

 could then expand to:

 {{{
 register_route_path('pages/%pagename%/feed/%feed%');
 register_route_path('pages/%pagename%/%feed%');
 register_route_path('pages/%pagename%/page/%paged%');
 register_route_path('pages/%pagename%/page%paged%');
 register_route_path('pages/%pagename%/');
 }}}

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/12935#comment:25>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list