[wp-trac] [WordPress Trac] #18276: Implement URL Routing System to "Front-End" WordPress' existing Rewrite System
WordPress Trac
wp-trac at lists.automattic.com
Thu Jul 28 07:35:05 UTC 2011
#18276: Implement URL Routing System to "Front-End" WordPress' existing Rewrite
System
--------------------------+------------------------------------
Reporter: mikeschinkel | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Permalinks | Version: 3.2.1
Severity: normal | Keywords: dev-feedback has-patch
--------------------------+------------------------------------
As per [http://wpdevel.wordpress.com/2011/07/27/wordpress-3-3-proposed-
scope/#comment-22028 scribu's reply to my comment on "WordPress 3.3
Proposed Scope"] I am attaching a plugin I called ''"WP URL Routes"'' in
file `wp-url-routes.php` which is a '''proof-of-concept''' that works as a
plugin today even though it was designed to be integrated into core
''(i.e. it does not use function name prefixes.)''
WP URL Routes uses `register_query_var()` and `register_url_path()`
functions to build a tree of `URL_Node` objects where each URL Node
ultimately contains the metadata required to represent a path segment
where the path `'/'` is the root URL Node. It is designed so that path
matching can be done not only with regular expressions like the WordPress
Rewrite system but also using keyed arrays, database lookups, potentially
specific callbacks, as well as global hooks. For high traffic systems WP
URL Routes could be optimized in a variety of ways including using
Memcached and more.
WP URL Routes is designed to be as easy as possible for a themer to add to
a theme's `function.php` in an `'init'` hook. Here is what such an
`'init'` hook might look like to define the oft requested
`/CATEGORY/POST/` url structure:
{{{
add_action( 'init', 'mysite_url_init' );
function mysite_url_init() {
// Allow categories to be specified at the beginning of a path
register_url_path( '%category_name%/%name%' );
}
}}}
A very important aspect of its architecture is that about the
'''''only''''' thing that the WP URL Routes concept overrides is that it
effectively replaces `$wp->parse_request()` by subclassing the `WP` class
''(which of course WordPress core need not do.)'' It actually might be
easier just to show you the code for the relevant parts of the
`WP_Urls_WP` class rather than try to explain it ''(see below)'':
{{{
/*
* Extends the WP class in WordPress and assigns an instance to the global
$wp variable.
* Notes: This is needed because WordPress does not (yet?) have a hook for
$wp->parse_request() as proposed in trac ticket #XXXXX
*/
class WP_Urls_WP extends WP {
static function on_load() {
// 'setup_theme' is 1st hook run after WP is created.
add_action( 'setup_theme', array( __CLASS__, 'setup_theme' ) );
}
static function setup_theme() {
global $wp;
$wp = new WP_Urls_WP(); // Replace the global $wp
}
function parse_request( $extra_query_vars = '' ) {
if ( apply_filters( 'wp_parse_request', false, $extra_query_vars ) ) {
WP_Urls::$result = 'routed';
} else {
WP_Urls::$result = 'fallback';
if ( WP_Urls::$fallback ) {
parent::parse_request($extra_query_vars); // Delegate to WP class
} else {
wp_die( 'URL Routing failed.' );
}
}
return;
}
}
WP_Urls_WP::on_load();
}}}
WP URL Routes is ''currently'' designed to ''"front-end"'' the WordPress
Rewrite system so that any URL path patterns defined take precedence over
the standard rewrite system but if no URL path patterns match the HTTP
request's URL then the WordPress Rewrite system takes over. And it is an
option for a developer designing a custom CMS system with WordPress to
bypass the WordPress Rewrite system completely on a failed match so they
can fully control the URLs of their site ''(if the WordPress team chooses
to use this proof-of-concept as a base for WordPress 3.3 URL routing they
could conceivably have a "no-backward-compatibility" mode for URL routing
that could be enabled with a constant in `/wp-config.php`.)''
WP URL Routes allows URL paths to be defined using exactly the same URL
path format found on the Permalinks page, i.e. `%year%/%month%/%day%` for
example ''(although I don't think this route is implemented in my plugin
just yet.)'' It also relies heavily on `array_merge()` to allow for
several levels of meta data to finally be merged down to the actual
metadata for each URL Node.
Let me illustrate with `%pagename%`. With the existing WordPress Rewrite
system `$wp->query_vars` ends up with `['pagename']` being set to the URL
slug ''(single or multi-path segments)'' and `['page']` is set to `''`
''(this latter is unimportant, but we want to match WordPress' Rewrite
behavior exactly.)'' So here is how we might define the `%pagename%` query
variable and the `%pagename%` path:
{{{
add_action( 'init', 'mysite_url_init' );
function mysite_url_init() {
register_query_var( '%pagename%', array(
'@validate' => 'is_valid_page', // Callback ,'@' means don't put
into query_var
'@multi_segment' => true, // Because /foo/bar/baz/ is a
valid path
'page' => '', // This is match WordPress'
behavior
'pagename' => '%this%', // '%this%' gets replaces by
current path segement
));
register_url_path( '%pagename%' );
}
}}}
But that requires a lot of learning on the themer's part so I hardcoded
the meta data for `%pagename%` into the function `register_query_var()` on
a `switch-case` statement; here's the `case`:
{{{
case 'pagename':
$defaults = array(
'@validate' => 'is_valid_page', // Callback ,'@' means don't put
into query_var
'@multi_segment' => true, // Because /foo/bar/baz/ is a
valid path
'page' => '', // This is match WordPress'
behavior
'pagename' => '%this%', // '%this%' gets replaces by
current path segement
);
break;
}}}
Which means we can simplify it for the themer to be like this:
{{{
add_action( 'init', 'mysite_url_init' );
function mysite_url_init() {
register_query_var( '%pagename%' );
register_url_path( '%pagename%' );
}
}}}
Of course the plugin can check to see if the query variable `'%pagename%'`
found in the path `'%pagename%'` has been registered yet and if not
register it, so our `'init'` hook simply becomes:
{{{
add_action( 'init', 'mysite_url_init' );
function mysite_url_init() {
register_url_path( '%pagename%' );
}
}}}
'''Very easy for the themer, no? Of course, for the person that really
needs power they can build all the metadata from scratch, but the
functions `register_query_var()` and `__register_url_path()` contain all
the default metadata for common query variables and common paths.'''
What you have here is a fully''(?)'' working URL routing engine but only a
handful of the standard query variables and standard paths have been
defined yet. Remember, this is a '''proof-of-concept''', not something
ready to be included into WordPress core ''(though I'll be happy to help
get it ready for core once I get agreement from the team that it is
wanted.)''
A few other points to argue for this approach:
1. The core code is working today.
2. Building a tree of URL path pattern nodes is a very close fit to the
structure of the URL path that is it modeling. Can we really do better?
3. It is very flexible in it's URL matching and does not rely solely on
RegEx.
4. It really fits the WordPress architecture because it's entire goal is
to populate `$wp->query_vars` correctly, '''''and nothing more'''''.
5. It integrates with existing WordPress architecture with a very tiny
amount of changes; only URL rewrites are affected and probably only the
lower level hooks.
6. There will be very few hooks that will need to be deprecated and
warnings can be generated for those hooks with WP_DEBUG is defined.
7. The tree of URL Nodes also provides metadata needed for ''automated
breadbrumb generation'' and for ''automated sitemap generation'' (both for
XML Sitemaps and sitemaps for humans.)
----
To try it the attached plugin copy the `wp-url-routes.php` file into your
site's `/wp-content/plugins/` directory and then activate the ''"WP URL
Routes"'' plugin. Also be sure to `define('WP_DEBUG',true);` in `/wp-
config.php`
The two URL paths defined for the demo are `'%category_name%/%name%'` and
`'%pagename%'` so type in any a URL that should match one of these and you
should see ''"URL Routing Result: routed"'' displayed in the top left
corner. What follows is the code for the test config you will find at the
bottom of the `wp-url-routes.php` file:
{{{
/*
* Define OMIT_URL_ROUTES_TEST_CONFIG if you want to omit this test
configuration.
*/
if ( ! defined( 'OMIT_URL_ROUTES_TEST_CONFIG') ) {
add_action( 'init', '_wp_url_routes_test_config' );
function _wp_url_routes_test_config() {
register_url_path( '%category_name%/%name%' );
register_url_path( '%pagename%' );
}
}
}}}
I'd love to see this used as a base to finally ''"fix"'' the URL routing
in WordPress to make it performant and to provide full flexibility. Again,
it's a proof-of-concept so it is just a starting point and we can evolve
it significantly if needed. However, if the team is not interested I'll be
publishing this on wordpress.org sometime in the next 90-120 days, albeit
with a different name. But it really needs to be integrated with core and
not be a plugin in order to provide full value to everyone who needs
something like this.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/18276>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list