[wp-trac] [WordPress Trac] #36292: Rewrites: Next Generation

Sun May 15 23:36:29 UTC 2016

#36292: Rewrites: Next Generation
-----------------------------+-----------------------------
 Reporter:  rmccue           |       Owner:  rmccue
     Type:  feature request  |      Status:  assigned
 Priority:  normal           |   Milestone:  Future Release
Component:  Rewrite Rules    |     Version:
 Severity:  normal           |  Resolution:
 Keywords:                   |     Focuses:
-----------------------------+-----------------------------

Comment (by MikeSchinkel):

 Replying to [comment:12 giuseppe.mazzapica]:
 > You'll be surprised that my plugins are the very few open source
 WordPress plugins on the entire web that are tested '''without''' loading
 WordPress. Every WordPress class and function and hook is mocked.

 The problem with that approach is that is makes the assumption that all of
 your mocks behave exactly as WordPress would behave, and given the
 combinatorial number of functions and hooks that is a statistical
 impossibility especially when you consider that behavior changes with each
 release of WordPress.

 Yes, what you have is better than not having any testing -- by far -- but
 it is still far from perfect.

 > As best I can tell from what you proposed the callback is not called
 until ''after'' the regex matches, so unless I misunderstand your
 suggestion the matches would still be context free.
 >
 > Maybe your definition of ''"context"'' is different than I intended when
 I used it? The context I meant was that state of the system configuration
 and/or data currently in the system.

 You are correct, you are using your definition of context ''in this
 context (no pun intended.)'' My definition of ''"context"'' here is the
 context of the URL segments when compared to architected structure of the
 site and of the values persisted in the database.

 For example, with context-free URLs if the URL path is `/about/` then we
 both would probably assume a `$post_type='page'` with a `slug` of `about`.

 However, with context-sensitive URLs that I am advocating for URLs that
 might not be able to be determined by merely looking at them. Instead the
 router would need to determine which post or taxonomy term those URLs
 represented, or if they even represented data in the database.  Maybe
 `/api/` would represent a full code-generated response.

 > For example, if we want Categories in the URL root and we have a
 `sports` category slug then that URL would route to the Category Archive
 instead of routing to a Page with a URL slug of `sports`.  ''(And with the
 approach I proposed we'd need to make sure users were warned when path
 segments became ambiguous.)''

 Yes, but I am advocating for more than that. For example, what of URL
 path's `/barack-obama/`, `/obamacare/`, `/google/`, `/mercedes-benz/` or
 `/giuseppe-mazzapica/`?  With only context-free URLs we know that those
 would look up those slugs using `$post_type='page'`.

 But with context-sensitive URLs the URLs above might represent,
 respectively, a person type post, a news category, a company type post, an
 automaker type post and a user account.  It is currently possible to do
 this in WordPress; most of the sites I work on do this.

 However, it is currently a hack to do this in WordPress. '''I think it
 should be built into the core of WordPress to enable this, and then for
 core to optimize the 80% case as best possible.'''

 Yes, in some cases the above would cause slow performance, but so does a
 poorly written SQL query.  Just as the SQL developer should be responsible
 for ensuring their queries are fast or cached so should the site
 architecture/developer be responsible to ensure that sites that need to
 scale to huge amounts of content either do not have URLs that cause issues
 or they optimize those URLs.

 For example, if a site has 1000 categories -- bad practice thought it
 probably is -- those category slugs could be loaded into memcache and
 scanned rapidly for a match before moving on to other potentials.

 > In my first comment here
 (https://core.trac.wordpress.org/ticket/36292?replyto=11#comment:4) I
 proposed to provide context when adding hooks (via the introduction of a
 new hook `add_rewrite_rules` that pass an instance of request object)
 '''and''' to pass context route callbacks.

 Certainly, but we were not discussing the same thing, as described above.

 > So, when a route is ''slightly'' different based on context, is probably
 convenient just ad it and then handle differences while performing route
 action, which is easy and straightforward if route action is a callback
 that receives context as argument.

 Adding routes into code makes it effectively impossible to get the
 reciprocal unless people write both sets of code, which many people will
 not ''(even realize them need to.)''  Better to build a system that allows
 us to find out that `/barack-obama/` is `$post_id=123` '''AND''' that
 `$post_id=123` has a URL of `/barack-obama/`; '''we should be able to
 derive that from the system from declarative information, not with
 hooks.'''

 > FastRoute is committed to speed, so it reduces features to avoid
 overhead, but I decided that some overhead is an acceptable trade off to
 have those features.
 > ...
 > Consider, first of all, that FastRoute performs a singular
 preg_match_all call to find a match among all rules.

 You avoided my response to your statement that ''"segment chunking will
 absolutely not be an optimization compared to FastRoute"'' which you
 justified by saying ''"Consider, first of all, that FastRoute performs a
 singular preg_match_all call to find a match among all rules."''  And I
 proved your justification to be false because FastRoute chunks arbitrarily
 in groups of 10 URLs which means it can end up running many `preg_match()`
 calls to match a URL, especially for WordPress sites with at least 100
 regex rules and often 200 or more.

 '''Thus I still assert that segment chunking is likely a proper
 optimization over FastRoute ''as it is currently implemented''.'''

--
Ticket URL: <https://core.trac.wordpress.org/ticket/36292#comment:14>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform