[buddypress-trac] [BuddyPress] #4086: Friends, Groups & Co are handled as 404 & nothing gets indexed by search engines

buddypress-trac at lists.automattic.com buddypress-trac at lists.automattic.com
Mon Mar 19 14:10:11 UTC 2012


#4086: Friends, Groups & Co are handled as 404 & nothing gets indexed by search
engines
------------------------------------+------------------
 Reporter:  wpdennis                |       Owner:
     Type:  defect (bug)            |      Status:  new
 Priority:  normal                  |   Milestone:  1.6
Component:  Core                    |     Version:
 Severity:  major                   |  Resolution:
 Keywords:  dev-feedback has-patch  |
------------------------------------+------------------
Changes (by boonebgorges):

 * keywords:  dev-feedback reporter-feedback => dev-feedback has-patch
 * milestone:  Awaiting Review => 1.6


Comment:

 Confirmed. Here's an outline of what's happening.

 When you set permalinks to /%postname%/, a flag called
 `use_verbose_page_rules` is set in `$wp_rewrite`. The purpose of
 `$wp_rewrite->use_verbose_page_rules` is to verify the existence of a
 requested page before returning it as a match in `WP::parse_request()`. WP
 uses the function `get_page_by_path()` to do this verification.
 `get_page_by_path()`, in turn, uses the following logic:

 1) bust the query request up into chunks (so that `/members/admin/`
 becomes 'members' and 'admin')
 2) do a direct query for pages that match one of the chunks
 3) cycle through the results to see if the page_name matches the last part
 of the request, as in cases where a URL is
 `http://example.com/boone/is/very/cool/` and the `post_name` is `cool`
 (with `very` as `post_parent`, `is` as `post_parent` of `very`, etc - the
 standard WP page setup)

 In the case of BP pages, steps (1) and (2) go just fine - WP successfully
 finds our page called `members` - but it fails at step (3). That's because
 `members` is not the *last* chunk of the request, which in turn is because
 we are constructing our URLs in a non-standard way. As a result, the
 function returns false, and `WP::parse_request()` fails to locate a match,
 which is the source of the 404. (BTW, having an attachment or a manually
 created page with `post_name = 'admin`` will trick `get_page_by_path()`
 into returning true, which explains wpdennis's perplexing attachment
 report in the OP.)

 There are a couple of things we could do to fix this.

 a) Manually set `$wp_rewrite->use_verbose_page_rules = false` in our
 catchuri routine, after we have done our own checks to make sure that we
 are, indeed, viewing a BP page. (See 4086.01.diff.) This should be safe,
 since by this point, we have already decided conclusively whether or not
 we'll be displaying BP content; if not, we return (`if ( empty( $matches )
 ) return false`). The downside is that it seems a bit hackish.
 b) Pass a patch upstream to WP that filters the return value of
 `get_page_by_path()`. We can filter it, check to see whether we are
 viewing a BP page (we'll already have this information in the `$bp`
 global, because of the existing catchuri routine), and then return a
 relevant value. This solution has the advantage of actually addressing the
 underlying cause of the discrepancy (namely, that we match page URIs to
 pages in the DB in a different way than WP does). However, we'll have to
 convince the WP team that it's a good idea (we are, after all, "doing it
 wrong"), and it will add about 20 lines of code to our codebase where (a)
 will only add one.

 Thoughts?

-- 
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/4086#comment:3>
BuddyPress <http://buddypress.org/>
BuddyPress


More information about the buddypress-trac mailing list