[wp-hackers] XHTML Strict Mode

Scott Reilly coffee2code at scottreilly.net
Tue Aug 3 16:31:08 UTC 2004


[List admins: go ahead and kill my previous message that is sitting in the 
queue; I sent it via the wrong e-mail address...]

[Before I begin, to get everyone who is interested on the same page, Jamie is 
referring to two posts I made about the fix to balanceTags(),

My analysis of the various problems with the function:
http://www.coffee2code.com/archives/2004/08/02/examining-balancetags/

An explanation of my patch:
http://www.coffee2code.com/archives/2004/08/03/fixing-balancetags/
]

On Tuesday 03 August 2004 06:23 am, Jamie Talbot wrote:
>
> if($tag != 'br' && $tag != 'img' && $tag != 'hr' && $tag != 'input' &&
> $tag != '')
>
> seems a little strange.  I take it this is to ignore self-closing
> tags?  It works as it is, but what about future self closing tags, or
> tags you haven't thought of?  A better way IMHO would be to check for
> the self closing-ness directly, - something like...
>
> if (substr($tag, -1) == "/")
>
> or something similar.  This would make it future proof, without having
> to add to the list.
>

True; I guess one of the things I was trying to do was to minimize my 
footprint into the function.  Especially since this thing has the potential 
to mangle user text if not done properly.  However, the preg_match() does not 
include the "/" for singular tags as part of the match, so you'd have to 
modify the matching pattern, and then even more code to take that into 
account, which adds up to more risk.

> You mention it could fix the 'more' tag problem easily.  Not sure this
> is the case.  I presume (tsk tsk) that you would just pop the stack
> before the more tag?  I think the only way it can be done at post time
> is to make a store of the unbalanced tags somewhere in the db.
> Otherwise, the inserted tags would break the structure when the entire
> post was displayed.  

Actually, that is not necessary.  balanceTags() is currently only called to 
process data on its way into the db.  To fix 'more', you just have to call it 
against the pre-'more' text on its way out of the db (when retrieved and 
parsed by get_the_content()) ONLY when it's just the pre-'more' text being 
returned.  It'll balance whatever lies before the 'more', but it doesn't 
affect the actual post content, and won't be called when the whole post is 
retrieved, so no structure will be broken.  I've written it up here:
http://www.coffee2code.com/archives/2004/08/03/patch-balancing-pre-more-tags/

> The alternative option is to divorce the 'more' 
> tag problem from post-time checking and do it at display-time which I
> originally thought was a good idea, but now am not so sure of.  More
> at mosquito bug 178.
>
> Balance Tags should also probably remove multiple instances of
> the 'more' tag as it causes any text after the second tag to not be
> displayed in "get-the-content" due to the explode(content) line.
> Mosquito bug 113.
>

Since get_the_content() is already the one responsible for parsing post 
content and splitting out the 'more', I'd make the change there.  In fact, 
this is a pretty easy fix.  The line that explodes <!--more-->:

$content = explode('<!--more-->', $content);

Just make it:

$content = explode('<!--more-->', $content, 2);

The 'limit' arg to explode() was introduced in PHP 4.0.1, and the WP site says 
we support 4.1, so that should do it.  I've made a patch for this also.

> Like you say, this doesn't sort out wp-autop problems.  I was thinking
> more along the lines of taking all the formatting code (balance tags,
> wp-autop, 'more' handling, etc..) and tying them all together in one
> go.  It makes sense to me at least to have all the code to format text
> in one place, even if the only reason is future manageability.  This
> would involve a substantial rewrite though, with alterations to a
> couple of files.  Your code could easily be used a base for this.
>
> Will go away and look again at this, and try to get back to you with
> feedback tomorrow. (Japan time!)
>
> Jamie.


Let me know if you were able to "break" balanceTags()!


-Scott
http://www.coffee2code.com



More information about the hackers mailing list