[wp-hackers] XHTML Strict Mode

Jamie Talbot wphackers at jamietalbot.com
Thu Aug 5 05:40:54 UTC 2004


I've released a plugin that implements better tag balancing, as well 
as addressing the 3 concerns I had below.  If people could take a 
little time to beta test it, that would be great.  You should remove 
WP's own tag balancing first, as they will probably conflict.  It's 
still a ways away from actual validation, but an improvement on what 
there is at the moment I think.

Plugin: http://www.jamietalbot.com/counterbalance.zip
Source: http://www.jamietalbot.com/counterbalance.phps

To save you reading all the way down, the three concerns:

* Inflexibility to future (or missed) self closing tags.
* Multiple generation of the same solution each time a post was 
displayed.
* Correct balancing of something like < b>< i>Wrong< /b>< /i>

Functionality:

* Only runs when a post is published or edited, saving redundant 
processing.
* Ignores comments and self closing tags.
* Correctly nests tags closed in the wrong order.
* Removes tags which were never opened.
* Never deletes user text, only adds to it.
* Attempts to intelligently handle <p> tags, preventing nesting, but 
not cutting too early.
* Allows for multiple nested divs.

Comments, bug reports and all welcomed - submission info is in the 
plugin file.

Also a post in the support forum:

http://wordpress.org/support/10/10430

Jamie.

> > True; I guess one of the things I was trying to do was to minimize 
> my 
> > footprint into the function.  Especially since this thing has the 
> potential 
> > to mangle user text if not done properly.  However, the preg_match
() 
> does not 
> > include the "/" for singular tags as part of the match, so you'd 
> have to 
> > modify the matching pattern, and then even more code to take that 
> into 
> > account, which adds up to more risk.
> > 
> 
> Personally, I think this is a better solution, which if coded 
properly 
> wouldn't (shouldn't!) be dangerous.  But it could easily be altered 
> later if required, right?
> 
> 
> > Actually, that is not necessary.  balanceTags() is currently only 
> called to 
> > process data on its way into the db.  To fix 'more', you just have 
> to call it 
> > against the pre-'more' text on its way out of the db (when 
retrieved 
> and 
> > parsed by get_the_content()) ONLY when it's just the pre-'more' 
text 
> being 
> > returned.  It'll balance whatever lies before the 'more', but it 
> doesn't 
> > affect the actual post content, and won't be called when the whole 
> post is 
> > retrieved, so no structure will be broken.  I've written it up 
here:
> > http://www.coffee2code.com/archives/2004/08/03/patch-balancing-pre-
> more-tags/
> 
> Had a look at that.  Nifty. :D  Of course, this still causes 
> balancetags to be executed every time a post is displayed.  For 
> multiple posts on one page, lots of viewers, page refreshes etc, 
this 
> could add up to quite a lot of work.  The post text is unchanging 
> until someone edits it, so the rebalancing on each occasion is 
> redundant.  I still think a better solution is to call balancetags 
one 
> time at post/edit time and save the output to a custom field.  This 
> could easily be achieved using a few filters.  This statically 
created 
> closing tag list could then be incorporated into the content using 
> your own get custom field plugin!  What do you think?
> 
> > Since get_the_content() is already the one responsible for parsing 
> post 
> > content and splitting out the 'more', I'd make the change 
there.  In 
> fact, 
> > this is a pretty easy fix.  The line that explodes <!--more-->:
> > 
> > $content = explode('<!--more-->', $content);
> > 
> > Just make it:
> > 
> > $content = explode('<!--more-->', $content, 2);
> > 
> > The 'limit' arg to explode() was introduced in PHP 4.0.1, and the 
WP 
> site says 
> > we support 4.1, so that should do it.  I've made a patch for this 
> also.
> 
> Nice and easy.  Was thinking *way* too much about that one!
> 
> > Let me know if you were able to "break" balanceTags()!
> 
> The only thing I found was the issue you document yourself - that of 
> multiple inline unbalanced tags being balanced all at the end 
instead 
> of individually.  However, it would be difficult to secondguess what 
> the user wanted...
> 
> All in all, good stuff!  Let me know what you think re the ongoing 
> more tags saga :D
> 
> Jamie
> 
> --
> http://www.jamietalbot.com
> 
> _______________________________________________
> hackers mailing list
> hackers at wordpress.org
> http://wordpress.org/mailman/listinfo/hackers_wordpress.org
> 
> 

-- 
http://www.jamietalbot.com/



More information about the hackers mailing list