[wp-hackers] RSS/Atom excerpt and filters

David House dave at xmouse.ithium.net
Sat Jul 3 22:03:30 UTC 2004


Quoting Stephen O'Connor <steve at stevarino.com>:

> What happens when the author includes escaped html code in the entry, as
> many authors on this list do. This could make things a whole lot worse. (I
> can't stand working with character encoding... ew)

I don't understand... any escaped angled brackets in the original excerpt would
be converted back into their actual characters, then escaped again. We would
have to convert any &amp;lt; and &amp;gt; that htmlentities produces back into
&lt; and &gt;, but that's no problem. No net loss of data. Even if there was
another function (call it function foo), say in a later version of WP, that
escaped < and >, the process would look like this:

* Original excerpt with &lt; and &gt;
* Foo called, excerpt still with &lt; and &gt;
* str_replace part of our function called, searches for < and >, finds none,
excerpt still contains &lt; and &gt;
* htmlentities converts &lt; and &gt; into &amp;lt; and &amp;gt;
* second str_replace converts &amp;lt; into &lt; and &amp;gt; into &gt;

If we start getting more complicated entities, like &amp;amp;lt;, then we could
use a preg_replace instead of a second str_replace:

preg_replace('/&(amp;)(?:(lt;|gt;))/', '', $text);


> > So make a function that str_replaces all &lt; into < and all &gt;
> > into >, before
> > calling htmlentities on it.
> 
> What happens when the author includes escaped html code in the entry, as
> many authors on this list do. This could make things a whole lot worse. (I
> can't stand working with character encoding... ew)
> 
> Perhaps a "best-practice" would be to parse $wp_filter for the existance of
> htmlentities. It would only work if everyone agreed on it, but it's a
> solution you can use today.
> 
> 
> _______________________________________________
> hackers mailing list
> hackers at wordpress.org
> http://wordpress.org/mailman/listinfo/hackers_wordpress.org
> 




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




More information about the hackers mailing list