[wp-hackers] RSS/Atom excerpt and filters

David House dave at xmouse.ithium.net
Sat Jul 3 21:07:34 UTC 2004


Quoting Michel Fortin <michel.fortin at michelf.com>:

> First point: If another plugin does the same thing after mine, the 
> excerpt will be escaped twice and the result won't be too good to look 
> at. Wouldn't it be better if WordPress was applying "htmlentities" by 
> itself?

So make a function that str_replaces all &lt; into < and all &gt; into >, before
calling htmlentities on it.

> Hi, I'm new on this list and I came because of something that may 
> require some developer discussion. I do not use WordPress much myself, 
> but I maintain [PHP Markdown][1], a text formatter tool. The Markdown 
> plugin made by Matt and bundled with WordPress 1.2 is based on my work, 
> which didn't include natively a plugin interface at the time.
> 
> [1]: http://www.michelf.com/projects/php-markdown/
> 
> Now to the main subject...
> It's easy to filter excerpts in RSS and Atom feeds using the 
> "the_excerpt_rss" hook, but I believe there is a problem with the way 
> it works currently. While I'm going to talk about Markdown, everything 
> is also valid for Textile.
> 
> WordPress automatically creates an excerpt from a post when the excerpt 
> field is left empty. This is used in RSS and Atom feeds for the 
> "descrition" and "summary" tags. But what happens if the excerpt 
> contains HTML? In this case, the HTML need to be encoded (changed into 
> entities) and with Atom the type and mode attribute need to be set on 
> the summary tag like this:
> 
> 	<summary type="text/html" mode="escaped">
> 		&lt;b&gt;Hello&lt;/b&gt; world!
> 	</summary>
> 
> So if I want to use Markdown as the filter for the excerpt made 
> automatically from my Markdown-formatted posts, I can do this:
> 
> 	add_filter('the_excerpt_rss', 'Markdown', 6);
> 	add_filter('the_excerpt_rss', 'htmlentities', 100);
> 
> First point: If another plugin does the same thing after mine, the 
> excerpt will be escaped twice and the result won't be too good to look 
> at. Wouldn't it be better if WordPress was applying "htmlentities" by 
> itself?
>
> Second point: that's great, it may work if I take care that no other 
> plugins does the same and I change the Atom template in order to add 
> the required attributes. So how should I distribute a plugin that does 
> this in a user-friendly manner?
> 
> The response to the second question is simple: there is only one way to 
> be sure a filter will not prevent the RSS or Atom feed to validate in 
> the current implementation of WordPress: remove HTML tags! ... like 
> this:
> 
> 	add_filter('the_excerpt_rss', 'Markdown', 6);
> 	add_filter('the_excerpt_rss', 'strip_tags', 100);
> 
> Or, if I am more concerned about forward compatibility:
> 
> 	add_filter('the_excerpt_rss', 'Markdown', 6);
> 	if ($wp_version == 1.2)
> 		add_filter('the_excerpt_rss', 'strip_tags', 100);
> 
> This last solution is implying that the problem will be solved in some 
> way in the next release version of WordPress.
> 
> Instead of escaping the text with entities, we could use a CDATA 
> section. This does not help much since we still have to add CDATA block 
> delimiters around the text (instead of escaping) and add the same 
> attributes to the summary tag in Atom. It may be a little better still 
> since it would only require modification to the templates.
> 
> Any ideas?
> 
> Michel Fortin
> michel.fortin at michelf.com
> http://www.michelf.com/
> 
> 
> _______________________________________________
> hackers mailing list
> hackers at wordpress.org
> http://wordpress.org/mailman/listinfo/hackers_wordpress.org
> 




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




More information about the hackers mailing list